Science.gov

Sample records for personal genomics bioinformatics

  1. Surveying Recent Themes in Translational Bioinformatics: Big Data in EHRs, Omics for Drugs, and Personal Genomics

    PubMed Central

    2014-01-01

    Summary Objective To provide a survey of recent progress in the use of large-scale biologic data to impact clinical care, and the impact the reuse of electronic health record data has made in genomic discovery. Method Survey of key themes in translational bioinformatics, primarily from 2012 and 2013. Result This survey focuses on four major themes: the growing use of Electronic Health Records (EHRs) as a source for genomic discovery, adoption of genomics and pharmacogenomics in clinical practice, the possible use of genomic technologies for drug repurposing, and the use of personal genomics to guide care. Conclusion Reuse of abundant clinical data for research is speeding discovery, and implementation of genomic data into clinical medicine is impacting care with new classes of data rarely used previously in medicine. PMID:25123743

  2. Bioinformatics Workflow for Clinical Whole Genome Sequencing at Partners HealthCare Personalized Medicine

    PubMed Central

    Tsai, Ellen A.; Shakbatyan, Rimma; Evans, Jason; Rossetti, Peter; Graham, Chet; Sharma, Himanshu; Lin, Chiao-Feng; Lebo, Matthew S.

    2016-01-01

    Effective implementation of precision medicine will be enhanced by a thorough understanding of each patient’s genetic composition to better treat his or her presenting symptoms or mitigate the onset of disease. This ideally includes the sequence information of a complete genome for each individual. At Partners HealthCare Personalized Medicine, we have developed a clinical process for whole genome sequencing (WGS) with application in both healthy individuals and those with disease. In this manuscript, we will describe our bioinformatics strategy to efficiently process and deliver genomic data to geneticists for clinical interpretation. We describe the handling of data from FASTQ to the final variant list for clinical review for the final report. We will also discuss our methodology for validating this workflow and the cost implications of running WGS. PMID:26927186

  3. Getting personalized cancer genome analysis into the clinic: the challenges in bioinformatics

    PubMed Central

    2012-01-01

    Progress in genomics has raised expectations in many fields, and particularly in personalized cancer research. The new technologies available make it possible to combine information about potential disease markers, altered function and accessible drug targets, which, coupled with pathological and medical information, will help produce more appropriate clinical decisions. The accessibility of such experimental techniques makes it all the more necessary to improve and adapt computational strategies to the new challenges. This review focuses on the critical issues associated with the standard pipeline, which includes: DNA sequencing analysis; analysis of mutations in coding regions; the study of genome rearrangements; extrapolating information on mutations to the functional and signaling level; and predicting the effects of therapies using mouse tumor models. We describe the possibilities, limitations and future challenges of current bioinformatics strategies for each of these issues. Furthermore, we emphasize the need for the collaboration between the bioinformaticians who implement the software and use the data resources, the computational biologists who develop the analytical methods, and the clinicians, the systems' end users and those ultimately responsible for taking medical decisions. Finally, the different steps in cancer genome analysis are illustrated through examples of applications in cancer genome analysis. PMID:22839973

  4. Bioinformatics and genomic medicine.

    PubMed

    Kim, Ju Han

    2002-01-01

    Bioinformatics is a rapidly emerging field of biomedical research. A flood of large-scale genomic and postgenomic data means that many of the challenges in biomedical research are now challenges in computational science. Clinical informatics has long developed methodologies to improve biomedical research and clinical care by integrating experimental and clinical information systems. The informatics revolution in both bioinformatics and clinical informatics will eventually change the current practice of medicine, including diagnostics, therapeutics, and prognostics. Postgenome informatics, powered by high-throughput technologies and genomic-scale databases, is likely to transform our biomedical understanding forever, in much the same way that biochemistry did a generation ago. This paper describes how these technologies will impact biomedical research and clinical care, emphasizing recent advances in biochip-based functional genomics and proteomics. Basic data preprocessing with normalization and filtering, primary pattern analysis, and machine-learning algorithms are discussed. Use of integrative biochip informatics technologies, including multivariate data projection, gene-metabolic pathway mapping, automated biomolecular annotation, text mining of factual and literature databases, and the integrated management of biomolecular databases, are also discussed. PMID:12544491

  5. Genomics, molecular imaging, bioinformatics, and bio-nano-info integration are synergistic components of translational medicine and personalized healthcare research

    PubMed Central

    2008-01-01

    Supported by National Science Foundation (NSF), International Society of Intelligent Biological Medicine (ISIBM), International Journal of Computational Biology and Drug Design and International Journal of Functional Informatics and Personalized Medicine, IEEE 7th Bioinformatics and Bioengineering attracted more than 600 papers and 500 researchers and medical doctors. It was the only synergistic inter/multidisciplinary IEEE conference with 24 Keynote Lectures, 7 Tutorials, 5 Cutting-Edge Research Workshops and 32 Scientific Sessions including 11 Special Research Interest Sessions that were designed dynamically at Harvard in response to the current research trends and advances. The committee was very grateful for the IEEE Plenary Keynote Lectures given by: Dr. A. Keith Dunker (Indiana), Dr. Jun Liu (Harvard), Dr. Brian Athey (Michigan), Dr. Mark Borodovsky (Georgia Tech and President of ISIBM), Dr. Hamid Arabnia (Georgia and Vice-President of ISIBM), Dr. Ruzena Bajcsy (Berkeley and Member of United States National Academy of Engineering and Member of United States Institute of Medicine of the National Academies), Dr. Mary Yang (United States National Institutes of Health and Oak Ridge, DOE), Dr. Chih-Ming Ho (UCLA and Member of United States National Academy of Engineering and Academician of Academia Sinica), Dr. Andy Baxevanis (United States National Institutes of Health), Dr. Arif Ghafoor (Purdue), Dr. John Quackenbush (Harvard), Dr. Eric Jakobsson (UIUC), Dr. Vladimir Uversky (Indiana), Dr. Laura Elnitski (United States National Institutes of Health) and other world-class scientific leaders. The Harvard meeting was a large academic event 100% full-sponsored by IEEE financially and academically. After a rigorous peer-review process, the committee selected 27 high-quality research papers from 600 submissions. The committee is grateful for contributions from keynote speakers Dr. Russ Altman (IEEE BIBM conference keynote lecturer on combining simulation and machine

  6. Bioinformatics Approach in Plant Genomic Research.

    PubMed

    Ong, Quang; Nguyen, Phuc; Thao, Nguyen Phuong; Le, Ly

    2016-08-01

    The advance in genomics technology leads to the dramatic change in plant biology research. Plant biologists now easily access to enormous genomic data to deeply study plant high-density genetic variation at molecular level. Therefore, fully understanding and well manipulating bioinformatics tools to manage and analyze these data are essential in current plant genome research. Many plant genome databases have been established and continued expanding recently. Meanwhile, analytical methods based on bioinformatics are also well developed in many aspects of plant genomic research including comparative genomic analysis, phylogenomics and evolutionary analysis, and genome-wide association study. However, constantly upgrading in computational infrastructures, such as high capacity data storage and high performing analysis software, is the real challenge for plant genome research. This review paper focuses on challenges and opportunities which knowledge and skills in bioinformatics can bring to plant scientists in present plant genomics era as well as future aspects in critical need for effective tools to facilitate the translation of knowledge from new sequencing data to enhancement of plant productivity. PMID:27499685

  7. Personalized medicine: challenges and opportunities for translational bioinformatics

    PubMed Central

    Overby, Casey Lynnette; Tarczy-Hornoch, Peter

    2013-01-01

    Personalized medicine can be defined broadly as a model of healthcare that is predictive, personalized, preventive and participatory. Two US President’s Council of Advisors on Science and Technology reports illustrate challenges in personalized medicine (in a 2008 report) and in use of health information technology (in a 2010 report). Translational bioinformatics is a field that can help address these challenges and is defined by the American Medical Informatics Association as “the development of storage, analytic and interpretive methods to optimize the transformation of increasing voluminous biomedical data into proactive, predictive, preventative and participatory health.” This article discusses barriers to implementing genomics applications and current progress toward overcoming barriers, describes lessons learned from early experiences of institutions engaged in personalized medicine and provides example areas for translational bioinformatics research inquiry. PMID:24039624

  8. Bioinformatics tools for analysing viral genomic data.

    PubMed

    Orton, R J; Gu, Q; Hughes, J; Maabar, M; Modha, S; Vattipally, S B; Wilkie, G S; Davison, A J

    2016-04-01

    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing. PMID:27217183

  9. Translational bioinformatics applications in genome medicine

    PubMed Central

    2009-01-01

    Although investigators using methodologies in bioinformatics have always been useful in genomic experimentation in analytic, engineering, and infrastructure support roles, only recently have bioinformaticians been able to have a primary scientific role in asking and answering questions on human health and disease. Here, I argue that this shift in role towards asking questions in medicine is now the next step needed for the field of bioinformatics. I outline four reasons why bioinformaticians are newly enabled to drive the questions in primary medical discovery: public availability of data, intersection of data across experiments, commoditization of methods, and streamlined validation. I also list four recommendations for bioinformaticians wishing to get more involved in translational research. PMID:19566916

  10. Genomics and Bioinformatics Resources for Crop Improvement

    PubMed Central

    Mochida, Keiichi; Shinozaki, Kazuo

    2010-01-01

    Recent remarkable innovations in platforms for omics-based research and application development provide crucial resources to promote research in model and applied plant species. A combinatorial approach using multiple omics platforms and integration of their outcomes is now an effective strategy for clarifying molecular systems integral to improving plant productivity. Furthermore, promotion of comparative genomics among model and applied plants allows us to grasp the biological properties of each species and to accelerate gene discovery and functional analyses of genes. Bioinformatics platforms and their associated databases are also essential for the effective design of approaches making the best use of genomic resources, including resource integration. We review recent advances in research platforms and resources in plant omics together with related databases and advances in technology. PMID:20208064

  11. Bacterial bioinformatics: pathogenesis and the genome.

    PubMed

    Paine, Kelly; Flower, Darren R

    2002-07-01

    As the number of completed microbial genome sequences continues to grow, there is a pressing need for the exploitation of this wealth of data through a synergistic interaction between the well-established science of bacteriology and the emergent discipline of bioinformatics. Antibiotic resistance and pathogenicity in virulent bacteria has become an increasing problem, with even the strongest drugs useless against some species, such as multi-drug resistant Enterococcus faecium and Mycobacterium tuberculosis. The global spread of Human Immunodeficiency Virus (HIV) and Acquired Immune Deficiency Syndrome (AIDS) has contributed to the re-emergence of tuberculosis and the threat from new and emergent diseases. To address these problems, bacterial pathogenicity requires redefinition as Koch's postulates become obsolete. This review discusses how the use of bacterial genomic information, and the in silico tools available at present, may aid in determining the definition of a current pathogen. The combination of both fields should provide a rapid and efficient way of assisting in the future development of antimicrobial therapies. PMID:12125816

  12. Online Bioinformatics Tutorials | Office of Cancer Genomics

    Cancer.gov

    Bioinformatics is a scientific discipline that applies computer science and information technology to help understand biological processes. The NIH provides a list of free online bioinformatics tutorials, either generated by the NIH Library or other institutes, which includes introductory lectures and "how to" videos on using various tools.

  13. Skate Genome Project: Cyber-Enabled Bioinformatics Collaboration

    PubMed Central

    Vincent, J.

    2011-01-01

    The Skate Genome Project, a pilot project of the North East Cyber infrastructure Consortium, aims to produce a draft genome sequence of Leucoraja erinacea, the Little Skate. The pilot project was designed to also develop expertise in large scale collaborations across the NECC region. An overview of the bioinformatics and infrastructure challenges faced during the first year of the project will be presented. Results to date and lessons learned from the perspective of a bioinformatics core will be highlighted.

  14. Using bioinformatics for drug target identification from the genome.

    PubMed

    Jiang, Zhenran; Zhou, Yanhong

    2005-01-01

    Genomics and proteomics technologies have created a paradigm shift in the drug discovery process, with bioinformatics having a key role in the exploitation of genomic, transcriptomic, and proteomic data to gain insights into the molecular mechanisms that underlie disease and to identify potential drug targets. We discuss the current state of the art for some of the bioinformatic approaches to identifying drug targets, including identifying new members of successful target classes and their functions, predicting disease relevant genes, and constructing gene networks and protein interaction networks. In addition, we introduce drug target discovery using the strategy of systems biology, and discuss some of the data resources for the identification of drug targets. Although bioinformatics tools and resources can be used to identify putative drug targets, validating targets is still a process that requires an understanding of the role of the gene or protein in the disease process and is heavily dependent on laboratory-based work. PMID:16336003

  15. Design and bioinformatics analysis of genome-wide CLIP experiments

    PubMed Central

    Wang, Tao; Xiao, Guanghua; Chu, Yongjun; Zhang, Michael Q.; Corey, David R.; Xie, Yang

    2015-01-01

    The past decades have witnessed a surge of discoveries revealing RNA regulation as a central player in cellular processes. RNAs are regulated by RNA-binding proteins (RBPs) at all post-transcriptional stages, including splicing, transportation, stabilization and translation. Defects in the functions of these RBPs underlie a broad spectrum of human pathologies. Systematic identification of RBP functional targets is among the key biomedical research questions and provides a new direction for drug discovery. The advent of cross-linking immunoprecipitation coupled with high-throughput sequencing (genome-wide CLIP) technology has recently enabled the investigation of genome-wide RBP–RNA binding at single base-pair resolution. This technology has evolved through the development of three distinct versions: HITS-CLIP, PAR-CLIP and iCLIP. Meanwhile, numerous bioinformatics pipelines for handling the genome-wide CLIP data have also been developed. In this review, we discuss the genome-wide CLIP technology and focus on bioinformatics analysis. Specifically, we compare the strengths and weaknesses, as well as the scopes, of various bioinformatics tools. To assist readers in choosing optimal procedures for their analysis, we also review experimental design and procedures that affect bioinformatics analyses. PMID:25958398

  16. The bioinformatics of psychosocial genomics in alternative and complementary medicine.

    PubMed

    Rossi, E

    2003-06-01

    The bioinformatics of alternative and complementary medicine is outlined in 3 hypotheses that extend the molecular-genomic revolution initiated by Watson and Crick 50 years ago to include psychology in the new discipline of psychosocial and cultural genomics. Stress-induced changes in the alternative splicing of genes demonstrate how psychosomatic stress in humans modulates activity-dependent gene expression, protein formation, physiological function, and psychological experience. The molecular messengers generated by stress, injury, and disease can activate immediate early genes within stem cells so that they then signal the target genes required to synthesize the proteins that will transform (differentiate) stem cells into mature well-functioning tissues. Such activity-dependent gene expression and its consequent activity-dependent neurogenesis and stem cell healing is proposed as the molecular-genomic-cellular basis of rehabilitative medicine, physical, and occupational therapy as well as the many alternative and complementary approaches to mind-body healing. The therapeutic replaying of enriching life experiences that evoke the novelty-numinosum-neurogenesis effect during creative moments of art, music, dance, drama, humor, literature, poetry, and spirituality, as well as cultural rituals of life transitions (birth, puberty, marriage, illness, healing, and death) can optimize consciousness, personal relationships, and healing in a manner that has much in common with the psychogenomic foundations of naturalistic and complementary medicine. The entire history of alternative and complementary approaches to healing is consistent with this new neuroscience world view about the role of psychological arousal and fascination in modulating gene expression, neurogenesis, and healing via the psychosocial and cultural rites of human societies. PMID:12853721

  17. Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

    SciTech Connect

    Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A.; Gordon, Stuart G.; Heinhorst, Sabine; Reed, Kelynne; Xu, Zhaohui; Sanders-Lorenz, Erin R.; Axen, Seth; Kim, Edwin; Johns, Mitrick; Scott, Kathleen; Kerfeld, Cheryl A.

    2011-08-01

    Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics into

  18. Bioinformatics tools for small genomes, such as hepatitis B virus.

    PubMed

    Bell, Trevor G; Kramvis, Anna

    2015-02-01

    DNA sequence analysis is undertaken in many biological research laboratories. The workflow consists of several steps involving the bioinformatic processing of biological data. We have developed a suite of web-based online bioinformatic tools to assist with processing, analysis and curation of DNA sequence data. Most of these tools are genome-agnostic, with two tools specifically designed for hepatitis B virus sequence data. Tools in the suite are able to process sequence data from Sanger sequencing, ultra-deep amplicon resequencing (pyrosequencing) and chromatograph (trace files), as appropriate. The tools are available online at no cost and are aimed at researchers without specialist technical computer knowledge. The tools can be accessed at http://hvdr.bioinf.wits.ac.za/SmallGenomeTools, and the source code is available online at https://github.com/DrTrevorBell/SmallGenomeTools. PMID:25690798

  19. Making sense of genomes of parasitic worms: Tackling bioinformatic challenges.

    PubMed

    Korhonen, Pasi K; Young, Neil D; Gasser, Robin B

    2016-01-01

    Billions of people and animals are infected with parasitic worms (helminths). Many of these worms cause diseases that have a major socioeconomic impact worldwide, and are challenging to control because existing treatment methods are often inadequate. There is, therefore, a need to work toward developing new intervention methods, built on a sound understanding of parasitic worms at molecular level, the relationships that they have with their animal hosts and/or the diseases that they cause. Decoding the genomes and transcriptomes of these parasites brings us a step closer to this goal. The key focus of this article is to critically review and discuss bioinformatic tools used for the assembly and annotation of these genomes and transcriptomes, as well as various post-genomic analyses of transcription profiles, biological pathways, synteny, phylogeny, biogeography and the prediction and prioritisation of drug target candidates. Bioinformatic pipelines implemented and established recently provide practical and efficient tools for the assembly and annotation of genomes of parasitic worms, and will be applicable to a wide range of other parasites and eukaryotic organisms. Future research will need to assess the utility of long-read sequence data sets for enhanced genomic assemblies, and develop improved algorithms for gene prediction and post-genomic analyses, to enable comprehensive systems biology explorations of parasitic organisms. PMID:26956711

  20. Public Access for Teaching Genomics, Proteomics, and Bioinformatics

    PubMed Central

    Campbell, A. Malcolm

    2003-01-01

    When the human genome project was conceived, its leaders wanted all researchers to have equal access to the data and associated research tools. Their vision of equal access provides an unprecedented teaching opportunity. Teachers and students have free access to the same databases that researchers are using. Furthermore, the recent movement to deliver scientific publications freely has presented a second source of current information for teaching. I have developed a genomics course that incorporates many of the public-domain databases, research tools, and peer-reviewed journals. These online resources provide students with exciting entree into the new fields of genomics, proteomics, and bioinformatics. In this essay, I outline how these fields are especially well suited for inclusion in the undergraduate curriculum. Assessment data indicate that my students were able to utilize online information to achieve the educational goals of the course and that the experience positively influenced their perceptions of how they might contribute to biology. PMID:12888845

  1. MEMOSys: Bioinformatics platform for genome-scale metabolic models

    PubMed Central

    2011-01-01

    Background Recent advances in genomic sequencing have enabled the use of genome sequencing in standard biological and biotechnological research projects. The challenge is how to integrate the large amount of data in order to gain novel biological insights. One way to leverage sequence data is to use genome-scale metabolic models. We have therefore designed and implemented a bioinformatics platform which supports the development of such metabolic models. Results MEMOSys (MEtabolic MOdel research and development System) is a versatile platform for the management, storage, and development of genome-scale metabolic models. It supports the development of new models by providing a built-in version control system which offers access to the complete developmental history. Moreover, the integrated web board, the authorization system, and the definition of user roles allow collaborations across departments and institutions. Research on existing models is facilitated by a search system, references to external databases, and a feature-rich comparison mechanism. MEMOSys provides customizable data exchange mechanisms using the SBML format to enable analysis in external tools. The web application is based on the Java EE framework and offers an intuitive user interface. It currently contains six annotated microbial metabolic models. Conclusions We have developed a web-based system designed to provide researchers a novel application facilitating the management and development of metabolic models. The system is freely available at http://www.icbi.at/MEMOSys. PMID:21276275

  2. Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud

    PubMed Central

    Afgan, Enis; Sloggett, Clare; Goonasekera, Nuwan; Makunin, Igor; Benson, Derek; Crowe, Mark; Gladman, Simon; Kowsar, Yousef; Pheasant, Michael; Horst, Ron; Lonie, Andrew

    2015-01-01

    Background Analyzing high throughput genomics data is a complex and compute intensive task, generally requiring numerous software tools and large reference data sets, tied together in successive stages of data transformation and visualisation. A computational platform enabling best practice genomics analysis ideally meets a number of requirements, including: a wide range of analysis and visualisation tools, closely linked to large user and reference data sets; workflow platform(s) enabling accessible, reproducible, portable analyses, through a flexible set of interfaces; highly available, scalable computational resources; and flexibility and versatility in the use of these resources to meet demands and expertise of a variety of users. Access to an appropriate computational platform can be a significant barrier to researchers, as establishing such a platform requires a large upfront investment in hardware, experience, and expertise. Results We designed and implemented the Genomics Virtual Laboratory (GVL) as a middleware layer of machine images, cloud management tools, and online services that enable researchers to build arbitrarily sized compute clusters on demand, pre-populated with fully configured bioinformatics tools, reference datasets and workflow and visualisation options. The platform is flexible in that users can conduct analyses through web-based (Galaxy, RStudio, IPython Notebook) or command-line interfaces, and add/remove compute nodes and data resources as required. Best-practice tutorials and protocols provide a path from introductory training to practice. The GVL is available on the OpenStack-based Australian Research Cloud (http://nectar.org.au) and the Amazon Web Services cloud. The principles, implementation and build process are designed to be cloud-agnostic. Conclusions This paper provides a blueprint for the design and implementation of a cloud-based Genomics Virtual Laboratory. We discuss scope, design considerations and technical and

  3. Bioinformatics challenges for genome-wide association studies

    PubMed Central

    Moore, Jason H.; Asselbergs, Folkert W.; Williams, Scott M.

    2010-01-01

    Motivation: The sequencing of the human genome has made it possible to identify an informative set of >1 million single nucleotide polymorphisms (SNPs) across the genome that can be used to carry out genome-wide association studies (GWASs). The availability of massive amounts of GWAS data has necessitated the development of new biostatistical methods for quality control, imputation and analysis issues including multiple testing. This work has been successful and has enabled the discovery of new associations that have been replicated in multiple studies. However, it is now recognized that most SNPs discovered via GWAS have small effects on disease susceptibility and thus may not be suitable for improving health care through genetic testing. One likely explanation for the mixed results of GWAS is that the current biostatistical analysis paradigm is by design agnostic or unbiased in that it ignores all prior knowledge about disease pathobiology. Further, the linear modeling framework that is employed in GWAS often considers only one SNP at a time thus ignoring their genomic and environmental context. There is now a shift away from the biostatistical approach toward a more holistic approach that recognizes the complexity of the genotype–phenotype relationship that is characterized by significant heterogeneity and gene–gene and gene–environment interaction. We argue here that bioinformatics has an important role to play in addressing the complexity of the underlying genetic basis of common human diseases. The goal of this review is to identify and discuss those GWAS challenges that will require computational methods. Contact: jason.h.moore@dartmouth.edu PMID:20053841

  4. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace.

    PubMed

    Qu, Kun; Garamszegi, Sara; Wu, Felix; Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P; Lee, Brian T; Kuhn, Robert M; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y; Mesirov, Jill P

    2016-03-01

    Complex biomedical analyses require the use of multiple software tools in concert and remain challenging for much of the biomedical research community. We introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource that currently supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate integrative analysis by non-programmers, it offers a growing set of 'recipes', short workflows to guide investigators through high-utility analysis tasks. PMID:26780094

  5. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    PubMed Central

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  6. Genomics and natural products: role of bioinformatics and recent patents.

    PubMed

    Preuss, Charles; Das, Malay K; Pathak, Yashwant V

    2014-01-01

    The post genomic era has promised major breakthroughs in personalized medicine which will improve a patient's health by selecting treatments including diet based on the patient's unique DNA sequence. The post genomic era is allowing scientists and clinicians to examine an individuals' DNA and then recommend the best diet in order to remain healthy and attenuate disease processes which the individual might be predisposed to because of their genetic make-up, e.g., cardiovascular disease. Nutrigenomics and nutrigenetics are related terms to pharmacogenomics and pharmacogenetics with an emphasis on diet or nutrition. There has been an increasing interest in consumers on natural medicines or Nutraceuticals in order to remain healthy. The post genomic era will allow a patient to visit their physician who will screen the patients DNA on a silicon chip. This will indicate which of the patient's genes have polymorphisms, e.g., a single nucleotide polymorphism (SNP) that might lead the patient to be more susceptible to certain diseases and then the physician could prescribe the appropriate dietary supplements to prevent or diminish these potential diseases. Several recently published patents are discussed in the article covering recent developments in the field. PMID:25185982

  7. Evolutionary genomics of animal personality.

    PubMed

    van Oers, Kees; Mueller, Jakob C

    2010-12-27

    Research on animal personality can be approached from both a phenotypic and a genetic perspective. While using a phenotypic approach one can measure present selection on personality traits and their combinations. However, this approach cannot reconstruct the historical trajectory that was taken by evolution. Therefore, it is essential for our understanding of the causes and consequences of personality diversity to link phenotypic variation in personality traits with polymorphisms in genomic regions that code for this trait variation. Identifying genes or genome regions that underlie personality traits will open exciting possibilities to study natural selection at the molecular level, gene-gene and gene-environment interactions, pleiotropic effects and how gene expression shapes personality phenotypes. In this paper, we will discuss how genome information revealed by already established approaches and some more recent techniques such as high-throughput sequencing of genomic regions in a large number of individuals can be used to infer micro-evolutionary processes, historical selection and finally the maintenance of personality trait variation. We will do this by reviewing recent advances in molecular genetics of animal personality, but will also use advanced human personality studies as case studies of how molecular information may be used in animal personality research in the near future. PMID:21078651

  8. FDA Bioinformatics Tool for Microbial Genomics Research on Molecular Characterization of Bacterial Foodborne Pathogens Using Microarrays

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: Advances in microbial genomics and bioinformatics are offering greater insights into the emergence and spread of foodborne pathogens in outbreak scenarios. The Food and Drug Administration (FDA) has developed the genomics tool ArrayTrackTM, which provides extensive functionalities to man...

  9. Computational and Bioinformatics Frameworks for Next-Generation Whole Exome and Genome Sequencing

    PubMed Central

    Dolled-Filhart, Marisa P.; Lee, Michael; Ou-yang, Chih-wen; Haraksingh, Rajini Rani; Lin, Jimmy Cheng-Ho

    2013-01-01

    It has become increasingly apparent that one of the major hurdles in the genomic age will be the bioinformatics challenges of next-generation sequencing. We provide an overview of a general framework of bioinformatics analysis. For each of the three stages of (1) alignment, (2) variant calling, and (3) filtering and annotation, we describe the analysis required and survey the different software packages that are used. Furthermore, we discuss possible future developments as data sources grow and highlight opportunities for new bioinformatics tools to be developed. PMID:23365548

  10. A Critical Analysis of Assessment Quality in Genomics and Bioinformatics Education Research

    ERIC Educational Resources Information Center

    Campbell, Chad E.; Nehm, Ross H.

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students' knowledge, attitudes, or skills. Although assessments are…

  11. The application of genomics and bioinformatics to accelerate crop improvement in a changing climate.

    PubMed

    Batley, Jacqueline; Edwards, David

    2016-04-01

    The changing climate and growing global population will increase pressure on our ability to produce sufficient food. The breeding of novel crops and the adaptation of current crops to the new environment are required to ensure continued food production. Advances in genomics offer the potential to accelerate the genomics based breeding of crop plants. However, relating genomic data to climate related agronomic traits for use in breeding remains a huge challenge, and one which will require coordination of diverse skills and expertise. Bioinformatics, when combined with genomics has the potential to help maintain food security in the face of climate change through the accelerated production of climate ready crops. PMID:26926905

  12. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    NASA Astrophysics Data System (ADS)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  13. Exploring laccase genes from plant pathogen genomes: a bioinformatic approach.

    PubMed

    Feng, B Z; Li, P Q; Fu, L; Yu, X M

    2015-01-01

    To date, research on laccases has mostly been focused on plant and fungal laccases and their current use in biotechnological applications. In contrast, little is known about laccases from plant pathogens, although recent rapid progress in whole genome sequencing of an increasing number of organisms has facilitated their identification and ascertainment of their origins. In this study, a comparative analysis was performed to elucidate the distribution of laccases among bacteria, fungi, and oomycetes, and, through comparison of their amino acids, to determine the relationships between them. We retrieved the laccase genes for the 20 publicly available plant pathogen genomes. From these, 125 laccase genes were identified in total, including seven in bacterial genomes, 101 in fungal genomes, and 17 in oomycete genomes. Most of the predicted protein models of these genes shared typical fungal laccase characteristics, possessing four conserved domains with one cysteine and ten histidine residues at these domains. Phylogenetic analysis illustrated that laccases from bacteria and oomycetes were grouped into two distinct clades, whereas fungal laccases clustered in three main clades. These results provide the theoretical groundwork regarding the role of laccases in plant pathogens and might be used to guide future research into these enzymes. PMID:26535716

  14. Silicon Era of Carbon-Based Life: Application of Genomics and Bioinformatics in Crop Stress Research

    PubMed Central

    Li, Man-Wah; Qi, Xinpeng; Ni, Meng; Lam, Hon-Ming

    2013-01-01

    Abiotic and biotic stresses lead to massive reprogramming of different life processes and are the major limiting factors hampering crop productivity. Omics-based research platforms allow for a holistic and comprehensive survey on crop stress responses and hence may bring forth better crop improvement strategies. Since high-throughput approaches generate considerable amounts of data, bioinformatics tools will play an essential role in storing, retrieving, sharing, processing, and analyzing them. Genomic and functional genomic studies in crops still lag far behind similar studies in humans and other animals. In this review, we summarize some useful genomics and bioinformatics resources available to crop scientists. In addition, we also discuss the major challenges and advancements in the “-omics” studies, with an emphasis on their possible impacts on crop stress research and crop improvement. PMID:23759993

  15. Public Access for Teaching Genomics, Proteomics, and Bioinformatics

    ERIC Educational Resources Information Center

    Campbell, A. Malcolm

    2003-01-01

    When the human genome project was conceived, its leaders wanted all researchers to have equal access to the data and associated research tools. Their vision of equal access provides an unprecedented teaching opportunity. Teachers and students have free access to the same databases that researchers are using. Furthermore, the recent movement to…

  16. A new set of bioinformatics tools for genome projects.

    PubMed

    Almeida, Luiz G P; Paixão, Roger; Souza, Rangel C; Costa, Gisele C da; Almeida, Darcy F de; Vasconcelos, Ana T R de

    2004-01-01

    A new tool called System for Automated Bacterial Integrated Annotation--SABIA (SABIA being a very well-known bird in Brazil) was developed for the assembly and annotation of bacterial genomes. This system performs automatic tasks of assembly analysis, ORFs identification/analysis, and extragenic region analyses. Genome assembly and contig automatic annotation data are also available in the same working environment. The system integrates several public domains and newly developed software programs capable of dealing with several types of databases, and it is portable to other operational systems. These programs interact with most of the well-known biological database/softwares, such as Glimmer, Genemark, the BLAST family programs, InterPro, COG, Kegg, PSORT, GO, tRNAScan and RBSFinder, and can also be used to identify metabolic pathways. PMID:15100986

  17. Meet me halfway: when genomics meets structural bioinformatics.

    PubMed

    Gong, Sungsam; Worth, Catherine L; Cheng, Tammy M K; Blundell, Tom L

    2011-06-01

    The DNA sequencing technology developed by Frederick Sanger in the 1970s established genomics as the basis of comparative genetics. The recent invention of next-generation sequencing (NGS) platform has added a new dimension to genome research by generating ultra-fast and high-throughput sequencing data in an unprecedented manner. The advent of NGS technology also provides the opportunity to study genetic diseases where sequence variants or mutations are sought to establish a causal relationship with disease phenotypes. However, it is not a trivial task to seek genetic variants responsible for genetic diseases and even harder for complex diseases such as diabetes and cancers. In such polygenic diseases, multiple genes and alleles, which can exist in healthy individuals, come together to contribute to common disease phenotypes in a complex manner. Hence, it is desirable to have an approach that integrates omics data with both knowledge of protein structure and function and an understanding of networks/pathways, i.e. functional genomics and systems biology; in this way, genotype-phenotype relationships can be better understood. In this review, we bring this 'bottom-up' approach alongside the current NGS-driven genetic study of genetic variations and disease aetiology. We describe experimental and computational techniques for assessing genetic variants and their deleterious effects on protein structure and function. PMID:21350909

  18. Tissue Banking, Bioinformatics, and Electronic Medical Records: The Front-End Requirements for Personalized Medicine

    PubMed Central

    Suh, K. Stephen; Sarojini, Sreeja; Youssif, Maher; Nalley, Kip; Milinovikj, Natasha; Elloumi, Fathi; Russell, Steven; Pecora, Andrew; Schecter, Elyssa; Goy, Andre

    2013-01-01

    Personalized medicine promises patient-tailored treatments that enhance patient care and decrease overall treatment costs by focusing on genetics and “-omics” data obtained from patient biospecimens and records to guide therapy choices that generate good clinical outcomes. The approach relies on diagnostic and prognostic use of novel biomarkers discovered through combinations of tissue banking, bioinformatics, and electronic medical records (EMRs). The analytical power of bioinformatic platforms combined with patient clinical data from EMRs can reveal potential biomarkers and clinical phenotypes that allow researchers to develop experimental strategies using selected patient biospecimens stored in tissue banks. For cancer, high-quality biospecimens collected at diagnosis, first relapse, and various treatment stages provide crucial resources for study designs. To enlarge biospecimen collections, patient education regarding the value of specimen donation is vital. One approach for increasing consent is to offer publically available illustrations and game-like engagements demonstrating how wider sample availability facilitates development of novel therapies. The critical value of tissue bank samples, bioinformatics, and EMR in the early stages of the biomarker discovery process for personalized medicine is often overlooked. The data obtained also require cross-disciplinary collaborations to translate experimental results into clinical practice and diagnostic and prognostic use in personalized medicine. PMID:23818899

  19. proBAMsuite, a Bioinformatics Framework for Genome-Based Representation and Analysis of Proteomics Data*

    PubMed Central

    Wang, Xiaojing; Slebos, Robbert J. C.; Chambers, Matthew C.; Tabb, David L.; Liebler, Daniel C.; Zhang, Bing

    2016-01-01

    To facilitate genome-based representation and analysis of proteomics data, we developed a new bioinformatics framework, proBAMsuite, in which a central component is the protein BAM (proBAM) file format for organizing peptide spectrum matches (PSMs)1 within the context of the genome. proBAMsuite also includes two R packages, proBAMr and proBAMtools, for generating and analyzing proBAM files, respectively. Applying proBAMsuite to three recently published proteomics datasets, we demonstrated its utility in facilitating efficient genome-based sharing, interpretation, and integration of proteomics data. First, the interpretation of proteomics data is significantly enhanced with the rich genomic annotation information. Second, PSMs can be easily reannotated using user-specified gene annotation schemes and assembled into both protein and gene identifications. Third, using the genome as a common reference, proBAMsuite facilitates seamless proteomics and proteogenomics data integration. Finally, proBAM files can be readily visualized in genome browsers and thus bring proteomics data analysis to a general audience beyond the proteomics community. Results from this study establish proBAMsuite as a useful bioinformatics framework for proteomics and proteogenomics research. PMID:26657539

  20. VectorBase: improvements to a bioinformatics resource for invertebrate vector genomics.

    PubMed

    Megy, Karine; Emrich, Scott J; Lawson, Daniel; Campbell, David; Dialynas, Emmanuel; Hughes, Daniel S T; Koscielny, Gautier; Louis, Christos; Maccallum, Robert M; Redmond, Seth N; Sheehan, Andrew; Topalis, Pantelis; Wilson, Derek

    2012-01-01

    VectorBase (http://www.vectorbase.org) is a NIAID-supported bioinformatics resource for invertebrate vectors of human pathogens. It hosts data for nine genomes: mosquitoes (three Anopheles gambiae genomes, Aedes aegypti and Culex quinquefasciatus), tick (Ixodes scapularis), body louse (Pediculus humanus), kissing bug (Rhodnius prolixus) and tsetse fly (Glossina morsitans). Hosted data range from genomic features and expression data to population genetics and ontologies. We describe improvements and integration of new data that expand our taxonomic coverage. Releases are bi-monthly and include the delivery of preliminary data for emerging genomes. Frequent updates of the genome browser provide VectorBase users with increasing options for visualizing their own high-throughput data. One major development is a new population biology resource for storing genomic variations, insecticide resistance data and their associated metadata. It takes advantage of improved ontologies and controlled vocabularies. Combined, these new features ensure timely release of multiple types of data in the public domain while helping overcome the bottlenecks of bioinformatics and annotation by engaging with our user community. PMID:22135296

  1. Genome-wide variant analysis of simplex autism families with an integrative clinical-bioinformatics pipeline

    PubMed Central

    Jiménez-Barrón, Laura T.; O'Rawe, Jason A.; Wu, Yiyang; Yoon, Margaret; Fang, Han; Iossifov, Ivan; Lyon, Gholson J.

    2015-01-01

    Autism spectrum disorders (ASDs) are a group of developmental disabilities that affect social interaction and communication and are characterized by repetitive behaviors. There is now a large body of evidence that suggests a complex role of genetics in ASDs, in which many different loci are involved. Although many current population-scale genomic studies have been demonstrably fruitful, these studies generally focus on analyzing a limited part of the genome or use a limited set of bioinformatics tools. These limitations preclude the analysis of genome-wide perturbations that may contribute to the development and severity of ASD-related phenotypes. To overcome these limitations, we have developed and utilized an integrative clinical and bioinformatics pipeline for generating a more complete and reliable set of genomic variants for downstream analyses. Our study focuses on the analysis of three simplex autism families consisting of one affected child, unaffected parents, and one unaffected sibling. All members were clinically evaluated and widely phenotyped. Genotyping arrays and whole-genome sequencing were performed on each member, and the resulting sequencing data were analyzed using a variety of available bioinformatics tools. We searched for rare variants of putative functional impact that were found to be segregating according to de novo, autosomal recessive, X-linked, mitochondrial, and compound heterozygote transmission models. The resulting candidate variants included three small heterozygous copy-number variations (CNVs), a rare heterozygous de novo nonsense mutation in MYBBP1A located within exon 1, and a novel de novo missense variant in LAMB3. Our work demonstrates how more comprehensive analyses that include rich clinical data and whole-genome sequencing data can generate reliable results for use in downstream investigations. PMID:27148569

  2. AphidBase: A centralized bioinformatic resource for annotation of the pea aphid genome

    PubMed Central

    Legeai, Fabrice; Shigenobu, Shuji; Gauthier, Jean-Pierre; Colbourne, John; Rispe, Claude; Collin, Olivier; Richards, Stephen; Wilson, Alex C. C.; Tagu, Denis

    2015-01-01

    AphidBase is a centralized bioinformatic resource that was developed to facilitate community annotation of the pea aphid genome by the International Aphid Genomics Consortium (IAGC). The AphidBase Information System designed to organize and distribute genomic data and annotations for a large international community was constructed using open source software tools from the Generic Model Organism Database (GMOD). The system includes Apollo and GBrowse utilities as well as a wiki, blast search capabilities and a full text search engine. AphidBase strongly supported community cooperation and coordination in the curation of gene models during community annotation of the pea aphid genome. AphidBase can be accessed at http://www.aphidbase.com. PMID:20482635

  3. Integrating genomics, proteomics and bioinformatics in translational studies of molecular medicine.

    PubMed

    Ostrowski, Jerzy; Wyrwicz, Lucjan S

    2009-09-01

    Understanding the molecular mechanisms of disease requires the introduction of molecular diagnostics into medical practice. Current medicine employs only elements of molecular diagnostics, which are usually applied on the scale of single genes. Medicine in the postgenomic era will utilize thousands of disease-associated molecular markers provided by high-throughput sequencing and functional genomic, proteomic and metabolomic studies. Such a spectrum of techniques will link clinical medicine based on molecularly oriented diagnostics with the prediction and prevention of disease. To achieve this task, large-scale and genome-wide biological and medical data must be combined with biostatistical and bioinformatic analyses to model biological systems. Collecting, cataloging and comparing data from molecular studies, and the subsequent development of conclusions, creates the fundamentals of systems biology. This highly complex analytical process reflects a new scientific paradigm known as integrative genomics. PMID:19732006

  4. Widening participation would be key in enhancing bioinformatics and genomics research in Africa

    PubMed Central

    Karikari, Thomas K.; Quansah, Emmanuel; Mohamed, Wael M.Y.

    2015-01-01

    Bioinformatics and genome science (BGS) are gradually gaining roots in Africa, contributing to studies that are leading to improved understanding of health, disease, agriculture and food security. While a few African countries have established foundations for research and training in these areas, BGS appear to be limited to only a few institutions in specific African countries. However, improving the disciplines in Africa will require pragmatic efforts to expand training and research partnerships to scientists in yet-unreached institutions. Here, we discuss the need to expand BGS programmes in Africa, and propose mechanisms to do so. PMID:26767163

  5. Edge Bioinformatics

    Energy Science and Technology Software Center (ESTSC)

    2015-08-03

    Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in amore » genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen or co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance« less

  6. Edge Bioinformatics

    SciTech Connect

    Lo, Chien-Chi

    2015-08-03

    Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in a genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen or co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance

  7. Bioinformatics visualization and integration with open standards: the Bluejay genomic browser.

    PubMed

    Turinsky, Andrei L; Ah-Seng, Andrew C; Gordon, Paul M K; Stromer, Julie N; Taschuk, Morgan L; Xu, Emily W; Sensen, Christoph W

    2005-01-01

    We have created a new Java-based integrated computational environment for the exploration of genomic data, called Bluejay. The system is capable of using almost any XML file related to genomic data. Non-XML data sources can be accessed via a proxy server. Bluejay has several features, which are new to Bioinformatics, including an unlimited semantic zoom capability, coupled with Scalable Vector Graphics (SVG) outputs; an implementation of the XLink standard, which features access to MAGPIE Genecards as well as any BioMOBY service accessible over the Internet; and the integration of gene chip analysis tools with the functional assignments. The system can be used as a signed web applet, Web Start, and a local stand-alone application, with or without connection to the Internet. It is available free of charge and as open source via http://bluejay.ucalgary.ca. PMID:15972014

  8. A critical analysis of assessment quality in genomics and bioinformatics education research.

    PubMed

    Campbell, Chad E; Nehm, Ross H

    2013-01-01

    The growing importance of genomics and bioinformatics methods and paradigms in biology has been accompanied by an explosion of new curricula and pedagogies. An important question to ask about these educational innovations is whether they are having a meaningful impact on students' knowledge, attitudes, or skills. Although assessments are necessary tools for answering this question, their outputs are dependent on their quality. Our study 1) reviews the central importance of reliability and construct validity evidence in the development and evaluation of science assessments and 2) examines the extent to which published assessments in genomics and bioinformatics education (GBE) have been developed using such evidence. We identified 95 GBE articles (out of 226) that contained claims of knowledge increases, affective changes, or skill acquisition. We found that 1) the purpose of most of these studies was to assess summative learning gains associated with curricular change at the undergraduate level, and 2) a minority (<10%) of studies provided any reliability or validity evidence, and only one study out of the 95 sampled mentioned both validity and reliability. Our findings raise concerns about the quality of evidence derived from these instruments. We end with recommendations for improving assessment quality in GBE. PMID:24006400

  9. Neurogenomics: An opportunity to integrate neuroscience, genomics and bioinformatics research in Africa

    PubMed Central

    Karikari, Thomas K.; Aleksic, Jelena

    2015-01-01

    Modern genomic approaches have made enormous contributions to improving our understanding of the function, development and evolution of the nervous system, and the diversity within and between species. However, most of these research advances have been recorded in countries with advanced scientific resources and funding support systems. On the contrary, little is known about, for example, the possible interplay between different genes, non-coding elements and environmental factors in modulating neurological diseases among populations in low-income countries, including many African countries. The unique ancestry of African populations suggests that improved inclusion of these populations in neuroscience-related genomic studies would significantly help to identify novel factors that might shape the future of neuroscience research and neurological healthcare. This perspective is strongly supported by the recent identification that diseased individuals and their kindred from specific sub-Saharan African populations lack common neurological disease-associated genetic mutations. This indicates that there may be population-specific causes of neurological diseases, necessitating further investigations into the contribution of additional, presently-unknown genomic factors. Here, we discuss how the development of neurogenomics research in Africa would help to elucidate disease-related genomic variants, and also provide a good basis to develop more effective therapies. Furthermore, neurogenomics would harness African scientists' expertise in neuroscience, genomics and bioinformatics to extend our understanding of the neural basis of behaviour, development and evolution. PMID:26937352

  10. Basics of Genome Sequence Analysis in Bioinformatics -- its Fundamental Ideas and Problems

    NASA Astrophysics Data System (ADS)

    Suzuki, Tomonori; Miyazaki, Satoru

    2009-02-01

    The genome sequences are one of the most fundamental data among various omics analyses. So far, basic bioinformatics tools have developing to treat genome sequences. First step of genome sequence analysis is to predict or assign "genes" on genome sequences. In the case of Eukaryotes, we can identify genes by use of full length cDNA sequences with local alignment tools such as search, blast and fasta, etc. However, it is difficult to catch mRNAs (transcripts) in Prokaryotes. Therefore, computational prediction for gene identification is first choice to start genome sequence analysis. In this review, we pick up methods for computational gene prediction first. Once genes are predicted, next step is to functions for proteins or RNAs encoded on a gene. Then, how we can define the distance between gene sequences is very important for the further analysis. So, we describe the basics of mathematical concept for gene comparison. And we also introduce our novel concept for biological sequence comparisons for the view point of informational theory. In the post genome era, many researchers are very interested in not only gene functions but also the gene regulations whose information is also on genome sequences. Cis-regulatory elements, however, is too short to find some mathematical rules. Therefore, computationally predicted cis-elements tend to include many false-positives. To reduce the ratio false-positives, we need reliable database of set of cis-regulatory elements called cis-regulatory modules for a gene. So, we are trying to develop the Cis-Regulatory Elements Module Reference Database. In the third section, we introduce you the procedure to construct the Cis-Regulatory Elements Module Reference Database and its user interfaces.

  11. Personal genomes, quantitative dynamic omics and personalized medicine

    PubMed Central

    Mias, George I.; Snyder, Michael

    2015-01-01

    The rapid technological developments following the Human Genome Project have made possible the availability of personalized genomes. As the focus now shifts from characterizing genomes to making personalized disease associations, in combination with the availability of other omics technologies, the next big push will be not only to obtain a personalized genome, but to quantitatively follow other omics. This will include transcriptomes, proteomes, metabolomes, antibodyomes, and new emerging technologies, enabling the profiling of thousands of molecular components in individuals. Furthermore, omics profiling performed longitudinally can probe the temporal patterns associated with both molecular changes and associated physiological health and disease states. Such data necessitates the development of computational methodology to not only handle and descriptively assess such data, but also construct quantitative biological models. Here we describe the availability of personal genomes and developing omics technologies that can be brought together for personalized implementations and how these novel integrated approaches may effectively provide a precise personalized medicine that focuses on not only characterization and treatment but ultimately the prevention of disease. PMID:25798291

  12. Personalized cloud-based bioinformatics services for research and education: use cases and the elasticHPC package

    PubMed Central

    2012-01-01

    Background Bioinformatics services have been traditionally provided in the form of a web-server that is hosted at institutional infrastructure and serves multiple users. This model, however, is not flexible enough to cope with the increasing number of users, increasing data size, and new requirements in terms of speed and availability of service. The advent of cloud computing suggests a new service model that provides an efficient solution to these problems, based on the concepts of "resources-on-demand" and "pay-as-you-go". However, cloud computing has not yet been introduced within bioinformatics servers due to the lack of usage scenarios and software layers that address the requirements of the bioinformatics domain. Results In this paper, we provide different use case scenarios for providing cloud computing based services, considering both the technical and financial aspects of the cloud computing service model. These scenarios are for individual users seeking computational power as well as bioinformatics service providers aiming at provision of personalized bioinformatics services to their users. We also present elasticHPC, a software package and a library that facilitates the use of high performance cloud computing resources in general and the implementation of the suggested bioinformatics scenarios in particular. Concrete examples that demonstrate the suggested use case scenarios with whole bioinformatics servers and major sequence analysis tools like BLAST are presented. Experimental results with large datasets are also included to show the advantages of the cloud model. Conclusions Our use case scenarios and the elasticHPC package are steps towards the provision of cloud based bioinformatics services, which would help in overcoming the data challenge of recent biological research. All resources related to elasticHPC and its web-interface are available at http://www.elasticHPC.org. PMID:23281941

  13. Challenges, Solutions, and Quality Metrics of Personal Genome Assembly in Advancing Precision Medicine

    PubMed Central

    Xiao, Wenming; Wu, Leihong; Yavas, Gokhan; Simonyan, Vahan; Ning, Baitang; Hong, Huixiao

    2016-01-01

    -response, tailoring drug therapy and detecting tumors. We believe the precision medicine would largely benefit from bioinformatics solutions, particularly for personal genome assembly. PMID:27110816

  14. Challenges, Solutions, and Quality Metrics of Personal Genome Assembly in Advancing Precision Medicine.

    PubMed

    Xiao, Wenming; Wu, Leihong; Yavas, Gokhan; Simonyan, Vahan; Ning, Baitang; Hong, Huixiao

    2016-01-01

    drug therapy and detecting tumors. We believe the precision medicine would largely benefit from bioinformatics solutions, particularly for personal genome assembly. PMID:27110816

  15. The Translational Genomics Core at Partners Personalized Medicine: Facilitating the Transition of Research towards Personalized Medicine.

    PubMed

    Blau, Ashley; Brown, Alison; Mahanta, Lisa; Amr, Sami S

    2016-01-01

    The Translational Genomics Core (TGC) at Partners Personalized Medicine (PPM) serves as a fee-for-service core laboratory for Partners Healthcare researchers, providing access to technology platforms and analysis pipelines for genomic, transcriptomic, and epigenomic research projects. The interaction of the TGC with various components of PPM provides it with a unique infrastructure that allows for greater IT and bioinformatics opportunities, such as sample tracking and data analysis. The following article describes some of the unique opportunities available to an academic research core operating within PPM, such the ability to develop analysis pipelines with a dedicated bioinformatics team and maintain a flexible Laboratory Information Management System (LIMS) with the support of an internal IT team, as well as the operational challenges encountered to respond to emerging technologies, diverse investigator needs, and high staff turnover. In addition, the implementation and operational role of the TGC in the Partners Biobank genotyping project of over 25,000 samples is presented as an example of core activities working with other components of PPM. PMID:26927185

  16. The Translational Genomics Core at Partners Personalized Medicine: Facilitating the Transition of Research towards Personalized Medicine

    PubMed Central

    Blau, Ashley; Brown, Alison; Mahanta, Lisa; Amr, Sami S.

    2016-01-01

    The Translational Genomics Core (TGC) at Partners Personalized Medicine (PPM) serves as a fee-for-service core laboratory for Partners Healthcare researchers, providing access to technology platforms and analysis pipelines for genomic, transcriptomic, and epigenomic research projects. The interaction of the TGC with various components of PPM provides it with a unique infrastructure that allows for greater IT and bioinformatics opportunities, such as sample tracking and data analysis. The following article describes some of the unique opportunities available to an academic research core operating within PPM, such the ability to develop analysis pipelines with a dedicated bioinformatics team and maintain a flexible Laboratory Information Management System (LIMS) with the support of an internal IT team, as well as the operational challenges encountered to respond to emerging technologies, diverse investigator needs, and high staff turnover. In addition, the implementation and operational role of the TGC in the Partners Biobank genotyping project of over 25,000 samples is presented as an example of core activities working with other components of PPM. PMID:26927185

  17. In the Spotlight: Bioinformatics

    PubMed Central

    Wang, May Dongmei

    2016-01-01

    During 2012, next generation sequencing (NGS) has attracted great attention in the biomedical research community, especially for personalized medicine. Also, third generation sequencing has become available. Therefore, state-of-art sequencing technology and analysis are reviewed in this Bioinformatics spotlight on 2012. Next-generation sequencing (NGS) is high-throughput nucleic acid sequencing technology with wide dynamic range and single base resolution. The full promise of NGS depends on the optimization of NGS platforms, sequence alignment and assembly algorithms, data analytics, novel algorithms for integrating NGS data with existing genomic, proteomic, or metabolomic data, and quantitative assessment of NGS technology in comparing to more established technologies such as microarrays. NGS technology has been predicated to become a cornerstone of personalized medicine. It is argued that NGS is a promising field for motivated young researchers who are looking for opportunities in bioinformatics. PMID:23192635

  18. Empowered genome community: leveraging a bioinformatics platform as a citizen-scientist collaboration tool.

    PubMed

    Wendelsdorf, Katherine; Shah, Sohela

    2015-09-01

    There is on-going effort in the biomedical research community to leverage Next Generation Sequencing (NGS) technology to identify genetic variants that affect our health. The main challenge facing researchers is getting enough samples from individuals either sick or healthy - to be able to reliably identify the few variants that are causal for a phenotype among all other variants typically seen among individuals. At the same time, more and more individuals are having their genome sequenced either out of curiosity or to identify the cause of an illness. These individuals may benefit from of a way to view and understand their data. QIAGEN's Ingenuity Variant Analysis is an online application that allows users with and without extensive bioinformatics training to incorporate information from published experiments, genetic databases, and a variety of statistical models to identify variants, from a long list of candidates, that are most likely causal for a phenotype as well as annotate variants with what is already known about them in the literature and databases. Ingenuity Variant Analysis is also an information sharing platform where users may exchange samples and analyses. The Empowered Genome Community (EGC) is a new program in which QIAGEN is making this on-line tool freely available to any individual who wishes to analyze their own genetic sequence. EGC members are then able to make their data available to other Ingenuity Variant Analysis users to be used in research. Here we present and describe the Empowered Genome Community in detail. We also present a preliminary, proof-of-concept study that utilizes the 200 genomes currently available through the EGC. The goal of this program is to allow individuals to access and understand their own data as well as facilitate citizen-scientist collaborations that can drive research forward and spur quality scientific dialogue in the general public. PMID:27054071

  19. Personalized medicine: new genomics, old lessons.

    PubMed

    Offit, Kenneth

    2011-07-01

    Personalized medicine uses traditional, as well as emerging concepts of the genetic and environmental basis of disease to individualize prevention, diagnosis and treatment. Personalized genomics plays a vital, but not exclusive role in this evolving model of personalized medicine. The distinctions between genetic and genomic medicine are more quantitative than qualitative. Personalized genomics builds on principles established by the integration of genetics into medical practice. Principles shared by genetic and genomic aspects of medicine, include the use of variants as markers for diagnosis, prognosis, prevention, as well as targets for treatment, the use of clinically validated variants that may not be functionally characterized, the segregation of these variants in non-Mendelian as well as Mendelian patterns, the role of gene--environment interactions, the dependence on evidence for clinical utility, the critical translational role of behavioral science, and common ethical considerations. During the current period of transition from investigation to practice, consumers should be protected from harms of premature translation of research findings, while encouraging the innovative and cost-effective application of those genomic discoveries that improve personalized medical care. PMID:21706342

  20. Evaluating the utility of personal genomic information.

    PubMed

    Foster, Morris W; Mulvihill, John J; Sharp, Richard R

    2009-08-01

    In evaluating the utility of human genome-wide assays, the answer will differ depending on why the question is asked. For purposes of regulating medical tests, a restrictive sense of clinical utility is used, although it may be possible to have clinical utility without changing patient's outcomes and clinical utility may vary between patients. For purposes of using limited third party or public health resources, cost effectiveness should be evaluated in a societal rather than individual context. However, for other health uses of genomic information a broader sense of overall utility should be used. Behavioral changes and increased individual awareness of health-related choices are relevant metrics for evaluating the personal utility of genomic information, even when traditional clinical benefits are not manifested. In taking account of personal utility, cost effectiveness may be calculated on an individual and societal basis. Overall measures of utility (including both restrictive clinical measures and measures of personal utility) may vary significantly between individuals depending on potential changes in lifestyle, health awareness and behaviors, family dynamics, and personal choice and interest as well as the psychological effects of disease risk perception. That interindividual variation suggests that a more expansive overall measure of utility could be used to identify individuals who are more likely to benefit from personal genomic information as well as those for whom the risks of personal information may be greater than any benefits. PMID:19478683

  1. Genome-wide bioinformatic and molecular analysis of introns in Saccharomyces cerevisiae.

    PubMed Central

    Spingola, M; Grate, L; Haussler, D; Ares, M

    1999-01-01

    Introns have typically been discovered in an ad hoc fashion: introns are found as a gene is characterized for other reasons. As complete eukaryotic genome sequences become available, better methods for predicting RNA processing signals in raw sequence will be necessary in order to discover genes and predict their expression. Here we present a catalog of 228 yeast introns, arrived at through a combination of bioinformatic and molecular analysis. Introns annotated in the Saccharomyces Genome Database (SGD) were evaluated, questionable introns were removed after failing a test for splicing in vivo, and known introns absent from the SGD annotation were added. A novel branchpoint sequence, AAUUAAC, was identified within an annotated intron that lacks a six-of-seven match to the highly conserved branchpoint consensus UACUAAC. Analysis of the database corroborates many conclusions about pre-mRNA substrate requirements for splicing derived from experimental studies, but indicates that splicing in yeast may not be as rigidly determined by splice-site conservation as had previously been thought. Using this database and a molecular technique that directly displays the lariat intron products of spliced transcripts (intron display), we suggest that the current set of 228 introns is still not complete, and that additional intron-containing genes remain to be discovered in yeast. The database can be accessed at http://www.cse.ucsc.edu/research/compbi o/yeast_introns.html. PMID:10024174

  2. Mi-DISCOVERER: A bioinformatics tool for the detection of mi-RNA in human genome.

    PubMed

    Arshad, Saadia; Mumtaz, Asia; Ahmad, Freed; Liaquat, Sadia; Nadeem, Shahid; Mehboob, Shahid; Afzal, Muhammad

    2010-01-01

    MicroRNAs (miRNAs) are 22 nucleotides non-coding RNAs that play pivotal regulatory roles in diverse organisms including the humans and are difficult to be identified due to lack of either sequence features or robust algorithms to efficiently identify. Therefore, we made a tool that is Mi-Discoverer for the detection of miRNAs in human genome. The tools used for the development of software are Microsoft Office Access 2003, the JDK version 1.6.0, BioJava version 1.0, and the NetBeans IDE version 6.0. All already made miRNAs softwares were web based; so the advantage of our project was to make a desktop facility to the user for sequence alignment search with already identified miRNAs of human genome present in the database. The user can also insert and update the newly discovered human miRNA in the database. Mi-Discoverer, a bioinformatics tool successfully identifies human miRNAs based on multiple sequence alignment searches. It's a non redundant database containing a large collection of publicly available human miRNAs. PMID:21364831

  3. Using Informatics-, Bioinformatics- and Genomics-Based Approaches for the Molecular Surveillance and Detection of Biothreat Agents

    NASA Astrophysics Data System (ADS)

    Seto, Donald

    The convergence and wealth of informatics, bioinformatics and genomics methods and associated resources allow a comprehensive and rapid approach for the surveillance and detection of bacterial and viral organisms. Coupled with the continuing race for the fastest, most cost-efficient and highest-quality DNA sequencing technology, that is, "next generation sequencing", the detection of biological threat agents by `cheaper and faster' means is possible. With the application of improved bioinformatic tools for the understanding of these genomes and for parsing unique pathogen genome signatures, along with `state-of-the-art' informatics which include faster computational methods, equipment and databases, it is feasible to apply new algorithms to biothreat agent detection. Two such methods are high-throughput DNA sequencing-based and resequencing microarray-based identification. These are illustrated and validated by two examples involving human adenoviruses, both from real-world test beds.

  4. Forward Individualized Medicine from Personal Genomes to Interactomes

    PubMed Central

    Zhang, Xiang; Kuivenhoven, Jan A.; Groen, Albert K.

    2015-01-01

    When considering the variation in the genome, transcriptome, proteome and metabolome, and their interaction with the environment, every individual can be rightfully considered as a unique biological entity. Individualized medicine promises to take this uniqueness into account to optimize disease treatment and thereby improve health benefits for every patient. The success of individualized medicine relies on a precise understanding of the genotype-phenotype relationship. Although omics technologies advance rapidly, there are several challenges that need to be overcome: Next generation sequencing can efficiently decipher genomic sequences, epigenetic changes, and transcriptomic variation in patients, but it does not automatically indicate how or whether the identified variation will cause pathological changes. This is likely due to the inability to account for (1) the consequences of gene-gene and gene-environment interactions, and (2) (post)transcriptional as well as (post)translational processes that eventually determine the concentration of key metabolites. The technologies to accurately measure changes in these latter layers are still under development, and such measurements in humans are also mainly restricted to blood and circulating cells. Despite these challenges, it is already possible to track dynamic changes in the human interactome in healthy and diseased states by using the integration of multi-omics data. In this review, we evaluate the potential value of current major bioinformatics and systems biology-based approaches, including genome wide association studies, epigenetics, gene regulatory and protein-protein interaction networks, and genome-scale metabolic modeling. Moreover, we address the question whether integrative analysis of personal multi-omics data will help understanding of personal genotype-phenotype relationships. PMID:26696898

  5. Genomes, Populations and Diseases: Ethnic Genomics and Personalized Medicine

    PubMed Central

    Stepanov, V.A.

    2010-01-01

    This review discusses the progress of ethnic genetics, the genetics of common diseases, and the concepts of personalized medicine. We show the relationship between the structure of genetic diversity in human populations and the varying frequencies of Mendelian and multifactor diseases. We also examine the population basis of pharmacogenetics and evaluate the effectiveness of pharmacotherapy, along with a review of new achievements and prospects in personalized genomics. PMID:22649660

  6. The Human Genome Project, and recent advances in personalized genomics.

    PubMed

    Wilson, Brenda J; Nicholls, Stuart G

    2015-01-01

    The language of "personalized medicine" and "personal genomics" has now entered the common lexicon. The idea of personalized medicine is the integration of genomic risk assessment alongside other clinical investigations. Consistent with this approach, testing is delivered by health care professionals who are not medical geneticists, and where results represent risks, as opposed to clinical diagnosis of disease, to be interpreted alongside the entirety of a patient's health and medical data. In this review we consider the evidence concerning the application of such personalized genomics within the context of population screening, and potential implications that arise from this. We highlight two general approaches which illustrate potential uses of genomic information in screening. The first is a narrowly targeted approach in which genetic profiling is linked with standard population-based screening for diseases; the second is a broader targeting of variants associated with multiple single gene disorders, performed opportunistically on patients being investigated for unrelated conditions. In doing so we consider the organization and evaluation of tests and services, the challenge of interpretation with less targeted testing, professional confidence, barriers in practice, and education needs. We conclude by discussing several issues pertinent to health policy, namely: avoiding the conflation of genetics with biological determinism, resisting the "technological imperative", due consideration of the organization of screening services, the need for professional education, as well as informed decision making and public understanding. PMID:25733939

  7. Personal genomes in progress: from the Human Genome Project to the Personal Genome Project

    PubMed Central

    Lunshof (Co-first author), Jeantine E.; Bobe (Co-first author), Jason; Aach, John; Angrist, Misha; V. Thakuria, Joseph; Vorhaus, Daniel B.; R. Hoehe (Co-last author), Margret; Church (Co-last author), George M.

    2010-01-01

    The cost of a diploid human genome sequence has dropped from about $70M to $2000 since 2007- even as the standards for redundancy have increased from 7x to 40x in order to improve call rates. Coupled with the low return on investment for common single-nucleotide polymorphisms, this has caused a significant rise in interest in correlating genome sequences with comprehensive environmental and trait data (GET). The cost of electronic health records, imaging, and microbial, immunological, and behavioral data are also dropping quickly. Sharing such integrated GET datasets and their interpretations with a diversity of researchers and research subjects highlights the need for informed-consent models capable of addressing novel privacy and other issues, as well as for flexible data-sharing resources that make materials and data available with minimum restrictions on use. This article examines the Personal Genome Project's effort to develop a GET database as a public genomics resource broadly accessible to both researchers and research participants, while pursuing the highest standards in research ethics. PMID:20373666

  8. Elucidating ANTs in worms using genomic and bioinformatic tools--biotechnological prospects?

    PubMed

    Hu, Min; Zhong, Weiwei; Campbell, Bronwyn E; Sternberg, Paul W; Pellegrino, Mark W; Gasser, Robin B

    2010-01-01

    Adenine nucleotide translocators (ANTs) belong to the mitochondrial carrier family (MCF) of proteins. ATP production and consumption are tightly linked to ANTs, the kinetics of which have been proposed to play a key regulatory role in mitochondrial oxidative phosphorylation. ANTs are also recognized as a central component of the mitochondrial permeability transition pore associated with apoptosis. Although ANTs have been investigated in a range of vertebrates, including human, mouse and cattle, and invertebrates, such as Drosophila melanogaster (vinegar fly), Saccharomyces cerevisiae (yeast) and Caenorhabditis elegans (free-living nematode), there has been a void of information on these molecules for parasitic nematodes of socio-economic importance. Exploring ANTs in nematodes has the potential lead to a better understanding of their fundamental roles in key biological pathways and might provide an avenue for the identification of targets for the rational design of nematocidal drugs. In the present article, we describe the discovery of an ANT from Haemonchus contortus (one of the most economically important parasitic nematodes of sheep and goats), conduct a comparative analysis of key ANTs and their genes (particularly ant-1.1) in nematodes and other organisms, predict the functional roles utilizing a combined genomic-bioinformatic approach and propose ANTs and associated molecules as possible drug targets, with the potential for biotechnological outcomes. PMID:19770033

  9. Clinical genomics: from a truly personal genome viewpoint.

    PubMed

    Lupski, James R

    2016-06-01

    The path to Clinical Genomics is punctuated by our understanding of what types of DNA structural and sequence variation contribute to disease, the many technical challenges to detect such variation genome-wide, and the initial struggles to interpret personal genome variation in the context of disease. This review describes one perspective of the development of clinical genomics; whereas the experimental challenges, and hurdles to overcoming them, might be deemed readily apparent, the non-technical issues for clinical implementation may be less obvious. Some of these latter challenges, including: (1) informed consent, (2) privacy, (3) what constitutes potentially pathogenic variation contributing to disease, (4) disease penetrance in populations, and (5) the genetic architecture of disease, and the struggles sometimes faced for solutions, are highlighted using illustrative examples. PMID:27221143

  10. The Human Genome Project, and recent advances in personalized genomics

    PubMed Central

    Wilson, Brenda J; Nicholls, Stuart G

    2015-01-01

    The language of “personalized medicine” and “personal genomics” has now entered the common lexicon. The idea of personalized medicine is the integration of genomic risk assessment alongside other clinical investigations. Consistent with this approach, testing is delivered by health care professionals who are not medical geneticists, and where results represent risks, as opposed to clinical diagnosis of disease, to be interpreted alongside the entirety of a patient’s health and medical data. In this review we consider the evidence concerning the application of such personalized genomics within the context of population screening, and potential implications that arise from this. We highlight two general approaches which illustrate potential uses of genomic information in screening. The first is a narrowly targeted approach in which genetic profiling is linked with standard population-based screening for diseases; the second is a broader targeting of variants associated with multiple single gene disorders, performed opportunistically on patients being investigated for unrelated conditions. In doing so we consider the organization and evaluation of tests and services, the challenge of interpretation with less targeted testing, professional confidence, barriers in practice, and education needs. We conclude by discussing several issues pertinent to health policy, namely: avoiding the conflation of genetics with biological determinism, resisting the “technological imperative”, due consideration of the organization of screening services, the need for professional education, as well as informed decision making and public understanding. PMID:25733939

  11. UTGB toolkit for personalized genome browsers

    PubMed Central

    Saito, Taro L.; Yoshimura, Jun; Sasaki, Shin; Ahsan, Budrul; Sasaki, Atsushi; Kuroshu, Reginaldo; Morishita, Shinichi

    2009-01-01

    The advent of high-throughput DNA sequencers has increased the pace of collecting enormous amounts of genomic information, yielding billions of nucleotides on a weekly basis. This advance represents an improvement of two orders of magnitude over traditional Sanger sequencers in terms of the number of nucleotides per unit time, allowing even small groups of researchers to obtain huge volumes of genomic data over fairly short period. Consequently, a pressing need exists for the development of personalized genome browsers for analyzing these immense amounts of locally stored data. The UTGB (University of Tokyo Genome Browser) Toolkit is designed to meet three major requirements for personalization of genome browsers: easy installation of the system with minimum efforts, browsing locally stored data and rapid interactive design of web interfaces tailored to individual needs. The UTGB Toolkit is licensed under an open source license. Availability: The software is freely available at http://utgenome.org/. Contact: moris@cb.k.u-tokyo.ac.jp PMID:19497937

  12. 2010 Translational bioinformatics year in review

    PubMed Central

    Miller, Katharine S

    2011-01-01

    A review of 2010 research in translational bioinformatics provides much to marvel at. We have seen notable advances in personal genomics, pharmacogenetics, and sequencing. At the same time, the infrastructure for the field has burgeoned. While acknowledging that, according to researchers, the members of this field tend to be overly optimistic, the authors predict a bright future. PMID:21672905

  13. Optimal Drug Prediction from Personal Genomics Profiles

    PubMed Central

    Sheng, Jianting; Li, Fuhai; Wong, Stephen T.C.

    2015-01-01

    Cancer patients often show heterogeneous drug responses such that only a small subset of patients is sensitive to a given anti-cancer drug. With the availability of large-scale genomic profiling via next generation sequencing (NGS), it is now economically feasible to profile the whole transcriptome and genome of individual patients in order to identify their unique genetic mutations and differentially expressed genes, which are believed to be responsible for heterogeneous drug responses. Although subtyping analysis has identified patient subgroups sharing common biomarkers, there is no effective method to predict the drug response of individual patients precisely and reliably. Herein, we propose a novel computational algorithm to predict the drug response of individual patients based on personal genomic profiles, as well as pharmacogenomic and drug sensitivity data. Specifically, more than 600 cancer cell lines (viewed as individual patients) across over 50 types of cancers and their responses to 75 drugs were obtained from the Genomics of Drug Sensitivity in Cancer (GDSC) database. The drug-specific sensitivity signatures were determined from the changes in genomic profiles of individual cell lines in response to a specific drug. The optimal drugs for individual cell lines were predicted by integrating the votes from other cell lines. The experimental results show that the proposed drug prediction algorithm can be used to improve greatly the reliability of finding optimal drugs for individual patients and will thus form a key component in the precision medicine infrastructure for oncology care. PMID:25781964

  14. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease.

    PubMed

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T; van Oven, Mannis; Wallace, Douglas C; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick F; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J; Gai, Xiaowu

    2016-06-01

    MSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR also functions as a centralized application server for Web-based tools to analyze data across both mitochondrial and nuclear DNA, including investigator-driven whole exome or genome dataset analyses through MSeqDR-Genesis. MSeqDR-GBrowse genome browser supports interactive genomic data exploration and visualization with custom tracks relevant to mtDNA variation and mitochondrial disease. MSeqDR-LSDB is a locus-specific database that currently manages 178 mitochondrial diseases, 1,363 genes associated with mitochondrial biology or disease, and 3,711 pathogenic variants in those genes. MSeqDR Disease Portal allows hierarchical tree-style disease exploration to evaluate their unique descriptions, phenotypes, and causative variants. Automated genomic data submission tools are provided that capture ClinVar compliant variant annotations. PhenoTips will be used for phenotypic data submission on deidentified patients using human phenotype ontology terminology. The development of a dynamic informed patient consent process to guide data access is underway to realize the full potential of these resources. PMID:26919060

  15. Ethical Considerations Regarding Classroom Use of Personal Genomic Information

    PubMed Central

    Parker, Lisa S.; Grubs, Robin

    2014-01-01

    Rapidly decreasing costs of genetic technologies—especially next-generation sequencing—and intensifying need for a clinical workforce trained in genomic medicine have increased interest in having students use personal genomic information to motivate and enhance genomics education. Numerous ethical issues attend classroom/pedagogical use of students’ personal genomic information, including their informed decision to participate, pressures to participate, privacy concerns, and psychosocial sequelae of learning genomic information. This paper addresses these issues, advocates explicit discussion of these issues to cultivate students’ ethical reasoning skills, suggests ways to mitigate potential harms, and recommends collection of ethically relevant data regarding pedagogical use of personal genomic information. PMID:25574277

  16. Genomic expression profiling and bioinformatics analysis on diabetic nephrology with ginsenoside Rg3

    PubMed Central

    Wang, Juan; Cui, Chunli; Fu, Li; Xiao, Zili; Xie, Nanzi; Liu, Yang; Yu, Lu; Wang, Haifeng; Luo, Bangzhen

    2016-01-01

    Diabetic nephropathy (DN), a common diabetes-related complication, is the leading cause of progressive chronic kidney disease (CKD) and end-stage renal disease. Despite the rapid development in the treatment of DN, currently available therapies used in early DN cannot prevent progressive CKD. The exact pathogenic mechanisms and the molecular events underlying DN development remain unclear. Ginsenoside Rg3 is a herbal medicine with numerous pharmacological effects. To gain a greater understanding of the molecular mechanism and signaling pathway underlying the effect of ginsenoside Rg3 in DN therapy, an RNA sequencing approach was performed to screen differential gene expression in a rat model of DN treated with ginsenoside Rg3. A combined bioinformatics analysis was then conducted to obtain insights into the underlying molecular mechanisms of the disease development, in order to identify potential novel targets for the treatment of DN. Six Sprague-Dawley male rats were randomly divided into 3 groups: Normal control group, DN group and ginsenoside-Rg3 treatment group, with two rats in each group. RNA sequencing was adopted for transcriptome profiling of cells from the renal cortex of DN rat model. Differentially expressed genes were screened out. Cluster analysis, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis were used to analyze the differentially expressed genes. In total, 78 differentially expressed genes in the DN control group were identified when compared with the normal control group, of which 52 genes were upregulated and 26 genes were downregulated. Differential expression of 43 genes was observed in the ginsenoside-Rg3 treatment group when compared with the DN control group, consisting of 10 upregulated genes and 33 downregulated genes. Notably, 21 that were downregulated in the DN control group compared with the control were then shown to be upregulated in the ginsenoside-Rg3 treatment group compared with the DN

  17. Breast Cancer in the Personal Genomics Era

    PubMed Central

    Ellsworth, Rachel E.; Decewicz, David J.; Shriver, Craig D.; Ellsworth, Darrell L.

    2010-01-01

    Breast cancer is a heterogeneous disease with a complex etiology that develops from different cellular lineages, progresses along multiple molecular pathways, and demonstrates wide variability in response to treatment. The “standard of care” approach to breast cancer treatment in which all patients receive similar interventions is rapidly being replaced by personalized medicine, based on molecular characteristics of individual patients. Both inherited and somatic genomic variation is providing useful information for customizing treatment regimens for breast cancer to maximize efficacy and minimize adverse side effects. In this article, we review (1) hereditary breast cancer and current use of inherited susceptibility genes in patient management; (2) the potential of newly-identified breast cancer-susceptibility variants for improving risk assessment; (3) advantages and disadvantages of direct-to-consumer testing; (4) molecular characterization of sporadic breast cancer through immunohistochemistry and gene expression profiling and opportunities for personalized prognostics; and (5) pharmacogenomic influences on the effectiveness of current breast cancer treatments. Molecular genomics has the potential to revolutionize clinical practice and improve the lives of women with breast cancer. PMID:21037853

  18. Putative lipoproteins identified by bioinformatic genome analysis of Leifsonia xyli ssp. xyli, the causative agent of sugarcane ratoon stunting disease.

    PubMed

    Sutcliffe, Iain C; Hutchings, Matthew I

    2007-01-01

    SUMMARY Leifsonia xyli ssp. xyli is the causative agent of ratoon stunting disease, a major cause of economic loss in sugarcane crops. Understanding of the biology of this pathogen has been hampered by its fastidious growth characteristics in vitro. However, the recent release of a genome sequence for this organism has allowed significant novel insights. Further to this, we have performed a bioinformatic analysis of the lipoproteins encoded in the L. xyli genome. These analyses suggest that lipoproteins represent c. 2.0% of the L. xyli predicted proteome. Functional analyses suggest that lipoproteins make an important contribution to the physiology of the pathogen and may influence its ability to cause disease in planta. PMID:20507484

  19. Personalized genomic disease risk of volunteers

    PubMed Central

    Gonzalez-Garay, Manuel L.; McGuire, Amy L.; Pereira, Stacey; Caskey, C. Thomas

    2013-01-01

    Next-generation sequencing (NGS) is commonly used for researching the causes of genetic disorders. However, its usefulness in clinical practice for medical diagnosis is in early development. In this report, we demonstrate the value of NGS for genetic risk assessment and evaluate the limitations and barriers for the adoption of this technology into medical practice. We performed whole exome sequencing (WES) on 81 volunteers, and for each volunteer, we requested personal medical histories, constructed a three-generation pedigree, and required their participation in a comprehensive educational program. We limited our clinical reporting to disease risks based on only rare damaging mutations and known pathogenic variations in genes previously reported to be associated with human disorders. We identified 271 recessive risk alleles (214 genes), 126 dominant risk alleles (101 genes), and 3 X-recessive risk alleles (3 genes). We linked personal disease histories with causative disease genes in 18 volunteers. Furthermore, by incorporating family histories into our genetic analyses, we identified an additional five heritable diseases. Traditional genetic counseling and disease education were provided in verbal and written reports to all volunteers. Our report demonstrates that when genome results are carefully interpreted and integrated with an individual’s medical records and pedigree data, NGS is a valuable diagnostic tool for genetic disease risk. PMID:24082139

  20. Integrated Bioinformatics, Environmental Epidemiologic and Genomic Approaches to Identify Environmental and Molecular Links between Endometriosis and Breast Cancer

    PubMed Central

    Roy, Deodutta; Morgan, Marisa; Yoo, Changwon; Deoraj, Alok; Roy, Sandhya; Yadav, Vijay Kumar; Garoub, Mohannad; Assaggaf, Hamza; Doke, Mayur

    2015-01-01

    We present a combined environmental epidemiologic, genomic, and bioinformatics approach to identify: exposure of environmental chemicals with estrogenic activity; epidemiologic association between endocrine disrupting chemical (EDC) and health effects, such as, breast cancer or endometriosis; and gene-EDC interactions and disease associations. Human exposure measurement and modeling confirmed estrogenic activity of three selected class of environmental chemicals, polychlorinated biphenyls (PCBs), bisphenols (BPs), and phthalates. Meta-analysis showed that PCBs exposure, not Bisphenol A (BPA) and phthalates, increased the summary odds ratio for breast cancer and endometriosis. Bioinformatics analysis of gene-EDC interactions and disease associations identified several hundred genes that were altered by exposure to PCBs, phthalate or BPA. EDCs-modified genes in breast neoplasms and endometriosis are part of steroid hormone signaling and inflammation pathways. All three EDCs–PCB 153, phthalates, and BPA influenced five common genes—CYP19A1, EGFR, ESR2, FOS, and IGF1—in breast cancer as well as in endometriosis. These genes are environmentally and estrogen responsive, altered in human breast and uterine tumors and endometriosis lesions, and part of Mitogen Activated Protein Kinase (MAPK) signaling pathways in cancer. Our findings suggest that breast cancer and endometriosis share some common environmental and molecular risk factors. PMID:26512648

  1. Genomic Discoveries and Personalized Medicine in Neurological Diseases

    PubMed Central

    Zhang, Li; Hong, Huixiao

    2015-01-01

    In the past decades, we have witnessed dramatic changes in clinical diagnoses and treatments due to the revolutions of genomics and personalized medicine. Undoubtedly we also met many challenges when we use those advanced technologies in drug discovery and development. In this review, we describe when genomic information is applied in personal healthcare in general. We illustrate some case examples of genomic discoveries and promising personalized medicine applications in the area of neurological disease particular. Available data suggest that individual genomics can be applied to better treat patients in the near future. PMID:26690205

  2. Personalized medicine, genomics, and pharmacogenomics: a primer for nurses.

    PubMed

    Blix, Andrew

    2014-08-01

    Personalized medicine is the study of patients' unique environmental influences as well as the totality of their genetic code-their genome-to tailor personalized risk assessments, diagnoses, prognoses, and treatments. The study of how patients' genomes affect responses to medications, or pharmacogenomics, is a related field. Personalized medicine and genomics are particularly relevant in oncology because of the genetic basis of cancer. Nurses need to understand related issues such as the role of genetic and genomic counseling, the ethical and legal questions surrounding genomics, and the growing direct-to-consumer genomics industry. As genomics research is incorporated into health care, nurses need to understand the technology to provide advocacy and education for patients and their families. PMID:25095297

  3. New bioinformatic tool for quick identification of functionally relevant endogenous retroviral inserts in human genome

    PubMed Central

    Garazha, Andrew; Ivanova, Alena; Suntsova, Maria; Malakhova, Galina; Roumiantsev, Sergey; Zhavoronkov, Alex; Buzdin, Anton

    2015-01-01

    Abstract Endogenous retroviruses (ERVs) and LTR retrotransposons (LRs) occupy ∼8% of human genome. Deep sequencing technologies provide clues to understanding of functional relevance of individual ERVs/LRs by enabling direct identification of transcription factor binding sites (TFBS) and other landmarks of functional genomic elements. Here, we performed the genome-wide identification of human ERVs/LRs containing TFBS according to the ENCODE project. We created the first interactive ERV/LRs database that groups the individual inserts according to their familial nomenclature, number of mapped TFBS and divergence from their consensus sequence. Information on any particular element can be easily extracted by the user. We also created a genome browser tool, which enables quick mapping of any ERV/LR insert according to genomic coordinates, known human genes and TFBS. These tools can be used to easily explore functionally relevant individual ERV/LRs, and for studying their impact on the regulation of human genes. Overall, we identified ∼110,000 ERV/LR genomic elements having TFBS. We propose a hypothesis of “domestication” of ERV/LR TFBS by the genome milieu including subsequent stages of initial epigenetic repression, partial functional release, and further mutation-driven reshaping of TFBS in tight coevolution with the enclosing genomic loci. PMID:25853282

  4. New bioinformatic tool for quick identification of functionally relevant endogenous retroviral inserts in human genome.

    PubMed

    Garazha, Andrew; Ivanova, Alena; Suntsova, Maria; Malakhova, Galina; Roumiantsev, Sergey; Zhavoronkov, Alex; Buzdin, Anton

    2015-01-01

    Endogenous retroviruses (ERVs) and LTR retrotransposons (LRs) occupy ∼8% of human genome. Deep sequencing technologies provide clues to understanding of functional relevance of individual ERVs/LRs by enabling direct identification of transcription factor binding sites (TFBS) and other landmarks of functional genomic elements. Here, we performed the genome-wide identification of human ERVs/LRs containing TFBS according to the ENCODE project. We created the first interactive ERV/LRs database that groups the individual inserts according to their familial nomenclature, number of mapped TFBS and divergence from their consensus sequence. Information on any particular element can be easily extracted by the user. We also created a genome browser tool, which enables quick mapping of any ERV/LR insert according to genomic coordinates, known human genes and TFBS. These tools can be used to easily explore functionally relevant individual ERV/LRs, and for studying their impact on the regulation of human genes. Overall, we identified ∼110,000 ERV/LR genomic elements having TFBS. We propose a hypothesis of "domestication" of ERV/LR TFBS by the genome milieu including subsequent stages of initial epigenetic repression, partial functional release, and further mutation-driven reshaping of TFBS in tight coevolution with the enclosing genomic loci. PMID:25853282

  5. A Novel Bioinformatics Method for Efficient Knowledge Discovery by BLSOM from Big Genomic Sequence Data

    PubMed Central

    Iwasaki, Yuki; Kanaya, Shigehiko; Zhao, Yue; Ikemura, Toshimichi

    2014-01-01

    With remarkable increase of genomic sequence data of a wide range of species, novel tools are needed for comprehensive analyses of the big sequence data. Self-Organizing Map (SOM) is an effective tool for clustering and visualizing high-dimensional data such as oligonucleotide composition on one map. By modifying the conventional SOM, we have previously developed Batch-Learning SOM (BLSOM), which allows classification of sequence fragments according to species, solely depending on the oligonucleotide composition. In the present study, we introduce the oligonucleotide BLSOM used for characterization of vertebrate genome sequences. We first analyzed pentanucleotide compositions in 100 kb sequences derived from a wide range of vertebrate genomes and then the compositions in the human and mouse genomes in order to investigate an efficient method for detecting differences between the closely related genomes. BLSOM can recognize the species-specific key combination of oligonucleotide frequencies in each genome, which is called a “genome signature,” and the specific regions specifically enriched in transcription-factor-binding sequences. Because the classification and visualization power is very high, BLSOM is an efficient powerful tool for extracting a wide range of information from massive amounts of genomic sequences (i.e., big sequence data). PMID:24804244

  6. WordSeeker: concurrent bioinformatics software for discovering genome-wide patterns and word-based genomic signatures

    PubMed Central

    2010-01-01

    Background An important focus of genomic science is the discovery and characterization of all functional elements within genomes. In silico methods are used in genome studies to discover putative regulatory genomic elements (called words or motifs). Although a number of methods have been developed for motif discovery, most of them lack the scalability needed to analyze large genomic data sets. Methods This manuscript presents WordSeeker, an enumerative motif discovery toolkit that utilizes multi-core and distributed computational platforms to enable scalable analysis of genomic data. A controller task coordinates activities of worker nodes, each of which (1) enumerates a subset of the DNA word space and (2) scores words with a distributed Markov chain model. Results A comprehensive suite of performance tests was conducted to demonstrate the performance, speedup and efficiency of WordSeeker. The scalability of the toolkit enabled the analysis of the entire genome of Arabidopsis thaliana; the results of the analysis were integrated into The Arabidopsis Gene Regulatory Information Server (AGRIS). A public version of WordSeeker was deployed on the Glenn cluster at the Ohio Supercomputer Center. Conclusion WordSeeker effectively utilizes concurrent computing platforms to enable the identification of putative functional elements in genomic data sets. This capability facilitates the analysis of the large quantity of sequenced genomic data. PMID:21210985

  7. Personal utility in genomic testing: is there such a thing?

    PubMed

    Bunnik, Eline M; Janssens, A Cecile J W; Schermer, Maartje H N

    2015-04-01

    In ethical and regulatory discussions on new applications of genomic testing technologies, the notion of 'personal utility' has been mentioned repeatedly. It has been used to justify direct access to commercially offered genomic testing or feedback of individual research results to research or biobank participants. Sometimes research participants or consumers claim a right to genomic information with an appeal to personal utility. As of yet, no systematic account of the umbrella notion of personal utility has been given. This paper offers a definition of personal utility that places it in the middle of the spectrum between clinical utility and personal perceptions of utility, and that acknowledges its normative charge. The paper discusses two perspectives on personal utility, the healthcare perspective and the consumer perspective, and argues that these are too narrow and too wide, respectively. Instead, it proposes a normative definition of personal utility that postulates information and potential use as necessary conditions of utility. This definition entails that perceived utility does not equal personal utility, and that expert judgment may be necessary to help determine whether a genomic test can have personal utility for someone. Two examples of genomic tests are presented to illustrate the discrepancies between perceived utility and our proposed definition of personal utility. The paper concludes that while there is room for the notion of personal utility in the ethical evaluation and regulation of genomic tests, the justificatory role of personal utility is not unlimited. For in the absence of clinical validity and reasonable potential use of information, there is no personal utility. PMID:24872596

  8. Personal Genomic Information Management and Personalized Medicine: Challenges, Current Solutions, and Roles of HIM Professionals

    PubMed Central

    Alzu'bi, Amal; Zhou, Leming; Watzlaf, Valerie

    2014-01-01

    In recent years, the term personalized medicine has received more and more attention in the field of healthcare. The increasing use of this term is closely related to the astonishing advancement in DNA sequencing technologies and other high-throughput biotechnologies. A large amount of personal genomic data can be generated by these technologies in a short time. Consequently, the needs for managing, analyzing, and interpreting these personal genomic data to facilitate personalized care are escalated. In this article, we discuss the challenges for implementing genomics-based personalized medicine in healthcare, current solutions to these challenges, and the roles of health information management (HIM) professionals in genomics-based personalized medicine. PMID:24808804

  9. Bioinformatic genome comparisons for taxonomic and phylogenic assignments using Aeromonas as a test case

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Prokaryotic taxonomy is the underpinning of microbiology, providing a framework for the proper identification and naming of organisms. The 'gold standard' of bacterial species delineation is the overall genome similarity as determined by DNA-DNA hybridization (DDH), a technically rigorous yet someti...

  10. A bioinformatics workflow for detecting signatures of selection in genomic data.

    PubMed

    Cadzow, Murray; Boocock, James; Nguyen, Hoang T; Wilcox, Phillip; Merriman, Tony R; Black, Michael A

    2014-01-01

    The detection of "signatures of selection" is now possible on a genome-wide scale in many plant and animal species, and can be performed in a population-specific manner due to the wealth of per-population genome-wide genotype data that is available. With genomic regions that exhibit evidence of having been under selection shown to also be enriched for genes associated with biologically important traits, detection of evidence of selective pressure is emerging as an additional approach for identifying novel gene-trait associations. While high-density genotype data is now relatively easy to obtain, for many researchers it is not immediately obvious how to go about identifying signatures of selection in these data sets. Here we describe a basic workflow, constructed from open source tools, for detecting and examining evidence of selection in genomic data. Code to install and implement the pipeline components, and instructions to run a basic analysis using the workflow described here, can be downloaded from our public GitHub repository: http://www.github.com/smilefreak/selectionTools/ PMID:25206364

  11. A bioinformatics workflow for detecting signatures of selection in genomic data

    PubMed Central

    Cadzow, Murray; Boocock, James; Nguyen, Hoang T.; Wilcox, Phillip; Merriman, Tony R.; Black, Michael A.

    2014-01-01

    The detection of “signatures of selection” is now possible on a genome-wide scale in many plant and animal species, and can be performed in a population-specific manner due to the wealth of per-population genome-wide genotype data that is available. With genomic regions that exhibit evidence of having been under selection shown to also be enriched for genes associated with biologically important traits, detection of evidence of selective pressure is emerging as an additional approach for identifying novel gene-trait associations. While high-density genotype data is now relatively easy to obtain, for many researchers it is not immediately obvious how to go about identifying signatures of selection in these data sets. Here we describe a basic workflow, constructed from open source tools, for detecting and examining evidence of selection in genomic data. Code to install and implement the pipeline components, and instructions to run a basic analysis using the workflow described here, can be downloaded from our public GitHub repository: http://www.github.com/smilefreak/selectionTools/ PMID:25206364

  12. Overview of personalized medicine in the disease genomic era.

    PubMed

    Hong, Kyung-Won; Oh, Bermseok

    2010-10-01

    Sir William Osler (1849-1919) recognized that "variability is the law of life, and as no two faces are the same, so no two bodies are alike, and no two individuals react alike and behave alike under the abnormal conditions we know as disease". Accordingly, the traditional methods of medicine are not always best for all patients. Over the last decade, the study of genomes and their derivatives (RNA, protein and metabolite) has rapidly advanced to the point that genomic research now serves as the basis for many medical decisions and public health initiatives. Genomic tools such as sequence variation, transcription and, more recently, personal genome sequencing enable the precise prediction and treatment of disease. At present, DNA-based risk assessment for common complex diseases, application of molecular signatures for cancer diagnosis and prognosis, genome-guided therapy, and dose selection of therapeutic drugs are the important issues in personalized medicine. In order to make personalized medicine effective, these genomic techniques must be standardized and integrated into health systems and clinical workflow. In addition, full application of personalized or genomic medicine requires dramatic changes in regulatory and reimbursement policies as well as legislative protection related to privacy. This review aims to provide a general overview of these topics in the field of personalized medicine. PMID:21034525

  13. Atlas2 Cloud: a framework for personal genome analysis in the cloud

    PubMed Central

    2012-01-01

    Background Until recently, sequencing has primarily been carried out in large genome centers which have invested heavily in developing the computational infrastructure that enables genomic sequence analysis. The recent advancements in next generation sequencing (NGS) have led to a wide dissemination of sequencing technologies and data, to highly diverse research groups. It is expected that clinical sequencing will become part of diagnostic routines shortly. However, limited accessibility to computational infrastructure and high quality bioinformatic tools, and the demand for personnel skilled in data analysis and interpretation remains a serious bottleneck. To this end, the cloud computing and Software-as-a-Service (SaaS) technologies can help address these issues. Results We successfully enabled the Atlas2 Cloud pipeline for personal genome analysis on two different cloud service platforms: a community cloud via the Genboree Workbench, and a commercial cloud via the Amazon Web Services using Software-as-a-Service model. We report a case study of personal genome analysis using our Atlas2 Genboree pipeline. We also outline a detailed cost structure for running Atlas2 Amazon on whole exome capture data, providing cost projections in terms of storage, compute and I/O when running Atlas2 Amazon on a large data set. Conclusions We find that providing a web interface and an optimized pipeline clearly facilitates usage of cloud computing for personal genome analysis, but for it to be routinely used for large scale projects there needs to be a paradigm shift in the way we develop tools, in standard operating procedures, and in funding mechanisms. PMID:23134663

  14. The personal genome browser: visualizing functions of genetic variants.

    PubMed

    Juan, Liran; Teng, Mingxiang; Zang, Tianyi; Hao, Yafeng; Wang, Zhenxing; Yan, Chengwu; Liu, Yongzhuang; Li, Jie; Zhang, Tianjiao; Wang, Yadong

    2014-07-01

    Advances in high-throughput sequencing technologies have brought us into the individual genome era. Projects such as the 1000 Genomes Project have led the individual genome sequencing to become more and more popular. How to visualize, analyse and annotate individual genomes with knowledge bases to support genome studies and personalized healthcare is still a big challenge. The Personal Genome Browser (PGB) is developed to provide comprehensive functional annotation and visualization for individual genomes based on the genetic-molecular-phenotypic model. Investigators can easily view individual genetic variants, such as single nucleotide variants (SNVs), INDELs and structural variations (SVs), as well as genomic features and phenotypes associated to the individual genetic variants. The PGB especially highlights potential functional variants using the PGB built-in method or SIFT/PolyPhen2 scores. Moreover, the functional risks of genes could be evaluated by scanning individual genetic variants on the whole genome, a chromosome, or a cytoband based on functional implications of the variants. Investigators can then navigate to high risk genes on the scanned individual genome. The PGB accepts Variant Call Format (VCF) and Genetic Variation Format (GVF) files as the input. The functional annotation of input individual genome variants can be visualized in real time by well-defined symbols and shapes. The PGB is available at http://www.pgbrowser.org/. PMID:24799434

  15. The personal genome browser: visualizing functions of genetic variants

    PubMed Central

    Juan, Liran; Teng, Mingxiang; Zang, Tianyi; Hao, Yafeng; Wang, Zhenxing; Yan, Chengwu; Liu, Yongzhuang; Li, Jie; Zhang, Tianjiao; Wang, Yadong

    2014-01-01

    Advances in high-throughput sequencing technologies have brought us into the individual genome era. Projects such as the 1000 Genomes Project have led the individual genome sequencing to become more and more popular. How to visualize, analyse and annotate individual genomes with knowledge bases to support genome studies and personalized healthcare is still a big challenge. The Personal Genome Browser (PGB) is developed to provide comprehensive functional annotation and visualization for individual genomes based on the genetic–molecular–phenotypic model. Investigators can easily view individual genetic variants, such as single nucleotide variants (SNVs), INDELs and structural variations (SVs), as well as genomic features and phenotypes associated to the individual genetic variants. The PGB especially highlights potential functional variants using the PGB built-in method or SIFT/PolyPhen2 scores. Moreover, the functional risks of genes could be evaluated by scanning individual genetic variants on the whole genome, a chromosome, or a cytoband based on functional implications of the variants. Investigators can then navigate to high risk genes on the scanned individual genome. The PGB accepts Variant Call Format (VCF) and Genetic Variation Format (GVF) files as the input. The functional annotation of input individual genome variants can be visualized in real time by well-defined symbols and shapes. The PGB is available at http://www.pgbrowser.org/. PMID:24799434

  16. Teaching Synthetic Biology, Bioinformatics and Engineering to Undergraduates: The Interdisciplinary Build-a-Genome Course

    PubMed Central

    Dymond, Jessica S.; Scheifele, Lisa Z.; Richardson, Sarah; Lee, Pablo; Chandrasegaran, Srinivasan; Bader, Joel S.; Boeke, Jef D.

    2009-01-01

    A major challenge in undergraduate life science curricula is the continual evaluation and development of courses that reflect the constantly shifting face of contemporary biological research. Synthetic biology offers an excellent framework within which students may participate in cutting-edge interdisciplinary research and is therefore an attractive addition to the undergraduate biology curriculum. This new discipline offers the promise of a deeper understanding of gene function, gene order, and chromosome structure through the de novo synthesis of genetic information, much as synthetic approaches informed organic chemistry. While considerable progress has been achieved in the synthesis of entire viral and prokaryotic genomes, fabrication of eukaryotic genomes requires synthesis on a scale that is orders of magnitude higher. These high-throughput but labor-intensive projects serve as an ideal way to introduce undergraduates to hands-on synthetic biology research. We are pursuing synthesis of Saccharomyces cerevisiae chromosomes in an undergraduate laboratory setting, the Build-a-Genome course, thereby exposing students to the engineering of biology on a genomewide scale while focusing on a limited region of the genome. A synthetic chromosome III sequence was designed, ordered from commercial suppliers in the form of oligonucleotides, and subsequently assembled by students into ∼750-bp fragments. Once trained in assembly of such DNA “building blocks” by PCR, the students accomplish high-yield gene synthesis, becoming not only technically proficient but also constructively critical and capable of adapting their protocols as independent researchers. Regular “lab meeting” sessions help prepare them for future roles in laboratory science. PMID:19015540

  17. A bioinformatic analysis of ribonucleotide reductase genes in phage genomes and metagenomes

    PubMed Central

    2013-01-01

    Background Ribonucleotide reductase (RNR), the enzyme responsible for the formation of deoxyribonucleotides from ribonucleotides, is found in all domains of life and many viral genomes. RNRs are also amongst the most abundant genes identified in environmental metagenomes. This study focused on understanding the distribution, diversity, and evolution of RNRs in phages (viruses that infect bacteria). Hidden Markov Model profiles were used to analyze the proteins encoded by 685 completely sequenced double-stranded DNA phages and 22 environmental viral metagenomes to identify RNR homologs in cultured phages and uncultured viral communities, respectively. Results RNRs were identified in 128 phage genomes, nearly tripling the number of phages known to encode RNRs. Class I RNR was the most common RNR class observed in phages (70%), followed by class II (29%) and class III (28%). Twenty-eight percent of the phages contained genes belonging to multiple RNR classes. RNR class distribution varied according to phage type, isolation environment, and the host’s ability to utilize oxygen. The majority of the phages containing RNRs are Myoviridae (65%), followed by Siphoviridae (30%) and Podoviridae (3%). The phylogeny and genomic organization of phage and host RNRs reveal several distinct evolutionary scenarios involving horizontal gene transfer, co-evolution, and differential selection pressure. Several putative split RNR genes interrupted by self-splicing introns or inteins were identified, providing further evidence for the role of frequent genetic exchange. Finally, viral metagenomic data indicate that RNRs are prevalent and highly dynamic in uncultured viral communities, necessitating future research to determine the environmental conditions under which RNRs provide a selective advantage. Conclusions This comprehensive study describes the distribution, diversity, and evolution of RNRs in phage genomes and environmental viral metagenomes. The distinct distributions of

  18. Genome mining of mycosporine-like amino acid (MAA) synthesizing and non-synthesizing cyanobacteria: A bioinformatics study.

    PubMed

    Singh, Shailendra P; Klisch, Manfred; Sinha, Rajeshwar P; Häder, Donat-P

    2010-02-01

    Mycosporine-like amino acids (MAAs) are a family of more than 20 compounds having absorption maxima between 310 and 362 nm. These compounds are well known for their UV-absorbing/screening role in various organisms and seem to have evolutionary significance. In the present investigation we tested four cyanobacteria, e.g., Anabaena variabilis PCC 7937, Anabaena sp. PCC 7120, Synechocystis sp. PCC 6803 and Synechococcus sp. PCC 6301, for their ability to synthesize MAA and conducted genomic and phylogenetic analysis to identify the possible set of genes that might be involved in the biosynthesis of these compounds. Out of the four investigated species, only A. variabilis PCC 7937 was able to synthesize MAA. Genome mining identified a combination of genes, YP_324358 (predicted DHQ synthase) and YP_324357 (O-methyltransferase), which were present only in A. variabilis PCC 7937 and missing in the other studied cyanobacteria. Phylogenetic analysis revealed that these two genes are transferred from a cyanobacterial donor to dinoflagellates and finally to metazoa by a lateral gene transfer event. All other cyanobacteria, which have these two genes, also had another copy of the DHQ synthase gene. The predicted protein structure for YP_324358 also suggested that this product is different from the chemically characterized DHQ synthase of Aspergillus nidulans contrary to the YP_324879, which was predicted to be similar to the DHQ synthase. The present study provides a first insight into the genes of cyanobacteria involved in MAA biosynthesis and thus widens the field of research for molecular, bioinformatics and phylogenetic analysis of these evolutionary and industrially important compounds. Based on the results we propose that YP_324358 and YP_324357 gene products are involved in the biosynthesis of the common core (deoxygadusol) of all MAAs. PMID:19879348

  19. Challenges of web-based personal genomic data sharing.

    PubMed

    Shabani, Mahsa; Borry, Pascal

    2015-01-01

    In order to study the relationship between genes and diseases, the increasing availability and sharing of phenotypic and genotypic data have been promoted as an imperative within the scientific community. In parallel with data sharing practices by clinicians and researchers, recent initiatives have been observed in which individuals are sharing personal genomic data. The involvement of individuals in such initiatives is facilitated by the increased accessibility of personal genomic data, offered by private test providers along with availability of online networks. Personal webpages and on-line data sharing platforms such as Consent to Research (Portable Legal Consent), Free the Data, and Genomes Unzipped are being utilized to host and share genotypes, electronic health records and family history uploaded by individuals. Although personal genomic data sharing initiatives vary in nature, the emphasis on the individuals' control on their data in order to benefit research and ultimately health care has seen as a key theme across these initiatives. In line with the growing practice of personal genomic data sharing, this paper aims to shed light on the potential challenges surrounding these initiatives. As in the course of these initiatives individuals are solicited to individually balance the risks and benefits of sharing their genomic data, their awareness of the implications of personal genomic data sharing for themselves and their family members is a necessity. Furthermore, given the sensitivity of genomic data and the controversies around their complete de-identifiability, potential privacy risks and harms originating from unintended uses of data have to be taken into consideration. PMID:26085313

  20. Genomic and Bioinformatics Analysis of HAdV-4, a Human Adenovirus Causing Acute Respiratory Disease: Implications for Gene Therapy and Vaccine Vector Development

    PubMed Central

    Purkayastha, Anjan; Ditty, Susan E.; Su, Jing; McGraw, John; Hadfield, Ted L.; Tibbetts, Clark; Seto, Donald

    2005-01-01

    Human adenovirus serotype 4 (HAdV-4) is a reemerging viral pathogenic agent implicated in epidemic outbreaks of acute respiratory disease (ARD). This report presents a genomic and bioinformatics analysis of the prototype 35,990-nucleotide genome (GenBank accession no. AY594253). Intriguingly, the genome analysis suggests a closer phylogenetic relationship with the chimpanzee adenoviruses (simian adenoviruses) rather than with other human adenoviruses, suggesting a recent origin of HAdV-4, and therefore species E, through a zoonotic event from chimpanzees to humans. Bioinformatics analysis also suggests a pre-zoonotic recombination event, as well, between species B-like and species C-like simian adenoviruses. These observations may have implications for the current interest in using chimpanzee adenoviruses in the development of vectors for human gene therapy and for DNA-based vaccines. Also, the reemergence, surveillance, and treatment of HAdV-4 as an ARD pathogen is an opportunity to demonstrate the use of genome determination as a tool for viral infectious disease characterization and epidemic outbreak surveillance: for example, rapid and accurate low-pass sequencing and analysis of the genome. In particular, this approach allows the rapid identification and development of unique probes for the differentiation of family, species, serotype, and strain (e.g., pathogen genome signatures) for monitoring epidemic outbreaks of ARD. PMID:15681456

  1. Genomic and bioinformatics analysis of HAdV-4, a human adenovirus causing acute respiratory disease: implications for gene therapy and vaccine vector development.

    PubMed

    Purkayastha, Anjan; Ditty, Susan E; Su, Jing; McGraw, John; Hadfield, Ted L; Tibbetts, Clark; Seto, Donald

    2005-02-01

    Human adenovirus serotype 4 (HAdV-4) is a reemerging viral pathogenic agent implicated in epidemic outbreaks of acute respiratory disease (ARD). This report presents a genomic and bioinformatics analysis of the prototype 35,990-nucleotide genome (GenBank accession no. AY594253). Intriguingly, the genome analysis suggests a closer phylogenetic relationship with the chimpanzee adenoviruses (simian adenoviruses) rather than with other human adenoviruses, suggesting a recent origin of HAdV-4, and therefore species E, through a zoonotic event from chimpanzees to humans. Bioinformatics analysis also suggests a pre-zoonotic recombination event, as well, between species B-like and species C-like simian adenoviruses. These observations may have implications for the current interest in using chimpanzee adenoviruses in the development of vectors for human gene therapy and for DNA-based vaccines. Also, the reemergence, surveillance, and treatment of HAdV-4 as an ARD pathogen is an opportunity to demonstrate the use of genome determination as a tool for viral infectious disease characterization and epidemic outbreak surveillance: for example, rapid and accurate low-pass sequencing and analysis of the genome. In particular, this approach allows the rapid identification and development of unique probes for the differentiation of family, species, serotype, and strain (e.g., pathogen genome signatures) for monitoring epidemic outbreaks of ARD. PMID:15681456

  2. AnnoTALE: bioinformatics tools for identification, annotation, and nomenclature of TALEs from Xanthomonas genomic sequences

    PubMed Central

    Grau, Jan; Reschke, Maik; Erkes, Annett; Streubel, Jana; Morgan, Richard D.; Wilson, Geoffrey G.; Koebnik, Ralf; Boch, Jens

    2016-01-01

    Transcription activator-like effectors (TALEs) are virulence factors, produced by the bacterial plant-pathogen Xanthomonas, that function as gene activators inside plant cells. Although the contribution of individual TALEs to infectivity has been shown, the specific roles of most TALEs, and the overall TALE diversity in Xanthomonas spp. is not known. TALEs possess a highly repetitive DNA-binding domain, which is notoriously difficult to sequence. Here, we describe an improved method for characterizing TALE genes by the use of PacBio sequencing. We present ‘AnnoTALE’, a suite of applications for the analysis and annotation of TALE genes from Xanthomonas genomes, and for grouping similar TALEs into classes. Based on these classes, we propose a unified nomenclature for Xanthomonas TALEs that reveals similarities pointing to related functionalities. This new classification enables us to compare related TALEs and to identify base substitutions responsible for the evolution of TALE specificities. PMID:26876161

  3. Integrative bioinformatics for functional genome annotation: trawling for G protein-coupled receptors.

    PubMed

    Flower, Darren R; Attwood, Teresa K

    2004-12-01

    G protein-coupled receptors (GPCR) are amongst the best studied and most functionally diverse types of cell-surface protein. The importance of GPCRs as mediates or cell function and organismal developmental underlies their involvement in key physiological roles and their prominence as targets for pharmacological therapeutics. In this review, we highlight the requirement for integrated protocols which underline the different perspectives offered by different sequence analysis methods. BLAST and FastA offer broad brush strokes. Motif-based search methods add the fine detail. Structural modelling offers another perspective which allows us to elucidate the physicochemical properties that underlie ligand binding. Together, these different views provide a more informative and a more detailed picture of GPCR structure and function. Many GPCRs remain orphan receptors with no identified ligand, yet as computer-driven functional genomics starts to elaborate their functions, a new understanding of their roles in cell and developmental biology will follow. PMID:15561589

  4. Clinical evaluation incorporating a personal genome

    PubMed Central

    Ashley, Euan A.; Butte, Atul J.; Wheeler, Matthew T.; Chen, Rong; Klein, Teri E.; Dewey, Frederick E.; Dudley, Joel T.; Ormond, Kelly E.; Pavlovic, Aleksandra; Hudgins, Louanne; Gong, Li; Hodges, Laura M.; Berlin, Dorit S.; Thorn, Caroline F.; Sangkuhl, Katrin; Hebert, Joan M.; Woon, Mark; Sagreiya, Hersh; Whaley, Ryan; Morgan, Alexander A.; Pushkarev, Dmitry; Neff, Norma F; Knowles, Joshua W.; Chou, Mike; Thakuria, Joseph; Rosenbaum, Abraham; Zaranek, Alexander Wait; Church, George; Greely, Henry T.; Quake, Stephen R.; Altman, Russ B.

    2010-01-01

    Background The cost of genomic information has fallen steeply but the path to clinical translation of risk estimates for common variants found in genome wide association studies remains unclear. Since the speed and cost of sequencing complete genomes is rapidly declining, more comprehensive means of analyzing these data in concert with rare variants for genetic risk assessment and individualisation of therapy are required. Here, we present the first integrated analysis of a complete human genome in a clinical context. Methods An individual with a family history of vascular disease and early sudden death was evaluated. Clinical assessment included risk prediction for coronary artery disease, screening for causes of sudden cardiac death, and genetic counselling. Genetic analysis included the development of novel methods for the integration of whole genome sequence data including 2.6 million single nucleotide polymorphisms and 752 copy number variations. The algorithm focused on predicting genetic risk of genes associated with known Mendelian disease, recognised drug responses, and pathogenicity for novel variants. In addition, since integration of risk ratios derived from case control studies is challenging, we estimated posterior probabilities from age and sex appropriate prior probability and likelihood ratios derived for each genotype. In addition, we developed a visualisation approach to account for gene-environment interactions and conditionally dependent risks. Findings We found increased genetic risk for myocardial infarction, type II diabetes and certain cancers. Rare variants in LPA are consistent with the family history of coronary artery disease. Pharmacogenomic analysis suggested a positive response to lipid lowering therapy, likely clopidogrel resistance, and a low initial dosing requirement for warfarin. Many variants of uncertain significance were reported. Interpretation Although challenges remain, our results suggest that whole genome sequencing can

  5. Cancer Genome Sequencing and Its Implications for Personalized Cancer Vaccines

    PubMed Central

    Li, Lijin; Goedegebuure, Peter; Mardis, Elaine R.; Ellis, Matthew J.C.; Zhang, Xiuli; Herndon, John M.; Fleming, Timothy P.; Carreno, Beatriz M.; Hansen, Ted H.; Gillanders, William E.

    2011-01-01

    New DNA sequencing platforms have revolutionized human genome sequencing. The dramatic advances in genome sequencing technologies predict that the $1,000 genome will become a reality within the next few years. Applied to cancer, the availability of cancer genome sequences permits real-time decision-making with the potential to affect diagnosis, prognosis, and treatment, and has opened the door towards personalized medicine. A promising strategy is the identification of mutated tumor antigens, and the design of personalized cancer vaccines. Supporting this notion are preliminary analyses of the epitope landscape in breast cancer suggesting that individual tumors express significant numbers of novel antigens to the immune system that can be specifically targeted through cancer vaccines. PMID:24213133

  6. Assessing Student Understanding of the "New Biology": Development and Evaluation of a Criterion-Referenced Genomics and Bioinformatics Assessment

    NASA Astrophysics Data System (ADS)

    Campbell, Chad Edward

    Over the past decade, hundreds of studies have introduced genomics and bioinformatics (GB) curricula and laboratory activities at the undergraduate level. While these publications have facilitated the teaching and learning of cutting-edge content, there has yet to be an evaluation of these assessment tools to determine if they are meeting the quality control benchmarks set forth by the educational research community. An analysis of these assessment tools indicated that <10% referenced any quality control criteria and that none of the assessments met more than one of the quality control benchmarks. In the absence of evidence that these benchmarks had been met, it is unclear whether these assessment tools are capable of generating valid and reliable inferences about student learning. To remedy this situation the development of a robust GB assessment aligned with the quality control benchmarks was undertaken in order to ensure evidence-based evaluation of student learning outcomes. Content validity is a central piece of construct validity, and it must be used to guide instrument and item development. This study reports on: (1) the correspondence of content validity evidence gathered from independent sources; (2) the process of item development using this evidence; (3) the results from a pilot administration of the assessment; (4) the subsequent modification of the assessment based on the pilot administration results and; (5) the results from the second administration of the assessment. Twenty-nine different subtopics within GB (Appendix B: Genomics and Bioinformatics Expert Survey) were developed based on preliminary GB textbook analyses. These subtopics were analyzed using two methods designed to gather content validity evidence: (1) a survey of GB experts (n=61) and (2) a detailed content analyses of GB textbooks (n=6). By including only the subtopics that were shown to have robust support across these sources, 22 GB subtopics were established for inclusion in the

  7. Chemogenomics: a discipline at the crossroad of high throughput technologies, biomarker research, combinatorial chemistry, genomics, cheminformatics, bioinformatics and artificial intelligence.

    PubMed

    Maréchal, Eric

    2008-09-01

    Chemogenomics is the study of the interaction of functional biological systems with exogenous small molecules, or in broader sense the study of the intersection of biological and chemical spaces. Chemogenomics requires expertises in biology, chemistry and computational sciences (bioinformatics, cheminformatics, large scale statistics and machine learning methods) but it is more than the simple apposition of each of these disciplines. Biological entities interacting with small molecules can be isolated proteins or more elaborate systems, from single cells to complete organisms. The biological space is therefore analyzed at various postgenomic levels (genomic, transcriptomic, proteomic or any phenotypic level). The space of small molecules is partially real, corresponding to commercial and academic collections of compounds, and partially virtual, corresponding to the chemical space possibly synthesizable. Synthetic chemistry has developed novel strategies allowing a physical exploration of this universe of possibilities. A major challenge of cheminformatics is to charter the virtual space of small molecules using realistic biological constraints (bioavailability, druggability, structural biological information). Chemogenomics is a descendent of conventional pharmaceutical approaches, since it involves the screening of chemolibraries for their effect on biological targets, and benefits from the advances in the corresponding enabling technologies and the introduction of new biological markers. Screening was originally motivated by the rigorous discovery of new drugs, neglecting and throwing away any molecule that would fail to meet the standards required for a therapeutic treatment. It is now the basis for the discovery of small molecules that might or might not be directly used as drugs, but which have an immense potential for basic research, as probes to explore an increasing number of biological phenomena. Concerns about the environmental impact of chemical industry

  8. Genome-wide bioinformatics analysis of steroid metabolism-associated genes in Nocardioides simplex VKM Ac-2033D.

    PubMed

    Shtratnikova, Victoria Y; Schelkunov, Mikhail I; Fokina, Victoria V; Pekov, Yury A; Ivashina, Tanya; Donova, Marina V

    2016-08-01

    Actinobacteria comprise diverse groups of bacteria capable of full degradation, or modification of different steroid compounds. Steroid catabolism has been characterized best for the representatives of suborder Corynebacterineae, such as Mycobacteria, Rhodococcus and Gordonia, with high content of mycolic acids in the cell envelope, while it is poorly understood for other steroid-transforming actinobacteria, such as representatives of Nocardioides genus belonging to suborder Propionibacterineae. Nocardioides simplex VKM Ac-2033D is an important biotechnological strain which is known for its ability to introduce ∆(1)-double bond in various 1(2)-saturated 3-ketosteroids, and perform convertion of 3β-hydroxy-5-ene steroids to 3-oxo-4-ene steroids, hydrolysis of acetylated steroids, reduction of carbonyl groups at C-17 and C-20 of androstanes and pregnanes, respectively. The strain is also capable of utilizing cholesterol and phytosterol as carbon and energy sources. In this study, a comprehensive bioinformatics genome-wide screening was carried out to predict genes related to steroid metabolism in this organism, their clustering and possible regulation. The predicted operon structure and number of candidate gene copies paralogs have been estimated. Binding sites of steroid catabolism regulators KstR and KstR2 specified for N. simplex VKM Ac-2033D have been calculated de novo. Most of the candidate genes grouped within three main clusters, one of the predicted clusters having no analogs in other actinobacteria studied so far. The results offer a base for further functional studies, expand the understanding of steroid catabolism by actinobacteria, and will contribute to modifying of metabolic pathways in order to generate effective biocatalysts capable of producing valuable bioactive steroids. PMID:26832142

  9. Genome Science and Personalized Cancer Treatment

    SciTech Connect

    Gray, Joe

    2009-08-07

    August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  10. Genome Science and Personalized Cancer Treatment

    SciTech Connect

    Gray, Joe

    2009-08-04

    Summer Lecture Series 2009: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  11. Genome Science and Personalized Cancer Treatment

    ScienceCinema

    Gray, Joe

    2010-01-08

    August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks ? particularly with regard to breast cancer.

  12. The predictive capacity of personal genome sequencing.

    PubMed

    Roberts, Nicholas J; Vogelstein, Joshua T; Parmigiani, Giovanni; Kinzler, Kenneth W; Vogelstein, Bert; Velculescu, Victor E

    2012-05-01

    New DNA sequencing methods will soon make it possible to identify all germline variants in any individual at a reasonable cost. However, the ability of whole-genome sequencing to predict predisposition to common diseases in the general population is unknown. To estimate this predictive capacity, we use the concept of a "genometype." A specific genometype represents the genomes in the population conferring a specific level of genetic risk for a specified disease. Using this concept, we estimated the maximum capacity of whole-genome sequencing to identify individuals at clinically significant risk for 24 different diseases. Our estimates were derived from the analysis of large numbers of monozygotic twin pairs; twins of a pair share the same genometype and therefore identical genetic risk factors. Our analyses indicate that (i) for 23 of the 24 diseases, most of the individuals will receive negative test results; (ii) these negative test results will, in general, not be very informative, because the risk of developing 19 of the 24 diseases in those who test negative will still be, at minimum, 50 to 80% of that in the general population; and (iii) on the positive side, in the best-case scenario, more than 90% of tested individuals might be alerted to a clinically significant predisposition to at least one disease. These results have important implications for the valuation of genetic testing by industry, health insurance companies, public policy-makers, and consumers. PMID:22472521

  13. Genome Paths: A Way to Personalized and Predictive Medicine

    PubMed Central

    2009-01-01

    The review is devoted to the impact of human genome research on progress in modern medicine. Basic achievements in genome research have resulted in the deciphering of the human genome and creation of a molecular landmarks map of the human haploid genome (HapMap Project), which has made a tremendous contribution to our understanding of common genetic and multifactorial (complex) disorders. Current genome studies mainly focus on genetic testing and gene association studies of multifactorial (complex) diseases, with the purpose of their efficient diagnostics and prevention . Identification of candidate ("predisposition") genes participating in the functional genetic modules underlying each common disorder and the use of this genetic background to elaborate sophisticated measures to efficiently prevent them constitutes a major goal in personalized molecular medicine. The concept of a genetic pass as an individual DNA databank reflecting inherited human predisposition to different complex and monogenic disorders, with special emphasis on its present state, and the numerous difficulties related to the practical implementation of personalized medicine are outlined. The problems related to the uncertainness of the results of genetic testing could be overcome at least partly by means of new technological achievements in genome research methods, such as genome-wide association studies (GWAS), massive parallel DNA sequencing, and genetic and epigenetic profiling. The basic tasks of genomic today could be determined as the need to properly estimate the clinical value of genetic testing and its applicability in clinical practice. Feasible ways towards the gradual implementation of personal genetic data, in line with routine laboratory tests, for the benefit of clinical practice are discussed. PMID:22649616

  14. Anticipation of Personal Genomics Data Enhances Interest and Learning Environment in Genomics and Molecular Biology Undergraduate Courses.

    PubMed

    Weber, K Scott; Jensen, Jamie L; Johnson, Steven M

    2015-01-01

    An important discussion at colleges is centered on determining more effective models for teaching undergraduates. As personalized genomics has become more common, we hypothesized it could be a valuable tool to make science education more hands on, personal, and engaging for college undergraduates. We hypothesized that providing students with personal genome testing kits would enhance the learning experience of students in two undergraduate courses at Brigham Young University: Advanced Molecular Biology and Genomics. These courses have an emphasis on personal genomics the last two weeks of the semester. Students taking these courses were given the option to receive personal genomics kits in 2014, whereas in 2015 they were not. Students sent their personal genomics samples in on their own and received the data after the course ended. We surveyed students in these courses before and after the two-week emphasis on personal genomics to collect data on whether anticipation of obtaining their own personal genomic data impacted undergraduate student learning. We also tested to see if specific personal genomic assignments improved the learning experience by analyzing the data from the undergraduate students who completed both the pre- and post-course surveys. Anticipation of personal genomic data significantly enhanced student interest and the learning environment based on the time students spent researching personal genomic material and their self-reported attitudes compared to those who did not anticipate getting their own data. Personal genomics homework assignments significantly enhanced the undergraduate student interest and learning based on the same criteria and a personal genomics quiz. We found that for the undergraduate students in both molecular biology and genomics courses, incorporation of personal genomic testing can be an effective educational tool in undergraduate science education. PMID:26241308

  15. Getting up close and personal with your genome.

    PubMed

    Bonetta, Laura

    2008-05-30

    A new type of company is offering to scan a person's genome and reveal the information it holds for as little as $1000. Are these services fun novelty items or do they provide valuable information that will help people take better care of their health? PMID:18510915

  16. Re-Examining the Gene in Personalized Genomics

    ERIC Educational Resources Information Center

    Bartol, Jordan

    2013-01-01

    Personalized genomics companies (PG; also called "direct-to-consumer genetics") are businesses marketing genetic testing to consumers over the Internet. While much has been written about these new businesses, little attention has been given to their roles in science communication. This paper provides an analysis of the gene concept…

  17. Making Personalized Health Care Even More Personalized: Insights From Activities of the IOM Genomics Roundtable

    PubMed Central

    David, Sean P.; Johnson, Samuel G.; Berger, Adam C.; Feero, W. Gregory; Terry, Sharon F.; Green, Larry A.; Phillips, Robert L.; Ginsburg, Geoffrey S.

    2015-01-01

    Genomic research has generated much new knowledge into mechanisms of human disease, with the potential to catalyze novel drug discovery and development, prenatal and neonatal screening, clinical pharmacogenomics, more sensitive risk prediction, and enhanced diagnostics. Genomic medicine, however, has been limited by critical evidence gaps, especially those related to clinical utility and applicability to diverse populations. Genomic medicine may have the greatest impact on health care if it is integrated into primary care, where most health care is received and where evidence supports the value of personalized medicine grounded in continuous healing relationships. Redesigned primary care is the most relevant setting for clinically useful genomic medicine research. Taking insights gained from the activities of the Institute of Medicine (IOM) Roundtable on Translating Genomic-Based Research for Health, we apply lessons learned from the patient-centered medical home national experience to implement genomic medicine in a patient-centered, learning health care system. PMID:26195686

  18. Bioinformatics for Genome Analysis

    SciTech Connect

    Gary J. Olsen

    2005-06-30

    Nesbo, Boucher and Doolittle (2001) used phylogenetic trees of four taxa to assess whether euryarchaeal genes share a common history. They have suggested that of the 521 genes examined, each of the three possible tree topologies relating the four taxa was supported essentially equal numbers of times. They suggest that this might be the result of numerous horizontal gene transfer events, essentially randomizing the relationships between gene histories (as inferred in the 521 gene trees) and organismal relationships (which would be a single underlying tree). Motivated by the fact that the order in which sequences are added to a multiple sequence alignment influences the alignment, and ultimately inferred tree, they were interested in the extent to which the variations among inferred trees might be due to variations in the alignment order. This bears directly on their efforts to evaluate and improve upon methods of multiple sequence alignment. They set out to analyze the influence of alignment order on the tree inferred for 43 genes shared among these same 4 taxa. Because alignments produced by CLUSTALW are directed by a rooted guide tree (the denderogram), there are 15 possible alignment orders of 4 taxa. For each gene they tested all 15 alignment orders, and as a 16th option, allowed CLUSTALW to generate its own guide tree. If we supply all 15 possible rooted guide trees, they expected that at least one of them should be as good at CLUSTAL's own guide tree, but most of the time they differed (sometimes being better than CLUSTAL's default tree and sometimes being worse). The difference seems to be that the user-supplied tree is not given meaningful branch lengths, which effect the assumed probability of amino acid changes. They examined the practicality of modifying CLUSTALW to improve its treatment of user-supplied guide trees. This work became ever increasing bogged down in finding and repairing minor bugs in the CLUSTALW code. This effort was put on hold as we feel that our other proposed approaches will ultimately be better.

  19. Bayesian predictive modeling for genomic based personalized treatment selection.

    PubMed

    Ma, Junsheng; Stingo, Francesco C; Hobbs, Brian P

    2016-06-01

    Efforts to personalize medicine in oncology have been limited by reductive characterizations of the intrinsically complex underlying biological phenomena. Future advances in personalized medicine will rely on molecular signatures that derive from synthesis of multifarious interdependent molecular quantities requiring robust quantitative methods. However, highly parameterized statistical models when applied in these settings often require a prohibitively large database and are sensitive to proper characterizations of the treatment-by-covariate interactions, which in practice are difficult to specify and may be limited by generalized linear models. In this article, we present a Bayesian predictive framework that enables the integration of a high-dimensional set of genomic features with clinical responses and treatment histories of historical patients, providing a probabilistic basis for using the clinical and molecular information to personalize therapy for future patients. Our work represents one of the first attempts to define personalized treatment assignment rules based on large-scale genomic data. We use actual gene expression data acquired from The Cancer Genome Atlas in the settings of leukemia and glioma to explore the statistical properties of our proposed Bayesian approach for personalizing treatment selection. The method is shown to yield considerable improvements in predictive accuracy when compared to penalized regression approaches. PMID:26575856

  20. Education and personalized genomics: deciphering the public's genetic health report

    PubMed Central

    Lamb, Neil E; Myers, Richard M; Gunter, Chris

    2010-01-01

    Where do members of the public turn to understand what genetic tests mean in terms of their own health? Now that genome-wide association studies and complete genome sequencing are widely available, the importance of education in personalized genomics cannot be overstated. Although some media have introduced the concept of genetic testing to better understand health and disease, the public's understanding of the scope and impact of genetic variation has not kept up with the pace of the science or technology. Unfortunately, the likely sources to which the public turn to for guidance – their physician and the media – are often no better prepared. We examine several venues for information, including print and online guides for both lay and health-oriented audiences, and summarize selected resources in multiple formats. We also note on the roadblocks to progress and discuss ways to remove them, as urgent action is needed to connect people with their genomes in a meaningful way. PMID:20161675

  1. Identification of conserved and polymorphic STRs for personal genomes

    PubMed Central

    2014-01-01

    Background Short tandem repeats (STRs) are abundant in human genomes. Numerous STRs have been shown to be associated with genetic diseases and gene regulatory functions, and have been selected as genetic markers for evolutionary and forensic analyses. High-throughput next generation sequencers have fostered new cutting-edge computing techniques for genome-scale analyses, and cross-genome comparisons have facilitated the efficient identification of polymorphic STR markers for various applications. Results An automated and efficient system for detecting human polymorphic STRs at the genome scale is proposed in this study. Assembled contigs from next generation sequencing data were aligned and calibrated according to selected reference sequences. To verify identified polymorphic STRs, human genomes from the 1000 Genomes Project were employed for comprehensive analyses, and STR markers from the Combined DNA Index System (CODIS) and disease-related STR motifs were also applied as cases for evaluation. In addition, we analyzed STR variations for highly conserved homologous genes and human-unique genes. In total 477 polymorphic STRs were identified from 492 human-unique genes, among which 26 STRs were retrieved and clustered into three different groups for efficient comparison. Conclusions We have developed an online system that efficiently identifies polymorphic STRs and provides novel distinguishable STR biomarkers for different levels of specificity. Candidate polymorphic STRs within a personal genome could be easily retrieved and compared to the constructed STR profile through query keywords, gene names, or assembled contigs. PMID:25560225

  2. SNPedia: a wiki supporting personal genome annotation, interpretation and analysis.

    PubMed

    Cariaso, Michael; Lennon, Greg

    2012-01-01

    SNPedia (http://www.SNPedia.com) is a wiki resource of the functional consequences of human genetic variation as published in peer-reviewed studies. Online since 2006 and freely available for personal use, SNPedia has focused on the medical, phenotypic and genealogical associations of single nucleotide polymorphisms. Entries are formatted to allow associations to be assigned to single genotypes as well as sets of genotypes (genosets). In this article, we discuss the growth of this resource and its use by affiliated software to create personal genome reports. PMID:22140107

  3. Personalized Genomic Medicine and the Rhetoric of Empowerment

    PubMed Central

    Juengst, Eric T.; Flatt, Michael A.; Settersten, Richard A.

    2013-01-01

    Advocates of “personalized” genomic medicine maintain that it is revolutionary not just in what it can reveal to us, but in how it will enable us to take control of our health. But we should not assume that patient empowerment always yields positive outcomes. To assess the social impact of personalized medicine, we must anticipate how the virtue might go awry in practice. PMID:22976411

  4. Mitochondrial and nuclear genomics and the emergence of personalized medicine

    PubMed Central

    2012-01-01

    Developing early detection biosensors for disease has been the long‒held goal of the Human Genome Project, but with little success. Conversely, the biological properties of the mitochondrion coupled with the relative simplicity of the mitochondrial genome give this organelle extraordinary functionality as a biosensor and places the field of mitochondrial genomics in a position of strategic advantage to launch significant advances in personalized medicine. Numerous factors make the mitochondrion organelle uniquely suited to be an early detection biosensor with applications in oncology as well as many other aspects of human health and disease. Early detection of disease translates into more effective, less expensive treatments for disease and overall better prognoses for those at greater risk for developing diseases. PMID:23244780

  5. Systematic evaluation of personal genome services for Japanese individuals.

    PubMed

    Kido, Takashi; Kawashima, Minae; Nishino, Seiji; Swan, Melanie; Kamatani, Naoyuki; Butte, Atul J

    2013-11-01

    Disease risk prediction (DRP) is one of the most important challenges in personal genome research. Although many direct-to-consumer genetic test (DTC) companies have begun to offer personal genome services for DRP, there is still no consensus on what constitutes a gold-standard service. Here, we systematically evaluated the distributions of DRPs from three DTC companies, that is, 23andMe, Navigenics and deCODEme, for 22 diseases using three Japanese samples. We systematically quantified and analyzed the differences between each DTC company's DRPs. Our independency test showed that the overall prediction results were correlated with each other, but not perfectly matched; less than onethird mismatching of the opposite direction occurred in eight diseases. Moreover, we found that the differences could mainly be attributed to four factors: (1) single nucleotide polymorphism (SNP) selection, (2) average risk estimation, (3) the disease risk calculation algorithm and (4) ethnicity adjustment. In particular, only 7.1% of SNPs over 22 diseases were reviewed by all three companies. Therefore, development of a universal core SNPs list for non-Caucasian samples will be important for achieving better prediction capacity for Japanese samples. This systematic methodology provides useful insights for improving the capacity of DRPs in future personal genome services. PMID:24067293

  6. Biology in 'silico': The Bioinformatics Revolution.

    ERIC Educational Resources Information Center

    Bloom, Mark

    2001-01-01

    Explains the Human Genome Project (HGP) and efforts to sequence the human genome. Describes the role of bioinformatics in the project and considers it the genetics Swiss Army Knife, which has many different uses, for use in forensic science, medicine, agriculture, and environmental sciences. Discusses the use of bioinformatics in the high school…

  7. Attitudes regarding privacy of genomic information in personalized cancer therapy

    PubMed Central

    Rogith, Deevakar; Yusuf, Rafeek A; Hovick, Shelley R; Peterson, Susan K; Burton-Chase, Allison M; Li, Yisheng; Meric-Bernstam, Funda; Bernstam, Elmer V

    2014-01-01

    Objective To evaluate attitudes regarding privacy of genomic data in a sample of patients with breast cancer. Methods Female patients with breast cancer (n=100) completed a questionnaire assessing attitudes regarding concerns about privacy of genomic data. Results Most patients (83%) indicated that genomic data should be protected. However, only 13% had significant concerns regarding privacy of such data. Patients expressed more concern about insurance discrimination than employment discrimination (43% vs 28%, p<0.001). They expressed less concern about research institutions protecting the security of their molecular data than government agencies or drug companies (20% vs 38% vs 44%; p<0.001). Most did not express concern regarding the association of their genomic data with their name and personal identity (49% concerned), billing and insurance information (44% concerned), or clinical data (27% concerned). Significantly fewer patients were concerned about the association with clinical data than other data types (p<0.001). In the absence of direct benefit, patients were more willing to consent to sharing of deidentified than identified data with researchers not involved in their care (76% vs 60%; p<0.001). Most (85%) patients were willing to consent to DNA banking. Discussion While patients are opposed to indiscriminate release of genomic data, privacy does not appear to be their primary concern. Furthermore, we did not find any specific predictors of privacy concerns. Conclusions Patients generally expressed low levels of concern regarding privacy of genomic data, and many expressed willingness to consent to sharing their genomic data with researchers. PMID:24737606

  8. SpeedSeq: Ultra-fast personal genome analysis and interpretation

    PubMed Central

    Chiang, Colby; Layer, Ryan M.; Faust, Gregory G.; Lindberg, Michael R.; Rose, David B.; Garrison, Erik P.; Marth, Gabor T.; Quinlan, Aaron R.; Hall, Ira M.

    2015-01-01

    SpeedSeq is an open-source genome analysis platform that accomplishes alignment, variant detection and functional annotation of a 50× human genome in 13 hours on a low-cost server, alleviating a bioinformatics bottleneck that typically demands weeks of computation with extensive hands-on expert involvement. SpeedSeq offers competitive or superior performance to current methods for detecting germline and somatic single nucleotide variants, indels, and structural variants, and includes novel functionality for streamlined interpretation. PMID:26258291

  9. Illuminating the Black Box of Genome Sequence Assembly: A Free Online Tool to Introduce Students to Bioinformatics

    ERIC Educational Resources Information Center

    Taylor, D. Leland; Campbell, A. Malcolm; Heyer, Laurie J.

    2013-01-01

    Next-generation sequencing technologies have greatly reduced the cost of sequencing genomes. With the current sequencing technology, a genome is broken into fragments and sequenced, producing millions of "reads." A computer algorithm pieces these reads together in the genome assembly process. PHAST is a set of online modules…

  10. Diagnosis of an imprinted-gene syndrome by a novel bioinformatics analysis of whole-genome sequences from a family trio.

    PubMed

    Bodian, Dale L; Solomon, Benjamin D; Khromykh, Alina; Thach, Dzung C; Iyer, Ramaswamy K; Link, Kathleen; Baker, Robin L; Baveja, Rajiv; Vockley, Joseph G; Niederhuber, John E

    2014-11-01

    Whole-genome sequencing and whole-exome sequencing are becoming more widely applied in clinical medicine to help diagnose rare genetic diseases. Identification of the underlying causative mutations by genome-wide sequencing is greatly facilitated by concurrent analysis of multiple family members, most often the mother-father-proband trio, using bioinformatics pipelines that filter genetic variants by mode of inheritance. However, current pipelines are limited to Mendelian inheritance patterns and do not specifically address disorders caused by mutations in imprinted genes, such as forms of Angelman syndrome and Beckwith-Wiedemann syndrome. Using publicly available tools, we implemented a genetic inheritance search mode to identify imprinted-gene mutations. Application of this search mode to whole-genome sequences from a family trio led to a diagnosis for a proband for whom extensive clinical testing and Mendelian inheritance-based sequence analysis were nondiagnostic. The condition in this patient, IMAGe syndrome, is likely caused by the heterozygous mutation c.832A>G (p.Lys278Glu) in the imprinted gene CDKN1C. The genotypes and disease status of six members of the family are consistent with maternal expression of the gene, and allele-biased expression was confirmed by RNA-Seq for the heterozygotes. This analysis demonstrates that an imprinted-gene search mode is a valuable addition to genome sequence analysis pipelines for identifying disease-causative variants. PMID:25614875

  11. Diagnosis of an imprinted-gene syndrome by a novel bioinformatics analysis of whole-genome sequences from a family trio

    PubMed Central

    Bodian, Dale L; Solomon, Benjamin D; Khromykh, Alina; Thach, Dzung C; Iyer, Ramaswamy K; Link, Kathleen; Baker, Robin L; Baveja, Rajiv; Vockley, Joseph G; Niederhuber, John E

    2014-01-01

    Whole-genome sequencing and whole-exome sequencing are becoming more widely applied in clinical medicine to help diagnose rare genetic diseases. Identification of the underlying causative mutations by genome-wide sequencing is greatly facilitated by concurrent analysis of multiple family members, most often the mother–father–proband trio, using bioinformatics pipelines that filter genetic variants by mode of inheritance. However, current pipelines are limited to Mendelian inheritance patterns and do not specifically address disorders caused by mutations in imprinted genes, such as forms of Angelman syndrome and Beckwith–Wiedemann syndrome. Using publicly available tools, we implemented a genetic inheritance search mode to identify imprinted-gene mutations. Application of this search mode to whole-genome sequences from a family trio led to a diagnosis for a proband for whom extensive clinical testing and Mendelian inheritance-based sequence analysis were nondiagnostic. The condition in this patient, IMAGe syndrome, is likely caused by the heterozygous mutation c.832A>G (p.Lys278Glu) in the imprinted gene CDKN1C. The genotypes and disease status of six members of the family are consistent with maternal expression of the gene, and allele-biased expression was confirmed by RNA-Seq for the heterozygotes. This analysis demonstrates that an imprinted-gene search mode is a valuable addition to genome sequence analysis pipelines for identifying disease-causative variants. PMID:25614875

  12. Bioinformatic tools for using whole genome sequencing as a rapid high resolution diagnostic typing tool when tracing bioterror organisms in the food and feed chain.

    PubMed

    Segerman, Bo; De Medici, Dario; Ehling Schulz, Monika; Fach, Patrick; Fenicia, Lucia; Fricker, Martina; Wielinga, Peter; Van Rotterdam, Bart; Knutsson, Rickard

    2011-03-01

    The rapid technological development in the field of parallel sequencing offers new opportunities when tracing and tracking microorganisms in the food and feed chain. If a bioterror organism is deliberately spread it is of crucial importance to get as much information as possible regarding the strain as fast as possible to aid the decision process and select suitable controls, tracing and tracking tools. A lot of efforts have been made to sequence multiple strains of potential bioterror organisms so there is a relatively large set of reference genomes available. This study is focused on how to use parallel sequencing for rapid phylogenomic analysis and screen for genetic modifications. A bioinformatic methodology has been developed to rapidly analyze sequence data with minimal post-processing. Instead of assembling the genome, defining genes, defining orthologous relations and calculating distances, the present method can achieve a similar high resolution directly from the raw sequence data. The method defines orthologous sequence reads instead of orthologous genes and the average similarity of the core genome (ASC) is calculated. The sequence reads from the core and from the non-conserved genomic regions can also be separated for further analysis. Finally, the comparison algorithm is used to visualize the phylogenomic diversity of the bacterial bioterror organisms Bacillus anthracis and Clostridium botulinum using heat plot diagrams. PMID:20826036

  13. MISIS-2: A bioinformatics tool for in-depth analysis of small RNAs and representation of consensus master genome in viral quasispecies.

    PubMed

    Seguin, Jonathan; Otten, Patricia; Baerlocher, Loïc; Farinelli, Laurent; Pooggin, Mikhail M

    2016-07-01

    In most eukaryotes, small RNA (sRNA) molecules such as miRNAs, siRNAs and piRNAs regulate gene expression and repress transposons and viruses. AGO/PIWI family proteins sort functional sRNAs based on size, 5'-nucleotide and other sequence features. In plants and some animals, viral sRNAs are extremely diverse and cover the entire viral genome sequences, which allows for de novo reconstruction of a complete viral genome by deep sequencing and bioinformatics analysis of viral sRNAs. Previously, we have developed a tool MISIS to view and analyze sRNA maps of viruses and cellular genome regions which spawn multiple sRNAs. Here we describe a new release of MISIS, MISIS-2, which enables to determine and visualize a consensus sequence and count sRNAs of any chosen sizes and 5'-terminal nucleotide identities. Furthermore we demonstrate the utility of MISIS-2 for identification of single nucleotide polymorphisms (SNPs) at each position of a reference sequence and reconstruction of a consensus master genome in evolving viral quasispecies. MISIS-2 is a Java standalone program. It is freely available along with the source code at the website http://www.fasteris.com/apps. PMID:26994965

  14. Bioinformatics Pipelines for Targeted Resequencing and Whole-Exome Sequencing of Human and Mouse Genomes: A Virtual Appliance Approach for Instant Deployment

    PubMed Central

    Saeed, Isaam; Wong, Stephen Q.; Mar, Victoria; Goode, David L.; Caramia, Franco; Doig, Ken; Ryland, Georgina L.; Thompson, Ella R.; Hunter, Sally M.; Halgamuge, Saman K.; Ellul, Jason; Dobrovic, Alexander; Campbell, Ian G.; Papenfuss, Anthony T.; McArthur, Grant A.; Tothill, Richard W.

    2014-01-01

    Targeted resequencing by massively parallel sequencing has become an effective and affordable way to survey small to large portions of the genome for genetic variation. Despite the rapid development in open source software for analysis of such data, the practical implementation of these tools through construction of sequencing analysis pipelines still remains a challenging and laborious activity, and a major hurdle for many small research and clinical laboratories. We developed TREVA (Targeted REsequencing Virtual Appliance), making pre-built pipelines immediately available as a virtual appliance. Based on virtual machine technologies, TREVA is a solution for rapid and efficient deployment of complex bioinformatics pipelines to laboratories of all sizes, enabling reproducible results. The analyses that are supported in TREVA include: somatic and germline single-nucleotide and insertion/deletion variant calling, copy number analysis, and cohort-based analyses such as pathway and significantly mutated genes analyses. TREVA is flexible and easy to use, and can be customised by Linux-based extensions if required. TREVA can also be deployed on the cloud (cloud computing), enabling instant access without investment overheads for additional hardware. TREVA is available at http://bioinformatics.petermac.org/treva/. PMID:24752294

  15. Re-examining the Gene in Personalized Genomics

    NASA Astrophysics Data System (ADS)

    Bartol, Jordan

    2013-10-01

    Personalized genomics companies (PG; also called `direct-to-consumer genetics') are businesses marketing genetic testing to consumers over the Internet. While much has been written about these new businesses, little attention has been given to their roles in science communication. This paper provides an analysis of the gene concept presented to customers and the relation between the information given and the science behind PG. Two quite different gene concepts are present in company rhetoric, but only one features in the science. To explain this, we must appreciate the delicate tension between PG, academic science, public expectation, and market forces.

  16. Genomics and Bioinformatics in Undergraduate Curricula: Contexts for Hybrid Laboratory/Lecture Courses for Entering and Advanced Science Students

    ERIC Educational Resources Information Center

    Temple, Louise; Cresawn, Steven G.; Monroe, Jonathan D.

    2010-01-01

    Emerging interest in genomics in the scientific community prompted biologists at James Madison University to create two courses at different levels to modernize the biology curriculum. The courses are hybrids of classroom and laboratory experiences. An upper level class uses raw sequence of a genome (plasmid or virus) as the subject on which to…

  17. Incidentalome from Genomic Sequencing: A Barrier to Personalized Medicine?

    PubMed Central

    Jamuar, Saumya Shekhar; Kuan, Jyn Ling; Brett, Maggie; Tiang, Zenia; Tan, Wilson Lek Wen; Lim, Jiin Ying; Liew, Wendy Kein Meng; Javed, Asif; Liew, Woei Kang; Law, Hai Yang; Tan, Ee Shien; Lai, Angeline; Ng, Ivy; Teo, Yik Ying; Venkatesh, Byrappa; Reversade, Bruno; Tan, Ene Choo; Foo, Roger

    2016-01-01

    Background In Western cohorts, the prevalence of incidental findings (IFs) or incidentalome, referring to variants in genes that are unrelated to the patient's primary condition, is between 0.86% and 8.8%. However, data on prevalence and type of IFs in Asian population is lacking. Methods In 2 cohorts of individuals with genomic sequencing performed in Singapore (total n = 377), we extracted and annotated variants in the 56 ACMG-recommended genes and filtered these variants based on the level of pathogenicity. We then analyzed the precise distribution of IFs, class of genes, related medical conditions, and potential clinical impact. Results We found a total of 41,607 variants in the 56 genes in our cohort of 377 individuals. After filtering for rare and coding variants, we identified 14 potential variants. After reviewing primary literature, only 4 out of the 14 variants were classified to be pathogenic, while an additional two variants were classified as likely pathogenic. Overall, the cumulative prevalence of IFs (pathogenic and likely pathogenic variants) in our cohort was 1.6%. Conclusion The cumulative prevalence of IFs through genomic sequencing is low and the incidentalome may not be a significant barrier to implementation of genomics for personalized medicine. PMID:27077130

  18. The Genome Sequencer FLX System--longer reads, more applications, straight forward bioinformatics and more complete data sets.

    PubMed

    Droege, Marcus; Hill, Brendon

    2008-08-31

    The Genome Sequencer FLX System (GS FLX), powered by 454 Sequencing, is a next-generation DNA sequencing technology featuring a unique mix of long reads, exceptional accuracy, and ultra-high throughput. It has been proven to be the most versatile of all currently available next-generation sequencing technologies, supporting many high-profile studies in over seven applications categories. GS FLX users have pursued innovative research in de novo sequencing, re-sequencing of whole genomes and target DNA regions, metagenomics, and RNA analysis. 454 Sequencing is a powerful tool for human genetics research, having recently re-sequenced the genome of an individual human, currently re-sequencing the complete human exome and targeted genomic regions using the NimbleGen sequence capture process, and detected low-frequency somatic mutations linked to cancer. PMID:18616967

  19. Playing with heart and soul…and genomes: sports implications and applications of personal genomics

    PubMed Central

    2013-01-01

    Whether the integration of genetic/omic technologies in sports contexts will facilitate player success, promote player safety, or spur genetic discrimination depends largely upon the game rules established by those currently designing genomic sports medicine programs. The integration has already begun, but there is not yet a playbook for best practices. Thus far discussions have focused largely on whether the integration would occur and how to prevent the integration from occurring, rather than how it could occur in such a way that maximizes benefits, minimizes risks, and avoids the exacerbation of racial disparities. Previous empirical research has identified members of the personal genomics industry offering sports-related DNA tests, and previous legal research has explored the impact of collective bargaining in professional sports as it relates to the employment protections of the Genetic Information Nondiscrimination Act (GINA). Building upon that research and upon participant observations with specific sports-related DNA tests purchased from four direct-to-consumer companies in 2011 and broader personal genomics (PGx) services, this anthropological, legal, and ethical (ALE) discussion highlights fundamental issues that must be addressed by those developing personal genomic sports medicine programs, either independently or through collaborations with commercial providers. For example, the vulnerability of student-athletes creates a number of issues that require careful, deliberate consideration. More broadly, however, this ALE discussion highlights potential sports-related implications (that ultimately might mitigate or, conversely, exacerbate racial disparities among athletes) of whole exome/genome sequencing conducted by biomedical researchers and clinicians for non-sports purposes. For example, the possibility that exome/genome sequencing of individuals who are considered to be non-patients, asymptomatic, normal, etc. will reveal the presence of variants of

  20. [Genome sequencing and personalized medicine: perspectives and limitations].

    PubMed

    Le Gall, Jean-Yves; Debré, Patrice

    2014-01-01

    (e.g. imatinib and Bcr/Abl rearrangement; verumafemib and the BRAF V600E mutation). Systematic sequencing of all the genes involved in drug metabolism and responsiveness will lead to individualized pharmacogenetics. Finally, sequencing of the tumoral and constitutional genomes, identfication of somatic mutations, and detection of pharmacogenetic variants will open up the era of personalized medicine. The first results of these targeted therapeutic indications show a gain in the duration of remission and survival, although the cost-effectiveness of these approaches remains to be determined. Finally, this huge capacity for genome sequencing raises a number of regulatory and ethical issues. PMID:26259290

  1. Web-based Gene Pathogenicity Analysis (WGPA): a web platform to interpret gene pathogenicity from personal genome data

    PubMed Central

    Diaz-Montana, Juan J.; Rackham, Owen J.L.; Diaz-Diaz, Norberto; Petretto, Enrico

    2016-01-01

    Summary: As the volume of patient-specific genome sequences increases the focus of biomedical research is switching from the detection of disease-mutations to their interpretation. To this end a number of techniques have been developed that use mutation data collected within a population to predict whether individual genes are likely to be disease-causing or not. As both sequence data and associated analysis tools proliferate, it becomes increasingly difficult for the community to make sense of these data and their implications. Moreover, no single analysis tool is likely to capture all relevant genomic features that contribute to the gene’s pathogenicity. Here, we introduce Web-based Gene Pathogenicity Analysis (WGPA), a web-based tool to analyze genes impacted by mutations and rank them through the integration of existing prioritization tools, which assess different aspects of gene pathogenicity using population-level sequence data. Additionally, to explore the polygenic contribution of mutations to disease, WGPA implements gene set enrichment analysis to prioritize disease-causing genes and gene interaction networks, therefore providing a comprehensive annotation of personal genomes data in disease. Availability and implementation: wgpa.systems-genetics.net Contact: enrico.petretto@duke-nus.edu.sg Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26490503

  2. Personal genomics and individual identities: motivations and moral imperatives of early users

    PubMed Central

    McGowan, Michelle L.; Fishman, Jennifer R.; Lambrix, Marcie A.

    2010-01-01

    Since 2007, consumer genomics companies have marketed personal genome scanning services to assess users’ genetic predispositions to a variety of complex diseases and traits. This study investigates early users’ reasons for utilizing personal genome services, their evaluation of the technology, how they interpret the results, and how they incorporate the results into health-related decision-making. The analysis contextualizes early users’ relationships to the technology, the knowledge generated by it, and how it mediates their relationship to their own health and to biomedicine more broadly. The results reveal that early users approach personal genome scanning with both optimism for genomic research and scepticism about the technology’s current capabilities, which runs contrary to concerns that consumers may be ill equipped to interpret and understand genome scan results. These findings provide important qualitative insight into early users’ conceptualizations of personal genomic risk assessment and illuminate their involvement in configuring this technology in the making. PMID:21076647

  3. Rapid Development of Bioinformatics Education in China

    ERIC Educational Resources Information Center

    Zhong, Yang; Zhang, Xiaoyan; Ma, Jian; Zhang, Liang

    2003-01-01

    As the Human Genome Project experiences remarkable success and a flood of biological data is produced, bioinformatics becomes a very "hot" cross-disciplinary field, yet experienced bioinformaticians are urgently needed worldwide. This paper summarises the rapid development of bioinformatics education in China, especially related undergraduate…

  4. Fuzzy Logic in Medicine and Bioinformatics

    PubMed Central

    Torres, Angela; Nieto, Juan J.

    2006-01-01

    The purpose of this paper is to present a general view of the current applications of fuzzy logic in medicine and bioinformatics. We particularly review the medical literature using fuzzy logic. We then recall the geometrical interpretation of fuzzy sets as points in a fuzzy hypercube and present two concrete illustrations in medicine (drug addictions) and in bioinformatics (comparison of genomes). PMID:16883057

  5. Bioinformatics of prokaryotic RNAs

    PubMed Central

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880

  6. Bioinformatics of prokaryotic RNAs.

    PubMed

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880

  7. Genome-wide expression profiling and bioinformatics analysis of diurnally regulated genes in the mouse prefrontal cortex

    PubMed Central

    Yang, Shuzhang; Wang, Kai; Valladares, Otto; Hannenhalli, Sridhar; Bucan, Maja

    2007-01-01

    Background The prefrontal cortex is important in regulating sleep and mood. Diurnally regulated genes in the prefrontal cortex may be controlled by the circadian system, by sleep:wake states, or by cellular metabolism or environmental responses. Bioinformatics analysis of these genes will provide insights into a wide-range of pathways that are involved in the pathophysiology of sleep disorders and psychiatric disorders with sleep disturbances. Results We examined gene expression in the mouse prefrontal cortex at four time points during a 24 hour (12 hour light:12 hour dark) cycle using microarrays, and identified 3,890 transcripts corresponding to 2,927 genes with diurnally regulated expression patterns. We show that 16% of the genes identified in our study are orthologs of identified clock, clock controlled or sleep/wakefulness induced genes in the mouse liver and suprachiasmatic nucleus, rat cortex and cerebellum, or Drosophila head. The diurnal expression patterns were confirmed for 16 out of 18 genes in an independent set of RNA samples. The diurnal genes fall into eight temporal categories with distinct functional attributes, as assessed by Gene Ontology classification and analysis of enriched transcription factor binding sites. Conclusion Our analysis demonstrates that approximately 10% of transcripts have diurnally regulated expression patterns in the mouse prefrontal cortex. Functional annotation of these genes will be important for the selection of candidate genes for behavioral mutants in the mouse and for genetic studies of disorders associated with anomalies in the sleep:wake cycle and circadian rhythm. PMID:18028544

  8. Crowdsourcing for bioinformatics

    PubMed Central

    Good, Benjamin M.; Su, Andrew I.

    2013-01-01

    Motivation: Bioinformatics is faced with a variety of problems that require human involvement. Tasks like genome annotation, image analysis, knowledge-base population and protein structure determination all benefit from human input. In some cases, people are needed in vast quantities, whereas in others, we need just a few with rare abilities. Crowdsourcing encompasses an emerging collection of approaches for harnessing such distributed human intelligence. Recently, the bioinformatics community has begun to apply crowdsourcing in a variety of contexts, yet few resources are available that describe how these human-powered systems work and how to use them effectively in scientific domains. Results: Here, we provide a framework for understanding and applying several different types of crowdsourcing. The framework considers two broad classes: systems for solving large-volume ‘microtasks’ and systems for solving high-difficulty ‘megatasks’. Within these classes, we discuss system types, including volunteer labor, games with a purpose, microtask markets and open innovation contests. We illustrate each system type with successful examples in bioinformatics and conclude with a guide for matching problems to crowdsourcing solutions that highlights the positives and negatives of different approaches. Contact: bgood@scripps.edu PMID:23782614

  9. Evaluation of next generation mtGenome sequencing using the Ion Torrent Personal Genome Machine (PGM).

    PubMed

    Parson, Walther; Strobl, Christina; Huber, Gabriela; Zimmermann, Bettina; Gomes, Sibylle M; Souto, Luis; Fendt, Liane; Delport, Rhena; Langit, Reina; Wootton, Sharon; Lagacé, Robert; Irwin, Jodi

    2013-09-01

    Insights into the human mitochondrial phylogeny have been primarily achieved by sequencing full mitochondrial genomes (mtGenomes). In forensic genetics (partial) mtGenome information can be used to assign haplotypes to their phylogenetic backgrounds, which may, in turn, have characteristic geographic distributions that would offer useful information in a forensic case. In addition and perhaps even more relevant in the forensic context, haplogroup-specific patterns of mutations form the basis for quality control of mtDNA sequences. The current method for establishing (partial) mtDNA haplotypes is Sanger-type sequencing (STS), which is laborious, time-consuming, and expensive. With the emergence of Next Generation Sequencing (NGS) technologies, the body of available mtDNA data can potentially be extended much more quickly and cost-efficiently. Customized chemistries, laboratory workflows and data analysis packages could support the community and increase the utility of mtDNA analysis in forensics. We have evaluated the performance of mtGenome sequencing using the Personal Genome Machine (PGM) and compared the resulting haplotypes directly with conventional Sanger-type sequencing. A total of 64mtGenomes (>1 million bases) were established that yielded high concordance with the corresponding STS haplotypes (<0.02% differences). About two-thirds of the differences were observed in or around homopolymeric sequence stretches. In addition, the sequence alignment algorithm employed to align NGS reads played a significant role in the analysis of the data and the resulting mtDNA haplotypes. Further development of alignment software would be desirable to facilitate the application of NGS in mtDNA forensic genetics. PMID:23948325

  10. BIOINFORMATIC INTEGRATION OF STRUCTURAL AND FUNCTIONAL GENOMICS DATA ACROSS SPECIES TO DEVELOP PORCINE INFLAMMATORY GENE REGULATORY PATHWAY INFORMATION

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Integration of structural and functional genomic data across species holds great promise in finding genes controlling disease resistance. We are investigating the porcine gut immune response to infection through gene expression profiling. We have collected porcine Affymetrix GeneChip data from RNA ...

  11. Genome-wide association study of antisocial personality disorder.

    PubMed

    Rautiainen, M-R; Paunio, T; Repo-Tiihonen, E; Virkkunen, M; Ollila, H M; Sulkava, S; Jolanki, O; Palotie, A; Tiihonen, J

    2016-01-01

    The pathophysiology of antisocial personality disorder (ASPD) remains unclear. Although the most consistent biological finding is reduced grey matter volume in the frontal cortex, about 50% of the total liability to developing ASPD has been attributed to genetic factors. The contributing genes remain largely unknown. Therefore, we sought to study the genetic background of ASPD. We conducted a genome-wide association study (GWAS) and a replication analysis of Finnish criminal offenders fulfilling DSM-IV criteria for ASPD (N=370, N=5850 for controls, GWAS; N=173, N=3766 for controls and replication sample). The GWAS resulted in suggestive associations of two clusters of single-nucleotide polymorphisms at 6p21.2 and at 6p21.32 at the human leukocyte antigen (HLA) region. Imputation of HLA alleles revealed an independent association with DRB1*01:01 (odds ratio (OR)=2.19 (1.53-3.14), P=1.9 × 10(-5)). Two polymorphisms at 6p21.2 LINC00951-LRFN2 gene region were replicated in a separate data set, and rs4714329 reached genome-wide significance (OR=1.59 (1.37-1.85), P=1.6 × 10(-9)) in the meta-analysis. The risk allele also associated with antisocial features in the general population conditioned for severe problems in childhood family (β=0.68, P=0.012). Functional analysis in brain tissue in open access GTEx and Braineac databases revealed eQTL associations of rs4714329 with LINC00951 and LRFN2 in cerebellum. In humans, LINC00951 and LRFN2 are both expressed in the brain, especially in the frontal cortex, which is intriguing considering the role of the frontal cortex in behavior and the neuroanatomical findings of reduced gray matter volume in ASPD. To our knowledge, this is the first study showing genome-wide significant and replicable findings on genetic variants associated with any personality disorder. PMID:27598967

  12. Genome-wide identification and evolutionary analysis of algal LPAT genes involved in TAG biosynthesis using bioinformatic approaches.

    PubMed

    Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar

    2014-12-01

    Lysophosphatidyl acyltransferase (LPAT) is one of the major triacylglycerol synthesis enzymes, controlling the metabolic flow of lysophosphatidic acid to phosphatidic acid. Experimental studies in Arabidopsis have shown that LPAT activity is exhibited primarily by three distinct isoforms, namely the plastid-located LPAT1, the endoplasmic reticulum-located LPAT2, and the soluble isoform of LPAT (solLPAT). In this study, 24 putative genes representing all LPAT isoforms were identified from the analysis of 11 complete genomes including green algae, red algae, diatoms and higher plants. We observed LPAT1 and solLPAT genes to be ubiquitously present in nearly all genomes examined, whereas LPAT2 genes to have evolved more recently in the plant lineage. Phylogenetic analysis indicated that LPAT1, LPAT2 and solLPAT have convergently evolved through separate evolutionary paths and belong to three different gene families, which was further evidenced by their wide divergence at gene structure and sequence level. The genome distribution supports the hypothesis that each gene encoding a LPAT is not duplicated. Mapping of exon-intron structure of LPAT genes to the domain structure of proteins across different algal and plant species indicates that exon shuffling plays no role in the evolution of LPAT genes. Besides the previously defined motifs, several conserved consensus sequences were discovered which could be useful to distinguish different LPAT isoforms. Taken together, this study will enable the generation of experimental approximations to better understand the functional role of algal LPAT in lipid accumulation. PMID:25280541

  13. [Construction and application of bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer].

    PubMed

    Xiang, Fang; Ningqiu, Li; Xiaozhe, Fu; Kaibin, Li; Qiang, Lin; Lihui, Liu; Cunbin, Shi; Shuqin, Wu

    2015-07-01

    As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects. PMID:26351170

  14. Genome-Wide Profiling of RNA from Dried Blood Spots: Convergence with Bioinformatic Results Derived from Whole Venous Blood and Peripheral Blood Mononuclear Cells.

    PubMed

    McDade, Thomas W; M Ross, Kharah; L Fried, Ruby; Arevalo, Jesusa M G; Ma, Jeffrey; Miller, Gregory E; Cole, Steve W

    2016-01-01

    Genome-wide transcriptional profiling has emerged as a powerful tool for analyzing biological mechanisms underlying social gradients in health, but utilization in population-based studies has been hampered by logistical constraints and costs associated with venipuncture blood sampling. Dried blood spots (DBS) provide a minimally invasive, low-cost alternative to venipuncture, and in this article we evaluate how closely the substantive results from DBS transcriptional profiling correspond to those derived from parallel analyses of gold-standard venous blood samples (PAXgene whole blood and peripheral blood mononuclear cells [PBMC]). Analyses focused on differences in gene expression between African-Americans and Caucasians in a community sample of 82 healthy adults (age 18-70 years; mean 35). Across 19,679 named gene transcripts, DBS-derived values correlated r = .85 with both PAXgene and PBMC values. Results from bioinformatics analyses of gene expression derived from DBS samples were concordant with PAXgene and PBMC samples in identifying increased Type I interferon signaling and up-regulated activity of monocytes and natural killer (NK) cells in African-Americans compared to Caucasian participants. These findings demonstrate the feasibility of DBS in field-based studies of gene expression and encourage future studies of human transcriptome dynamics in larger, more representative samples than are possible with clinic- or lab-based research designs. PMID:27337553

  15. iHOPerator: user-scripting a personalized bioinformatics Web, starting with the iHOP website

    PubMed Central

    Good, Benjamin M; Kawas, Edward A; Kuo, Byron Yu-Lin; Wilkinson, Mark D

    2006-01-01

    Background User-scripts are programs stored in Web browsers that can manipulate the content of websites prior to display in the browser. They provide a novel mechanism by which users can conveniently gain increased control over the content and the display of the information presented to them on the Web. As the Web is the primary medium by which scientists retrieve biological information, any improvements in the mechanisms that govern the utility or accessibility of this information may have profound effects. GreaseMonkey is a Mozilla Firefox extension that facilitates the development and deployment of user-scripts for the Firefox web-browser. We utilize this to enhance the content and the presentation of the iHOP (information Hyperlinked Over Proteins) website. Results The iHOPerator is a GreaseMonkey user-script that augments the gene-centred pages on iHOP by providing a compact, configurable visualization of the defining information for each gene and by enabling additional data, such as biochemical pathway diagrams, to be collected automatically from third party resources and displayed in the same browsing context. Conclusion This open-source script provides an extension to the iHOP website, demonstrating how user-scripts can personalize and enhance the Web browsing experience in a relevant biological setting. The novel, user-driven controls over the content and the display of Web resources made possible by user-scripts, such as the iHOPerator, herald the beginning of a transition from a resource-centric to a user-centric Web experience. We believe that this transition is a necessary step in the development of Web technology that will eventually result in profound improvements in the way life scientists interact with information. PMID:17173692

  16. Genomic insights into ayurvedic and western approaches to personalized medicine.

    PubMed

    Prasher, Bhavana; Gibson, Greg; Mukerji, Mitali

    2016-03-01

    Ayurveda, an ancient Indian system of medicine documented and practised since 1500 B.C., follows a systems approach that has interesting parallels with contemporary personalized genomic medicine approaches to the understanding and management of health and disease. It is based on the trisutra, which are the three aspects of causes, features and therapeutics that are interconnected through a common organizing principle termed 'tridosha'. Tridosha comprise three ascertainable physiological entities; vata (kinetic), pitta (metabolic) and kapha (potential) that are pervasive across systems, work in conjunction with each other, respond to the external environment and maintain homeostasis. Each individual is born with a specific proportion of tridosha that are not only genetically determined but also influenced by the environment during foetal development. Jointly they determine a person's basic constitution, which is termed their 'prakriti'. Development and progressi on of different diseases with their subtypes are thought to depend on the origin and mechanism of perturbation of the doshas, and the aim of therapeutic practice is to ensure that the doshas retain their homeostatic state. Similarly, western systems biology epitomized by translational P4 medicine envisages the integration of multiscalar genetic, cellular, physiological and environmental networks to predict phenotypic outcomes of perturbations. In this perspective article, we aim to outline the shape of a unifying scaffold that may allow the two intellectual traditions to enhance one another. Specifically, we illustrate how a unique integrative 'Ayurgenomics' approach can be used to integrate the trisutra concept of Ayurveda with genomics. We observe biochemical and molecular correlates of prakriti and show how these differ significantly in processes that are linked to intermediate patho-phenotypes, known to take different course in diseases. We also observe a significant enr ichment of the highly connected

  17. Bioinformatics Analysis of the Complete Genome Sequence of the Mango Tree Pathogen Pseudomonas syringae pv. syringae UMAF0158 Reveals Traits Relevant to Virulence and Epiphytic Lifestyle.

    PubMed

    Martínez-García, Pedro Manuel; Rodríguez-Palenzuela, Pablo; Arrebola, Eva; Carrión, Víctor J; Gutiérrez-Barranquero, José Antonio; Pérez-García, Alejandro; Ramos, Cayo; Cazorla, Francisco M; de Vicente, Antonio

    2015-01-01

    The genome sequence of more than 100 Pseudomonas syringae strains has been sequenced to date; however only few of them have been fully assembled, including P. syringae pv. syringae B728a. Different strains of pv. syringae cause different diseases and have different host specificities; so, UMAF0158 is a P. syringae pv. syringae strain related to B728a but instead of being a bean pathogen it causes apical necrosis of mango trees, and the two strains belong to different phylotypes of pv.syringae and clades of P. syringae. In this study we report the complete sequence and annotation of P. syringae pv. syringae UMAF0158 chromosome and plasmid pPSS158. A comparative analysis with the available sequenced genomes of other 25 P. syringae strains, both closed (the reference genomes DC3000, 1448A and B728a) and draft genomes was performed. The 5.8 Mb UMAF0158 chromosome has 59.3% GC content and comprises 5017 predicted protein-coding genes. Bioinformatics analysis revealed the presence of genes potentially implicated in the virulence and epiphytic fitness of this strain. We identified several genetic features, which are absent in B728a, that may explain the ability of UMAF0158 to colonize and infect mango trees: the mangotoxin biosynthetic operon mbo, a gene cluster for cellulose production, two different type III and two type VI secretion systems, and a particular T3SS effector repertoire. A mutant strain defective in the rhizobial-like T3SS Rhc showed no differences compared to wild-type during its interaction with host and non-host plants and worms. Here we report the first complete sequence of the chromosome of a pv. syringae strain pathogenic to a woody plant host. Our data also shed light on the genetic factors that possibly determine the pathogenic and epiphytic lifestyle of UMAF0158. This work provides the basis for further analysis on specific mechanisms that enable this strain to infect woody plants and for the functional analysis of host specificity in the P

  18. Multifunctionality and diversity of GDSL esterase/lipase gene family in rice (Oryza sativa L. japonica) genome: new insights from bioinformatics analysis

    PubMed Central

    2012-01-01

    Background GDSL esterases/lipases are a newly discovered subclass of lipolytic enzymes that are very important and attractive research subjects because of their multifunctional properties, such as broad substrate specificity and regiospecificity. Compared with the current knowledge regarding these enzymes in bacteria, our understanding of the plant GDSL enzymes is very limited, although the GDSL gene family in plant species include numerous members in many fully sequenced plant genomes. Only two genes from a large rice GDSL esterase/lipase gene family were previously characterised, and the majority of the members remain unknown. In the present study, we describe the rice OsGELP (Oryza sativa GDSL esterase/lipase protein) gene family at the genomic and proteomic levels, and use this knowledge to provide insights into the multifunctionality of the rice OsGELP enzymes. Results In this study, an extensive bioinformatics analysis identified 114 genes in the rice OsGELP gene family. A complete overview of this family in rice is presented, including the chromosome locations, gene structures, phylogeny, and protein motifs. Among the OsGELPs and the plant GDSL esterase/lipase proteins of known functions, 41 motifs were found that represent the core secondary structure elements or appear specifically in different phylogenetic subclades. The specification and distribution of identified putative conserved clade-common and -specific peptide motifs, and their location on the predicted protein three dimensional structure may possibly signify their functional roles. Potentially important regions for substrate specificity are highlighted, in accordance with protein three-dimensional model and location of the phylogenetic specific conserved motifs. The differential expression of some representative genes were confirmed by quantitative real-time PCR. The phylogenetic analysis, together with protein motif architectures, and the expression profiling were analysed to predict the

  19. Bioinformatics Analysis of the Complete Genome Sequence of the Mango Tree Pathogen Pseudomonas syringae pv. syringae UMAF0158 Reveals Traits Relevant to Virulence and Epiphytic Lifestyle

    PubMed Central

    Arrebola, Eva; Carrión, Víctor J.; Gutiérrez-Barranquero, José Antonio; Pérez-García, Alejandro; Ramos, Cayo; Cazorla, Francisco M.; de Vicente, Antonio

    2015-01-01

    The genome sequence of more than 100 Pseudomonas syringae strains has been sequenced to date; however only few of them have been fully assembled, including P. syringae pv. syringae B728a. Different strains of pv. syringae cause different diseases and have different host specificities; so, UMAF0158 is a P. syringae pv. syringae strain related to B728a but instead of being a bean pathogen it causes apical necrosis of mango trees, and the two strains belong to different phylotypes of pv.syringae and clades of P. syringae. In this study we report the complete sequence and annotation of P. syringae pv. syringae UMAF0158 chromosome and plasmid pPSS158. A comparative analysis with the available sequenced genomes of other 25 P. syringae strains, both closed (the reference genomes DC3000, 1448A and B728a) and draft genomes was performed. The 5.8 Mb UMAF0158 chromosome has 59.3% GC content and comprises 5017 predicted protein-coding genes. Bioinformatics analysis revealed the presence of genes potentially implicated in the virulence and epiphytic fitness of this strain. We identified several genetic features, which are absent in B728a, that may explain the ability of UMAF0158 to colonize and infect mango trees: the mangotoxin biosynthetic operon mbo, a gene cluster for cellulose production, two different type III and two type VI secretion systems, and a particular T3SS effector repertoire. A mutant strain defective in the rhizobial-like T3SS Rhc showed no differences compared to wild-type during its interaction with host and non-host plants and worms. Here we report the first complete sequence of the chromosome of a pv. syringae strain pathogenic to a woody plant host. Our data also shed light on the genetic factors that possibly determine the pathogenic and epiphytic lifestyle of UMAF0158. This work provides the basis for further analysis on specific mechanisms that enable this strain to infect woody plants and for the functional analysis of host specificity in the P

  20. Interpretation of personal genome sequencing data in terms of disease ranks based on mutual information

    PubMed Central

    2015-01-01

    Background The rapid advances in genome sequencing technologies have resulted in an unprecedented number of genome variations being discovered in humans. However, there has been very limited coverage of interpretation of the personal genome sequencing data in terms of diseases. Methods In this paper we present the first computational analysis scheme for interpreting personal genome data by simultaneously considering the functional impact of damaging variants and curated disease-gene association data. This method is based on mutual information as a measure of the relative closeness between the personal genome and diseases. We hypothesize that a higher mutual information score implies that the personal genome is more susceptible to a particular disease than other diseases. Results The method was applied to the sequencing data of 50 acute myeloid leukemia (AML) patients in The Cancer Genome Atlas. The utility of associations between a disease and the personal genome was explored using data of healthy (control) people obtained from the 1000 Genomes Project. The ranks of the disease terms in the AML patient group were compared with those in the healthy control group using "Leukemia, Myeloid, Acute" (C04.557.337.539.550) as the corresponding MeSH disease term. The mutual information rank of the disease term was substantially higher in the AML patient group than in the healthy control group, which demonstrates that the proposed methodology can be successfully applied to infer associations between the personal genome and diseases. Conclusions Overall, the area under the receiver operating characteristics curve was significantly larger for the AML patient data than for the healthy controls. This methodology could contribute to consequential discoveries and explanations for mining personal genome sequencing data in terms of diseases, and have versatility with respect to genomic-based knowledge such as drug-gene and environmental-factor-gene interactions. PMID:26045178

  1. Genomic and bioinformatics analysis of HAdV-7, a human adenovirus of species B1 that causes acute respiratory disease: implications for vector development in human gene therapy.

    PubMed

    Purkayastha, Anjan; Su, Jing; Carlisle, Steve; Tibbetts, Clark; Seto, Donald

    2005-02-01

    Human adenovirus serotype 7 (HAdV-7) is a reemerging pathogen identified in acute respiratory disease (ARD), particularly in epidemics affecting basic military trainee populations of otherwise healthy young adults. The genome has been sequenced and annotated (GenBank accession no. ). Comparative genomics and bioinformatics analyses of the HAdV-7 genome sequence provide insight into its natural history and phylogenetic relationships. A putative origin of HAdV-7 from a chimpanzee host is observed. This has implications within the current biotechnological interest of using chimpanzee adenoviruses as vectors for human gene therapy and DNA vaccine delivery. Rapid genome sequencing and analyses of this species B1 member provide an example of exploiting accurate low-pass DNA sequencing technology in pathogen characterization and epidemic outbreak surveillance through the identification, validation, and application of unique pathogen genome signatures. PMID:15661145

  2. "Personalizing" academic medicine: opportunities and challenges in implementing genomic profiling.

    PubMed

    Tweardy, David J; Belmont, John W

    2009-12-01

    BCM faculty members spearheaded the development of a first-generation Personal Genome Profile (Baylor PGP) assay to assist physicians in diagnosing and managing patients in this new era of medicine. The principles that guided the design and implementation of the Baylor PGP were high quality, robustness, low expense, flexibility, practical clinical utility, and the ability to facilitate broad areas of clinical research. The most distinctive feature of the approach taken is an emphasis on extensive screening for rare disease-causing mutations rather than common risk-increasing polymorphisms. Because these variants have large direct effects, the ability to screen for them inexpensively could have a major immediate clinical impact in disease diagnosis, carrier detection, presymptomatic detection of late onset disease, and even prenatal diagnosis. In addition to creating a counseling tool for individual "consumers," this system will fit into the established medical record and be used by physicians involved in direct patient care. This article describes an overall framework for clinical diagnostic array genotyping and the available technologies, as well as highlights the opportunities and challenges for implementation. PMID:19931194

  3. Generations of interdisciplinarity in bioinformatics

    PubMed Central

    Bartlett, Andrew; Lewis, Jamie; Williams, Matthew L.

    2016-01-01

    Bioinformatics, a specialism propelled into relevance by the Human Genome Project and the subsequent -omic turn in the life science, is an interdisciplinary field of research. Qualitative work on the disciplinary identities of bioinformaticians has revealed the tensions involved in work in this “borderland.” As part of our ongoing work on the emergence of bioinformatics, between 2010 and 2011, we conducted a survey of United Kingdom-based academic bioinformaticians. Building on insights drawn from our fieldwork over the past decade, we present results from this survey relevant to a discussion of disciplinary generation and stabilization. Not only is there evidence of an attitudinal divide between the different disciplinary cultures that make up bioinformatics, but there are distinctions between the forerunners, founders and the followers; as inter/disciplines mature, they face challenges that are both inter-disciplinary and inter-generational in nature. PMID:27453689

  4. Bioinformatic Analyses of Integral Membrane Transport Proteins Encoded Within the Genome of the Planctomycetes species, Rhodopirellula baltica

    PubMed Central

    Paparoditis, Philipp; Vastermark, Ake; Le, Andrew J.; Fuerst, John A.; Saier, Milton H.

    2013-01-01

    Rhodopirellula baltica (R. baltica) is a Planctomycete, known to have intracellular membranes. Because of its unusual cell structure and ecological significance, we have conducted comprehensive analyses of its transmembrane transport proteins. The complete proteome of R. baltica was screened against the Transporter Classification Database (TCDB) to identify recognizable integral membrane transport proteins. 342 proteins were identified with a high degree of confidence, and these fell into several different classes. R. baltica encodes in its genome channels (12%), secondary carriers (33%), and primary active transport proteins (41%) in addition to classes represented in smaller numbers. Relative to most non-marine bacteria, R. baltica possesses a larger number of sodium-dependent symporters but fewer proton-dependent symporters, and it has dimethylsulfoxide (DMSO) and trimethyl-amine-oxide (TMAO) reductases, consistent with its Na+-rich marine environment. R. baltica also possesses a Na+-translocating NADH:quinone dehydrogenase (Na+-NDH), a Na+ efflux decarboxylase, two Na+-exporting ABC pumps, two Na+-translocating F-type ATPases, two Na+:H+ antiporters and two K+:H+ antiporters. Flagellar motility probably depends on the sodium electrochemical gradient. Surprisingly, R. baltica also has a complete set of H+-translocating electron transport complexes similar to those present in β-proteobacteria and eukaryotic mitochondria. The transport proteins identified proved to be typical of the bacterial domain with little or no indication of the presence of eukaryotic-type transporters. However, novel functionally uncharacterized multispanning membrane proteins were identified, some of which are found only in Rhodopirellula species, but others of which are widely distributed in bacteria. The analyses lead to predictions regarding the physiology, ecology and evolution of R. baltica. PMID:23969110

  5. CGAT: a model for immersive personalized training in computational genomics.

    PubMed

    Sims, David; Ponting, Chris P; Heger, Andreas

    2016-01-01

    How should the next generation of genomics scientists be trained while simultaneously pursuing high quality and diverse research? CGAT, the Computational Genomics Analysis and Training programme, was set up in 2010 by the UK Medical Research Council to complement its investment in next-generation sequencing capacity. CGAT was conceived around the twin goals of training future leaders in genome biology and medicine, and providing much needed capacity to UK science for analysing genome scale data sets. Here we outline the training programme employed by CGAT and describe how it dovetails with collaborative research projects to launch scientists on the road towards independent research careers in genomics. PMID:25981124

  6. CGAT: a model for immersive personalized training in computational genomics

    PubMed Central

    Sims, David; Ponting, Chris P.

    2016-01-01

    How should the next generation of genomics scientists be trained while simultaneously pursuing high quality and diverse research? CGAT, the Computational Genomics Analysis and Training programme, was set up in 2010 by the UK Medical Research Council to complement its investment in next-generation sequencing capacity. CGAT was conceived around the twin goals of training future leaders in genome biology and medicine, and providing much needed capacity to UK science for analysing genome scale data sets. Here we outline the training programme employed by CGAT and describe how it dovetails with collaborative research projects to launch scientists on the road towards independent research careers in genomics. PMID:25981124

  7. INTERPRETOME: A FREELY AVAILABLE, MODULAR, AND SECURE PERSONAL GENOME INTERPRETATION ENGINE

    PubMed Central

    CORDERO, PABLO; TATONETTI, NICHOLAS P.; DUDLEY, JOEL T.; SALARI, KEYAN; SNYDER, MICHAEL; ALTMAN, RUSS B.; KIM, STUART K.

    2016-01-01

    The decreasing cost of genotyping and genome sequencing has ushered in an era of genomic personalized medicine. More than 100,000 individuals have been genotyped by direct-to-consumer genetic testing services, which offer a glimpse into the interpretation and exploration of a personal genome. However, these interpretations, which require extensive manual curation, are subject to the preferences of the company and are not customizable by the individual. Academic institutions teaching personalized medicine, as well as genetic hobbyists, may prefer to customize their analysis and have full control over the content and method of interpretation. We present the Interpretome, a system for private genome interpretation, which contains all genotype information in client-side interpretation scripts, supported by server-side databases. We provide state-of-the-art analyses for teaching clinical implications of personal genomics, including disease risk assessment and pharmacogenomics. Additionally, we have implemented client-side algorithms for ancestry inference, demonstrating the power of these methods without excessive computation. Finally, the modular nature of the system allows for plugin capabilities for custom analyses. This system will allow for personal genome exploration without compromising privacy, facilitating hands-on courses in genomics and personalized medicine. PMID:22174289

  8. Microbial bioinformatics 2020.

    PubMed

    Pallen, Mark J

    2016-09-01

    Microbial bioinformatics in 2020 will remain a vibrant, creative discipline, adding value to the ever-growing flood of new sequence data, while embracing novel technologies and fresh approaches. Databases and search strategies will struggle to cope and manual curation will not be sustainable during the scale-up to the million-microbial-genome era. Microbial taxonomy will have to adapt to a situation in which most microorganisms are discovered and characterised through the analysis of sequences. Genome sequencing will become a routine approach in clinical and research laboratories, with fresh demands for interpretable user-friendly outputs. The "internet of things" will penetrate healthcare systems, so that even a piece of hospital plumbing might have its own IP address that can be integrated with pathogen genome sequences. Microbiome mania will continue, but the tide will turn from molecular barcoding towards metagenomics. Crowd-sourced analyses will collide with cloud computing, but eternal vigilance will be the price of preventing the misinterpretation and overselling of microbial sequence data. Output from hand-held sequencers will be analysed on mobile devices. Open-source training materials will address the need for the development of a skilled labour force. As we boldly go into the third decade of the twenty-first century, microbial sequence space will remain the final frontier! PMID:27471065

  9. Personal Genome Sequencing in Ostensibly Healthy Individuals and the PeopleSeq Consortium

    PubMed Central

    Linderman, Michael D.; Nielsen, Daiva E.; Green, Robert C.

    2016-01-01

    Thousands of ostensibly healthy individuals have had their exome or genome sequenced, but a much smaller number of these individuals have received any personal genomic results from that sequencing. We term those projects in which ostensibly healthy participants can receive sequencing-derived genetic findings and may also have access to their genomic data as participatory predispositional personal genome sequencing (PPGS). Here we are focused on genome sequencing applied in a pre-symptomatic context and so define PPGS to exclude diagnostic genome sequencing intended to identify the molecular cause of suspected or diagnosed genetic disease. In this report we describe the design of completed and underway PPGS projects, briefly summarize the results reported to date and introduce the PeopleSeq Consortium, a newly formed collaboration of PPGS projects designed to collect much-needed longitudinal outcome data. PMID:27023617

  10. Informing the Design of Direct-to-Consumer Interactive Personal Genomics Reports

    PubMed Central

    Shaer, Orit; Okerlund, Johanna; Balestra, Martina; Stowell, Elizabeth; Ascher, Laura; Bi, Joanna; Schlenker, Claire; Ball, Madeleine

    2015-01-01

    Background In recent years, people who sought direct-to-consumer genetic testing services have been increasingly confronted with an unprecedented amount of personal genomic information, which influences their decisions, emotional state, and well-being. However, these users of direct-to-consumer genetic services, who vary in their education and interests, frequently have little relevant experience or tools for understanding, reasoning about, and interacting with their personal genomic data. Online interactive techniques can play a central role in making personal genomic data useful for these users. Objective We sought to (1) identify the needs of diverse users as they make sense of their personal genomic data, (2) consequently develop effective interactive visualizations of genomic trait data to address these users’ needs, and (3) evaluate the effectiveness of the developed visualizations in facilitating comprehension. Methods The first two user studies, conducted with 63 volunteers in the Personal Genome Project and with 36 personal genomic users who participated in a design workshop, respectively, employed surveys and interviews to identify the needs and expectations of diverse users. Building on the two initial studies, the third study was conducted with 730 Amazon Mechanical Turk users and employed a controlled experimental design to examine the effectiveness of different design interventions on user comprehension. Results The first two studies identified searching, comparing, sharing, and organizing data as fundamental to users’ understanding of personal genomic data. The third study demonstrated that interactive and visual design interventions could improve the understandability of personal genomic reports for consumers. In particular, results showed that a new interactive bubble chart visualization designed for the study resulted in the highest comprehension scores, as well as the highest perceived comprehension scores. These scores were significantly

  11. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    ERIC Educational Resources Information Center

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  12. Motivations and Perceptions of Early Adopters of Personalized Genomics: Perspectives from Research Participants

    PubMed Central

    Gollust, S.E.; Gordon, E.S.; Zayac, C.; Griffin, G.; Christman, M.F.; Pyeritz, R.E.; Wawak, L.; Bernhardt, B.A.

    2011-01-01

    Background/Aims: To predict the potential public health impact of personal genomics, empirical research on public perceptions of these services is needed. In this study, ‘early adopters’ of personal genomics were surveyed to assess their motivations, perceptions and intentions. Methods: Participants were recruited from everyone who registered to attend an enrollment event for the Coriell Personalized Medicine Collaborative, a United States-based (Camden, N.J.) research study of the utility of personalized medicine, between March 31, 2009 and April 1, 2010 (n = 369). Participants completed an Internet-based survey about their motivations, awareness of personalized medicine, perceptions of study risks and benefits, and intentions to share results with health care providers. Results: Respondents were motivated to participate for their own curiosity and to find out their disease risk to improve their health. Fewer than 10% expressed deterministic perspectives about genetic risk, but 32% had misperceptions about the research study or personal genomic testing. Most respondents perceived the study to have health-related benefits. Nearly all (92%) intended to share their results with physicians, primarily to request specific medical recommendations. Conclusion: Early adopters of personal genomics are prospectively enthusiastic about using genomic profiling information to improve their health, in close consultation with their physicians. This suggests that early users (i.e. through direct-to-consumer companies or research) may follow up with the health care system. Further research should address whether intentions to seek care match actual behaviors. PMID:21654153

  13. Integrating Genomics into Clinical Oncology: Ethical and Social Challenges from Proponents of Personalized Medicine

    PubMed Central

    Settersten, Richard A.; Juengst, Eric T.; Fishman, Jennifer R.

    2013-01-01

    Summary The use of molecular tools to individualize health care, predict appropriate therapies and prevent adverse health outcomes has gained significant traction in the field of oncology, under the banner of “personalized medicine.” Enthusiasm for personalized medicine in oncology has been fueled by success stories of targeted treatments for a variety of cancers based on their molecular profiles. Though these are clear indications of optimism for personalized medicine, little is known about the ethical and social implications of personalized approaches in clinical oncology. The objective of this study is to assess how a range of stakeholders engaged in promoting, monitoring, and providing personalized medicine understand the challenges of integrating genomic testing and targeted therapies into clinical oncology. The study involved the analysis of in-depth interviews with 117 basic scientists, clinician-researchers, clinicians in private practice, health professional educators, representatives of funding agencies, medical journal editors, entrepreneurs, and insurers whose experiences and perspectives on personalized medicine span a wide variety of institutional and professional settings. Despite considerable enthusiasm for this shift, promoters, monitors and providers of personalized medicine identified four domains which will still provoke heightened ethical and social concerns: (1) informed consent for cancer genomic testing, (2) privacy, confidentiality, and disclosure of genomic test results, (3) access to genomic testing and targeted therapies in oncology, and (4) the costs of scaling up pharmacogenomic testing and targeted cancer therapies. These specific concerns are not unique to oncology, or even genomics. However, those most invested in the success of personalized medicine view oncologists’ responses to these challenges as precedent-setting because oncology is farther along the path of clinical integration of genomic technologies than other fields

  14. What can whole genome expression data tell us about the ecology and evolution of personality?

    PubMed Central

    Bell, Alison M.; Aubin-Horth, Nadia

    2010-01-01

    Consistent individual differences in behaviour, aka personality, pose several evolutionary questions. For example, it is difficult to explain within-individual consistency in behaviour because behavioural plasticity is often advantageous. In addition, selection erodes heritable behavioural variation that is related to fitness, therefore we wish to know the mechanisms that can maintain between-individual variation in behaviour. In this paper, we argue that whole genome expression data can reveal new insights into the proximate mechanisms underlying personality, as well as its evolutionary consequences. After introducing the basics of whole genome expression analysis, we show how whole genome expression data can be used to understand whether behaviours in different contexts are affected by the same molecular mechanisms. We suggest strategies for using the power of genomics to understand what maintains behavioural variation, to study the evolution of behavioural correlations and to compare personality traits across diverse organisms. PMID:21078652

  15. Bioinformatics and Moonlighting Proteins

    PubMed Central

    Hernández, Sergio; Franco, Luís; Calvo, Alejandra; Ferragut, Gabriela; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2015-01-01

    Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyze and describe several approaches that use sequences, structures, interactomics, and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are (a) remote homology searches using Psi-Blast, (b) detection of functional motifs and domains, (c) analysis of data from protein–protein interaction databases (PPIs), (d) match the query protein sequence to 3D databases (i.e., algorithms as PISITE), and (e) mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs) has the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations – it requires the existence of multialigned family protein sequences – but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/), previously published by our group, has been used as a benchmark for the all of the analyses. PMID:26157797

  16. Bioinformatics and Moonlighting Proteins.

    PubMed

    Hernández, Sergio; Franco, Luís; Calvo, Alejandra; Ferragut, Gabriela; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2015-01-01

    Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyze and describe several approaches that use sequences, structures, interactomics, and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are (a) remote homology searches using Psi-Blast, (b) detection of functional motifs and domains, (c) analysis of data from protein-protein interaction databases (PPIs), (d) match the query protein sequence to 3D databases (i.e., algorithms as PISITE), and (e) mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs) has the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations - it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/), previously published by our group, has been used as a benchmark for the all of the analyses. PMID:26157797

  17. Basic principles of yeast genomics, a personal recollection.

    PubMed

    Dujon, Bernard

    2015-08-01

    The genomes of many yeast species or strain isolates have now been sequenced with an accelerating momentum that quickly relegates initial data to history, albeit that they are less than two decades old. Today, novel yeast genomes are entirely sequenced for a variety of reasons, often only to identify a few expected genes of specific interest, thus providing a wealth of data, heterogenous in quality and completion but informative about the origin and evolution of this heterogeneous collection of unicellular modern fungi. However, how many scientists fully appreciate the important conceptual and technological roles played by yeasts in the extraordinary development of today's genomics? Novel notions of general significance emerged from the very first eukaryote sequenced, Saccharomyces cerevisiae, and were successively refined and extended over time. Tools with general applications were originally developed with this yeast; and surprises emerged from the results. Here, I have tried to recollect the gradual building up of knowledge as yeast genomics developed, and then briefly summarize our present views about the basic nature of yeast genomes, based on the most recent data. PMID:26071597

  18. Simultaneous Whole Mitochondrial Genome Sequencing with Short Overlapping Amplicons Suitable for Degraded DNA Using the Ion Torrent Personal Genome Machine.

    PubMed

    Chaitanya, Lakshmi; Ralf, Arwin; van Oven, Mannis; Kupiec, Tomasz; Chang, Joseph; Lagacé, Robert; Kayser, Manfred

    2015-12-01

    Whole mitochondrial (mt) genome analysis enables a considerable increase in analysis throughput, and improves the discriminatory power to the maximum possible phylogenetic resolution. Most established protocols on the different massively parallel sequencing (MPS) platforms, however, invariably involve the PCR amplification of large fragments, typically several kilobases in size, which may fail due to mtDNA fragmentation in the available degraded materials. We introduce a MPS tiling approach for simultaneous whole human mt genome sequencing using 161 short overlapping amplicons (average 200 bp) with the Ion Torrent Personal Genome Machine. We illustrate the performance of this new method by sequencing 20 DNA samples belonging to different worldwide mtDNA haplogroups. Additional quality control, particularly regarding the potential detection of nuclear insertions of mtDNA (NUMTs), was performed by comparative MPS analysis using the conventional long-range amplification method. Preliminary sensitivity testing revealed that detailed haplogroup inference was feasible with 100 pg genomic input DNA. Complete mt genome coverage was achieved from DNA samples experimentally degraded down to genomic fragment sizes of about 220 bp, and up to 90% coverage from naturally degraded samples. Overall, we introduce a new approach for whole mt genome MPS analysis from degraded and nondegraded materials relevant to resolve and infer maternal genetic ancestry at complete resolution in anthropological, evolutionary, medical, and forensic applications. PMID:26387877

  19. [Ethical issues of personal genome: a legal perspective--ethical and legal ramifications of personal genome research].

    PubMed

    Maruyama, Eiji

    2009-06-01

    Whole-genome research projects, especially those involving whole-genome sequencing, tend to raise intractable ethical and legal challenges. In this kind of research, genetic and genomic data obtained by typing or sequencing are usually put in open or limited access scientific databases on the Internet to promote studies by many researchers. Once data become available on the Internet, it will be virtually meaningless to withdraw the information, effectively nullifying participants' right to revoke consent. Although the author favors the governance system that will assure research subjects of the right to withdraw their participation, considering these characteristics of whole-genome research, he finds those recommendations offered in Caulfield T, et al: Research ethics recommendations for whole-genome research: Consensus statement. PLoS Biol 6(3): e73(2008), especially to the effect that the consent process should include information about data security and the governance structure and, in particular, the mechanism for considering future research protocols, well reasoned and acceptable. PMID:19507516

  20. An integrated framework of personalized medicine: from individual genomes to participatory health care.

    PubMed

    Evers, Andrea W M; Rovers, Maroeska M; Kremer, Jan A M; Veltman, Joris A; Schalken, Jack A; Bloem, Bas R; van Gool, Alain J

    2012-08-01

    Promising research developments in both basic and applied sciences, such as genomics and participatory health care approaches, have generated widespread interest in personalized medicine among almost all scientific areas and clinicians. The term personalized medicine is, however, frequently used without defining a clear theoretical and methodological background. In addition, to date most personalized medicine approaches still lack convincing empirical evidence regarding their contribution and advantages in comparison to traditional models. Here, we propose that personalized medicine can only fulfill the promise of optimizing our health care system by an interdisciplinary and translational view that extends beyond traditional diagnostic and classification systems. PMID:22911520

  1. Computational Biology and Bioinformatics in Nigeria

    PubMed Central

    Fatumo, Segun A.; Adoga, Moses P.; Ojo, Opeolu O.; Oluwagbemi, Olugbenga; Adeoye, Tolulope; Ewejobi, Itunuoluwa; Adebiyi, Marion; Adebiyi, Ezekiel; Bewaji, Clement; Nashiru, Oyekanmi

    2014-01-01

    Over the past few decades, major advances in the field of molecular biology, coupled with advances in genomic technologies, have led to an explosive growth in the biological data generated by the scientific community. The critical need to process and analyze such a deluge of data and turn it into useful knowledge has caused bioinformatics to gain prominence and importance. Bioinformatics is an interdisciplinary research area that applies techniques, methodologies, and tools in computer and information science to solve biological problems. In Nigeria, bioinformatics has recently played a vital role in the advancement of biological sciences. As a developing country, the importance of bioinformatics is rapidly gaining acceptance, and bioinformatics groups comprised of biologists, computer scientists, and computer engineers are being constituted at Nigerian universities and research institutes. In this article, we present an overview of bioinformatics education and research in Nigeria. We also discuss professional societies and academic and research institutions that play central roles in advancing the discipline in Nigeria. Finally, we propose strategies that can bolster bioinformatics education and support from policy makers in Nigeria, with potential positive implications for other developing countries. PMID:24763310

  2. Getting Personal: Head and Neck Cancer Management in the Era of Genomic Medicine

    PubMed Central

    Birkeland, Andrew C.; Uhlmann, Wendy R.; Brenner, J. Chad; Shuman, Andrew G.

    2015-01-01

    Background Genetic testing is rapidly becoming an important tool in the management of patients with head and neck cancer. As we enter the era of genomics and personalized medicine, providers should be aware of testing options, counseling resources, and the benefits, limitations and future of personalized therapy. Methods This manuscript offers a primer to assist clinicians treating patients in anticipating and managing the inherent practical and ethical challenges of cancer care in the genomic era. Results Clinical applications of genomics for head and neck cancer are emerging. We discuss the indications for genetic testing, types of testing available, implications for care, privacy/disclosure concerns and ethical considerations. Hereditary genetic syndromes associated with head and neck neoplasms are reviewed, and online genetics resources are provided. Conclusions This article summarizes and contextualizes the evolving diagnostic and therapeutic options that impact the care of patients with head and neck cancer in the genomic era. PMID:25995036

  3. Genome Science and Personalized Cancer Treatment (LBNL Summer Lecture Series)

    SciTech Connect

    Gray, Joe

    2009-08-04

    Summer Lecture Series 2009: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks — particularly with regard to breast cancer.

  4. Genome Science and Personalized Cancer Treatment (LBNL Summer Lecture Series)

    ScienceCinema

    Gray, Joe

    2011-04-28

    Summer Lecture Series 2009: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks ? particularly with regard to breast cancer.

  5. Attitudes towards personal genomics among older Swiss adults: An exploratory study

    PubMed Central

    Mählmann, Laura; Röcke, Christina; Brand, Angela; Hafen, Ernst; Vayena, Effy

    2016-01-01

    Objectives To explore attitudes of Swiss older adults towards personal genomics (PG). Methods Using an anonymized voluntary paper-and-pencil survey, data were collected from 151 men and women aged 60–89 years attending the Seniorenuniversität Zurich, Switzerland (Seniors' University). Analyses were conducted using descriptive and inferential statistics. Results One third of the respondents were aware of PG, and more than half indicated interest in undergoing PG testing. The primary motivation provided was respondents' interest in finding out about their own disease risk, followed by willingness to contribute to scientific research. Forty-four percent were not interested in undergoing testing because results might be worrisome, or due to concerns about the validity of the results. Only a minority of respondents mentioned privacy-related concerns. Further, 66% were interested in undergoing clinic-based PG motivated by the opportunity to contribute to scientific research (78%) and 75% of all study participants indicated strong preferences to donate genomic data to public research institutions. Conclusion This study indicates a relatively positive overall attitude towards personal genomic testing among older Swiss adults, a group not typically represented in surveys about personal genomics. Genomic data of older adults can be highly relevant to late life health and maintenance of quality of life. In addition they can be an invaluable source for better understanding of longevity, health and disease. Understanding the attitudes of this population towards genomic analyses, although important, remains under-examined. PMID:27047754

  6. Personal genome testing: Test characteristics to clarify the discourse on ethical, legal and societal issues

    PubMed Central

    2011-01-01

    Background As genetics technology proceeds, practices of genetic testing have become more heterogeneous: many different types of tests are finding their way to the public in different settings and for a variety of purposes. This diversification is relevant to the discourse on ethical, legal and societal issues (ELSI) surrounding genetic testing, which must evolve to encompass these differences. One important development is the rise of personal genome testing on the basis of genetic profiling: the testing of multiple genetic variants simultaneously for the prediction of common multifactorial diseases. Currently, an increasing number of companies are offering personal genome tests directly to consumers and are spurring ELSI-discussions, which stand in need of clarification. This paper presents a systematic approach to the ELSI-evaluation of personal genome testing for multifactorial diseases along the lines of its test characteristics. Discussion This paper addresses four test characteristics of personal genome testing: its being a non-targeted type of testing, its high analytical validity, low clinical validity and problematic clinical utility. These characteristics raise their own specific ELSI, for example: non-targeted genetic profiling poses serious problems for information provision and informed consent. Questions about the quantity and quality of the necessary information, as well as about moral responsibilities with regard to the provision of information are therefore becoming central themes within ELSI-discussions of personal genome testing. Further, the current low level of clinical validity of genetic profiles raises questions concerning societal risks and regulatory requirements, whereas simultaneously it causes traditional ELSI-issues of clinical genetics, such as psychological and health risks, discrimination, and stigmatization, to lose part of their relevance. Also, classic notions of clinical utility are challenged by the newer notion of 'personal

  7. Population genetic inference from personal genome data: impact of ancestry and admixture on human genomic variation.

    PubMed

    Kidd, Jeffrey M; Gravel, Simon; Byrnes, Jake; Moreno-Estrada, Andres; Musharoff, Shaila; Bryc, Katarzyna; Degenhardt, Jeremiah D; Brisbin, Abra; Sheth, Vrunda; Chen, Rong; McLaughlin, Stephen F; Peckham, Heather E; Omberg, Larsson; Bormann Chung, Christina A; Stanley, Sarah; Pearlstein, Kevin; Levandowsky, Elizabeth; Acevedo-Acevedo, Suehelay; Auton, Adam; Keinan, Alon; Acuña-Alonzo, Victor; Barquera-Lozano, Rodrigo; Canizales-Quinteros, Samuel; Eng, Celeste; Burchard, Esteban G; Russell, Archie; Reynolds, Andy; Clark, Andrew G; Reese, Martin G; Lincoln, Stephen E; Butte, Atul J; De La Vega, Francisco M; Bustamante, Carlos D

    2012-10-01

    Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas-70% of the European ancestry in today's African Americans dates back to European gene flow happening only 7-8 generations ago. PMID:23040495

  8. Population Genetic Inference from Personal Genome Data: Impact of Ancestry and Admixture on Human Genomic Variation

    PubMed Central

    Kidd, Jeffrey M.; Gravel, Simon; Byrnes, Jake; Moreno-Estrada, Andres; Musharoff, Shaila; Bryc, Katarzyna; Degenhardt, Jeremiah D.; Brisbin, Abra; Sheth, Vrunda; Chen, Rong; McLaughlin, Stephen F.; Peckham, Heather E.; Omberg, Larsson; Bormann Chung, Christina A.; Stanley, Sarah; Pearlstein, Kevin; Levandowsky, Elizabeth; Acevedo-Acevedo, Suehelay; Auton, Adam; Keinan, Alon; Acuña-Alonzo, Victor; Barquera-Lozano, Rodrigo; Canizales-Quinteros, Samuel; Eng, Celeste; Burchard, Esteban G.; Russell, Archie; Reynolds, Andy; Clark, Andrew G.; Reese, Martin G.; Lincoln, Stephen E.; Butte, Atul J.; De La Vega, Francisco M.; Bustamante, Carlos D.

    2012-01-01

    Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas—70% of the European ancestry in today’s African Americans dates back to European gene flow happening only 7–8 generations ago. PMID:23040495

  9. Bioinformatics in protein analysis.

    PubMed

    Persson, B

    2000-01-01

    The chapter gives an overview of bioinformatic techniques of importance in protein analysis. These include database searches, sequence comparisons and structural predictions. Links to useful World Wide Web (WWW) pages are given in relation to each topic. Databases with biological information are reviewed with emphasis on databases for nucleotide sequences (EMBL, GenBank, DDBJ), genomes, amino acid sequences (Swissprot, PIR, TrEMBL, GenePept), and three-dimensional structures (PDB). Integrated user interfaces for databases (SRS and Entrez) are described. An introduction to databases of sequence patterns and protein families is also given (Prosite, Pfam, Blocks). Furthermore, the chapter describes the widespread methods for sequence comparisons, FASTA and BLAST, and the corresponding WWW services. The techniques involving multiple sequence alignments are also reviewed: alignment creation with the Clustal programs, phylogenetic tree calculation with the Clustal or Phylip packages and tree display using Drawtree, njplot or phylo_win. Finally, the chapter also treats the issue of structural prediction. Different methods for secondary structure predictions are described (Chou-Fasman, Garnier-Osguthorpe-Robson, Predator, PHD). Techniques for predicting membrane proteins, antigenic sites and postranslational modifications are also reviewed. PMID:10803381

  10. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    PubMed Central

    Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students’ attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484

  11. A survey of scholarly literature describing the field of bioinformatics education and bioinformatics educational research.

    PubMed

    Magana, Alejandra J; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students' attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484

  12. openSNP–A Crowdsourced Web Resource for Personal Genomics

    PubMed Central

    Greshake, Bastian; Bayer, Philipp E.; Rausch, Helge; Reda, Julia

    2014-01-01

    Genome-Wide Association Studies are widely used to correlate phenotypic traits with genetic variants. These studies usually compare the genetic variation between two groups to single out certain Single Nucleotide Polymorphisms (SNPs) that are linked to a phenotypic variation in one of the groups. However, it is necessary to have a large enough sample size to find statistically significant correlations. Direct-To-Consumer (DTC) genetic testing can supply additional data: DTC-companies offer the analysis of a large amount of SNPs for an individual at low cost without the need to consult a physician or geneticist. Over 100,000 people have already been genotyped through Direct-To-Consumer genetic testing companies. However, this data is not public for a variety of reasons and thus cannot be used in research. It seems reasonable to create a central open data repository for such data. Here we present the web platform openSNP, an open database which allows participants of Direct-To-Consumer genetic testing to publish their genetic data at no cost along with phenotypic information. Through this crowdsourced effort of collecting genetic and phenotypic information, openSNP has become a resource for a wide area of studies, including Genome-Wide Association Studies. openSNP is hosted at http://www.opensnp.org, and the code is released under MIT-license at http://github.com/gedankenstuecke/snpr. PMID:24647222

  13. Sequencing and analysis of a South Asian-Indian personal genome

    PubMed Central

    2012-01-01

    Background With over 1.3 billion people, India is estimated to contain three times more genetic diversity than does Europe. Next-generation sequencing technologies have facilitated the understanding of diversity by enabling whole genome sequencing at greater speed and lower cost. While genomes from people of European and Asian descent have been sequenced, only recently has a single male genome from the Indian subcontinent been published at sufficient depth and coverage. In this study we have sequenced and analyzed the genome of a South Asian Indian female (SAIF) from the Indian state of Kerala. Results We identified over 3.4 million SNPs in this genome including over 89,873 private variations. Comparison of the SAIF genome with several published personal genomes revealed that this individual shared ~50% of the SNPs with each of these genomes. Analysis of the SAIF mitochondrial genome showed that it was closely related to the U1 haplogroup which has been previously observed in Kerala. We assessed the SAIF genome for SNPs with health and disease consequences and found that the individual was at a higher risk for multiple sclerosis and a few other diseases. In analyzing SNPs that modulate drug response, we found a variation that predicts a favorable response to metformin, a drug used to treat diabetes. SNPs predictive of adverse reaction to warfarin indicated that the SAIF individual is not at risk for bleeding if treated with typical doses of warfarin. In addition, we report the presence of several additional SNPs of medical relevance. Conclusions This is the first study to report the complete whole genome sequence of a female from the state of Kerala in India. The availability of this complete genome and variants will further aid studies aimed at understanding genetic diversity, identifying clinically relevant changes and assessing disease burden in the Indian population. PMID:22938532

  14. Bioinformatics in the information age

    SciTech Connect

    Spengler, Sylvia J.

    2000-02-01

    There is a well-known story about the blind man examining the elephant: the part of the elephant examined determines his perception of the whole beast. Perhaps bioinformatics--the shotgun marriage between biology and mathematics, computer science, and engineering--is like an elephant that occupies a large chair in the scientific living room. Given the demand for and shortage of researchers with the computer skills to handle large volumes of biological data, where exactly does the bioinformatics elephant sit? There are probably many biologists who feel that a major product of this bioinformatics elephant is large piles of waste material. If you have tried to plow through Web sites and software packages in search of a specific tool for analyzing and collating large amounts of research data, you may well feel the same way. But there has been progress with major initiatives to develop more computing power, educate biologists about computers, increase funding, and set standards. For our purposes, bioinformatics is not simply a biologically inclined rehash of information theory (1) nor is it a hodgepodge of computer science techniques for building, updating, and accessing biological data. Rather bioinformatics incorporates both of these capabilities into a broad interdisciplinary science that involves both conceptual and practical tools for the understanding, generation, processing, and propagation of biological information. As such, bioinformatics is the sine qua non of 21st-century biology. Analyzing gene expression using cDNA microarrays immobilized on slides or other solid supports (gene chips) is set to revolutionize biology and medicine and, in so doing, generate vast quantities of data that have to be accurately interpreted (Fig. 1). As discussed at a meeting a few months ago (Microarray Algorithms and Statistical Analysis: Methods and Standards; Tahoe City, California; 9-12 November 1999), experiments with cDNA arrays must be subjected to quality control

  15. The origins of bioinformatics.

    PubMed

    Hagen, J B

    2000-12-01

    Bioinformatics is often described as being in its infancy, but computers emerged as important tools in molecular biology during the early 1960s. A decade before DNA sequencing became feasible, computational biologists focused on the rapidly accumulating data from protein biochemistry. Without the benefits of super computers or computer networks, these scientists laid important conceptual and technical foundations for bioinformatics today. PMID:11252753

  16. BreCAN-DB: a repository cum browser of personalized DNA breakpoint profiles of cancer genomes.

    PubMed

    Narang, Pankaj; Dhapola, Parashar; Chowdhury, Shantanu

    2016-01-01

    BreCAN-DB (http://brecandb.igib.res.in) is a repository cum browser of whole genome somatic DNA breakpoint profiles of cancer genomes, mapped at single nucleotide resolution using deep sequencing data. These breakpoints are associated with deletions, insertions, inversions, tandem duplications, translocations and a combination of these structural genomic alterations. The current release of BreCAN-DB features breakpoint profiles from 99 cancer-normal pairs, comprising five cancer types. We identified DNA breakpoints across genomes using high-coverage next-generation sequencing data obtained from TCGA and dbGaP. Further, in these cancer genomes, we methodically identified breakpoint hotspots which were significantly enriched with somatic structural alterations. To visualize the breakpoint profiles, a next-generation genome browser was integrated with BreCAN-DB. Moreover, we also included previously reported breakpoint profiles from 138 cancer-normal pairs, spanning 10 cancer types into the browser. Additionally, BreCAN-DB allows one to identify breakpoint hotspots in user uploaded data set. We have also included a functionality to query overlap of any breakpoint profile with regions of user's interest. Users can download breakpoint profiles from the database or may submit their data to be integrated in BreCAN-DB. We believe that BreCAN-DB will be useful resource for genomics scientific community and is a step towards personalized cancer genomics. PMID:26586806

  17. BreCAN-DB: a repository cum browser of personalized DNA breakpoint profiles of cancer genomes

    PubMed Central

    Narang, Pankaj; Dhapola, Parashar; Chowdhury, Shantanu

    2016-01-01

    BreCAN-DB (http://brecandb.igib.res.in) is a repository cum browser of whole genome somatic DNA breakpoint profiles of cancer genomes, mapped at single nucleotide resolution using deep sequencing data. These breakpoints are associated with deletions, insertions, inversions, tandem duplications, translocations and a combination of these structural genomic alterations. The current release of BreCAN-DB features breakpoint profiles from 99 cancer-normal pairs, comprising five cancer types. We identified DNA breakpoints across genomes using high-coverage next-generation sequencing data obtained from TCGA and dbGaP. Further, in these cancer genomes, we methodically identified breakpoint hotspots which were significantly enriched with somatic structural alterations. To visualize the breakpoint profiles, a next-generation genome browser was integrated with BreCAN-DB. Moreover, we also included previously reported breakpoint profiles from 138 cancer-normal pairs, spanning 10 cancer types into the browser. Additionally, BreCAN-DB allows one to identify breakpoint hotspots in user uploaded data set. We have also included a functionality to query overlap of any breakpoint profile with regions of user's interest. Users can download breakpoint profiles from the database or may submit their data to be integrated in BreCAN-DB. We believe that BreCAN-DB will be useful resource for genomics scientific community and is a step towards personalized cancer genomics. PMID:26586806

  18. Informed consent in direct-to-consumer personal genome testing: the outline of a model between specific and generic consent.

    PubMed

    Bunnik, Eline M; Janssens, A Cecile J W; Schermer, Maartje H N

    2014-09-01

    Broad genome-wide testing is increasingly finding its way to the public through the online direct-to-consumer marketing of so-called personal genome tests. Personal genome tests estimate genetic susceptibilities to multiple diseases and other phenotypic traits simultaneously. Providers commonly make use of Terms of Service agreements rather than informed consent procedures. However, to protect consumers from the potential physical, psychological and social harms associated with personal genome testing and to promote autonomous decision-making with regard to the testing offer, we argue that current practices of information provision are insufficient and that there is a place--and a need--for informed consent in personal genome testing, also when it is offered commercially. The increasing quantity, complexity and diversity of most testing offers, however, pose challenges for information provision and informed consent. Both specific and generic models for informed consent fail to meet its moral aims when applied to personal genome testing. Consumers should be enabled to know the limitations, risks and implications of personal genome testing and should be given control over the genetic information they do or do not wish to obtain. We present the outline of a new model for informed consent which can meet both the norm of providing sufficient information and the norm of providing understandable information. The model can be used for personal genome testing, but will also be applicable to other, future forms of broad genetic testing or screening in commercial and clinical settings. PMID:23137034

  19. Attitudes towards Social Networking and Sharing Behaviors among Consumers of Direct-to-Consumer Personal Genomics

    PubMed Central

    Lee, Sandra Soo-Jin; Vernez, Simone L.; Ormond, K.E.; Granovetter, Mark

    2013-01-01

    Little is known about how consumers of direct-to-consumer personal genetic services share personal genetic risk information. In an age of ubiquitous online networking and rapid development of social networking tools, understanding how consumers share personal genetic risk assessments is critical in the development of appropriate and effective policies. This exploratory study investigates how consumers share personal genetic information and attitudes towards social networking behaviors. Methods: Adult participants aged 23 to 72 years old who purchased direct-to-consumer genetic testing from a personal genomics company were administered a web-based survey regarding their sharing activities and social networking behaviors related to their personal genetic test results. Results: 80 participants completed the survey; of those, 45% shared results on Facebook and 50.9% reported meeting or reconnecting with more than 10 other individuals through the sharing of their personal genetic information. For help interpreting test results, 70.4% turned to Internet websites and online sources, compared to 22.7% who consulted their healthcare providers. Amongst participants, 51.8% reported that they believe the privacy of their personal genetic information would be breached in the future. Conclusion: Consumers actively utilize online social networking tools to help them share and interpret their personal genetic information. These findings suggest a need for careful consideration of policy recommendations in light of the current ambiguity of regulation and oversight of consumer initiated sharing activities. PMID:25562728

  20. Evaluation of next generation mtGenome sequencing using the Ion Torrent Personal Genome Machine (PGM)☆

    PubMed Central

    Parson, Walther; Strobl, Christina; Huber, Gabriela; Zimmermann, Bettina; Gomes, Sibylle M.; Souto, Luis; Fendt, Liane; Delport, Rhena; Langit, Reina; Wootton, Sharon; Lagacé, Robert; Irwin, Jodi

    2013-01-01

    Insights into the human mitochondrial phylogeny have been primarily achieved by sequencing full mitochondrial genomes (mtGenomes). In forensic genetics (partial) mtGenome information can be used to assign haplotypes to their phylogenetic backgrounds, which may, in turn, have characteristic geographic distributions that would offer useful information in a forensic case. In addition and perhaps even more relevant in the forensic context, haplogroup-specific patterns of mutations form the basis for quality control of mtDNA sequences. The current method for establishing (partial) mtDNA haplotypes is Sanger-type sequencing (STS), which is laborious, time-consuming, and expensive. With the emergence of Next Generation Sequencing (NGS) technologies, the body of available mtDNA data can potentially be extended much more quickly and cost-efficiently. Customized chemistries, laboratory workflows and data analysis packages could support the community and increase the utility of mtDNA analysis in forensics. We have evaluated the performance of mtGenome sequencing using the Personal Genome Machine (PGM) and compared the resulting haplotypes directly with conventional Sanger-type sequencing. A total of 64 mtGenomes (>1 million bases) were established that yielded high concordance with the corresponding STS haplotypes (<0.02% differences). About two-thirds of the differences were observed in or around homopolymeric sequence stretches. In addition, the sequence alignment algorithm employed to align NGS reads played a significant role in the analysis of the data and the resulting mtDNA haplotypes. Further development of alignment software would be desirable to facilitate the application of NGS in mtDNA forensic genetics. PMID:23948325

  1. Integrating Sequencing Technologies in Personal Genomics: Optimal Low Cost Reconstruction of Structural Variants

    PubMed Central

    Du, Jiang; Bjornson, Robert D.; Zhang, Zhengdong D.; Kong, Yong; Snyder, Michael; Gerstein, Mark B.

    2009-01-01

    The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen), with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs). SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome.) To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of human genomes at

  2. The potential of translational bioinformatics approaches for pharmacology research.

    PubMed

    Li, Lang

    2015-10-01

    The field of bioinformatics has allowed the interpretation of massive amounts of biological data, ushering in the era of 'omics' to biomedical research. Its potential impact on pharmacology research is enormous and it has shown some emerging successes. A full realization of this potential, however, requires standardized data annotation for large health record databases and molecular data resources. Improved standardization will further stimulate the development of system pharmacology models, using translational bioinformatics methods. This new translational bioinformatics paradigm is highly complementary to current pharmacological research fields, such as personalized medicine, pharmacoepidemiology and drug discovery. In this review, I illustrate the application of transformational bioinformatics to research in numerous pharmacology subdisciplines. PMID:25753093

  3. Public health genomics and personalized prevention: lessons from the COGS project

    PubMed Central

    Pashayan, N; Hall, A; Chowdhury, S; Dent, T; Pharoah, P D P; Burton, H

    2013-01-01

    Using the principles of public health genomics, we examined the opportunities and challenges of implementing personalized prevention programmes for cancer at the population level. Our model-based estimates indicate that polygenic risk stratification can potentially improve the effectiveness and cost-effectiveness of screening programmes. However, compared with ‘one-size-fits-all’ screening programmes, personalized screening adds further layers of complexity to the organization of screening services and raises ethical, legal and social challenges. Before polygenic inheritance is translated into population screening strategy, evidence from empirical research and engagement with and education of the public and the health professionals are needed. PMID:24127941

  4. Pathway analysis of genome-wide association datasets of personality traits.

    PubMed

    Kim, H-N; Kim, B-H; Cho, J; Ryu, S; Shin, H; Sung, J; Shin, C; Cho, N H; Sung, Y A; Choi, B-O; Kim, H-L

    2015-04-01

    Although several genome-wide association (GWA) studies of human personality have been recently published, genetic variants that are highly associated with certain personality traits remain unknown, due to difficulty reproducing results. To further investigate these genetic variants, we assessed biological pathways using GWA datasets. Pathway analysis using GWA data was performed on 1089 Korean women whose personality traits were measured with the Revised NEO Personality Inventory for the 5-factor model of personality. A total of 1042 pathways containing 8297 genes were included in our study. Of these, 14 pathways were highly enriched with association signals that were validated in 1490 independent samples. These pathways include association of: Neuroticism with axon guidance [L1 cell adhesion molecule (L1CAM) interactions]; Extraversion with neuronal system and voltage-gated potassium channels; Agreeableness with L1CAM interaction, neurotransmitter receptor binding and downstream transmission in postsynaptic cells; and Conscientiousness with the interferon-gamma and platelet-derived growth factor receptor beta polypeptide pathways. Several genes that contribute to top-ranked pathways in this study were previously identified in GWA studies or by pathway analysis in schizophrenia or other neuropsychiatric disorders. Here we report the first pathway analysis of all five personality traits. Importantly, our analysis identified novel pathways that contribute to understanding the etiology of personality traits. PMID:25809424

  5. From prenatal genomic diagnosis to fetal personalized medicine: progress and challenges

    PubMed Central

    Bianchi, Diana W

    2015-01-01

    Thus far, the focus of personalized medicine has been the prevention and treatment of conditions that affect adults. Although advances in genetic technology have been applied more frequently to prenatal diagnosis than to fetal treatment, genetic and genomic information is beginning to influence pregnancy management. Recent developments in sequencing the fetal genome combined with progress in understanding fetal physiology using gene expression arrays indicate that we could have the technical capabilities to apply an individualized medicine approach to the fetus. Here I review recent advances in prenatal genetic diagnostics, the challenges associated with these new technologies and how the information derived from them can be used to advance fetal care. Historically, the goal of prenatal diagnosis has been to provide an informed choice to prospective parents. We are now at a point where that goal can and should be expanded to incorporate genetic, genomic and transcriptomic data to develop new approaches to fetal treatment. PMID:22772565

  6. Eyes wide open: the personal genome project, citizen science and veracity in informed consent

    PubMed Central

    Angrist, Misha

    2012-01-01

    I am a close observer of the Personal Genome Project (PGP) and one of the original ten participants. The PGP was originally conceived as a way to test novel DNA sequencing technologies on human samples and to begin to build a database of human genomes and traits. However, its founder, Harvard geneticist George Church, was concerned about the fact that DNA is the ultimate digital identifier – individuals and many of their traits can be identified. Therefore, he believed that promising participants privacy and confidentiality would be impractical and disingenuous. Moreover, deidentification of samples would impoverish both genotypic and phenotypic data. As a result, the PGP has arguably become best known for its unprecedented approach to informed consent. All participants must pass an exam testing their knowledge of genomic science and privacy issues and agree to forgo the privacy and confidentiality of their genomic data and personal health records. Church aims to scale up to 100,000 participants. This special report discusses the impetus for the project, its early history and its potential to have a lasting impact on the treatment of human subjects in biomedical research. PMID:22328898

  7. Should direct-to-consumer personalized genomic medicine remain unregulated?: a rebuttal of the defenses.

    PubMed

    Valles, Sean A

    2012-01-01

    Direct-to-consumer personalized genomic medicine has recently grown into a small industry that sells mail-order DNA sample kits and then provides disease risk assessments, typically based upon results from genome-trait association studies. The companies selling these services have been largely exempted from FDA regulation in the United States. Testing kit companies and their supporters have defended the industry's unregulated status using two arguments. First, defenders have argued that mere absence of harm is all that must be proved for mail-order tests to be acceptable. Second, defenders of mail-order testing have argued that there is an individual right to the tests' information. This article rebuts these arguments. The article demonstrates that the direct-to-consumer market has resulted in the sidelining of clinical utility (medical value to patients), leading to the development of certain mail-order tests that do not promote customers' interests and to defenders' downplaying of a potentially damaging empirical study of mail-order genomic testing's effects on consumers. The article also shows that the notion of an individual right to these tests rests on a flawed reading of the key service provided by mail-order companies, which is the provision of medical interpretations, not simply genetic information. Absent these two justifications, there is no reason to exempt direct-to-consumer personalized genomic medicine from stringent federal oversight. PMID:22643762

  8. Perceptions of genetic counseling services in direct-to-consumer personal genomic testing.

    PubMed

    Darst, B F; Madlensky, L; Schork, N J; Topol, E J; Bloss, C S

    2013-10-01

    To describe consumers' perceptions of genetic counseling services in the context of direct-to-consumer personal genomic testing is the purpose of this research. Utilizing data from the Scripps Genomic Health Initiative, we assessed direct-to-consumer genomic test consumers' utilization and perceptions of genetic counseling services. At long-term follow-up, approximately 14 months post-testing, participants were asked to respond to several items gauging their interactions, if any, with a Navigenics genetic counselor, and their perceptions of those interactions. Out of 1325 individuals who completed long-term follow-up, 187 (14.1%) indicated that they had spoken with a genetic counselor. The most commonly given reason for not utilizing the counseling service was a lack of need due to the perception of already understanding one's results (55.6%). The most common reasons for utilizing the service included wanting to take advantage of a free service (43.9%) and wanting more information on risk calculations (42.2%). Among those who utilized the service, a large fraction reported that counseling improved their understanding of their results (54.5%) and genetics in general (43.9%). A relatively small proportion of participants utilized genetic counseling after direct-to-consumer personal genomic testing. Among those individuals who did utilize the service, however, a large fraction perceived it to be informative, and thus presumably beneficial. PMID:23590221

  9. A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

    ERIC Educational Resources Information Center

    Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari

    2014-01-01

    Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…

  10. Translational Bioinformatics and Healthcare Informatics: Computational and Ethical Challenges

    PubMed Central

    Sethi, Prerna; Theodos, Kimberly

    2009-01-01

    Exponentially growing biological and bioinformatics data sets present a challenge and an opportunity for researchers to contribute to the understanding of the genetic basis of phenotypes. Due to breakthroughs in microarray technology, it is possible to simultaneously monitor the expressions of thousands of genes, and it is imperative that researchers have access to the clinical data to understand the genetics and proteomics of the diseased tissue. This technology could be a landmark in personalized medicine, which will provide storage for clinical and genetic data in electronic health records (EHRs). In this paper, we explore the computational and ethical challenges that emanate from the intersection of bioinformatics and healthcare informatics research. We describe the current situation of the EHR and its capabilities to store clinical and genetic data and then discuss the Genetic Information Nondiscrimination Act. Finally, we posit that the synergy obtained from the collaborative efforts between the genomics, clinical, and healthcare disciplines has potential to enhance and promote faster and more advanced breakthroughs in healthcare. PMID:20169020

  11. A tiered-layered-staged model for informed consent in personal genome testing.

    PubMed

    Bunnik, Eline M; Janssens, A Cecile J W; Schermer, Maartje H N

    2013-06-01

    In recent years, developments in genomics technologies have led to the rise of commercial personal genome testing (PGT): broad genome-wide testing for multiple diseases simultaneously. While some commercial providers require physicians to order a personal genome test, others can be accessed directly. All providers advertise directly to consumers and offer genetic risk information about dozens of diseases in one single purchase. The quantity and the complexity of risk information pose challenges to adequate pre-test and post-test information provision and informed consent. There are currently no guidelines for what should constitute informed consent in PGT or how adequate informed consent can be achieved. In this paper, we propose a tiered-layered-staged model for informed consent. First, the proposed model is tiered as it offers choices between categories of diseases that are associated with distinct ethical, personal or societal issues. Second, the model distinguishes layers of information with a first layer offering minimal, indispensable information that is material to all consumers, and additional layers offering more detailed information made available upon request. Finally, the model stages informed consent as a process by feeding information to consumers in each subsequent stage of the process of undergoing a test, and by accommodating renewed consent for test result updates, resulting from the ongoing development of the science underlying PGT. A tiered-layered-staged model for informed consent with a focus on the consumer perspective can help overcome the ethical problems of information provision and informed consent in direct-to-consumer PGT. PMID:23169494

  12. An Online Bioinformatics Curriculum

    PubMed Central

    Searls, David B.

    2012-01-01

    Online learning initiatives over the past decade have become increasingly comprehensive in their selection of courses and sophisticated in their presentation, culminating in the recent announcement of a number of consortium and startup activities that promise to make a university education on the internet, free of charge, a real possibility. At this pivotal moment it is appropriate to explore the potential for obtaining comprehensive bioinformatics training with currently existing free video resources. This article presents such a bioinformatics curriculum in the form of a virtual course catalog, together with editorial commentary, and an assessment of strengths, weaknesses, and likely future directions for open online learning in this field. PMID:23028269

  13. Towards personalized agriculture: what chemical genomics can bring to plant biotechnology

    PubMed Central

    Stokes, Michael E.; McCourt, Peter

    2014-01-01

    In contrast to the dominant drug paradigm in which compounds were developed to “fit all,” new models focused around personalized medicine are appearing in which treatments are developed and customized for individual patients. The agricultural biotechnology industry (Ag-biotech) should also think about these new personalized models. For example, most common herbicides are generic in action, which led to the development of genetically modified crops to add specificity. The ease and accessibility of modern genomic analysis, when wedded to accessible large chemical space, should facilitate the discovery of chemicals that are more selective in their utility. Is it possible to develop species-selective herbicides and growth regulators? More generally put, is plant research at a stage where chemicals can be developed that streamline plant development and growth to various environments? We believe the advent of chemical genomics now opens up these and other opportunities to “personalize” agriculture. Furthermore, chemical genomics does not necessarily require genetically tractable plant models, which in principle should allow quick translation to practical applications. For this to happen, however, will require collaboration between the Ag-biotech industry and academic labs for early stage research and development, a situation that has proven very fruitful for Big Pharma. PMID:25183965

  14. Gene Variant Databases and Sharing: Creating a Global Genomic Variant Database for Personalized Medicine.

    PubMed

    Bean, Lora J H; Hegde, Madhuri R

    2016-06-01

    Revolutionary changes in sequencing technology and the desire to develop therapeutics for rare diseases have led to the generation of an enormous amount of genomic data in the last 5 years. Large-scale sequencing done in both research and diagnostic laboratories has linked many new genes to rare diseases, but has also generated a number of variants that we cannot interpret today. It is clear that we remain a long way from a complete understanding of the genomic variation in the human genome and its association with human health and disease. Recent studies identified susceptibility markers to infectious diseases and also the contribution of rare variants to complex diseases in different populations. The sequencing revolution has also led to the creation of a large number of databases that act as "keepers" of data, and in many cases give an interpretation of the effect of the variant. This interpretation is based on reports in the literature, prediction models, and in some cases is accompanied by functional evidence. As we move toward the practice of genomic medicine, and consider its place in "personalized medicine," it is time to ask ourselves how we can aggregate this wealth of data into a single database for multiple users with different goals. PMID:26931283

  15. Bioinformatics and School Biology

    ERIC Educational Resources Information Center

    Dalpech, Roger

    2006-01-01

    The rapidly changing field of bioinformatics is fuelling the need for suitably trained personnel with skills in relevant biological "sub-disciplines" such as proteomics, transcriptomics and metabolomics, etc. But because of the complexity--and sheer weight of data--associated with these new areas of biology, many school teachers feel…

  16. Bioinformatics Methods and Tools to Advance Clinical Care

    PubMed Central

    Lecroq, T.

    2015-01-01

    Summary Objectives To summarize excellent current research in the field of Bioinformatics and Translational Informatics with application in the health domain and clinical care. Method We provide a synopsis of the articles selected for the IMIA Yearbook 2015, from which we attempt to derive a synthetic overview of current and future activities in the field. As last year, a first step of selection was performed by querying MEDLINE with a list of MeSH descriptors completed by a list of terms adapted to the section. Each section editor has evaluated separately the set of 1,594 articles and the evaluation results were merged for retaining 15 articles for peer-review. Results The selection and evaluation process of this Yearbook’s section on Bioinformatics and Translational Informatics yielded four excellent articles regarding data management and genome medicine that are mainly tool-based papers. In the first article, the authors present PPISURV a tool for uncovering the role of specific genes in cancer survival outcome. The second article describes the classifier PredictSNP which combines six performing tools for predicting disease-related mutations. In the third article, by presenting a high-coverage map of the human proteome using high resolution mass spectrometry, the authors highlight the need for using mass spectrometry to complement genome annotation. The fourth article is also related to patient survival and decision support. The authors present datamining methods of large-scale datasets of past transplants. The objective is to identify chances of survival. Conclusions The current research activities still attest the continuous convergence of Bioinformatics and Medical Informatics, with a focus this year on dedicated tools and methods to advance clinical care. Indeed, there is a need for powerful tools for managing and interpreting complex, large-scale genomic and biological datasets, but also a need for user-friendly tools developed for the clinicians in their

  17. Online Tools for Bioinformatics Analyses in Nutrition Sciences12

    PubMed Central

    Malkaram, Sridhar A.; Hassan, Yousef I.; Zempleni, Janos

    2012-01-01

    Recent advances in “omics” research have resulted in the creation of large datasets that were generated by consortiums and centers, small datasets that were generated by individual investigators, and bioinformatics tools for mining these datasets. It is important for nutrition laboratories to take full advantage of the analysis tools to interrogate datasets for information relevant to genomics, epigenomics, transcriptomics, proteomics, and metabolomics. This review provides guidance regarding bioinformatics resources that are currently available in the public domain, with the intent to provide a starting point for investigators who want to take advantage of the opportunities provided by the bioinformatics field. PMID:22983844

  18. Towards a career in bioinformatics

    PubMed Central

    2009-01-01

    The 2009 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation from 1998, was organized as the 8th International Conference on Bioinformatics (InCoB), Sept. 9-11, 2009 at Biopolis, Singapore. InCoB has actively engaged researchers from the area of life sciences, systems biology and clinicians, to facilitate greater synergy between these groups. To encourage bioinformatics students and new researchers, tutorials and student symposium, the Singapore Symposium on Computational Biology (SYMBIO) were organized, along with the Workshop on Education in Bioinformatics and Computational Biology (WEBCB) and the Clinical Bioinformatics (CBAS) Symposium. However, to many students and young researchers, pursuing a career in a multi-disciplinary area such as bioinformatics poses a Himalayan challenge. A collection to tips is presented here to provide signposts on the road to a career in bioinformatics. An overview of the application of bioinformatics to traditional and emerging areas, published in this supplement, is also presented to provide possible future avenues of bioinformatics investigation. A case study on the application of e-learning tools in undergraduate bioinformatics curriculum provides information on how to go impart targeted education, to sustain bioinformatics in the Asia-Pacific region. The next InCoB is scheduled to be held in Tokyo, Japan, Sept. 26-28, 2010. PMID:19958508

  19. Meta-analysis of genome-wide association studies for personality

    PubMed Central

    de Moor, Marleen H.M.; Costa, Paul T.; Terracciano, Antonio; Krueger, Robert F.; de Geus, Eco J.C.; Toshiko, Tanaka; Penninx, Brenda W.J.H.; Esko, Tõnu; Madden, Pamela A F; Derringer, Jaime; Amin, Najaf; Willemsen, Gonneke; Hottenga, Jouke-Jan; Distel, Marijn A.; Uda, Manuela; Sanna, Serena; Spinhoven, Philip; Hartman, Catharina A.; Sullivan, Patrick; Realo, Anu; Allik, Jüri; Heath, Andrew C; Pergadia, Michele L; Agrawal, Arpana; Lin, Peng; Grucza, Richard; Nutile, Teresa; Ciullo, Marina; Rujescu, Dan; Giegling, Ina; Konte, Bettina; Widen, Elisabeth; Cousminer, Diana L; Eriksson, Johan G.; Palotie, Aarno; Luciano, Michelle; Tenesa, Albert; Davies, Gail; Lopez, Lorna M.; Hansell, Narelle K.; Medland, Sarah E.; Ferrucci, Luigi; Schlessinger, David; Montgomery, Grant W.; Wright, Margaret J.; Aulchenko, Yurii S.; Janssens, A.Cecile J.W.; Oostra, Ben A.; Metspalu, Andres; Abecasis, Gonçalo R.; Deary, Ian J.; Räikkönen, Katri; Bierut, Laura J.; Martin, Nicholas G.; van Duijn, Cornelia M.; Boomsma, Dorret I.

    2013-01-01

    Personality can be thought of as a set of characteristics that influence people’s thoughts, feelings, and behaviour across a variety of settings. Variation in personality is predictive of many outcomes in life, including mental health. Here we report on a meta-analysis of genome-wide association (GWA) data for personality in ten discovery samples (17 375 adults) and five in-silico replication samples (3 294 adults). All participants were of European ancestry. Personality scores for Neuroticism, Extraversion, Openness to Experience, Agreeableness, and Conscientiousness were based on the NEO Five-Factor Inventory. Genotype data were available of ~2.4M Single Nucleotide Polymorphisms (SNPs; directly typed and imputed using HAPMAP data). In the discovery samples, classical association analyses were performed under an additive model followed by meta-analysis using the weighted inverse variance method. Results showed genome-wide significance for Openness to Experience near the RASA1 gene on 5q14.3 (rs1477268 and rs2032794, P = 2.8 × 10−8 and 3.1 × 10−8) and for Conscientiousness in the brain-expressed KATNAL2 gene on 18q21.1 (rs2576037, P = 4.9 × 10−8). We further conducted a gene-based test that confirmed the association of KATNAL2 to Conscientiousness. In-silico replication did not, however, show significant associations of the top SNPs with Openness and Conscientiousness, although the direction of effect of the KATNAL2 SNP on Conscientiousness was consistent in all replication samples. Larger scale GWA studies and alternative approaches are required for confirmation of KATNAL2 as a novel gene affecting Conscientiousness. PMID:21173776

  20. From Molecules to Patients: The Clinical Applications of Translational Bioinformatics

    PubMed Central

    Regan, K.

    2015-01-01

    Summary Objective In order to realize the promise of personalized medicine, Translational Bioinformatics (TBI) research will need to continue to address implementation issues across the clinical spectrum. In this review, we aim to evaluate the expanding field of TBI towards clinical applications, and define common themes and current gaps in order to motivate future research. Methods Here we present the state-of-the-art of clinical implementation of TBI-based tools and resources. Our thematic analyses of a targeted literature search of recent TBI-related articles ranged across topics in genomics, data management, hypothesis generation, molecular epidemiology, diagnostics, therapeutics and personalized medicine. Results Open areas of clinically-relevant TBI research identified in this review include developing data standards and best practices, publicly available resources, integrative systems-level approaches, user-friendly tools for clinical support, cloud computing solutions, emerging technologies and means to address pressing legal, ethical and social issues. Conclusions There is a need for further research bridging the gap from foundational TBI-based theories and methodologies to clinical implementation. We have organized the topic themes presented in this review into four conceptual foci – domain analyses, knowledge engineering, computational architectures and computation methods alongside three stages of knowledge development in order to orient future TBI efforts to accelerate the goals of personalized medicine. PMID:26293863

  1. Bioinformatics for Exploration

    NASA Technical Reports Server (NTRS)

    Johnson, Kathy A.

    2006-01-01

    For the purpose of this paper, bioinformatics is defined as the application of computer technology to the management of biological information. It can be thought of as the science of developing computer databases and algorithms to facilitate and expedite biological research. This is a crosscutting capability that supports nearly all human health areas ranging from computational modeling, to pharmacodynamics research projects, to decision support systems within autonomous medical care. Bioinformatics serves to increase the efficiency and effectiveness of the life sciences research program. It provides data, information, and knowledge capture which further supports management of the bioastronautics research roadmap - identifying gaps that still remain and enabling the determination of which risks have been addressed.

  2. How Well Do Customers of Direct-to-Consumer Personal Genomic Testing Services Comprehend Genetic Test Results? Findings from the Impact of Personal Genomics Study

    PubMed Central

    Ostergren, Jenny E.; Gornick, Michele C.; Carere, Deanna Alexis; Kalia, Sarah S.; Uhlmann, Wendy R.; Ruffin, Mack T.; Mountain, Joanna L.; Green, Robert C.; Roberts, J. Scott

    2016-01-01

    Aim To assess customer comprehension of health-related personal genomic testing (PGT) results. Methods We presented sample reports of genetic results and examined responses to comprehension questions in 1,030 PGT customers (mean age: 46.7 years; 59.9% female; 79.0% college graduates; 14.9% non-White; 4.7% of Hispanic/Latino ethnicity). Sample reports presented a genetic risk for Alzheimer’s disease and type 2 diabetes, carrier screening summary results for >30 conditions, results for phenylketonuria and cystic fibrosis, and drug response results for a statin drug. Logistic regression was used to identify correlates of participant comprehension. Results Participants exhibited high overall comprehension (mean score: 79.1% correct). The highest comprehension (range: 81.1–97.4% correct) was observed in the statin drug response and carrier screening summary results, and lower comprehension (range: 63.6–74.8% correct) on specific carrier screening results. Higher levels of numeracy, genetic knowledge, and education were significantly associated with greater comprehension. Older age (≥ 60 years) was associated with lower comprehension scores. Conclusions Most customers accurately interpreted the health implications of PGT results; however, comprehension varied by demographic characteristics, numeracy and genetic knowledge, and types and format of the genetic information presented. Results suggest a need to tailor the presentation of PGT results by test type and customer characteristics. PMID:26087778

  3. Phylogenetic trees in bioinformatics

    SciTech Connect

    Burr, Tom L

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  4. Genetics, Genomics and Cancer Risk Assessment: State of the art and future directions in the era of personalized medicine

    PubMed Central

    Weitzel, Jeffrey N.; Blazer, Kathleen R.; MacDonald, Deborah J.; Culver, Julie O.; Offit, Kenneth

    2012-01-01

    Scientific and technologic advances are revolutionizing our approach to genetic cancer risk assessment, cancer screening and prevention, and targeted therapy, fulfilling the promise of personalized medicine. In this monograph we review the evolution of scientific discovery in cancer genetics and genomics, and describe current approaches, benefits and barriers to the translation of this information to the practice of preventive medicine. Summaries of known hereditary cancer syndromes and highly penetrant genes are provided and contrasted with recently-discovered genomic variants associated with modest increases in cancer risk. We describe the scope of knowledge, tools, and expertise required for the translation of complex genetic and genomic test information into clinical practice. The challenges of genomic counseling include the need for genetics and genomics professional education and multidisciplinary team training, the need for evidence-based information regarding the clinical utility of testing for genomic variants, the potential dangers posed by premature marketing of first-generation genomic profiles, and the need for new clinical models to improve access to and responsible communication of complex disease-risk information. We conclude that given the experiences and lessons learned in the genetics era, the multidisciplinary model of genetic cancer risk assessment and management will serve as a solid foundation to support the integration of personalized genomic information into the practice of cancer medicine. PMID:21858794

  5. Genome-wide association scan for five major dimensions of personality

    PubMed Central

    Terracciano, Antonio; Sanna, Serena; Uda, Manuela; Deiana, Barbara; Usala, Gianluca; Busonero, Fabio; Maschio, Andrea; Scally, Matthew; Patriciu, Nicholas; Chen, Wei-Min; Distel, Marijn A; Slagboom, Eline P; Boomsma, Dorret I; Villafuerte, Sandra; Śliwerska, Elżbieta; Burmeister, Margit; Amin, Najaf; Janssens, A. Cecile J.W.; van Duijn, Cornelia M.; Schlessinger, David; Abecasis, Gonçalo R.; Costa, Paul T.

    2008-01-01

    Personality traits are summarized by five broad dimensions with pervasive influences on major life outcomes, strong links to psychiatric disorders, and clear heritable components. To identify genetic variants associated with each of the five dimensions of personality we performed a genome wide association (GWA) scan of 3,972 individuals from a genetically isolated population within Sardinia, Italy. Based on analyses of 362,129 single nucleotide polymorphisms (SNPs) we found several strong signals within or near genes previously implicated in psychiatric disorders. They include the association of Neuroticism with SNAP25 (rs362584, P = 5 × 10−5), Extraversion with BDNF and two cadherin genes (CDH13 and CDH23; Ps < 5 × 10−5), Openness with CNTNAP2 (rs10251794, P = 3 × 10−5), Agreeableness with CLOCK (rs6832769, P = 9 × 10−6), and Conscientiousness with DYRK1A (rs2835731, P = 3 × 10−5). Effect sizes were small (less than 1% of variance), and most failed to replicate in the follow-up independent samples (N up to 3,903), though the association between Agreeableness and CLOCK was supported in two of three replication samples (overall P = 2 × 10−5). We infer that a large number of loci may influence personality traits and disorders, requiring larger sample sizes for the GWA approach to identify significant genetic variants. PMID:18957941

  6. STORMSeq: an open-source, user-friendly pipeline for processing personal genomics data in the cloud.

    PubMed

    Karczewski, Konrad J; Fernald, Guy Haskin; Martin, Alicia R; Snyder, Michael; Tatonetti, Nicholas P; Dudley, Joel T

    2014-01-01

    The increasing public availability of personal complete genome sequencing data has ushered in an era of democratized genomics. However, read mapping and variant calling software is constantly improving and individuals with personal genomic data may prefer to customize and update their variant calls. Here, we describe STORMSeq (Scalable Tools for Open-Source Read Mapping), a graphical interface cloud computing solution that does not require a parallel computing environment or extensive technical experience. This customizable and modular system performs read mapping, read cleaning, and variant calling and annotation. At present, STORMSeq costs approximately $2 and 5-10 hours to process a full exome sequence and $30 and 3-8 days to process a whole genome sequence. We provide this open-access and open-source resource as a user-friendly interface in Amazon EC2. PMID:24454756

  7. Agile parallel bioinformatics workflow management using Pwrake

    PubMed Central

    2011-01-01

    Background In bioinformatics projects, scientific workflow systems are widely used to manage computational procedures. Full-featured workflow systems have been proposed to fulfil the demand for workflow management. However, such systems tend to be over-weighted for actual bioinformatics practices. We realize that quick deployment of cutting-edge software implementing advanced algorithms and data formats, and continuous adaptation to changes in computational resources and the environment are often prioritized in scientific workflow management. These features have a greater affinity with the agile software development method through iterative development phases after trial and error. Here, we show the application of a scientific workflow system Pwrake to bioinformatics workflows. Pwrake is a parallel workflow extension of Ruby's standard build tool Rake, the flexibility of which has been demonstrated in the astronomy domain. Therefore, we hypothesize that Pwrake also has advantages in actual bioinformatics workflows. Findings We implemented the Pwrake workflows to process next generation sequencing data using the Genomic Analysis Toolkit (GATK) and Dindel. GATK and Dindel workflows are typical examples of sequential and parallel workflows, respectively. We found that in practice, actual scientific workflow development iterates over two phases, the workflow definition phase and the parameter adjustment phase. We introduced separate workflow definitions to help focus on each of the two developmental phases, as well as helper methods to simplify the descriptions. This approach increased iterative development efficiency. Moreover, we implemented combined workflows to demonstrate modularity of the GATK and Dindel workflows. Conclusions Pwrake enables agile management of scientific workflows in the bioinformatics domain. The internal domain specific language design built on Ruby gives the flexibility of rakefiles for writing scientific workflows. Furthermore, readability

  8. Novel bioinformatic developments for exome sequencing.

    PubMed

    Lelieveld, Stefan H; Veltman, Joris A; Gilissen, Christian

    2016-06-01

    With the widespread adoption of next generation sequencing technologies by the genetics community and the rapid decrease in costs per base, exome sequencing has become a standard within the repertoire of genetic experiments for both research and diagnostics. Although bioinformatics now offers standard solutions for the analysis of exome sequencing data, many challenges still remain; especially the increasing scale at which exome data are now being generated has given rise to novel challenges in how to efficiently store, analyze and interpret exome data of this magnitude. In this review we discuss some of the recent developments in bioinformatics for exome sequencing and the directions that this is taking us to. With these developments, exome sequencing is paving the way for the next big challenge, the application of whole genome sequencing. PMID:27075447

  9. Bioinformatic analysis of expression data to identify effector candidates.

    PubMed

    Reid, Adam J; Jones, John T

    2014-01-01

    Pathogens produce effectors that manipulate the host to the benefit of the pathogen. These effectors are often secreted proteins that are upregulated during the early phases of infection. These properties can be used to identify candidate effectors from genomes and transcriptomes of pathogens. Here we describe commonly used bioinformatic approaches that (1) allow identification of genes encoding predicted secreted proteins within a genome and (2) allow the identification of genes encoding predicted secreted proteins that are upregulated at important stages of the life cycle. Other approaches for bioinformatic identification of effector candidates, including OrthoMCL analysis to identify expanded gene families, are also described. PMID:24643549

  10. Intrageneric primer design: Bringing bioinformatics tools to the class.

    PubMed

    Lima, André O S; Garcês, Sérgio P S

    2006-09-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private and academic) with a need for bachelor of science students with bioinformatics skills. In consideration of this need, described here is a problem-based class in which students are asked to design a set of intrageneric primers for PCR. The exercise is divided into five classes of 1 h each, in which students use freeware bioinformatics tools and data bases available through the Internet. Besides designing the set of primers, the students will consequently learn the significance and use of the major bioinformatics procedures, such as searching a data base, conducting and analyzing sequence multialignment, comparing sequences with a data base, and selecting primers. PMID:21638710

  11. After the revolution? Ethical and social challenges in ‘personalized genomic medicine’

    PubMed Central

    Juengst, Eric T; Settersten, Richard A; Fishman, Jennifer R; McGowan, Michelle L

    2013-01-01

    Personalized genomic medicine (PGM) is a goal that currently unites a wide array of biomedical initiatives, and is promoted as a ‘new paradigm for healthcare’ by its champions. Its promissory virtues include individualized diagnosis and risk prediction, more effective prevention and health promotion, and patient empowerment. Beyond overcoming scientific and technological hurdles to realizing PGM, proponents may interpret and rank these promises differently, which carries ethical and social implications for the realization of PGM as an approach to healthcare. We examine competing visions of PGM’s virtues and the directions in which they could take the field, in order to anticipate policy choices that may lie ahead for researchers, healthcare providers and the public. PMID:23662108

  12. Consumers report lower confidence in their genetics knowledge following direct-to-consumer personal genomic testing

    PubMed Central

    Carere, Deanna Alexis; Kraft, Peter; Kaphingst, Kimberly A.; Roberts, J. Scott; Green, Robert C.

    2015-01-01

    Purpose To measure changes to genetics knowledge and self-efficacy following personal genomic testing (PGT). Methods New customers of 23andMe and Pathway Genomics completed a series of online surveys. Prior to receipt of results, and 6 months post-results, we measured genetics knowledge (9 true/false items) and genetics self-efficacy (5 Likert-scale items) and used paired methods to evaluate change over time. Correlates of change (e.g., decision regret) were identified using linear regression. Results 998 PGT customers (59.9% female; 85.8% White; mean age 46.9±15.5 years) were included in our analyses. Mean genetics knowledge score out of 9 was 8.15±0.95 at baseline and 8.25±0.92 at 6 months (p = .0024). Mean self-efficacy score out of 35 was 29.06±5.59 at baseline and 27.7±5.46 at 6 months (p < .0001); on each item, 30–45% of participants reported lower self-efficacy following PGT. Change in self-efficacy was positively associated with health care provider consultation (p = .0042), impact of PGT on perceived control over one’s health (p < .0001), and perceived value of PGT (p < .0001), and negatively associated with decision regret (p < .0001). Conclusion Lowered genetics self-efficacy following PGT may reflect an appropriate reevaluation by consumers in response to receiving complex genetic information. PMID:25812042

  13. Genomic classification of the RAS network identifies a personalized treatment strategy for lung cancer

    PubMed Central

    El-Chaar, Nader N.; Piccolo, Stephen R.; Boucher, Kenneth M.; Cohen, Adam L.; Chang, Jeffrey T.; Moos, Philip J.; Bild, Andrea H.

    2014-01-01

    Better approaches are needed to evaluate a single patient's drug response at the genomic level. Targeted therapy for signaling pathways in cancer has met limited success in part due to the exceedingly interwoven nature of the pathways. In particular, the highly complex RAS network has been challenging to target. Effectively targeting the pathway requires development of techniques that measure global network activity to account for pathway complexity. For this purpose, we used a gene-expression-based biomarker for RAS network activity in non-small cell lung cancer (NSCLC) cells, and screened for drugs whose efficacy were significantly highly correlated to RAS network activity. Results identified EGFR and MEK co-inhibition as the most effective treatment for RAS-active NSCLC amongst a panel of over 360 compounds and fractions. RAS activity was identified in both RAS-mutant and wild-type lines, indicating broad characterization of RAS signaling inclusive of multiple mechanisms of RAS activity, and not solely based on mutation status. Mechanistic studies demonstrated that co-inhibition of EGFR and MEK induced apoptosis and blocked both EGFR-RAS-RAF-MEK-ERK and EGFR-PI3K-AKT-RPS6 nodes simultaneously in RAS-active, but not RAS-inactive NSCLC. These results provide a comprehensive strategy to personalize treatment of NSCLC based on RAS network dysregulation and provide proof-of-concept of a genomic approach to classify and target complex signaling networks. PMID:24908424

  14. [Genome-cohort studies for the development of personalized cancer prevention programs in Japan].

    PubMed

    Tanaka, Hideo

    2015-05-01

    One of the most important roles of molecular epidemiology is to investigate gene-environment interactions in order to provide data for personalized risk modification. A case-control study conducted in Aichi showed that an aldehyde dehydrogenase- 2(ALDH2)polymorphism together with cigarette smoking significantly affects the risk of lung cancer. The main purpose of this large-scale genome-cohort study of healthy individuals is to confirm that these factors are associated with the development of diseases and to set optimal thresholds for the environmental factors. The Japan Multi-Institutional Collaborative Cohort(J-MICC)Study was launched in 2005. It has recruited 100,600 healthy participants up to the end of 2014, and plans to follow them until 2025. Although Japanese genome-cohort studies, including the J-MICC Study, the Japan Public Health Center-based Prospective(JPHC)Study, and the Tohoku Medical Megabank Organization Study, consist of different research teams with different financial resources, collaboration to standardize the data collection format for successful pooled analysis is being discussed. PMID:25981648

  15. Pattern recognition in bioinformatics.

    PubMed

    de Ridder, Dick; de Ridder, Jeroen; Reinders, Marcel J T

    2013-09-01

    Pattern recognition is concerned with the development of systems that learn to solve a given problem using a set of example instances, each represented by a number of features. These problems include clustering, the grouping of similar instances; classification, the task of assigning a discrete label to a given instance; and dimensionality reduction, combining or selecting features to arrive at a more useful representation. The use of statistical pattern recognition algorithms in bioinformatics is pervasive. Classification and clustering are often applied to high-throughput measurement data arising from microarray, mass spectrometry and next-generation sequencing experiments for selecting markers, predicting phenotype and grouping objects or genes. Less explicitly, classification is at the core of a wide range of tools such as predictors of genes, protein function, functional or genetic interactions, etc., and used extensively in systems biology. A course on pattern recognition (or machine learning) should therefore be at the core of any bioinformatics education program. In this review, we discuss the main elements of a pattern recognition course, based on material developed for courses taught at the BSc, MSc and PhD levels to an audience of bioinformaticians, computer scientists and life scientists. We pay attention to common problems and pitfalls encountered in applications and in interpretation of the results obtained. PMID:23559637

  16. Bioinformatics-Aided Venomics

    PubMed Central

    Kaas, Quentin; Craik, David J.

    2015-01-01

    Venomics is a modern approach that combines transcriptomics and proteomics to explore the toxin content of venoms. This review will give an overview of computational approaches that have been created to classify and consolidate venomics data, as well as algorithms that have helped discovery and analysis of toxin nucleic acid and protein sequences, toxin three-dimensional structures and toxin functions. Bioinformatics is used to tackle specific challenges associated with the identification and annotations of toxins. Recognizing toxin transcript sequences among second generation sequencing data cannot rely only on basic sequence similarity because toxins are highly divergent. Mass spectrometry sequencing of mature toxins is challenging because toxins can display a large number of post-translational modifications. Identifying the mature toxin region in toxin precursor sequences requires the prediction of the cleavage sites of proprotein convertases, most of which are unknown or not well characterized. Tracing the evolutionary relationships between toxins should consider specific mechanisms of rapid evolution as well as interactions between predatory animals and prey. Rapidly determining the activity of toxins is the main bottleneck in venomics discovery, but some recent bioinformatics and molecular modeling approaches give hope that accurate predictions of toxin specificity could be made in the near future. PMID:26110505

  17. Identification of anticancer drugs for hepatocellular carcinoma through personalized genome-scale metabolic modeling.

    PubMed

    Agren, Rasmus; Mardinoglu, Adil; Asplund, Anna; Kampf, Caroline; Uhlen, Mathias; Nielsen, Jens

    2014-01-01

    Genome-scale metabolic models (GEMs) have proven useful as scaffolds for the integration of omics data for understanding the genotype-phenotype relationship in a mechanistic manner. Here, we evaluated the presence/absence of proteins encoded by 15,841 genes in 27 hepatocellular carcinoma (HCC) patients using immunohistochemistry. We used this information to reconstruct personalized GEMs for six HCC patients based on the proteomics data, HMR 2.0, and a task-driven model reconstruction algorithm (tINIT). The personalized GEMs were employed to identify anticancer drugs using the concept of antimetabolites; i.e., drugs that are structural analogs to metabolites. The toxicity of each antimetabolite was predicted by assessing the in silico functionality of 83 healthy cell type-specific GEMs, which were also reconstructed with the tINIT algorithm. We predicted 101 antimetabolites that could be effective in preventing tumor growth in all HCC patients, and 46 antimetabolites which were specific to individual patients. Twenty-two of the 101 predicted antimetabolites have already been used in different cancer treatment strategies, while the remaining antimetabolites represent new potential drugs. Finally, one of the identified targets was validated experimentally, and it was confirmed to attenuate growth of the HepG2 cell line. PMID:24646661

  18. THE PATIENT AS PERSON IN AN INCREASINGLY GENE-CENTRIC UNIVERSE: HOW HEALTHCARE PROFESSIONALS SHOULD THINK ABOUT GENOMICS AND EVOLUTION

    PubMed Central

    Jackson, Timothy P.

    2009-01-01

    In the past, the primary threat to the patient as person was a medical utilitarianism that would sacrifice the individual for the collective, that would coercively (ab)use a person for the sake of an in-group’s health or happiness. Today, the threat is not only from vainglorious social groups but also from valorized genes and genomes. An over-valuation of genes risks making persons seem epiphenomenal. A central thesis of this paper is that religious healthcare professionals have unique resources to combat this. PMID:19170083

  19. The patient as person in an increasingly gene-centric universe: how healthcare professionals should think about genomics and evolution.

    PubMed

    Jackson, Timothy P

    2009-02-15

    In the past, the primary threat to the patient as person was a medical utilitarianism that would sacrifice the individual for the collective, that would coercively (ab)use a person for the sake of an in-group's health or happiness. Today, the threat is not only from vainglorious social groups but also from valorized genes and genomes. An over-valuation of genes risks making persons seem epiphenomenal. A central thesis of this article is that religious healthcare professionals have unique resources to combat this. PMID:19170083

  20. Genome-Wide Association Analysis of Eating Disorder-Related Symptoms, Behaviors, and Personality Traits

    PubMed Central

    Boraska, Vesna; Davis, Oliver SP; Cherkas, Lynn F; Helder, Sietske G; Harris, Juliette; Krug, Isabel; Pei-Chi Liao, Thomas; Treasure, Janet; Ntalla, Ioanna; Karhunen, Leila; Keski-Rahkonen, Anna; Christakopoulou, Danai; Raevuori, Anu; Shin, So-Youn; Dedoussis, George V; Kaprio, Jaakko; Soranzo, Nicole; Spector, Tim D; Collier, David A; Zeggini, Eleftheria

    2012-01-01

    Eating disorders (EDs) are common, complex psychiatric disorders thought to be caused by both genetic and environmental factors. They share many symptoms, behaviors, and personality traits, which may have overlapping heritability. The aim of the present study is to perform a genome-wide association scan (GWAS) of six ED phenotypes comprising three symptom traits from the Eating Disorders Inventory 2 [Drive for Thinness (DT), Body Dissatisfaction (BD), and Bulimia], Weight Fluctuation symptom, Breakfast Skipping behavior and Childhood Obsessive-Compulsive Personality Disorder trait (CHIRP). Investigated traits were derived from standardized self-report questionnaires completed by the TwinsUK population-based cohort. We tested 283,744 directly typed SNPs across six phenotypes of interest in the TwinsUK discovery dataset and followed-up signals from various strata using a two-stage replication strategy in two independent cohorts of European ancestry. We meta-analyzed a total of 2,698 individuals for DT, 2,680 for BD, 2,789 (821 cases/1,968 controls) for Bulimia, 1,360 (633 cases/727 controls) for Childhood Obsessive-Compulsive Personality Disorder trait, 2,773 (761 cases/2,012 controls) for Breakfast Skipping, and 2,967 (798 cases/2,169 controls) for Weight Fluctuation symptom. In this GWAS analysis of six ED-related phenotypes, we detected association of eight genetic variants with P < 10−5. Genetic variants that showed suggestive evidence of association were previously associated with several psychiatric disorders and ED-related phenotypes. Our study indicates that larger-scale collaborative studies will be needed to achieve the necessary power to detect loci underlying ED-related traits. © 2012 Wiley Periodicals, Inc. PMID:22911880

  1. Integration of bioinformatics into an undergraduate biology curriculum and the impact on development of mathematical skills.

    PubMed

    Wightman, Bruce; Hark, Amy T

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this study, we deliberately integrated bioinformatics instruction at multiple course levels into an existing biology curriculum. Students in an introductory biology course, intermediate lab courses, and advanced project-oriented courses all participated in new course components designed to sequentially introduce bioinformatics skills and knowledge, as well as computational approaches that are common to many bioinformatics applications. In each course, bioinformatics learning was embedded in an existing disciplinary instructional sequence, as opposed to having a single course where all bioinformatics learning occurs. We designed direct and indirect assessment tools to follow student progress through the course sequence. Our data show significant gains in both student confidence and ability in bioinformatics during individual courses and as course level increases. Despite evidence of substantial student learning in both bioinformatics and mathematics, students were skeptical about the link between learning bioinformatics and learning mathematics. While our approach resulted in substantial learning gains, student "buy-in" and engagement might be better in longer project-based activities that demand application of skills to research problems. Nevertheless, in situations where a concentrated focus on project-oriented bioinformatics is not possible or desirable, our approach of integrating multiple smaller components into an existing curriculum provides an alternative. PMID:22987552

  2. Integration of Bioinformatics into an Undergraduate Biology Curriculum and the Impact on Development of Mathematical Skills

    ERIC Educational Resources Information Center

    Wightman, Bruce; Hark, Amy T.

    2012-01-01

    The development of fields such as bioinformatics and genomics has created new challenges and opportunities for undergraduate biology curricula. Students preparing for careers in science, technology, and medicine need more intensive study of bioinformatics and more sophisticated training in the mathematics on which this field is based. In this…

  3. Visualizing and Sharing Results in Bioinformatics Projects: GBrowse and GenBank Exports

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Effective tools for presenting and sharing data are necessary for collaborative projects, typical for bioinformatics. In order to facilitate sharing our data with other genomics, molecular biology, and bioinformatics researchers, we have developed software to export our data to GenBank and combined ...

  4. Making Bioinformatics Projects a Meaningful Experience in an Undergraduate Biotechnology or Biomedical Science Programme

    ERIC Educational Resources Information Center

    Sutcliffe, Iain C.; Cummings, Stephen P.

    2007-01-01

    Bioinformatics has emerged as an important discipline within the biological sciences that allows scientists to decipher and manage the vast quantities of data (such as genome sequences) that are now available. Consequently, there is an obvious need to provide graduates in biosciences with generic, transferable skills in bioinformatics. We present…

  5. Virtual Bioinformatics Distance Learning Suite

    ERIC Educational Resources Information Center

    Tolvanen, Martti; Vihinen, Mauno

    2004-01-01

    Distance learning as a computer-aided concept allows students to take courses from anywhere at any time. In bioinformatics, computers are needed to collect, store, process, and analyze massive amounts of biological and biomedical data. We have applied the concept of distance learning in virtual bioinformatics to provide university course material…

  6. Channelrhodopsins: a bioinformatics perspective.

    PubMed

    Del Val, Coral; Royuela-Flor, José; Milenkovic, Stefan; Bondar, Ana-Nicoleta

    2014-05-01

    Channelrhodopsins are microbial-type rhodopsins that function as light-gated cation channels. Understanding how the detailed architecture of the protein governs its dynamics and specificity for ions is important, because it has the potential to assist in designing site-directed channelrhodopsin mutants for specific neurobiology applications. Here we use bioinformatics methods to derive accurate alignments of channelrhodopsin sequences, assess the sequence conservation patterns and find conserved motifs in channelrhodopsins, and use homology modeling to construct three-dimensional structural models of channelrhodopsins. The analyses reveal that helices C and D of channelrhodopsins contain Cys, Ser, and Thr groups that can engage in both intra- and inter-helical hydrogen bonds. We propose that these polar groups participate in inter-helical hydrogen-bonding clusters important for the protein conformational dynamics and for the local water interactions. This article is part of a Special Issue entitled: Retinal Proteins - You can teach an old dog new tricks. PMID:24252597

  7. Intrageneric Primer Design: Bringing Bioinformatics Tools to the Class

    ERIC Educational Resources Information Center

    Lima, Andre O. S.; Garces, Sergio P. S.

    2006-01-01

    Bioinformatics is one of the fastest growing scientific areas over the last decade. It focuses on the use of informatics tools for the organization and analysis of biological data. An example of their importance is the availability nowadays of dozens of software programs for genomic and proteomic studies. Thus, there is a growing field (private…

  8. Bioinformatics in high school biology curricula: a study of state science standards.

    PubMed

    Wefer, Stephen H; Sheppard, Keith

    2008-01-01

    The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics content of each state's biology standards was analyzed and categorized into nine areas: Human Genome Project/genomics, forensics, evolution, classification, nucleotide variations, medicine, computer use, agriculture/food technology, and science technology and society/socioscientific issues. Findings indicated a generally low representation of bioinformatics-related content, which varied substantially across the different areas, with Human Genome Project/genomics and computer use being the lowest (8%), and evolution being the highest (64%) among states' science frameworks. This essay concludes with recommendations for reworking/rewording existing standards to facilitate the goal of promoting science literacy among secondary school students. PMID:18316818

  9. Bioinformatics in High School Biology Curricula: A Study of State Science Standards

    PubMed Central

    Sheppard, Keith

    2008-01-01

    The proliferation of bioinformatics in modern biology marks a modern revolution in science that promises to influence science education at all levels. This study analyzed secondary school science standards of 49 U.S. states (Iowa has no science framework) and the District of Columbia for content related to bioinformatics. The bioinformatics content of each state's biology standards was analyzed and categorized into nine areas: Human Genome Project/genomics, forensics, evolution, classification, nucleotide variations, medicine, computer use, agriculture/food technology, and science technology and society/socioscientific issues. Findings indicated a generally low representation of bioinformatics-related content, which varied substantially across the different areas, with Human Genome Project/genomics and computer use being the lowest (8%), and evolution being the highest (64%) among states' science frameworks. This essay concludes with recommendations for reworking/rewording existing standards to facilitate the goal of promoting science literacy among secondary school students. PMID:18316818

  10. Translational bioinformatics in psychoneuroimmunology: methods and applications.

    PubMed

    Yan, Qing

    2012-01-01

    Translational bioinformatics plays an indispensable role in transforming psychoneuroimmunology (PNI) into personalized medicine. It provides a powerful method to bridge the gaps between various knowledge domains in PNI and systems biology. Translational bioinformatics methods at various systems levels can facilitate pattern recognition, and expedite and validate the discovery of systemic biomarkers to allow their incorporation into clinical trials and outcome assessments. Analysis of the correlations between genotypes and phenotypes including the behavioral-based profiles will contribute to the transition from the disease-based medicine to human-centered medicine. Translational bioinformatics would also enable the establishment of predictive models for patient responses to diseases, vaccines, and drugs. In PNI research, the development of systems biology models such as those of the neurons would play a critical role. Methods based on data integration, data mining, and knowledge representation are essential elements in building health information systems such as electronic health records and computerized decision support systems. Data integration of genes, pathophysiology, and behaviors are needed for a broad range of PNI studies. Knowledge discovery approaches such as network-based systems biology methods are valuable in studying the cross-talks among pathways in various brain regions involved in disorders such as Alzheimer's disease. PMID:22933157

  11. Meta-analysis of Genome-Wide Association Studies for Extraversion: Findings from the Genetics of Personality Consortium.

    PubMed

    van den Berg, Stéphanie M; de Moor, Marleen H M; Verweij, Karin J H; Krueger, Robert F; Luciano, Michelle; Arias Vasquez, Alejandro; Matteson, Lindsay K; Derringer, Jaime; Esko, Tõnu; Amin, Najaf; Gordon, Scott D; Hansell, Narelle K; Hart, Amy B; Seppälä, Ilkka; Huffman, Jennifer E; Konte, Bettina; Lahti, Jari; Lee, Minyoung; Miller, Mike; Nutile, Teresa; Tanaka, Toshiko; Teumer, Alexander; Viktorin, Alexander; Wedenoja, Juho; Abdellaoui, Abdel; Abecasis, Goncalo R; Adkins, Daniel E; Agrawal, Arpana; Allik, Jüri; Appel, Katja; Bigdeli, Timothy B; Busonero, Fabio; Campbell, Harry; Costa, Paul T; Smith, George Davey; Davies, Gail; de Wit, Harriet; Ding, Jun; Engelhardt, Barbara E; Eriksson, Johan G; Fedko, Iryna O; Ferrucci, Luigi; Franke, Barbara; Giegling, Ina; Grucza, Richard; Hartmann, Annette M; Heath, Andrew C; Heinonen, Kati; Henders, Anjali K; Homuth, Georg; Hottenga, Jouke-Jan; Iacono, William G; Janzing, Joost; Jokela, Markus; Karlsson, Robert; Kemp, John P; Kirkpatrick, Matthew G; Latvala, Antti; Lehtimäki, Terho; Liewald, David C; Madden, Pamela A F; Magri, Chiara; Magnusson, Patrik K E; Marten, Jonathan; Maschio, Andrea; Mbarek, Hamdi; Medland, Sarah E; Mihailov, Evelin; Milaneschi, Yuri; Montgomery, Grant W; Nauck, Matthias; Nivard, Michel G; Ouwens, Klaasjan G; Palotie, Aarno; Pettersson, Erik; Polasek, Ozren; Qian, Yong; Pulkki-Råback, Laura; Raitakari, Olli T; Realo, Anu; Rose, Richard J; Ruggiero, Daniela; Schmidt, Carsten O; Slutske, Wendy S; Sorice, Rossella; Starr, John M; St Pourcain, Beate; Sutin, Angelina R; Timpson, Nicholas J; Trochet, Holly; Vermeulen, Sita; Vuoksimaa, Eero; Widen, Elisabeth; Wouda, Jasper; Wright, Margaret J; Zgaga, Lina; Porteous, David; Minelli, Alessandra; Palmer, Abraham A; Rujescu, Dan; Ciullo, Marina; Hayward, Caroline; Rudan, Igor; Metspalu, Andres; Kaprio, Jaakko; Deary, Ian J; Räikkönen, Katri; Wilson, James F; Keltikangas-Järvinen, Liisa; Bierut, Laura J; Hettema, John M; Grabe, Hans J; Penninx, Brenda W J H; van Duijn, Cornelia M; Evans, David M; Schlessinger, David; Pedersen, Nancy L; Terracciano, Antonio; McGue, Matt; Martin, Nicholas G; Boomsma, Dorret I

    2016-03-01

    Extraversion is a relatively stable and heritable personality trait associated with numerous psychosocial, lifestyle and health outcomes. Despite its substantial heritability, no genetic variants have been detected in previous genome-wide association (GWA) studies, which may be due to relatively small sample sizes of those studies. Here, we report on a large meta-analysis of GWA studies for extraversion in 63,030 subjects in 29 cohorts. Extraversion item data from multiple personality inventories were harmonized across inventories and cohorts. No genome-wide significant associations were found at the single nucleotide polymorphism (SNP) level but there was one significant hit at the gene level for a long non-coding RNA site (LOC101928162). Genome-wide complex trait analysis in two large cohorts showed that the additive variance explained by common SNPs was not significantly different from zero, but polygenic risk scores, weighted using linkage information, significantly predicted extraversion scores in an independent cohort. These results show that extraversion is a highly polygenic personality trait, with an architecture possibly different from other complex human traits, including other personality traits. Future studies are required to further determine which genetic variants, by what modes of gene action, constitute the heritable nature of extraversion. PMID:26362575

  12. CattleTickBase: An integrated Internet-based bioinformatics resource for Rhipicephalus (Boophilus) microplus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Rhipicephalus microplus genome is large and complex in structure, making a genome sequence difficult to assemble and costly to resource the required bioinformatics. In light of this, a consortium of international collaborators was formed to pool resources to begin sequencing this genome. We have...

  13. Postgenomics: Proteomics and Bioinformatics in Cancer Research

    PubMed Central

    2003-01-01

    Now that the human genome is completed, the characterization of the proteins encoded by the sequence remains a challenging task. The study of the complete protein complement of the genome, the “proteome,” referred to as proteomics, will be essential if new therapeutic drugs and new disease biomarkers for early diagnosis are to be developed. Research efforts are already underway to develop the technology necessary to compare the specific protein profiles of diseased versus nondiseased states. These technologies provide a wealth of information and rapidly generate large quantities of data. Processing the large amounts of data will lead to useful predictive mathematical descriptions of biological systems which will permit rapid identification of novel therapeutic targets and identification of metabolic disorders. Here, we present an overview of the current status and future research approaches in defining the cancer cell's proteome in combination with different bioinformatics and computational biology tools toward a better understanding of health and disease. PMID:14615629

  14. Ion Torrent Personal Genome Machine Sequencing for Genomic Typing of Neisseria meningitidis for Rapid Determination of Multiple Layers of Typing Information

    PubMed Central

    Szczepanowski, Rafael; Claus, Heike; Jünemann, Sebastian; Prior, Karola; Harmsen, Dag

    2012-01-01

    Neisseria meningitidis causes invasive meningococcal disease in infants, toddlers, and adolescents worldwide. DNA sequence-based typing, including multilocus sequence typing, analysis of genetic determinants of antibiotic resistance, and sequence typing of vaccine antigens, has become the standard for molecular epidemiology of the organism. However, PCR of multiple targets and consecutive Sanger sequencing provide logistic constraints to reference laboratories. Taking advantage of the recent development of benchtop next-generation sequencers (NGSs) and of BIGSdb, a database accommodating and analyzing genome sequence data, we therefore explored the feasibility and accuracy of Ion Torrent Personal Genome Machine (PGM) sequencing for genomic typing of meningococci. Three strains from a previous meningococcus serogroup B community outbreak were selected to compare conventional typing results with data generated by semiconductor chip-based sequencing. In addition, sequencing of the meningococcal type strain MC58 provided information about the general performance of the technology. The PGM technology generated sequence information for all target genes addressed. The results were 100% concordant with conventional typing results, with no further editing being necessary. In addition, the amount of typing information, i.e., nucleotides and target genes analyzed, could be substantially increased by the combined use of genome sequencing and BIGSdb compared to conventional methods. In the near future, affordable and fast benchtop NGS machines like the PGM might enable reference laboratories to switch to genomic typing on a routine basis. This will reduce workloads and rapidly provide information for laboratory surveillance, outbreak investigation, assessment of vaccine preventability, and antibiotic resistance gene monitoring. PMID:22461678

  15. Advancing Pharmacogenomics Education in the Core PharmD Curriculum through Student Personal Genomic Testing

    PubMed Central

    Adams, Solomon M.; Anderson, Kacey B.; Coons, James C.; Smith, Randall B.; Meyer, Susan M.; Parker, Lisa S.

    2016-01-01

    Objective. To develop, implement, and evaluate “Test2Learn” a program to enhance pharmacogenomics education through the use of personal genomic testing (PGT) and real genetic data. Design. One hundred twenty-two second-year doctor of pharmacy (PharmD) students in a required course were offered PGT as part of a larger program approach to teach pharmacogenomics within a robust ethical framework. The program added novel learning objectives, lecture materials, analysis tools, and exercises using individual-level and population-level genetic data. Outcomes were assessed with objective measures and pre/post survey instruments. Assessment. One hundred students (82%) underwent PGT. Knowledge significantly improved on multiple assessments. Genotyped students reported a greater increase in confidence in understanding test results by the end of the course. Similarly, undergoing PGT improved student’s self-perceived ability to empathize with patients compared to those not genotyped. Most students (71%) reported feeling PGT was an important part of the course, and 60% reported they had a better understanding of pharmacogenomics specifically because of the opportunity. Conclusion. Implementation of PGT in the core pharmacy curriculum was feasible, well-received, and enhanced student learning of pharmacogenomics. PMID:26941429

  16. Multiplex Y-STRs analysis using the ion torrent personal genome machine (PGM).

    PubMed

    Zhao, Xueying; Ma, Ke; Li, Hui; Cao, Yu; Liu, Wenbin; Zhou, Huaigu; Ping, Yuan

    2015-11-01

    Massively parallel sequencing (MPS) technologies allow parallel sequencing analyses of many targeted regions of multiple samples at desirable depth of coverage. Routine use of MPS for forensic genetics is on the horizon. In this study, we explore the application of MPS technology in forensic Y-STR analysis. We designed a multiplex assay with 13 Y-STR loci (DYS19, DYS389 I, DYS389 II, DYS390, DYS391, DYS392, DYS437, DYS438, DYS439, DYS448, DYS456, DYS635, GATA-H4) for the purpose of MPS. The multiplex Y-STR assay was amplified in 42 unrelated male individuals and amplicons were sequenced simultaneously using the ion torrent personal genome machine (PGM) system. All loci were detected successfully, except for DYS389 II that exhibited a failure rate of 1.8% due to the relatively long amplicon sizes. We observed 7, 3, 2, 6 and 5 new alleles, respectively in DYS389 II, DYS390, DYS437, DYS448 and DYS635 due to the presence of sub-repeat composition differences, and a new allele in DYS438 because of nucleotide substitution. One allele of DYS390 was inconsistent with allele call from conventional capillary electrophoresis (CE) because of 4 bp deletions upstream of the core repeat unit. This study demonstrates that Y-STR typing by MPS can provide more genetic information, holding the promise for high discriminatory power. PMID:26247785

  17. Direct-to-Consumer Genetic Testing and Personal Genomics Services: A Review of Recent Empirical Studies

    PubMed Central

    Ostergren, Jenny

    2013-01-01

    Direct-to-consumer genetic testing (DTC-GT) has sparked much controversy and undergone dramatic changes in its brief history. Debates over appropriate health policies regarding DTC-GT would benefit from empirical research on its benefits, harms, and limitations. We review the recent literature (2011-present) and summarize findings across (1) content analyses of DTC-GT websites, (2) studies of consumer perspectives and experiences, and (3) surveys of relevant health care providers. Findings suggest that neither the health benefits envisioned by DTC-GT proponents (e.g., significant improvements in positive health behaviors) nor the worst fears expressed by its critics (e.g., catastrophic psychological distress and misunderstanding of test results, undue burden on the health care system) have materialized to date. However, research in this area is in its early stages and possesses numerous key limitations. We note needs for future studies to illuminate the impact of DTC-GT and thereby guide practice and policy regarding this rapidly evolving approach to personal genomics. PMID:24058877

  18. Erosion of Conserved Binding Sites in Personal Genomes Points to Medical Histories.

    PubMed

    Guturu, Harendra; Chinchali, Sandeep; Clarke, Shoa L; Bejerano, Gill

    2016-02-01

    Although many human diseases have a genetic component involving many loci, the majority of studies are statistically underpowered to isolate the many contributing variants, raising the question of the existence of alternate processes to identify disease mutations. To address this question, we collect ancestral transcription factor binding sites disrupted by an individual's variants and then look for their most significant congregation next to a group of functionally related genes. Strikingly, when the method is applied to five different full human genomes, the top enriched function for each is invariably reflective of their very different medical histories. For example, our method implicates "abnormal cardiac output" for a patient with a longstanding family history of heart disease, "decreased circulating sodium level" for an individual with hypertension, and other biologically appealing links for medical histories spanning narcolepsy to axonal neuropathy. Our results suggest that erosion of gene regulation by mutation load significantly contributes to observed heritable phenotypes that manifest in the medical history. The test we developed exposes a hitherto hidden layer of personal variants that promise to shed new light on human disease penetrance, expressivity and the sensitivity with which we can detect them. PMID:26845687

  19. Social Networkers’ Attitudes Toward Direct-to-Consumer Personal Genome Testing

    PubMed Central

    McGuire, Amy L.; Diaz, Christina M.; Wang, Tao; Hilsenbeck, Susan G.

    2009-01-01

    Purpose This study explores social networkers’ interest in and attitudes toward personal genome testing (PGT), focusing on expectations related to the clinical integration of PGT results. Methods An online survey of 1,087 social networking users was conducted to assess 1) use and interest in PGT; 2) attitudes toward PGT companies and test results; and 3) expectations for the clinical integration of PGT. Descriptive statistics were calculated to summarize respondents’ characteristics and responses. Results Six percent of respondents have used PGT, 64% would consider using PGT, and 30% would not use PGT. Of those who would consider using PGT, 74% would use it to gain knowledge about disease in their family. Of all respondents, 34% consider the information obtained from PGT to be a medical diagnosis. Of all respondents, 78% of those who would consider PGT would ask their physician for help interpreting test results, and 61% of all respondents believe that physicians have a professional obligation to help individuals interpret PGT results. Conclusion Respondents express interest in using PGT services, primarily for purposes related to their medical care and expect physicians to help interpret PGT results. Physicians should therefore be prepared for patient demands for information and counsel on the basis of PGT results. PMID:19998099

  20. The Portable Dictionary of the Mouse Genome: a personal database for gene mapping and molecular biology.

    PubMed

    Williams, R W

    1994-06-01

    The Portable Dictionary of the Mouse Genome is a database for personal computers that contains information on approximately 10,000 loci in the mouse, along with data on homologs in several other mammalian species, including human, rat, cat, cow, and pig. Key features of the dictionary are its compact size, its network independence, and the ability to convert the entire dictionary to a wide variety of common application programs. Another significant feature is the integration of DNA sequence accession data. Loci in the dictionary can be rapidly resorted by chromosomal position, by type, by human homology, or by gene effect. The dictionary provides an accessible, easily manipulated set of data that has many uses--from a quick review of loci and gene nomenclature to the design of experiments and analysis of results. The Portable Dictionary is available in several formats suitable for conversion to different programs and computer systems. It can be obtained on disk or from Internet Gopher servers (mickey.utmen.edu or anat4.utmen.edu), an anonymous FTP site (nb.utmem.edu in the directory pub/genedict), and a World Wide Web server (http://mickey.utmem.edu/front.html). PMID:8043953

  1. Advancing Pharmacogenomics Education in the Core PharmD Curriculum through Student Personal Genomic Testing.

    PubMed

    Adams, Solomon M; Anderson, Kacey B; Coons, James C; Smith, Randall B; Meyer, Susan M; Parker, Lisa S; Empey, Philip E

    2016-02-25

    Objective. To develop, implement, and evaluate "Test2Learn" a program to enhance pharmacogenomics education through the use of personal genomic testing (PGT) and real genetic data. Design. One hundred twenty-two second-year doctor of pharmacy (PharmD) students in a required course were offered PGT as part of a larger program approach to teach pharmacogenomics within a robust ethical framework. The program added novel learning objectives, lecture materials, analysis tools, and exercises using individual-level and population-level genetic data. Outcomes were assessed with objective measures and pre/post survey instruments. Assessment. One hundred students (82%) underwent PGT. Knowledge significantly improved on multiple assessments. Genotyped students reported a greater increase in confidence in understanding test results by the end of the course. Similarly, undergoing PGT improved student's self-perceived ability to empathize with patients compared to those not genotyped. Most students (71%) reported feeling PGT was an important part of the course, and 60% reported they had a better understanding of pharmacogenomics specifically because of the opportunity. Conclusion. Implementation of PGT in the core pharmacy curriculum was feasible, well-received, and enhanced student learning of pharmacogenomics. PMID:26941429

  2. Erosion of Conserved Binding Sites in Personal Genomes Points to Medical Histories

    PubMed Central

    Guturu, Harendra; Chinchali, Sandeep; Clarke, Shoa L.; Bejerano, Gill

    2016-01-01

    Although many human diseases have a genetic component involving many loci, the majority of studies are statistically underpowered to isolate the many contributing variants, raising the question of the existence of alternate processes to identify disease mutations. To address this question, we collect ancestral transcription factor binding sites disrupted by an individual’s variants and then look for their most significant congregation next to a group of functionally related genes. Strikingly, when the method is applied to five different full human genomes, the top enriched function for each is invariably reflective of their very different medical histories. For example, our method implicates “abnormal cardiac output” for a patient with a longstanding family history of heart disease, “decreased circulating sodium level” for an individual with hypertension, and other biologically appealing links for medical histories spanning narcolepsy to axonal neuropathy. Our results suggest that erosion of gene regulation by mutation load significantly contributes to observed heritable phenotypes that manifest in the medical history. The test we developed exposes a hitherto hidden layer of personal variants that promise to shed new light on human disease penetrance, expressivity and the sensitivity with which we can detect them. PMID:26845687

  3. Incorporating a Collaborative Web-Based Virtual Laboratory in an Undergraduate Bioinformatics Course

    ERIC Educational Resources Information Center

    Weisman, David

    2010-01-01

    Face-to-face bioinformatics courses commonly include a weekly, in-person computer lab to facilitate active learning, reinforce conceptual material, and teach practical skills. Similarly, fully-online bioinformatics courses employ hands-on exercises to achieve these outcomes, although students typically perform this work offsite. Combining a…

  4. Global computing for bioinformatics.

    PubMed

    Loewe, Laurence

    2002-12-01

    Global computing, the collaboration of idle PCs via the Internet in a SETI@home style, emerges as a new way of massive parallel multiprocessing with potentially enormous CPU power. Its relations to the broader, fast-moving field of Grid computing are discussed without attempting a review of the latter. This review (i) includes a short table of milestones in global computing history, (ii) lists opportunities global computing offers for bioinformatics, (iii) describes the structure of problems well suited for such an approach, (iv) analyses the anatomy of successful projects and (v) points to existing software frameworks. Finally, an evaluation of the various costs shows that global computing indeed has merit, if the problem to be solved is already coded appropriately and a suitable global computing framework can be found. Then, either significant amounts of computing power can be recruited from the general public, or--if employed in an enterprise-wide Intranet for security reasons--idle desktop PCs can substitute for an expensive dedicated cluster. PMID:12511066

  5. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word “data-mining” is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  6. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  7. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software

    PubMed Central

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  8. Engineering bioinformatics: building reliability, performance and productivity into bioinformatics software.

    PubMed

    Lawlor, Brendan; Walsh, Paul

    2015-01-01

    There is a lack of software engineering skills in bioinformatic contexts. We discuss the consequences of this lack, examine existing explanations and remedies to the problem, point out their shortcomings, and propose alternatives. Previous analyses of the problem have tended to treat the use of software in scientific contexts as categorically different from the general application of software engineering in commercial settings. In contrast, we describe bioinformatic software engineering as a specialization of general software engineering, and examine how it should be practiced. Specifically, we highlight the difference between programming and software engineering, list elements of the latter and present the results of a survey of bioinformatic practitioners which quantifies the extent to which those elements are employed in bioinformatics. We propose that the ideal way to bring engineering values into research projects is to bring engineers themselves. We identify the role of Bioinformatic Engineer and describe how such a role would work within bioinformatic research teams. We conclude by recommending an educational emphasis on cross-training software engineers into life sciences, and propose research on Domain Specific Languages to facilitate collaboration between engineers and bioinformaticians. PMID:25996054

  9. The Information Technology Infrastructure for the Translational Genomics Core and the Partners Biobank at Partners Personalized Medicine.

    PubMed

    Boutin, Natalie; Holzbach, Ana; Mahanta, Lisa; Aldama, Jackie; Cerretani, Xander; Embree, Kevin; Leon, Irene; Rathi, Neeta; Vickers, Matilde

    2016-01-01

    The Biobank and Translational Genomics core at Partners Personalized Medicine requires robust software and hardware. This Information Technology (IT) infrastructure enables the storage and transfer of large amounts of data, drives efficiencies in the laboratory, maintains data integrity from the time of consent to the time that genomic data is distributed for research, and enables the management of complex genetic data. Here, we describe the functional components of the research IT infrastructure at Partners Personalized Medicine and how they integrate with existing clinical and research systems, review some of the ways in which this IT infrastructure maintains data integrity and security, and discuss some of the challenges inherent to building and maintaining such infrastructure. PMID:26805892

  10. The Information Technology Infrastructure for the Translational Genomics Core and the Partners Biobank at Partners Personalized Medicine

    PubMed Central

    Boutin, Natalie; Holzbach, Ana; Mahanta, Lisa; Aldama, Jackie; Cerretani, Xander; Embree, Kevin; Leon, Irene; Rathi, Neeta; Vickers, Matilde

    2016-01-01

    The Biobank and Translational Genomics core at Partners Personalized Medicine requires robust software and hardware. This Information Technology (IT) infrastructure enables the storage and transfer of large amounts of data, drives efficiencies in the laboratory, maintains data integrity from the time of consent to the time that genomic data is distributed for research, and enables the management of complex genetic data. Here, we describe the functional components of the research IT infrastructure at Partners Personalized Medicine and how they integrate with existing clinical and research systems, review some of the ways in which this IT infrastructure maintains data integrity and security, and discuss some of the challenges inherent to building and maintaining such infrastructure. PMID:26805892

  11. Exploring the immunogenome with bioinformatics.

    PubMed

    de Bono, Bernard; Trowsdale, John

    2003-08-01

    A better description of the immune system can be afforded if the latest developments in bioinformatics are applied to integrate sequence with structure and function. Clear guidelines for the upgrade of the bioinformatic capability of the immunogenetics laboratory are discussed in the light of more powerful methods to detect homology, combined approaches to predict the three dimensional properties of a protein and a robust strategy to represent the biological role of a gene. PMID:14690048

  12. Progress in bioinformatics and the importance of being earnest.

    PubMed

    Attwood, T K; Miller, C J

    2002-01-01

    In silico biology has gathered momentum as, worldwide, scientists have united in a common quest to sequence, store and analyse complete genomes. This year, a pivotal achievement of this cooperative endeavour was realised in the release of a public draft of the human genome, and with it the promises to improve our understanding of diverse aspects of biology and to yield a healthier future with safe personalized medicines. Key to these goals will be the need to elucidate and characterise the genes and gene products encoded not just in the human genome, but in many genomes. These tasks are underpinned by the concepts and processes of genome and gene/protein evolution, regulation of gene expression, mechanisms of protein folding, the manifestation of protein function, and so on, all of which must be understood in the context of complex, dynamic biological systems. Our use of computers to model such concepts and systems must be placed in the context of the current limits of our understanding of them:- it is important to recognise, for example, that we don't have a common understanding either of what constitutes a gene or a protein function; we can't invariably say that a particular sequence or fold has arisen via divergent or convergent evolution; and we don't fully understand the rules of protein folding. Accepting what we can't do in silico is essential in appreciating what we can do. Without this understanding, it is easy to be misled, as notions of what particular computational approaches can achieve are sometimes rather optimistic. There are valuable lessons to be learned here from the field of Artificial Intelligence, principal among which is the realisation that capturing and representing complex knowledge is time consuming, expensive and hard. Thus, we argue here that if bioinformatics is to tackle biological complexity in earnest, it would be wise to absorb the experience distilled from decades of artificial intelligence research, and to approach the road ahead

  13. Provenance in bioinformatics workflows

    PubMed Central

    2013-01-01

    In this work, we used the PROV-DM model to manage data provenance in workflows of genome projects. This provenance model allows the storage of details of one workflow execution, e.g., raw and produced data and computational tools, their versions and parameters. Using this model, biologists can access details of one particular execution of a workflow, compare results produced by different executions, and plan new experiments more efficiently. In addition to this, a provenance simulator was created, which facilitates the inclusion of provenance data of one genome project workflow execution. Finally, we discuss one case study, which aims to identify genes involved in specific metabolic pathways of Bacillus cereus, as well as to compare this isolate with other phylogenetic related bacteria from the Bacillus group. B. cereus is an extremophilic bacteria, collected in warm water in the Midwestern Region of Brazil, its DNA samples having been sequenced with an NGS machine. PMID:24564294

  14. Bioinformatic Identification of Conserved Cis-Sequences in Coregulated Genes.

    PubMed

    Bülow, Lorenz; Hehl, Reinhard

    2016-01-01

    Bioinformatics tools can be employed to identify conserved cis-sequences in sets of coregulated plant genes because more and more gene expression and genomic sequence data become available. Knowledge on the specific cis-sequences, their enrichment and arrangement within promoters, facilitates the design of functional synthetic plant promoters that are responsive to specific stresses. The present chapter illustrates an example for the bioinformatic identification of conserved Arabidopsis thaliana cis-sequences enriched in drought stress-responsive genes. This workflow can be applied for the identification of cis-sequences in any sets of coregulated genes. The workflow includes detailed protocols to determine sets of coregulated genes, to extract the corresponding promoter sequences, and how to install and run a software package to identify overrepresented motifs. Further bioinformatic analyses that can be performed with the results are discussed. PMID:27557771

  15. Borderline personality disorder and childhood maltreatment: a genome-wide methylation analysis.

    PubMed

    Prados, J; Stenz, L; Courtet, P; Prada, P; Nicastro, R; Adouan, W; Guillaume, S; Olié, E; Aubry, J-M; Dayer, A; Perroud, N

    2015-02-01

    Early life adversity plays a critical role in the emergence of borderline personality disorder (BPD) and this could occur through epigenetic programming. In this perspective, we aimed to determine whether childhood maltreatment could durably modify epigenetic processes by the means of a whole-genome methylation scan of BPD subjects. Using the Illumina Infinium® HumanMethylation450 BeadChip, global methylation status of DNA extracted from peripheral blood leucocytes was correlated to the severity of childhood maltreatment in 96 BPD subjects suffering from a high level of child adversity and 93 subjects suffering from major depressive disorder (MDD) and reporting a low rate of child maltreatment. Several CpGs within or near the following genes (IL17RA, miR124-3, KCNQ2, EFNB1, OCA2, MFAP2, RPH3AL, WDR60, CST9L, EP400, A2ML1, NT5DC2, FAM163A and SPSB2) were found to be differently methylated, either in BPD compared with MDD or in relation to the severity of childhood maltreatment. A highly relevant biological result was observed for cg04927004 close to miR124-3 that was significantly associated with BPD and severity of childhood maltreatment. miR124-3 codes for a microRNA (miRNA) targeting several genes previously found to be associated with BPD such as NR3C1. Our results highlight the potentially important role played by miRNAs in the etiology of neuropsychiatric disorders such as BPD and the usefulness of using methylome-wide association studies to uncover such candidate genes. Moreover, they offer new understanding of the impact of maltreatments on biological processes leading to diseases and may ultimately result in the identification of relevant biomarkers. PMID:25612291

  16. CAPweb: a bioinformatics CGH array Analysis Platform.

    PubMed

    Liva, Stéphane; Hupé, Philippe; Neuvial, Pierre; Brito, Isabel; Viara, Eric; La Rosa, Philippe; Barillot, Emmanuel

    2006-07-01

    Assessing variations in DNA copy number is crucial for understanding constitutional or somatic diseases, particularly cancers. The recently developed array-CGH (comparative genomic hybridization) technology allows this to be investigated at the genomic level. We report the availability of a web tool for analysing array-CGH data. CAPweb (CGH array Analysis Platform on the Web) is intended as a user-friendly tool enabling biologists to completely analyse CGH arrays from the raw data to the visualization and biological interpretation. The user typically performs the following bioinformatics steps of a CGH array project within CAPweb: the secure upload of the results of CGH array image analysis and of the array annotation (genomic position of the probes); first level analysis of each array, including automatic normalization of the data (for correcting experimental biases), breakpoint detection and status assignment (gain, loss or normal); validation or deletion of the analysis based on a summary report and quality criteria; visualization and biological analysis of the genomic profiles and results through a user-friendly interface. CAPweb is accessible at http://bioinfo.curie.fr/CAPweb. PMID:16845053

  17. Personalization.

    ERIC Educational Resources Information Center

    Shore, Rebecca Martin

    1996-01-01

    Describes how a typical high school in Huntington Beach, California, curbed disruptive student behavior by personalizing the school experience for "problem" students. Through mostly volunteer efforts, an adopt-a-kid program was initiated that matched kids' learning styles to adults' personality styles and resulted in fewer suspensions and numerous…

  18. Translational Bioinformatics Approaches to Drug Development

    PubMed Central

    Readhead, Ben; Dudley, Joel

    2013-01-01

    Significance A majority of therapeutic interventions occur late in the pathological process, when treatment outcome can be less predictable and effective, highlighting the need for new precise and preventive therapeutic development strategies that consider genomic and environmental context. Translational bioinformatics is well positioned to contribute to the many challenges inherent in bridging this gap between our current reactive methods of healthcare delivery and the intent of precision medicine, particularly in the areas of drug development, which forms the focus of this review. Recent Advances A variety of powerful informatics methods for organizing and leveraging the vast wealth of available molecular measurements available for a broad range of disease contexts have recently emerged. These include methods for data driven disease classification, drug repositioning, identification of disease biomarkers, and the creation of disease network models, each with significant impacts on drug development approaches. Critical Issues An important bottleneck in the application of bioinformatics methods in translational research is the lack of investigators who are versed in both biomedical domains and informatics. Efforts to nurture both sets of competencies within individuals and to increase interfield visibility will help to accelerate the adoption and increased application of bioinformatics in translational research. Future Directions It is possible to construct predictive, multiscale network models of disease by integrating genotype, gene expression, clinical traits, and other multiscale measures using causal network inference methods. This can enable the identification of the “key drivers” of pathology, which may represent novel therapeutic targets or biomarker candidates that play a more direct role in the etiology of disease. PMID:24527359

  19. Bioinformatics education--perspectives and challenges out of Africa.

    PubMed

    Tastan Bishop, Özlem; Adebiyi, Ezekiel F; Alzohairy, Ahmed M; Everett, Dean; Ghedira, Kais; Ghouila, Amel; Kumuthini, Judit; Mulder, Nicola J; Panji, Sumir; Patterton, Hugh-G

    2015-03-01

    The discipline of bioinformatics has developed rapidly since the complete sequencing of the first genomes in the 1990s. The development of many high-throughput techniques during the last decades has ensured that bioinformatics has grown into a discipline that overlaps with, and is required for, the modern practice of virtually every field in the life sciences. This has placed a scientific premium on the availability of skilled bioinformaticians, a qualification that is extremely scarce on the African continent. The reasons for this are numerous, although the absence of a skilled bioinformatician at academic institutions to initiate a training process and build sustained capacity seems to be a common African shortcoming. This dearth of bioinformatics expertise has had a knock-on effect on the establishment of many modern high-throughput projects at African institutes, including the comprehensive and systematic analysis of genomes from African populations, which are among the most genetically diverse anywhere on the planet. Recent funding initiatives from the National Institutes of Health and the Wellcome Trust are aimed at ameliorating this shortcoming. In this paper, we discuss the problems that have limited the establishment of the bioinformatics field in Africa, as well as propose specific actions that will help with the education and training of bioinformaticians on the continent. This is an absolute requirement in anticipation of a boom in high-throughput approaches to human health issues unique to data from African populations. PMID:24990350

  20. Bioinformatics Education—Perspectives and Challenges out of Africa

    PubMed Central

    Adebiyi, Ezekiel F.; Alzohairy, Ahmed M.; Everett, Dean; Ghedira, Kais; Ghouila, Amel; Kumuthini, Judit; Mulder, Nicola J.; Panji, Sumir; Patterton, Hugh-G.

    2015-01-01

    The discipline of bioinformatics has developed rapidly since the complete sequencing of the first genomes in the 1990s. The development of many high-throughput techniques during the last decades has ensured that bioinformatics has grown into a discipline that overlaps with, and is required for, the modern practice of virtually every field in the life sciences. This has placed a scientific premium on the availability of skilled bioinformaticians, a qualification that is extremely scarce on the African continent. The reasons for this are numerous, although the absence of a skilled bioinformatician at academic institutions to initiate a training process and build sustained capacity seems to be a common African shortcoming. This dearth of bioinformatics expertise has had a knock-on effect on the establishment of many modern high-throughput projects at African institutes, including the comprehensive and systematic analysis of genomes from African populations, which are among the most genetically diverse anywhere on the planet. Recent funding initiatives from the National Institutes of Health and the Wellcome Trust are aimed at ameliorating this shortcoming. In this paper, we discuss the problems that have limited the establishment of the bioinformatics field in Africa, as well as propose specific actions that will help with the education and training of bioinformaticians on the continent. This is an absolute requirement in anticipation of a boom in high-throughput approaches to human health issues unique to data from African populations. PMID:24990350

  1. ExPASy: SIB bioinformatics resource portal

    PubMed Central

    Artimo, Panu; Jonnalagedda, Manohar; Arnold, Konstantin; Baratin, Delphine; Csardi, Gabor; de Castro, Edouard; Duvaud, Séverine; Flegel, Volker; Fortier, Arnaud; Gasteiger, Elisabeth; Grosdidier, Aurélien; Hernandez, Céline; Ioannidis, Vassilios; Kuznetsov, Dmitry; Liechti, Robin; Moretti, Sébastien; Mostaguir, Khaled; Redaschi, Nicole; Rossier, Grégoire; Xenarios, Ioannis; Stockinger, Heinz

    2012-01-01

    ExPASy (http://www.expasy.org) has worldwide reputation as one of the main bioinformatics resources for proteomics. It has now evolved, becoming an extensible and integrative portal accessing many scientific resources, databases and software tools in different areas of life sciences. Scientists can henceforth access seamlessly a wide range of resources in many different domains, such as proteomics, genomics, phylogeny/evolution, systems biology, population genetics, transcriptomics, etc. The individual resources (databases, web-based and downloadable software tools) are hosted in a ‘decentralized’ way by different groups of the SIB Swiss Institute of Bioinformatics and partner institutions. Specifically, a single web portal provides a common entry point to a wide range of resources developed and operated by different SIB groups and external institutions. The portal features a search function across ‘selected’ resources. Additionally, the availability and usage of resources are monitored. The portal is aimed for both expert users and people who are not familiar with a specific domain in life sciences. The new web interface provides, in particular, visual guidance for newcomers to ExPASy. PMID:22661580

  2. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio) Database.

    PubMed

    Choi, Jeongseok; Kim, Jaekwon; Lee, Dong Kyun; Jang, Kwang Soo; Kim, Dai-Jin; Choi, In Young

    2016-03-01

    Internet addiction (IA) has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio) database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA. PMID:27103887

  3. The OAuth 2.0 Web Authorization Protocol for the Internet Addiction Bioinformatics (IABio) Database

    PubMed Central

    Choi, Jeongseok; Kim, Jaekwon; Lee, Dong Kyun; Jang, Kwang Soo; Kim, Dai-Jin

    2016-01-01

    Internet addiction (IA) has become a widespread and problematic phenomenon as smart devices pervade society. Moreover, internet gaming disorder leads to increases in social expenditures for both individuals and nations alike. Although the prevention and treatment of IA are getting more important, the diagnosis of IA remains problematic. Understanding the neurobiological mechanism of behavioral addictions is essential for the development of specific and effective treatments. Although there are many databases related to other addictions, a database for IA has not been developed yet. In addition, bioinformatics databases, especially genetic databases, require a high level of security and should be designed based on medical information standards. In this respect, our study proposes the OAuth standard protocol for database access authorization. The proposed IA Bioinformatics (IABio) database system is based on internet user authentication, which is a guideline for medical information standards, and uses OAuth 2.0 for access control technology. This study designed and developed the system requirements and configuration. The OAuth 2.0 protocol is expected to establish the security of personal medical information and be applied to genomic research on IA. PMID:27103887

  4. Motivations, concerns and preferences of personal genome sequencing research participants: Baseline findings from the HealthSeq project.

    PubMed

    Sanderson, Saskia C; Linderman, Michael D; Suckiel, Sabrina A; Diaz, George A; Zinberg, Randi E; Ferryman, Kadija; Wasserstein, Melissa; Kasarskis, Andrew; Schadt, Eric E

    2016-01-01

    Whole exome/genome sequencing (WES/WGS) is increasingly offered to ostensibly healthy individuals. Understanding the motivations and concerns of research participants seeking out personal WGS and their preferences regarding return-of-results and data sharing will help optimize protocols for WES/WGS. Baseline interviews including both qualitative and quantitative components were conducted with research participants (n=35) in the HealthSeq project, a longitudinal cohort study of individuals receiving personal WGS results. Data sharing preferences were recorded during informed consent. In the qualitative interview component, the dominant motivations that emerged were obtaining personal disease risk information, satisfying curiosity, contributing to research, self-exploration and interest in ancestry, and the dominant concern was the potential psychological impact of the results. In the quantitative component, 57% endorsed concerns about privacy. Most wanted to receive all personal WGS results (94%) and their raw data (89%); a third (37%) consented to having their data shared to the Database of Genotypes and Phenotypes (dbGaP). Early adopters of personal WGS in the HealthSeq project express a variety of health- and non-health-related motivations. Almost all want all available findings, while also expressing concerns about the psychological impact and privacy of their results. PMID:26036856

  5. Motivations, concerns and preferences of personal genome sequencing research participants: Baseline findings from the HealthSeq project

    PubMed Central

    Sanderson, Saskia C; Linderman, Michael D; Suckiel, Sabrina A; Diaz, George A; Zinberg, Randi E; Ferryman, Kadija; Wasserstein, Melissa; Kasarskis, Andrew; Schadt, Eric E

    2016-01-01

    Whole exome/genome sequencing (WES/WGS) is increasingly offered to ostensibly healthy individuals. Understanding the motivations and concerns of research participants seeking out personal WGS and their preferences regarding return-of-results and data sharing will help optimize protocols for WES/WGS. Baseline interviews including both qualitative and quantitative components were conducted with research participants (n=35) in the HealthSeq project, a longitudinal cohort study of individuals receiving personal WGS results. Data sharing preferences were recorded during informed consent. In the qualitative interview component, the dominant motivations that emerged were obtaining personal disease risk information, satisfying curiosity, contributing to research, self-exploration and interest in ancestry, and the dominant concern was the potential psychological impact of the results. In the quantitative component, 57% endorsed concerns about privacy. Most wanted to receive all personal WGS results (94%) and their raw data (89%); a third (37%) consented to having their data shared to the Database of Genotypes and Phenotypes (dbGaP). Early adopters of personal WGS in the HealthSeq project express a variety of health- and non-health-related motivations. Almost all want all available findings, while also expressing concerns about the psychological impact and privacy of their results. PMID:26036856

  6. The European Bioinformatics Institute's data resources 2014.

    PubMed

    Brooksbank, Catherine; Bergman, Mary Todd; Apweiler, Rolf; Birney, Ewan; Thornton, Janet

    2014-01-01

    Molecular Biology has been at the heart of the 'big data' revolution from its very beginning, and the need for access to biological data is a common thread running from the 1965 publication of Dayhoff's 'Atlas of Protein Sequence and Structure' through the Human Genome Project in the late 1990s and early 2000s to today's population-scale sequencing initiatives. The European Bioinformatics Institute (EMBL-EBI; http://www.ebi.ac.uk) is one of three organizations worldwide that provides free access to comprehensive, integrated molecular data sets. Here, we summarize the principles underpinning the development of these public resources and provide an overview of EMBL-EBI's database collection to complement the reviews of individual databases provided elsewhere in this issue. PMID:24271396

  7. Bioinformatics by Example: From Sequence to Target

    NASA Astrophysics Data System (ADS)

    Kossida, Sophia; Tahri, Nadia; Daizadeh, Iraj

    2002-12-01

    With the completion of the human genome, and the imminent completion of other large-scale sequencing and structure-determination projects, computer-assisted bioscience is aimed to become the new paradigm for conducting basic and applied research. The presence of these additional bioinformatics tools stirs great anxiety for experimental researchers (as well as for pedagogues), since they are now faced with a wider and deeper knowledge of differing disciplines (biology, chemistry, physics, mathematics, and computer science). This review targets those individuals who are interested in using computational methods in their teaching or research. By analyzing a real-life, pharmaceutical, multicomponent, target-based example the reader will experience this fascinating new discipline.

  8. Bioinformatics for Next Generation Sequencing Data

    PubMed Central

    Magi, Alberto; Benelli, Matteo; Gozzini, Alessia; Girolami, Francesca; Torricelli, Francesca; Brandi, Maria Luisa

    2010-01-01

    The emergence of next-generation sequencing (NGS) platforms imposes increasing demands on statistical methods and bioinformatic tools for the analysis and the management of the huge amounts of data generated by these technologies. Even at the early stages of their commercial availability, a large number of softwares already exist for analyzing NGS data. These tools can be fit into many general categories including alignment of sequence reads to a reference, base-calling and/or polymorphism detection, de novo assembly from paired or unpaired reads, structural variant detection and genome browsing. This manuscript aims to guide readers in the choice of the available computational tools that can be used to face the several steps of the data analysis workflow. PMID:24710047

  9. Personalized medicine and genomics: challenges and opportunities in assessing effectiveness, cost-effectiveness, and future research priorities.

    PubMed

    Conti, Rena; Veenstra, David L; Armstrong, Katrina; Lesko, Lawrence J; Grosse, Scott D

    2010-01-01

    Personalized medicine is health care that tailors interventions to individual variation in risk and treatment response. Although medicine has long strived to achieve this goal, advances in genomics promise to facilitate this process. Relevant to present-day practice is the use of genomic information to classify individuals according to disease susceptibility or expected responsiveness to a pharmacologic treatment and to provide targeted interventions. A symposium at the annual meeting of the Society for Medical Decision Making on 23 October 2007 highlighted the challenges and opportunities posed in translating advances in molecular medicine into clinical practice. A panel of US experts in medical practice, regulatory policy, technology assessment, and the financing and organization of medical innovation was asked to discuss the current state of practice and research on personalized medicine as it relates to their own field. This article reports on the issues raised, discusses potential approaches to meet these challenges, and proposes directions for future work. The case of genetic testing to inform dosing with warfarin, an anticoagulant, is used to illustrate differing perspectives on evidence and decision making for personalized medicine. PMID:20086232

  10. Current trends in antimicrobial agent research: chemo- and bioinformatics approaches.

    PubMed

    Hammami, Riadh; Fliss, Ismail

    2010-07-01

    Databases and chemo- and bioinformatics tools that contain genomic, proteomic and functional information have become indispensable for antimicrobial drug research. The combination of chemoinformatics tools, bioinformatics tools and relational databases provides means of analyzing, linking and comparing online search results. The development of computational tools feeds on a diversity of disciplines, including mathematics, statistics, computer science, information technology and molecular biology. The computational approach to antimicrobial agent discovery and design encompasses genomics, molecular simulation and dynamics, molecular docking, structural and/or functional class prediction, and quantitative structure-activity relationships. This article reviews progress in the development of computational methods, tools and databases used for organizing and extracting biological meaning from antimicrobial research. PMID:20546918

  11. Using the Coriell Personalized Medicine Collaborative Data to conduct a genome-wide association study of sleep duration.

    PubMed

    Scheinfeldt, Laura B; Gharani, Neda; Kasper, Rachel S; Schmidlen, Tara J; Gordon, Erynn S; Jarvis, Joseph P; Delaney, Susan; Kronenthal, Courtney J; Gerry, Norman P; Christman, Michael F

    2015-12-01

    Sleep is critical to health and functionality, and several studies have investigated the inherited component of insomnia and other sleep disorders using genome-wide association studies (GWAS). However, genome-wide studies focused on sleep duration are less common. Here, we used data from participants in the Coriell Personalized Medicine Collaborative (CPMC) (n = 4,401) to examine putative associations between self-reported sleep duration, demographic and lifestyle variables, and genome-wide single nucleotide polymorphism (SNP) data to better understand genetic contributions to variation in sleep duration. We employed stepwise ordered logistic regression to select our model and retained the following predictive variables: age, gender, weight, physical activity, physical activity at work, smoking status, alcohol consumption, ethnicity, and ancestry (as measured by principal components analysis) in our association testing. Several of our strongest candidate genes were previously identified in GWAS related to sleep duration (TSHZ2, ABCC9, FBXO15) and narcolepsy (NFATC2, SALL4). In addition, we have identified novel candidate genes for involvement in sleep duration including SORCS1 and ELOVL2. Our results demonstrate that the self-reported data collected through the CPMC are robust, and our genome-wide association analysis has identified novel candidate genes involved in sleep duration. More generally, this study contributes to a better understanding of the complexity of human sleep. PMID:26333835

  12. Evolving Strategies for the Incorporation of Bioinformatics Within the Undergraduate Cell Biology Curriculum

    PubMed Central

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in three courses, beginning with an introductory course in cell biology. The exercises and projects that were used to help students develop literacy in bioinformatics are described. In a recently offered course in bioinformatics, students developed their own simple sequence analysis tool using the Perl programming language. These experiences are described from the point of view of the instructor as well as the students. A preliminary assessment has been made of the degree to which students had developed a working knowledge of bioinformatics concepts and methods. Finally, some conclusions have been drawn from these courses that may be helpful to instructors wishing to introduce bioinformatics within the undergraduate biology curriculum. PMID:14673489

  13. The 20th anniversary of EMBnet: 20 years of bioinformatics for the Life Sciences community

    PubMed Central

    D'Elia, Domenica; Gisel, Andreas; Eriksson, Nils-Einar; Kossida, Sophia; Mattila, Kimmo; Klucar, Lubos; Bongcam-Rudloff, Erik

    2009-01-01

    The EMBnet Conference 2008, focusing on 'Leading Applications and Technologies in Bioinformatics', was organized by the European Molecular Biology network (EMBnet) to celebrate its 20th anniversary. Since its foundation in 1988, EMBnet has been working to promote collaborative development of bioinformatics services and tools to serve the European community of molecular biology laboratories. This conference was the first meeting organized by the network that was open to the international scientific community outside EMBnet. The conference covered a broad range of research topics in bioinformatics with a main focus on new achievements and trends in emerging technologies supporting genomics, transcriptomics and proteomics analyses such as high-throughput sequencing and data managing, text and data-mining, ontologies and Grid technologies. Papers selected for publication, in this supplement to BMC Bioinformatics, cover a broad range of the topics treated, providing also an overview of the main bioinformatics research fields that the EMBnet community is involved in. PMID:19534734

  14. Refining genome-wide linkage intervals using a meta-analysis of genome-wide association studies identifies loci influencing personality dimensions.

    PubMed

    Amin, Najaf; Hottenga, Jouke-Jan; Hansell, Narelle K; Janssens, A Cecile J W; de Moor, Marleen H M; Madden, Pamela A F; Zorkoltseva, Irina V; Penninx, Brenda W; Terracciano, Antonio; Uda, Manuela; Tanaka, Toshiko; Esko, Tonu; Realo, Anu; Ferrucci, Luigi; Luciano, Michelle; Davies, Gail; Metspalu, Andres; Abecasis, Goncalo R; Deary, Ian J; Raikkonen, Katri; Bierut, Laura J; Costa, Paul T; Saviouk, Viatcheslav; Zhu, Gu; Kirichenko, Anatoly V; Isaacs, Aaron; Aulchenko, Yurii S; Willemsen, Gonneke; Heath, Andrew C; Pergadia, Michele L; Medland, Sarah E; Axenovich, Tatiana I; de Geus, Eco; Montgomery, Grant W; Wright, Margaret J; Oostra, Ben A; Martin, Nicholas G; Boomsma, Dorret I; van Duijn, Cornelia M

    2013-08-01

    Personality traits are complex phenotypes related to psychosomatic health. Individually, various gene finding methods have not achieved much success in finding genetic variants associated with personality traits. We performed a meta-analysis of four genome-wide linkage scans (N=6149 subjects) of five basic personality traits assessed with the NEO Five-Factor Inventory. We compared the significant regions from the meta-analysis of linkage scans with the results of a meta-analysis of genome-wide association studies (GWAS) (N∼17 000). We found significant evidence of linkage of neuroticism to chromosome 3p14 (rs1490265, LOD=4.67) and to chromosome 19q13 (rs628604, LOD=3.55); of extraversion to 14q32 (ATGG002, LOD=3.3); and of agreeableness to 3p25 (rs709160, LOD=3.67) and to two adjacent regions on chromosome 15, including 15q13 (rs970408, LOD=4.07) and 15q14 (rs1055356, LOD=3.52) in the individual scans. In the meta-analysis, we found strong evidence of linkage of extraversion to 4q34, 9q34, 10q24 and 11q22, openness to 2p25, 3q26, 9p21, 11q24, 15q26 and 19q13 and agreeableness to 4q34 and 19p13. Significant evidence of association in the GWAS was detected between openness and rs677035 at 11q24 (P-value=2.6 × 10(-06), KCNJ1). The findings of our linkage meta-analysis and those of the GWAS suggest that 11q24 is a susceptible locus for openness, with KCNJ1 as the possible candidate gene. PMID:23211697

  15. A critical appraisal of the scientific basis of commercial genomic profiles used to assess health risks and personalize health interventions.

    PubMed

    Janssens, A Cecile J W; Gwinn, Marta; Bradley, Linda A; Oostra, Ben A; van Duijn, Cornelia M; Khoury, Muin J

    2008-03-01

    Predictive genomic profiling used to produce personalized nutrition and other lifestyle health recommendations is currently offered directly to consumers. By examining previous meta-analyses and HuGE reviews, we assessed the scientific evidence supporting the purported gene-disease associations for genes included in genomic profiles offered online. We identified seven companies that offer predictive genomic profiling. We searched PubMed for meta-analyses and HuGE reviews of studies of gene-disease associations published from 2000 through June 2007 in which the genotypes of people with a disease were compared with those of a healthy or general-population control group. The seven companies tested at least 69 different polymorphisms in 56 genes. Of the 56 genes tested, 24 (43%) were not reviewed in meta-analyses. For the remaining 32 genes, we found 260 meta-analyses that examined 160 unique polymorphism-disease associations, of which only 60 (38%) were found to be statistically significant. Even the 60 significant associations, which involved 29 different polymorphisms and 28 different diseases, were generally modest, with synthetic odds ratios ranging from 0.54 to 0.88 for protective variants and from 1.04 to 3.2 for risk variants. Furthermore, genes in cardiogenomic profiles were more frequently associated with noncardiovascular diseases than with cardiovascular diseases, and though two of the five genes of the osteogenomic profiles did show significant associations with disease, the associations were not with bone diseases. There is insufficient scientific evidence to conclude that genomic profiles are useful in measuring genetic risk for common diseases or in developing personalized diet and lifestyle recommendations for disease prevention. PMID:18319070

  16. A Critical Appraisal of the Scientific Basis of Commercial Genomic Profiles Used to Assess Health Risks and Personalize Health Interventions

    PubMed Central

    Janssens, A. Cecile J.W.; Gwinn, Marta; Bradley, Linda A.; Oostra, Ben A.; van Duijn, Cornelia M.; Khoury, Muin J.

    2008-01-01

    Predictive genomic profiling used to produce personalized nutrition and other lifestyle health recommendations is currently offered directly to consumers. By examining previous meta-analyses and HuGE reviews, we assessed the scientific evidence supporting the purported gene-disease associations for genes included in genomic profiles offered online. We identified seven companies that offer predictive genomic profiling. We searched PubMed for meta-analyses and HuGE reviews of studies of gene-disease associations published from 2000 through June 2007 in which the genotypes of people with a disease were compared with those of a healthy or general-population control group. The seven companies tested at least 69 different polymorphisms in 56 genes. Of the 56 genes tested, 24 (43%) were not reviewed in meta-analyses. For the remaining 32 genes, we found 260 meta-analyses that examined 160 unique polymorphism-disease associations, of which only 60 (38%) were found to be statistically significant. Even the 60 significant associations, which involved 29 different polymorphisms and 28 different diseases, were generally modest, with synthetic odds ratios ranging from 0.54 to 0.88 for protective variants and from 1.04 to 3.2 for risk variants. Furthermore, genes in cardiogenomic profiles were more frequently associated with noncardiovascular diseases than with cardiovascular diseases, and though two of the five genes of the osteogenomic profiles did show significant associations with disease, the associations were not with bone diseases. There is insufficient scientific evidence to conclude that genomic profiles are useful in measuring genetic risk for common diseases or in developing personalized diet and lifestyle recommendations for disease prevention. PMID:18319070

  17. Extending information retrieval methods to personalized genomic-based studies of disease.

    PubMed

    Ye, Shuyun; Dawson, John A; Kendziorski, Christina

    2014-01-01

    Genomic-based studies of disease now involve diverse types of data collected on large groups of patients. A major challenge facing statistical scientists is how best to combine the data, extract important features, and comprehensively characterize the ways in which they affect an individual's disease course and likelihood of response to treatment. We have developed a survival-supervised latent Dirichlet allocation (survLDA) modeling framework to address these challenges. Latent Dirichlet allocation (LDA) models have proven extremely effective at identifying themes common across large collections of text, but applications to genomics have been limited. Our framework extends LDA to the genome by considering each patient as a "document" with "text" detailing his/her clinical events and genomic state. We then further extend the framework to allow for supervision by a time-to-event response. The model enables the efficient identification of collections of clinical and genomic features that co-occur within patient subgroups, and then characterizes each patient by those features. An application of survLDA to The Cancer Genome Atlas ovarian project identifies informative patient subgroups showing differential response to treatment, and validation in an independent cohort demonstrates the potential for patient-specific inference. PMID:25733795

  18. Extending Information Retrieval Methods to Personalized Genomic-Based Studies of Disease

    PubMed Central

    Ye, Shuyun; Dawson, John A; Kendziorski, Christina

    2014-01-01

    Genomic-based studies of disease now involve diverse types of data collected on large groups of patients. A major challenge facing statistical scientists is how best to combine the data, extract important features, and comprehensively characterize the ways in which they affect an individual’s disease course and likelihood of response to treatment. We have developed a survival-supervised latent Dirichlet allocation (survLDA) modeling framework to address these challenges. Latent Dirichlet allocation (LDA) models have proven extremely effective at identifying themes common across large collections of text, but applications to genomics have been limited. Our framework extends LDA to the genome by considering each patient as a “document” with “text” detailing his/her clinical events and genomic state. We then further extend the framework to allow for supervision by a time-to-event response. The model enables the efficient identification of collections of clinical and genomic features that co-occur within patient subgroups, and then characterizes each patient by those features. An application of survLDA to The Cancer Genome Atlas ovarian project identifies informative patient subgroups showing differential response to treatment, and validation in an independent cohort demonstrates the potential for patient-specific inference. PMID:25733795

  19. Bioinformatics and the Undergraduate Curriculum

    ERIC Educational Resources Information Center

    Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…

  20. Reproducible Bioinformatics Research for Biologists

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This book chapter describes the current Big Data problem in Bioinformatics and the resulting issues with performing reproducible computational research. The core of the chapter provides guidelines and summaries of current tools/techniques that a noncomputational researcher would need to learn to pe...

  1. Visualising "Junk" DNA through Bioinformatics

    ERIC Educational Resources Information Center

    Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

    2005-01-01

    One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…

  2. Clinical Bioinformatics: challenges and opportunities

    PubMed Central

    2012-01-01

    Background Network Tools and Applications in Biology (NETTAB) Workshops are a series of meetings focused on the most promising and innovative ICT tools and to their usefulness in Bioinformatics. The NETTAB 2011 workshop, held in Pavia, Italy, in October 2011 was aimed at presenting some of the most relevant methods, tools and infrastructures that are nowadays available for Clinical Bioinformatics (CBI), the research field that deals with clinical applications of bioinformatics. Methods In this editorial, the viewpoints and opinions of three world CBI leaders, who have been invited to participate in a panel discussion of the NETTAB workshop on the next challenges and future opportunities of this field, are reported. These include the development of data warehouses and ICT infrastructures for data sharing, the definition of standards for sharing phenotypic data and the implementation of novel tools to implement efficient search computing solutions. Results Some of the most important design features of a CBI-ICT infrastructure are presented, including data warehousing, modularity and flexibility, open-source development, semantic interoperability, integrated search and retrieval of -omics information. Conclusions Clinical Bioinformatics goals are ambitious. Many factors, including the availability of high-throughput "-omics" technologies and equipment, the widespread availability of clinical data warehouses and the noteworthy increase in data storage and computational power of the most recent ICT systems, justify research and efforts in this domain, which promises to be a crucial leveraging factor for biomedical research. PMID:23095472

  3. Computational intelligence techniques in bioinformatics.

    PubMed

    Hassanien, Aboul Ella; Al-Shammari, Eiman Tamah; Ghali, Neveen I

    2013-12-01

    Computational intelligence (CI) is a well-established paradigm with current systems having many of the characteristics of biological computers and capable of performing a variety of tasks that are difficult to do using conventional techniques. It is a methodology involving adaptive mechanisms and/or an ability to learn that facilitate intelligent behavior in complex and changing environments, such that the system is perceived to possess one or more attributes of reason, such as generalization, discovery, association and abstraction. The objective of this article is to present to the CI and bioinformatics research communities some of the state-of-the-art in CI applications to bioinformatics and motivate research in new trend-setting directions. In this article, we present an overview of the CI techniques in bioinformatics. We will show how CI techniques including neural networks, restricted Boltzmann machine, deep belief network, fuzzy logic, rough sets, evolutionary algorithms (EA), genetic algorithms (GA), swarm intelligence, artificial immune systems and support vector machines, could be successfully employed to tackle various problems such as gene expression clustering and classification, protein sequence classification, gene selection, DNA fragment assembly, multiple sequence alignment, and protein function prediction and its structure. We discuss some representative methods to provide inspiring examples to illustrate how CI can be utilized to address these problems and how bioinformatics data can be characterized by CI. Challenges to be addressed and future directions of research are also presented and an extensive bibliography is included. PMID:23891719

  4. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa

    PubMed Central

    Mulder, Nicola J.; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M.; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C. Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu

    2016-01-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985

  5. H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa.

    PubMed

    Mulder, Nicola J; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu

    2016-02-01

    The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985

  6. Bioinformatics and the allergy assessment of agricultural biotechnology products: industry practices and recommendations.

    PubMed

    Ladics, Gregory S; Cressman, Robert F; Herouet-Guicheney, Corinne; Herman, Rod A; Privalle, Laura; Song, Ping; Ward, Jason M; McClain, Scott

    2011-06-01

    Bioinformatic tools are being increasingly utilized to evaluate the degree of similarity between a novel protein and known allergens within the context of a larger allergy safety assessment process. Importantly, bioinformatics is not a predictive analysis that can determine if a novel protein will ''become" an allergen, but rather a tool to assess whether the protein is a known allergen or is potentially cross-reactive with an existing allergen. Bioinformatic tools are key components of the 2009 CodexAlimentarius Commission's weight-of-evidence approach, which encompasses a variety of experimental approaches for an overall assessment of the allergenic potential of a novel protein. Bioinformatic search comparisons between novel protein sequences, as well as potential novel fusion sequences derived from the genome and transgene, and known allergens are required by all regulatory agencies that assess the safety of genetically modified (GM) products. The objective of this paper is to identify opportunities for consensus in the methods of applying bioinformatics and to outline differences that impact a consistent and reliable allergy safety assessment. The bioinformatic comparison process has some critical features, which are outlined in this paper. One of them is a curated, publicly available and well-managed database with known allergenic sequences. In this paper, the best practices, scientific value, and food safety implications of bioinformatic analyses, as they are applied to GM food crops are discussed. Recommendations for conducting bioinformatic analysis on novel food proteins for potential cross-reactivity to known allergens are also put forth. PMID:21320564

  7. Genomic and molecular aberrations in malignant peripheral nerve sheath tumor and their roles in personalized target therapy.

    PubMed

    Yang, Jilong; Du, Xiaoling

    2013-09-01

    Malignant peripheral nerve sheath tumors (MPNSTs) are malignant tumors with a high rate of local recurrence and a significant tendency to metastasize. Its dismal outcome points to the urgent need to establish better therapeutic strategies for patients harboring MPNSTs. The investigations of genomic and molecular aberrations in MPNSTs which detect many chromosomal aberrations, pathway abnormalities, and specific molecular aberrant events would supply multiple potential therapy targets and contribute to achievement of personalized medicine. The involved genes in the significant gains aberrations include BIRC5, CCNE2, DAB2, DDX15, EGFR, DAB2, MSH2, CDK6, HGF, ITGB4, KCNK12, LAMA3, LOXL2, MET, and PDGFRA. The involved genes in the significant deletion aberrations include CDH1, GLTSCR2, EGR1, CTSB, GATA3, SULT2A1, GLTSCR2, HMMR/RHAMM, LICAM2, MMP13, p16/INK4a, RASSF2, NM-23H1, and TP53. These genetic aberrations involve in several important signaling pathways such as TFF, EGFR, ARF, IGF1R signaling pathways. The genomic and molecular aberrations of EGFR, IGF1R, SOX9, EYA4, TOP2A, ETV4, and BIRC5 exhibit great promise as personalized therapeutic targets for MPNST patients. PMID:23830351

  8. Molecular pathology of prostate cancer revealed by next-generation sequencing: opportunities for genome-based personalized therapy

    PubMed Central

    Huang, Jiaoti; Wang, Jason K.; Sun, Yin

    2014-01-01

    Purpose of review This article reviews recently identified genomic mutations in prostate cancer. Recent findings Advanced sequencing technologies have made it possible to obtain large amounts of data on genomes and transcriptomes of cancers. Such technologies have been used to sequence prostate cancer of different stages, from treatment-naive cancers, to advanced, castration-resistant cancers to the aggressive small cell neuroendocrine carcinomas. For each category of prostate cancer, distinct and overlapping DNA sequence alterations were discovered, including point mutations, small insertions or deletions, copy number changes and chromosomal rearrangements. There appears to be a stepwise increase in genomic alterations from low risk to high risk to advanced cancers. Summary These novel findings have significantly increased our knowledge of the genetic basis of human prostate cancer and the molecular mechanisms responsible for disease progression and treatment resistance. Some of the lesions are potential therapeutic targets. Studies along this direction will eventually make it possible to design personalized management plans for individual patients. PMID:23385974

  9. Prioritization of anticancer drugs against a cancer using genomic features of cancer cells: A step towards personalized medicine

    PubMed Central

    Gupta, Sudheer; Chaudhary, Kumardeep; Kumar, Rahul; Gautam, Ankur; Nanda, Jagpreet Singh; Dhanda, Sandeep Kumar; Brahmachari, Samir Kumar; Raghava, Gajendra P. S.

    2016-01-01

    In this study, we investigated drug profile of 24 anticancer drugs tested against a large number of cell lines in order to understand the relation between drug resistance and altered genomic features of a cancer cell line. We detected frequent mutations, high expression and high copy number variations of certain genes in both drug resistant cell lines and sensitive cell lines. It was observed that a few drugs, like Panobinostat, are effective against almost all types of cell lines, whereas certain drugs are effective against only a limited type of cell lines. Tissue-specific preference of drugs was also seen where a drug is more effective against cell lines belonging to a specific tissue. Genomic features based models have been developed for each anticancer drug and achieved average correlation between predicted and actual growth inhibition of cell lines in the range of 0.43 to 0.78. We hope, our study will throw light in the field of personalized medicine, particularly in designing patient-specific anticancer drugs. In order to serve the scientific community, a webserver, CancerDP, has been developed for predicting priority/potency of an anticancer drug against a cancer cell line using its genomic features (http://crdd.osdd.net/raghava/cancerdp/). PMID:27030518

  10. Omics-bioinformatics in the context of clinical data.

    PubMed

    Mayer, Gert; Heinze, Georg; Mischak, Harald; Hellemons, Merel E; Heerspink, Hiddo J Lambers; Bakker, Stephan J L; de Zeeuw, Dick; Haiduk, Martin; Rossing, Peter; Oberbauer, Rainer

    2011-01-01

    The Omics revolution has provided the researcher with tools and methodologies for qualitative and quantitative assessment of a wide spectrum of molecular players spanning from the genome to the meta-bolome level. As a consequence, explorative analysis (in contrast to purely hypothesis driven research procedures) has become applicable. However, numerous issues have to be considered for deriving meaningful results from Omics, and bioinformatics has to respect these in data analysis and interpretation. Aspects include sample type and quality, concise definition of the (clinical) question, and selection of samples ideally coming from thoroughly defined sample and data repositories. Omics suffers from a principal shortcoming, namely unbalanced sample-to-feature matrix denoted as "curse of dimensionality", where a feature refers to a specific gene or protein among the many thousands assayed in parallel in an Omics experiment. This setting makes the identification of relevant features with respect to a phenotype under analysis error prone from a statistical perspective. From this sample size calculation for screening studies and for verification of results from Omics, bioinformatics is essential. Here we present key elements to be considered for embedding Omics bioinformatics in a quality controlled workflow for Omics screening, feature identification, and validation. Relevant items include sample and clinical data management, minimum sample quality requirements, sample size estimates, and statistical procedures for computing the significance of findings from Omics bioinformatics in validation studies. PMID:21370098

  11. Genomic profiling of murine mammary tumors identifies potential personalized drug targets for p53-deficient mammary cancers

    PubMed Central

    Agrawal, Yash N.; Koboldt, Daniel C.; Kanchi, Krishna L.; Herschkowitz, Jason I.; Mardis, Elaine R.; Rosen, Jeffrey M.; Perou, Charles M.

    2016-01-01

    ABSTRACT Targeted therapies against basal-like breast tumors, which are typically ‘triple-negative breast cancers (TNBCs)’, remain an important unmet clinical need. Somatic TP53 mutations are the most common genetic event in basal-like breast tumors and TNBC. To identify additional drivers and possible drug targets of this subtype, a comparative study between human and murine tumors was performed by utilizing a murine Trp53-null mammary transplant tumor model. We show that two subsets of murine Trp53-null mammary transplant tumors resemble aspects of the human basal-like subtype. DNA-microarray, whole-genome and exome-based sequencing approaches were used to interrogate the secondary genetic aberrations of these tumors, which were then compared to human basal-like tumors to identify conserved somatic genetic features. DNA copy-number variation produced the largest number of conserved candidate personalized drug targets. These candidates were filtered using a DNA-RNA Pearson correlation cut-off and a requirement that the gene was deemed essential in at least 5% of human breast cancer cell lines from an RNA-mediated interference screen database. Five potential personalized drug target genes, which were spontaneously amplified loci in both murine and human basal-like tumors, were identified: Cul4a, Lamp1, Met, Pnpla6 and Tubgcp3. As a proof of concept, inhibition of Met using crizotinib caused Met-amplified murine tumors to initially undergo complete regression. This study identifies Met as a promising drug target in a subset of murine Trp53-null tumors, thus identifying a potential shared driver with a subset of human basal-like breast cancers. Our results also highlight the importance of comparative genomic studies for discovering personalized drug targets and for providing a preclinical model for further investigations of key tumor signaling pathways. PMID:27149990

  12. A Bioinformatics Facility for NASA

    NASA Technical Reports Server (NTRS)

    Schweighofer, Karl; Pohorille, Andrew

    2006-01-01

    Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.

  13. An association-adjusted consensus deleterious scheme to classify homozygous Mis-sense mutations for personal genome interpretation

    PubMed Central

    2013-01-01

    Background Personal genome analysis is now being considered for evaluation of disease risk in healthy individuals, utilizing both rare and common variants. Multiple scores have been developed to predict the deleteriousness of amino acid substitutions, using information on the allele frequencies, level of evolutionary conservation, and averaged structural evidence. However, agreement among these scores is limited and they likely over-estimate the fraction of the genome that is deleterious. Method This study proposes an integrative approach to identify a subset of homozygous non-synonymous single nucleotide polymorphisms (nsSNPs). An 8-level classification scheme is constructed from the presence/absence of deleterious predictions combined with evidence of association with disease or complex traits. Detailed literature searches and structural validations are then performed for a subset of homozygous 826 mis-sense mutations in 575 proteins found in the genomes of 12 healthy adults. Results Implementation of the Association-Adjusted Consensus Deleterious Scheme (AACDS) classifies 11% of all predicted highly deleterious homozygous variants as most likely to influence disease risk. The number of such variants per genome ranges from 0 to 8 with no significant difference between African and Caucasian Americans. Detailed analysis of mutations affecting the APOE, MTMR2, THSB1, CHIA, αMyHC, and AMY2A proteins shows how the protein structure is likely to be disrupted, even though the associated phenotypes have not been documented in the corresponding individuals. Conclusions The classification system for homozygous nsSNPs provides an opportunity to systematically rank nsSNPs based on suggestive evidence from annotations and sequence-based predictions. The ranking scheme, in-depth literature searches, and structural validations of highly prioritized mis-sense mutations compliment traditional sequence-based approaches and should have particular utility for the development of

  14. CAN I ACCESS MY PERSONAL GENOME? THE CURRENT LEGAL POSITION IN THE UK

    PubMed Central

    Kaye, Jane; Kanellopoulou, Nadja; Hawkins, Naomi; Gowans, Heather; Curren, Liam; Melham, Karen

    2014-01-01

    This paper discusses the nature of genomic information, and the moral arguments in support of an individual's right to access it. It analyses the legal avenues an individual might take to access their sequence information. The authors describe the policy implications in this area and conclude that, for now, the law appears to strike an appropriate balance, but new policy will need to be developed to address this issue. PMID:24136352

  15. A survey on evolutionary algorithm based hybrid intelligence in bioinformatics.

    PubMed

    Li, Shan; Kang, Liying; Zhao, Xing-Ming

    2014-01-01

    With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs) are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks. PMID:24729969

  16. A Survey on Evolutionary Algorithm Based Hybrid Intelligence in Bioinformatics

    PubMed Central

    Li, Shan; Zhao, Xing-Ming

    2014-01-01

    With the rapid advance in genomics, proteomics, metabolomics, and other types of omics technologies during the past decades, a tremendous amount of data related to molecular biology has been produced. It is becoming a big challenge for the bioinformatists to analyze and interpret these data with conventional intelligent techniques, for example, support vector machines. Recently, the hybrid intelligent methods, which integrate several standard intelligent approaches, are becoming more and more popular due to their robustness and efficiency. Specifically, the hybrid intelligent approaches based on evolutionary algorithms (EAs) are widely used in various fields due to the efficiency and robustness of EAs. In this review, we give an introduction about the applications of hybrid intelligent methods, in particular those based on evolutionary algorithm, in bioinformatics. In particular, we focus on their applications to three common problems that arise in bioinformatics, that is, feature selection, parameter estimation, and reconstruction of biological networks. PMID:24729969

  17. Balancing Benefits and Risks of Immortal Data: Participants' Views of Open Consent in the Personal Genome Project.

    PubMed

    Zarate, Oscar A; Brody, Julia Green; Brown, Phil; Ramirez-Andreotta, Mónica D; Perovich, Laura; Matz, Jacob

    2016-01-01

    An individual's health, genetic, or environmental-exposure data, placed in an online repository, creates a valuable shared resource that can accelerate biomedical research and even open opportunities for crowd-sourcing discoveries by members of the public. But these data become "immortalized" in ways that may create lasting risk as well as benefit. Once shared on the Internet, the data are difficult or impossible to redact, and identities may be revealed by a process called data linkage, in which online data sets are matched to each other. Reidentification (re-ID), the process of associating an individual's name with data that were considered deidentified, poses risks such as insurance or employment discrimination, social stigma, and breach of the promises often made in informed-consent documents. At the same time, re-ID poses risks to researchers and indeed to the future of science, should re-ID end up undermining the trust and participation of potential research participants. The ethical challenges of online data sharing are heightened as so-called big data becomes an increasingly important research tool and driver of new research structures. Big data is shifting research to include large numbers of researchers and institutions as well as large numbers of participants providing diverse types of data, so the participants' consent relationship is no longer with a person or even a research institution. In addition, consent is further transformed because big data analysis often begins with descriptive inquiry and generation of a hypothesis, and the research questions cannot be clearly defined at the outset and may be unforeseeable over the long term. In this article, we consider how expanded data sharing poses new challenges, illustrated by genomics and the transition to new models of consent. We draw on the experiences of participants in an open data platform-the Personal Genome Project-to allow study participants to contribute their voices to inform ethical consent

  18. Type 2 diabetes, genomics, and nursing: necessary next steps to advance the science into improved, personalized care.

    PubMed

    Underwood, Patricia C

    2011-01-01

    Type 2 diabetes mellitus (T2DM) is an inherited, chronic disorder with long-term complications; including cardiovascular disease the leading cause of mortality in the United States. The prevalence of T2DM and its complications are on the rise in the United States, highlighting the need for improved individualized prevention and treatment strategies. Exciting advancements in the field of genomics has led to the recent discovery of numerous genetic markers for T2DM; completing a promising first step toward improved, individualized prevention and treatment strategies for T2DM. These genomic markers, identified using genome-wide association studies (GWAS), candidate gene, and rare variant methodology, identify new physiologic pathways underlying the development of T2DM. Much more work is needed to successfully translate the identification of genetic markers for T2DM into improved, individualized prevention and treatment strategies. As front line providers and leaders of prevention and treatment strategies for chronic disease, nurses, nurse practitioners, and nurse scientists must contribute to this translational effort. Thus, it is important for nurses at all levels to (a) be aware of the current science of genetics and T2DM and (b) participate in the translation of this genetic information into improved, personalized patient care. The aim of this review is to (a) provide an overview of the current state of the science of genetic markers and T2DM and (b) highlight essential next steps to successfully translate the identification of genetic markers for T2DM into improved prevention and treatment strategies; focusing particularly on the role of nursing in this process. PMID:22891509

  19. Genomic Analysis as the First Step toward Personalized Treatment in Renal Cell Carcinoma

    PubMed Central

    Bielecka, Zofia Felicja; Czarnecka, Anna Małgorzata; Szczylik, Cezary

    2014-01-01

    Drug resistance mechanisms in renal cell carcinoma (RCC) still remain elusive. Although most patients initially respond to targeted therapy, acquired resistance can still develop eventually. Most of the patients suffer from intrinsic (genetic) resistance as well, suggesting that there is substantial need to broaden our knowledge in the field of RCC genetics. As molecular abnormalities occur for various reasons, ranging from single nucleotide polymorphisms to large chromosomal defects, conducting whole-genome association studies using high-throughput techniques seems inevitable. In principle, data obtained via genome-wide research should be continued and performed on a large scale for the purposes of drug development and identification of biological pathways underlying cancerogenesis. Genetic alterations are mostly unique for each histological RCC subtype. According to recently published data, RCC is a highly heterogeneous tumor. In this paper, the authors discuss the following: (1) current state-of-the-art knowledge on the potential biomarkers of RCC subtypes; (2) significant obstacles encountered in the translational research on RCC; and (3) recent molecular findings that may have a crucial impact on future therapeutic approaches. PMID:25120953

  20. Bioinformatics in Africa: The Rise of Ghana?

    PubMed

    Karikari, Thomas K

    2015-09-01

    Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics. PMID:26378921

  1. Bioinformatics in Africa: The Rise of Ghana?

    PubMed Central

    Karikari, Thomas K.

    2015-01-01

    Until recently, bioinformatics, an important discipline in the biological sciences, was largely limited to countries with advanced scientific resources. Nonetheless, several developing countries have lately been making progress in bioinformatics training and applications. In Africa, leading countries in the discipline include South Africa, Nigeria, and Kenya. However, one country that is less known when it comes to bioinformatics is Ghana. Here, I provide a first description of the development of bioinformatics activities in Ghana and how these activities contribute to the overall development of the discipline in Africa. Over the past decade, scientists in Ghana have been involved in publications incorporating bioinformatics analyses, aimed at addressing research questions in biomedical science and agriculture. Scarce research funding and inadequate training opportunities are some of the challenges that need to be addressed for Ghanaian scientists to continue developing their expertise in bioinformatics. PMID:26378921

  2. UCSC genome browser tutorial.

    PubMed

    Zweig, Ann S; Karolchik, Donna; Kuhn, Robert M; Haussler, David; Kent, W James

    2008-08-01

    The University of California Santa Cruz (UCSC) Genome Bioinformatics website consists of a suite of free, open-source, on-line tools that can be used to browse, analyze, and query genomic data. These tools are available to anyone who has an Internet browser and an interest in genomics. The website provides a quick and easy-to-use visual display of genomic data. It places annotation tracks beneath genome coordinate positions, allowing rapid visual correlation of different types of information. Many of the annotation tracks are submitted by scientists worldwide; the others are computed by the UCSC Genome Bioinformatics group from publicly available sequence data. It also allows users to upload and display their own experimental results or annotation sets by creating a custom track. The suite of tools, downloadable data files, and links to documentation and other information can be found at http://genome.ucsc.edu/. PMID:18514479

  3. Bioinformatics Tools for the Discovery of New Nonribosomal Peptides.

    PubMed

    Leclère, Valérie; Weber, Tilmann; Jacques, Philippe; Pupin, Maude

    2016-01-01

    This chapter helps in the use of bioinformatics tools relevant to the discovery of new nonribosomal peptides (NRPs) produced by microorganisms. The strategy described can be applied to draft or fully assembled genome sequences. It relies on the identification of the synthetase genes and the deciphering of the domain architecture of the nonribosomal peptide synthetases (NRPSs). In the next step, candidate peptides synthesized by these NRPSs are predicted in silico, considering the specificity of incorporated monomers together with their isomery. To assess their novelty, the two-dimensional structure of the peptides can be compared with the structural patterns of all known NRPs. The presented workflow leads to an efficient and rapid screening of genomic data generated by high throughput technologies. The exploration of such sequenced genomes may lead to the discovery of new drugs (i.e., antibiotics against multi-resistant pathogens or anti-tumors). PMID:26831711

  4. Genomic Testing: a Genetic Counselor's Personal Reflection on Three Years of Consenting and Testing.

    PubMed

    Wynn, Julia

    2016-08-01

    Whole exome sequencing (WES) is increasingly used in research and clinical genetics as the cost of sequencing decreases and the interpretation improves. Genetic counselors need to be prepared to counsel a diverse patient population for this complex test. This commentary is a reflection of one genetic counselor's experiences in counseling, consenting, and returning results for clinical and research WES for over 120 participants and patients. She reflects on how she overcame the initial challenges and concerns of counseling for WES and how her counseling evolved from a teaching based counseling model to an interactive patient-center counseling model. Her insights are offered to prepare other genetic counselors for the growing use of genomic testing. PMID:26242468

  5. Testing personalized medicine: patient and physician expectations of next-generation genomic sequencing in late-stage cancer care

    PubMed Central

    Miller, Fiona A; Hayeems, Robin Z; Bytautas, Jessica P; Bedard, Philippe L; Ernst, Scott; Hirte, Hal; Hotte, Sebastien; Oza, Amit; Razak, Albiruni; Welch, Stephen; Winquist, Eric; Dancey, Janet; Siu, Lillian L

    2014-01-01

    Developments in genomics, including next-generation sequencing technologies, are expected to enable a more personalized approach to clinical care, with improved risk stratification and treatment selection. In oncology, personalized medicine is particularly advanced and increasingly used to identify oncogenic variants in tumor tissue that predict responsiveness to specific drugs. Yet, the translational research needed to validate these technologies will be conducted in patients with late-stage cancer and is expected to produce results of variable clinical significance and incidentally identify genetic risks. To explore the experiential context in which much of personalized cancer care will be developed and evaluated, we conducted a qualitative interview study alongside a pilot feasibility study of targeted DNA sequencing of metastatic tumor biopsies in adult patients with advanced solid malignancies. We recruited 29/73 patients and 14/17 physicians; transcripts from semi-structured interviews were analyzed for thematic patterns using an interpretive descriptive approach. Patient hopes of benefit from research participation were enhanced by the promise of novel and targeted treatment but challenged by non-findings or by limited access to relevant trials. Family obligations informed a willingness to receive genetic information, which was perceived as burdensome given disease stage or as inconsequential given faced challenges. Physicians were optimistic about long-term potential but conservative about immediate benefits and mindful of elevated patient expectations; consent and counseling processes were expected to mitigate challenges from incidental findings. These findings suggest the need for information and decision tools to support physicians in communicating realistic prospects of benefit, and for cautious approaches to the generation of incidental genetic information. PMID:23860039

  6. Testing personalized medicine: patient and physician expectations of next-generation genomic sequencing in late-stage cancer care.

    PubMed

    Miller, Fiona A; Hayeems, Robin Z; Bytautas, Jessica P; Bedard, Philippe L; Ernst, Scott; Hirte, Hal; Hotte, Sebastien; Oza, Amit; Razak, Albiruni; Welch, Stephen; Winquist, Eric; Dancey, Janet; Siu, Lillian L

    2014-03-01

    Developments in genomics, including next-generation sequencing technologies, are expected to enable a more personalized approach to clinical care, with improved risk stratification and treatment selection. In oncology, personalized medicine is particularly advanced and increasingly used to identify oncogenic variants in tumor tissue that predict responsiveness to specific drugs. Yet, the translational research needed to validate these technologies will be conducted in patients with late-stage cancer and is expected to produce results of variable clinical significance and incidentally identify genetic risks. To explore the experiential context in which much of personalized cancer care will be developed and evaluated, we conducted a qualitative interview study alongside a pilot feasibility study of targeted DNA sequencing of metastatic tumor biopsies in adult patients with advanced solid malignancies. We recruited 29/73 patients and 14/17 physicians; transcripts from semi-structured interviews were analyzed for thematic patterns using an interpretive descriptive approach. Patient hopes of benefit from research participation were enhanced by the promise of novel and targeted treatment but challenged by non-findings or by limited access to relevant trials. Family obligations informed a willingness to receive genetic information, which was perceived as burdensome given disease stage or as inconsequential given faced challenges. Physicians were optimistic about long-term potential but conservative about immediate benefits and mindful of elevated patient expectations; consent and counseling processes were expected to mitigate challenges from incidental findings. These findings suggest the need for information and decision tools to support physicians in communicating realistic prospects of benefit, and for cautious approaches to the generation of incidental genetic information. PMID:23860039

  7. Bioinformatics and Microarray Data Analysis on the Cloud.

    PubMed

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data. PMID:25863787

  8. Bioinformatics in the secondary science classroom: A study of state content standards and students' perceptions of, and performance in, bioinformatics lessons

    NASA Astrophysics Data System (ADS)

    Wefer, Stephen H.

    The proliferation of bioinformatics in modern Biology marks a new revolution in science, which promises to influence science education at all levels. This thesis examined state standards for content that articulated bioinformatics, and explored secondary students' affective and cognitive perceptions of, and performance in, a bioinformatics mini-unit. The results are presented as three studies. The first study analyzed secondary science standards of 49 U.S States (Iowa has no science framework) and the District of Columbia for content related to bioinformatics at the introductory high school biology level. The bionformatics content of each state's Biology standards were categorized into nine areas and the prevalence of each area documented. The nine areas were: The Human Genome Project, Forensics, Evolution, Classification, Nucleotide Variations, Medicine, Computer Use, Agriculture/Food Technology, and Science Technology and Society/Socioscientific Issues (STS/SSI). Findings indicated a generally low representation of bioinformatics related content, which varied substantially across the different areas. Recommendations are made for reworking existing standards to incorporate bioinformatics and to facilitate the goal of promoting science literacy in this emerging new field among secondary school students. The second study examined thirty-two students' affective responses to, and content mastery of, a two-week bioinformatics mini-unit. The findings indicate that the students generally were positive relative to their interest level, the usefulness of the lessons, the difficulty level of the lessons, likeliness to engage in additional bioinformatics, and were overall successful on the assessments. A discussion of the results and significance is followed by suggestions for future research and implementation for transferability. The third study presents a case study of individual differences among ten secondary school students, whose cognitive and affective percepts were

  9. Which craft is best in bioinformatics?

    PubMed

    Attwood, T K; Miller, C J

    2001-07-01

    'Silicon-based' biology has gathered momentum as the world-wide sequencing projects have made possible the investigation and comparative analysis of complete genomes. Central to the quest to elucidate and characterise the genes and gene products encoded within genomes are pivotal concepts concerning the processes of evolution, the mechanisms of protein folding, and, crucially, the manifestation of protein function. Our use of computers to model such concepts is limited by, and must be placed in the context of, the current limits of our understanding of these biological processes. It is important to recognise that we do not have a common understanding of what constitutes a gene; we cannot invariably say that a particular sequence or fold has arisen via divergence or convergence; we do not fully understand the rules of protein folding, so we cannot predict protein structure; and we cannot invariably diagnose protein function, given knowledge only of its sequence or structure in isolation. Accepting what we cannot do with computers plays an essential role in forming an appreciation of what we can do. Without this understanding, it is easy to be misled, as spurious arguments are often used to promote over-enthusiastic notions of what particular programs can achieve. There are valuable lessons to be learned here from the field of artificial intelligence, principal among which is the realisation that capturing and representing complex knowledge is time consuming, expensive and hard. If bioinformatics is to tackle biological complexity meaningfully, the road ahead must therefore be paved with caution, rigour and pragmatism. PMID:11459349

  10. Biophysics and bioinformatics of transcription regulation in bacteria and bacteriophages

    NASA Astrophysics Data System (ADS)

    Djordjevic, Marko

    2005-11-01

    Due to rapid accumulation of biological data, bioinformatics has become a very important branch of biological research. In this thesis, we develop novel bioinformatic approaches and aid design of biological experiments by using ideas and methods from statistical physics. Identification of transcription factor binding sites within the regulatory segments of genomic DNA is an important step towards understanding of the regulatory circuits that control expression of genes. We propose a novel, biophysics based algorithm, for the supervised detection of transcription factor (TF) binding sites. The method classifies potential binding sites by explicitly estimating the sequence-specific binding energy and the chemical potential of a given TF. In contrast with the widely used information theory based weight matrix method, our approach correctly incorporates saturation in the transcription factor/DNA binding probability. This results in a significant reduction in the number of expected false positives, and in the explicit appearance---and determination---of a binding threshold. The new method was used to identify likely genomic binding sites for the Escherichia coli TFs, and to examine the relationship between TF binding specificity and degree of pleiotropy (number of regulatory targets). We next address how parameters of protein-DNA interactions can be obtained from data on protein binding to random oligos under controlled conditions (SELEX experiment data). We show that 'robust' generation of an appropriate data set is achieved by a suitable modification of the standard SELEX procedure, and propose a novel bioinformatic algorithm for analysis of such data. Finally, we use quantitative data analysis, bioinformatic methods and kinetic modeling to analyze gene expression strategies of bacterial viruses. We study bacteriophage Xp10 that infects rice pathogen Xanthomonas oryzae. Xp10 is an unusual bacteriophage, which has morphology and genome organization that most closely

  11. High-throughput protein analysis integrating bioinformatics and experimental assays.

    PubMed

    del Val, Coral; Mehrle, Alexander; Falkenhahn, Mechthild; Seiler, Markus; Glatting, Karl-Heinz; Poustka, Annemarie; Suhai, Sandor; Wiemann, Stefan

    2004-01-01

    The wealth of transcript information that has been made publicly available in recent years requires the development of high-throughput functional genomics and proteomics approaches for its analysis. Such approaches need suitable data integration procedures and a high level of automation in order to gain maximum benefit from the results generated. We have designed an automatic pipeline to analyse annotated open reading frames (ORFs) stemming from full-length cDNAs produced mainly by the German cDNA Consortium. The ORFs are cloned into expression vectors for use in large-scale assays such as the determination of subcellular protein localization or kinase reaction specificity. Additionally, all identified ORFs undergo exhaustive bioinformatic analysis such as similarity searches, protein domain architecture determination and prediction of physicochemical characteristics and secondary structure, using a wide variety of bioinformatic methods in combination with the most up-to-date public databases (e.g. PRINTS, BLOCKS, INTERPRO, PROSITE SWISSPROT). Data from experimental results and from the bioinformatic analysis are integrated and stored in a relational database (MS SQL-Server), which makes it possible for researchers to find answers to biological questions easily, thereby speeding up the selection of targets for further analysis. The designed pipeline constitutes a new automatic approach to obtaining and administrating relevant biological data from high-throughput investigations of cDNAs in order to systematically identify and characterize novel genes, as well as to comprehensively describe the function of the encoded proteins. PMID:14762202

  12. A Mathematical Optimization Problem in Bioinformatics

    ERIC Educational Resources Information Center

    Heyer, Laurie J.

    2008-01-01

    This article describes the sequence alignment problem in bioinformatics. Through examples, we formulate sequence alignment as an optimization problem and show how to compute the optimal alignment with dynamic programming. The examples and sample exercises have been used by the author in a specialized course in bioinformatics, but could be adapted…

  13. Using "Arabidopsis" Genetic Sequences to Teach Bioinformatics

    ERIC Educational Resources Information Center

    Zhang, Xiaorong

    2009-01-01

    This article describes a new approach to teaching bioinformatics using "Arabidopsis" genetic sequences. Several open-ended and inquiry-based laboratory exercises have been designed to help students grasp key concepts and gain practical skills in bioinformatics, using "Arabidopsis" leucine-rich repeat receptor-like kinase (LRR RLK) genetic…

  14. Integrative bioinformatic analyses of an oncogenomic profile reveal the biology of endometrial cancer and guide drug discovery

    PubMed Central

    Wong, Henry Sung-Ching; Juan, Yung-Shun; Wu, Mei-Shin; Zhang, Yan-Feng; Hsu, Yu-Wen; Chen, Huang-Hui; Liu, Wei-Min; Chang, Wei-Chiao

    2016-01-01

    A major challenge in personalized cancer medicine is to establish a systematic approach to translate huge oncogenomic datasets to clinical situations and facilitate drug discovery for cancers such as endometrial carcinoma. We performed a genome-wide somatic mutation-expression association study in a total of 219 endometrial cancer patients from TCGA database, by evaluating the correlation between ∼5,800 somatic mutations to ∼13,500 gene expression levels (in total, ∼78, 500, 000 pairs). A bioinformatics pipeline was devised to identify expression-associated single nucleotide variations (eSNVs) which are crucial for endometrial cancer progression and patient prognoses. We further prioritized 394 biologically risky mutational candidates which mapped to 275 gene loci and demonstrated that these genes collaborated with expression features were significantly enriched in targets of drugs approved for solid tumors, suggesting the plausibility of drug repurposing. Taken together, we integrated a fundamental endometrial cancer genomic profile into clinical circumstances, further shedding light for clinical implementation of genomic-based therapies and guidance for drug discovery. PMID:26716509

  15. Incorporating a New Bioinformatics Component into Genetics at a Historically Black College: Outcomes and Lessons

    ERIC Educational Resources Information Center

    Holtzclaw, J. David; Eisen, Arri; Whitney, Erika M.; Penumetcha, Meera; Hoey, J. Joseph; Kimbro, K. Sean

    2006-01-01

    Many students at minority-serving institutions are underexposed to Internet resources such as the human genome project, PubMed, NCBI databases, and other Web-based technologies because of a lack of financial resources. To change this, we designed and implemented a new bioinformatics component to supplement the undergraduate Genetics course at…

  16. Evolving Strategies for the Incorporation of Bioinformatics within the Undergraduate Cell Biology Curriculum

    ERIC Educational Resources Information Center

    Honts, Jerry E.

    2003-01-01

    Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in…

  17. Highlights of the 2 nd Bioinformatics Student Symposium by ISCB RSG-UK

    PubMed Central

    White, Benjamen; Fatima, Vayani; Fatima, Nazeefa; Das, Sayoni; Rahman, Farzana; Hassan, Mehedi

    2016-01-01

    Following the success of the 1 st Student Symposium by ISCB RSG-UK, a 2 nd Student Symposium took place on 7 th October 2015 at The Genome Analysis Centre, Norwich, UK. This short report summarizes the main highlights from the 2 nd Bioinformatics Student Symposium. PMID:27239284

  18. An "in silico" Bioinformatics Laboratory Manual for Bioscience Departments: "Prediction of Glycosylation Sites in Phosphoethanolamine Transferases"

    ERIC Educational Resources Information Center

    Alyuruk, Hakan; Cavas, Levent

    2014-01-01

    Genomics and proteomics projects have produced a huge amount of raw biological data including DNA and protein sequences. Although these data have been stored in data banks, their evaluation is strictly dependent on bioinformatics tools. These tools have been developed by multidisciplinary experts for fast and robust analysis of biological data.…

  19. Strategies for Using Peer-Assisted Learning Effectively in an Undergraduate Bioinformatics Course

    ERIC Educational Resources Information Center

    Shapiro, Casey; Ayon, Carlos; Moberg-Parker, Jordan; Levis-Fitzgerald, Marc; Sanders, Erin R.

    2013-01-01

    This study used a mixed methods approach to evaluate hybrid peer-assisted learning approaches incorporated into a bioinformatics tutorial for a genome annotation research project. Quantitative and qualitative data were collected from undergraduates who enrolled in a research-based laboratory course during two different academic terms at UCLA.…

  20. Bioinformatics clouds for big data manipulation

    PubMed Central

    2012-01-01

    Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. PMID:23190475

  1. A Genome-Wide Study of Cytogenetic Changes in Colorectal Cancer Using SNP Microarrays: Opportunities for Future Personalized Treatment

    PubMed Central

    Jasmine, Farzana; Rahaman, Ronald; Dodsworth, Charlotte; Roy, Shantanu; Paul, Rupash; Raza, Maruf; Paul-Brutus, Rachelle; Kamal, Mohammed; Ahsan, Habibul; Kibriya, Muhammad G.

    2012-01-01

    In colorectal cancer (CRC), chromosomal instability (CIN) is typically studied using comparative-genomic hybridization (CGH) arrays. We studied paired (tumor and surrounding healthy) fresh frozen tissue from 86 CRC patients using Illumina's Infinium-based SNP array. This method allowed us to study CIN in CRC, with simultaneous analysis of copy number (CN) and B-allele frequency (BAF) - a representation of allelic composition. These data helped us to detect mono-allelic and bi-allelic amplifications/deletion, copy neutral loss of heterozygosity, and levels of mosaicism for mixed cell populations, some of which can not be assessed with other methods that do not measure BAF. We identified associations between CN abnormalities and different CRC phenotypes (histological diagnosis, location, tumor grade, stage, MSI and presence of lymph node metastasis). We showed commonalities between regions of CN change observed in CRC and the regions reported in previous studies of other solid cancers (e.g. amplifications of 20q, 13q, 8q, 5p and deletions of 18q, 17p and 8p). From Therapeutic Target Database, we identified relevant drugs, targeted to the genes located in these regions with CN changes, approved or in trials for other cancers and common diseases. These drugs may be considered for future therapeutic trials in CRC, based on personalized cytogenetic diagnosis. We also found many regions, harboring genes, which are not currently targeted by any relevant drugs that may be considered for future drug discovery studies. Our study shows the application of high density SNP arrays for cytogenetic study in CRC and its potential utility for personalized treatment. PMID:22363777

  2. Genomic Measures to Predict Adaptation to Novel Sensorimotor Environments and Improve Personalization of Countermeasure Design

    NASA Technical Reports Server (NTRS)

    Kreutzberg, G. A.; Zanello, S.; Seidler, R. D.; Peters, B.; De Dios, Y. E.; Gadd, N. E.; Bloomberg, J. J.; Mulavara, A. P.

    2016-01-01

    Introduction. Astronauts experience sensorimotor disturbances during their initial exposure to microgravity and during the re-adaptation phase following a return to an Earth-gravitational environment. These alterations may affect crewmembers' ability to perform mission-critical functional tasks. Interestingly, astronauts have shown significant inter-subject variation in adaptive capability during gravitational transitions. The ability to predict the manner and degree to which individual astronauts would be affected would improve the efficacy of personalized countermeasure training programs designed to enhance sensorimotor adaptability. The success of such an approach depends on the development of predictive measures of sensorimotor adaptation, which would ascertain each crewmember's adaptive capacity. The goal of this study is to determine whether specific genetic polymorphisms have significant influence on sensorimotor adaptability, which can help inform the design of personalized training countermeasures. Methods. Subjects (n=15) were tested on their ability to negotiate a complex obstacle course for ten test trials while wearing up-down vision-displacing goggles. This presented a visuomotor challenge while doing a full body task. The first test trial time and the recovery rate over the ten trials were used as adaptability performance metrics. Four single nucleotide polymorphisms (SNPs) were selected for their role in neural pathways underlying sensorimotor adaptation and were identified in subjects' DNA extracted from saliva samples: catechol-O-methyl transferase (COMT, rs4680), dopamine receptor D2 (DRD2, rs1076560), brain-derived neurotrophic factor genes (BDNF, rs6265), and the DraI polymorphism of the alpha-2 adrenergic receptor. The relationship between the SNPs and test performance was assessed by assigning subjects a rank score based on their adaptability performance metrics and comparing gene expression between the top half and bottom half performers

  3. Meeting Highlights: Genome Sequencing and Biology 2001

    PubMed Central

    2001-01-01

    We bring you a report from the CSHL Genome Sequencing and Biology Meeting, which has a long and prestigious history. This year there were sessions on large-scale sequencing and analysis, polymorphisms (covering discovery and technologies and mapping and analysis), comparative genomics of mammalian and model organism genomes, functional genomics and bioinformatics. PMID:18628920

  4. Incorporating a collaborative web-based virtual laboratory in an undergraduate bioinformatics course.

    PubMed

    Weisman, David

    2010-01-01

    Face-to-face bioinformatics courses commonly include a weekly, in-person computer lab to facilitate active learning, reinforce conceptual material, and teach practical skills. Similarly, fully-online bioinformatics courses employ hands-on exercises to achieve these outcomes, although students typically perform this work offsite. Combining a face-to-face lecture course with a web-based virtual laboratory presents new opportunities for collaborative learning of the conceptual material, and for fostering peer support of technical bioinformatics questions. To explore this combination, an in-person lecture-only undergraduate bioinformatics course was augmented with a remote web-based laboratory, and tested with a large class. This study hypothesized that the collaborative virtual lab would foster active learning and peer support, and tested this hypothesis by conducting a student survey near the end of the semester. Respondents broadly reported strong benefits from the online laboratory, and strong benefits from peer-provided technical support. In comparison with traditional in-person teaching labs, students preferred the virtual lab by a factor of two. Key aspects of the course architecture and design are described to encourage further experimentation in teaching collaborative online bioinformatics laboratories. PMID:21567782

  5. Bioinformatics for Diagnostics, Forensics, and Virulence Characterization and Detection

    SciTech Connect

    Gardner, S; Slezak, T

    2005-04-05

    We summarize four of our group's high-risk/high-payoff research projects funded by the Intelligence Technology Innovation Center (ITIC) in conjunction with our DHS-funded pathogen informatics activities. These are (1) quantitative assessment of genomic sequencing needs to predict high quality DNA and protein signatures for detection, and comparison of draft versus finished sequences for diagnostic signature prediction; (2) development of forensic software to identify SNP and PCR-RFLP variations from a large number of viral pathogen sequences and optimization of the selection of markers for maximum discrimination of those sequences; (3) prediction of signatures for the detection of virulence, antibiotic resistance, and toxin genes and genetic engineering markers in bacteria; (4) bioinformatic characterization of virulence factors to rapidly screen genomic data for potential genes with similar functions and to elucidate potential health threats in novel organisms. The results of (1) are being used by policy makers to set national sequencing priorities. Analyses from (2) are being used in collaborations with the CDC to genotype and characterize many variola strains, and reports from these collaborations have been made to the President. We also determined SNPs for serotype and strain discrimination of 126 foot and mouth disease virus (FMDV) genomes. For (3), currently >1000 probes have been predicted for the specific detection of >4000 virulence, antibiotic resistance, and genetic engineering vector sequences, and we expect to complete the bioinformatic design of a comprehensive ''virulence detection chip'' by August 2005. Results of (4) will be a system to rapidly predict potential virulence pathways and phenotypes in organisms based on their genomic sequences.

  6. PATRIC, the bacterial bioinformatics database and analysis resource

    PubMed Central

    Wattam, Alice R.; Abraham, David; Dalay, Oral; Disz, Terry L.; Driscoll, Timothy; Gabbard, Joseph L.; Gillespie, Joseph J.; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olson, Robert; Overbeek, Ross; Pusch, Gordon D.; Shukla, Maulik; Schulman, Julie; Stevens, Rick L.; Sullivan, Daniel E.; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J.C.; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W.

    2014-01-01

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein–protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10 000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue. PMID:24225323

  7. Bioinformatic challenges in targeted proteomics.

    PubMed

    Reker, Daniel; Malmström, Lars

    2012-09-01

    Selected reaction monitoring mass spectrometry is an emerging targeted proteomics technology that allows for the investigation of complex protein samples with high sensitivity and efficiency. It requires extensive knowledge about the sample for the many parameters needed to carry out the experiment to be set appropriately. Most studies today rely on parameter estimation from prior studies, public databases, or from measuring synthetic peptides. This is efficient and sound, but in absence of prior data, de novo parameter estimation is necessary. Computational methods can be used to create an automated framework to address this problem. However, the number of available applications is still small. This review aims at giving an orientation on the various bioinformatical challenges. To this end, we state the problems in classical machine learning and data mining terms, give examples of implemented solutions and provide some room for alternatives. This will hopefully lead to an increased momentum for the development of algorithms and serve the needs of the community for computational methods. We note that the combination of such methods in an assisted workflow will ease both the usage of targeted proteomics in experimental studies as well as the further development of computational approaches. PMID:22866949

  8. Bioinformatics and the undergraduate curriculum essay.

    PubMed

    Maloney, Mark; Parker, Jeffrey; Leblanc, Mark; Woodard, Craig T; Glackin, Mary; Hanrahan, Michael

    2010-01-01

    Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of bioinformatics as a new discipline has challenged many colleges and universities to keep current with their curricula, often in the face of static or dwindling resources. On the plus side, many bioinformatics modules and related databases and software programs are free and accessible online, and interdisciplinary partnerships between existing faculty members and their support staff have proved advantageous in such efforts. We present examples of strategies and methods that have been successfully used to incorporate bioinformatics content into undergraduate curricula. PMID:20810947

  9. Bioinformatics Visualisation Tools: An Unbalanced Picture.

    PubMed

    Broască, Laura; Ancuşa, Versavia; Ciocârlie, Horia

    2016-01-01

    Visualization tools represent a key element in triggering human creativity while being supported with the analysis power of the machine. This paper analyzes free network visualization tools for bioinformatics, frames them in domain specific requirements and compares them. PMID:27577488

  10. Bioinformatics in Italy: BITS2011, the Eighth Annual Meeting of the Italian Society of Bioinformatics

    PubMed Central

    2012-01-01

    The BITS2011 meeting, held in Pisa on June 20-22, 2011, brought together more than 120 Italian researchers working in the field of Bioinformatics, as well as students in Bioinformatics, Computational Biology, Biology, Computer Sciences, and Engineering, representing a landscape of Italian bioinformatics research. This preface provides a brief overview of the meeting and introduces the peer-reviewed manuscripts that were accepted for publication in this Supplement. PMID:22536954

  11. Will solid-state drives accelerate your bioinformatics? In-depth profiling, performance analysis and beyond.

    PubMed

    Lee, Sungmin; Min, Hyeyoung; Yoon, Sungroh

    2016-07-01

    A wide variety of large-scale data have been produced in bioinformatics. In response, the need for efficient handling of biomedical big data has been partly met by parallel computing. However, the time demand of many bioinformatics programs still remains high for large-scale practical uses because of factors that hinder acceleration by parallelization. Recently, new generations of storage devices have emerged, such as NAND flash-based solid-state drives (SSDs), and with the renewed interest in near-data processing, they are increasingly becoming acceleration methods that can accompany parallel processing. In certain cases, a simple drop-in replacement of hard disk drives by SSDs results in dramatic speedup. Despite the various advantages and continuous cost reduction of SSDs, there has been little review of SSD-based profiling and performance exploration of important but time-consuming bioinformatics programs. For an informative review, we perform in-depth profiling and analysis of 23 key bioinformatics programs using multiple types of devices. Based on the insight we obtain from this research, we further discuss issues related to design and optimize bioinformatics algorithms and pipelines to fully exploit SSDs. The programs we profile cover traditional and emerging areas of importance, such as alignment, assembly, mapping, expression analysis, variant calling and metagenomics. We explain how acceleration by parallelization can be combined with SSDs for improved performance and also how using SSDs can expedite important bioinformatics pipelines, such as variant calling by the Genome Analysis Toolkit and transcriptome analysis using RNA sequencing. We hope that this review can provide useful directions and tips to accompany future bioinformatics algorithm design procedures that properly consider new generations of powerful storage devices. PMID:26330577

  12. GIW and InCoB, two premier bioinformatics conferences in Asia with a combined 40 years of history

    PubMed Central

    2015-01-01

    Knowledge discovery in bioinformatics thrives on joint and inclusive efforts of stakeholders. Similarly, knowledge dissemination is expected to be more effective and scalable through joint efforts. Therefore, the International Conference on Bioinformatics (InCoB) and the International Conference on Genome Informatics (GIW) were organized as a joint conference for the first time in 13 years of coexistence. The Asia-Pacific Bioinformatics Network (APBioNet) and the Japanese Society for Bioinformatics (JSBi) collaborated to host GIW/InCoB2015 in Tokyo, September 9-11, 2015. The joint endeavour yielded 51 research articles published in seven journals, 78 poster and 89 oral presentations, showcasing bioinformatics research in the Asia-Pacific region. Encouraged by the results and reduced organizational overheads, APBioNet will collaborate with other bioinformatics societies in organizing co-located bioinformatics research and training meetings in the future. InCoB2016 will be hosted in Singapore, September 21-23, 2016. PMID:26679412

  13. Challenges of the next decade for the Asia Pacific region: 2010 International Conference in Bioinformatics (InCoB 2010).

    PubMed

    Ranganathan, Shoba; Schönbach, Christian; Nakai, Kenta; Tan, Tin Wee

    2010-01-01

    The 2010 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia's oldest bioinformatics organisation formed in 1998, was organized as the 9th International Conference on Bioinformatics (InCoB), Sept. 26-28, 2010 in Tokyo, Japan. Initially, APBioNet created InCoB as forum to foster bioinformatics in the Asia Pacific region. Given the growing importance of interdisciplinary research, InCoB2010 included topics targeting scientists in the fields of genomic medicine, immunology and chemoinformatics, supporting translational research. Peer-reviewed manuscripts that were accepted for publication in this supplement, represent key areas of research interests that have emerged in our region. We also highlight some of the current challenges bioinformatics is facing in the Asia Pacific region and conclude our report with the announcement of APBioNet's 100 BioDatabases (BioDB100) initiative. BioDB100 will comply with the database criteria set out earlier in our proposal for Minimum Information about a Bioinformatics and Investigation (MIABi), setting the standards for biocuration and bioinformatics research, on which we will report at the next InCoB, Nov. 27 - Dec. 2, 2011 at Kuala Lumpur, Malaysia. PMID:21143792

  14. Challenges of the next decade for the Asia Pacific region: 2010 International Conference in Bioinformatics (InCoB 2010)

    PubMed Central

    2010-01-01

    The 2010 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia’s oldest bioinformatics organisation formed in 1998, was organized as the 9th International Conference on Bioinformatics (InCoB), Sept. 26-28, 2010 in Tokyo, Japan. Initially, APBioNet created InCoB as forum to foster bioinformatics in the Asia Pacific region. Given the growing importance of interdisciplinary research, InCoB2010 included topics targeting scientists in the fields of genomic medicine, immunology and chemoinformatics, supporting translational research. Peer-reviewed manuscripts that were accepted for publication in this supplement, represent key areas of research interests that have emerged in our region. We also highlight some of the current challenges bioinformatics is facing in the Asia Pacific region and conclude our report with the announcement of APBioNet’s 100 BioDatabases (BioDB100) initiative. BioDB100 will comply with the database criteria set out earlier in our proposal for Minimum Information about a Bioinformatics and Investigation (MIABi), setting the standards for biocuration and bioinformatics research, on which we will report at the next InCoB, Nov. 27 – Dec. 2, 2011 at Kuala Lumpur, Malaysia. PMID:21143792

  15. A Guide to Bioinformatics for Immunologists

    PubMed Central

    Whelan, Fiona J.; Yap, Nicholas V. L.; Surette, Michael G.; Golding, G. Brian; Bowdish, Dawn M. E.

    2013-01-01

    Bioinformatics includes a suite of methods, which are cheap, approachable, and many of which are easily accessible without any sort of specialized bioinformatic training. Yet, despite this, bioinformatic tools are under-utilized by immunologists. Herein, we review a representative set of publicly available, easy-to-use bioinformatic tools using our own research on an under-annotated human gene, SCARA3, as an example. SCARA3 shares an evolutionary relationship with the class A scavenger receptors, but preliminary research showed that it was divergent enough that its function remained unclear. In our quest for more information about this gene – did it share gene sequence similarities to other scavenger receptors? Did it contain conserved protein domains? Where was it expressed in the human body? – we discovered the power and informative potential of publicly available bioinformatic tools designed for the novice in mind, which allowed us to hypothesize on the regulation, structure, and function of this protein. We argue that these tools are largely applicable to many facets of immunology research. PMID:24363654

  16. Carving a niche: establishing bioinformatics collaborations

    PubMed Central

    Lyon, Jennifer A.; Tennant, Michele R.; Messner, Kevin R.; Osterbur, David L.

    2006-01-01

    Objectives: The paper describes collaborations and partnerships developed between library bioinformatics programs and other bioinformatics-related units at four academic institutions. Methods: A call for information on bioinformatics partnerships was made via email to librarians who have participated in the National Center for Biotechnology Information's Advanced Workshop for Bioinformatics Information Specialists. Librarians from Harvard University, the University of Florida, the University of Minnesota, and Vanderbilt University responded and expressed willingness to contribute information on their institutions, programs, services, and collaborating partners. Similarities and differences in programs and collaborations were identified. Results: The four librarians have developed partnerships with other units on their campuses that can be categorized into the following areas: knowledge management, instruction, and electronic resource support. All primarily support freely accessible electronic resources, while other campus units deal with fee-based ones. These demarcations are apparent in resource provision as well as in subsequent support and instruction. Conclusions and Recommendations: Through environmental scanning and networking with colleagues, librarians who provide bioinformatics support can develop fruitful collaborations. Visibility is key to building collaborations, as is broad-based thinking in terms of potential partners. PMID:16888668

  17. Can bioinformatics help in the identification of moonlighting proteins?

    PubMed

    Hernández, Sergio; Calvo, Alejandra; Ferragut, Gabriela; Franco, Luís; Hermoso, Antoni; Amela, Isaac; Gómez, Antonio; Querol, Enrique; Cedano, Juan

    2014-12-01

    Protein multitasking or moonlighting is the capability of certain proteins to execute two or more unique biological functions. This ability to perform moonlighting functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Usually, moonlighting proteins are revealed experimentally by serendipity, and the proteins described probably represent just the tip of the iceberg. It would be helpful if bioinformatics could predict protein multifunctionality, especially because of the large amounts of sequences coming from genome projects. In the present article, we describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. The sequence analysis has been performed: (i) by remote homology searches using PSI-BLAST, (ii) by the detection of functional motifs, and (iii) by the co-evolutionary relationship between amino acids. Programs designed to identify functional motifs/domains are basically oriented to detect the main function, but usually fail in the detection of secondary ones. Remote homology searches such as PSI-BLAST seem to be more versatile in this task, and it is a good complement for the information obtained from protein-protein interaction (PPI) databases. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can be used only in very restricted situations, but can suggest how the evolutionary process of the acquisition of the second function took place. PMID:25399591

  18. Bioinformatic characterization of plant networks

    SciTech Connect

    McDermott, Jason E.; Samudrala, Ram

    2008-06-30

    Cells and organisms are governed by networks of interactions, genetic, physical and metabolic. Large-scale experimental studies of interactions between components of biological systems have been performed for a variety of eukaryotic organisms. However, there is a dearth of such data for plants. Computational methods for prediction of relationships between proteins, primarily based on comparative genomics, provide a useful systems-level view of cellular functioning and can be used to extend information about other eukaryotes to plants. We have predicted networks for Arabidopsis thaliana, Oryza sativa indica and japonica and several plant pathogens using the Bioverse (http://bioverse.compbio.washington.edu) and show that they are similar to experimentally-derived interaction networks. Predicted interaction networks for plants can be used to provide novel functional annotations and predictions about plant phenotypes and aid in rational engineering of biosynthesis pathways.

  19. The European Bioinformatics Institute’s data resources 2014

    PubMed Central

    Brooksbank, Catherine; Bergman, Mary Todd; Apweiler, Rolf; Birney, Ewan; Thornton, Janet

    2014-01-01

    Molecular Biology has been at the heart of the ‘big data’ revolution from its very beginning, and the need for access to biological data is a common thread running from the 1965 publication of Dayhoff’s ‘Atlas of Protein Sequence and Structure’ through the Human Genome Project in the late 1990s and early 2000s to today’s population-scale sequencing initiatives. The European Bioinformatics Institute (EMBL-EBI; http://www.ebi.ac.uk) is one of three organizations worldwide that provides free access to comprehensive, integrated molecular data sets. Here, we summarize the principles underpinning the development of these public resources and provide an overview of EMBL-EBI’s database collection to complement the reviews of individual databases provided elsewhere in this issue. PMID:24271396

  20. Building international genomics collaboration for global health security

    SciTech Connect

    Cui, Helen H.; Erkkila, Tracy; Chain, Patrick S. G.; Vuyisich, Momchilo

    2015-12-07

    Genome science and technologies are transforming life sciences globally in many ways and becoming a highly desirable area for international collaboration to strengthen global health. The Genome Science Program at the Los Alamos National Laboratory is leveraging a long history of expertise in genomics research to assist multiple partner nations in advancing their genomics and bioinformatics capabilities. The capability development objectives focus on providing a molecular genomics-based scientific approach for pathogen detection, characterization, and biosurveillance applications. The general approaches include introduction of basic principles in genomics technologies, training on laboratory methodologies and bioinformatic analysis of resulting data, procurement, and installation of next-generation sequencing instruments, establishing bioinformatics software capabilities, and exploring collaborative applications of the genomics capabilities in public health. Genome centers have been established with public health and research institutions in the Republic of Georgia, Kingdom of Jordan, Uganda, and Gabon; broader collaborations in genomics applications have also been developed with research institutions in many other countries.

  1. Building International Genomics Collaboration for Global Health Security

    PubMed Central

    Cui, Helen H.; Erkkila, Tracy; Chain, Patrick S. G.; Vuyisich, Momchilo

    2015-01-01

    Genome science and technologies are transforming life sciences globally in many ways and becoming a highly desirable area for international collaboration to strengthen global health. The Genome Science Program at the Los Alamos National Laboratory is leveraging a long history of expertise in genomics research to assist multiple partner nations in advancing their genomics and bioinformatics capabilities. The capability development objectives focus on providing a molecular genomics-based scientific approach for pathogen detection, characterization, and biosurveillance applications. The general approaches include introduction of basic principles in genomics technologies, training on laboratory methodologies and bioinformatic analysis of resulting data, procurement, and installation of next-generation sequencing instruments, establishing bioinformatics software capabilities, and exploring collaborative applications of the genomics capabilities in public health. Genome centers have been established with public health and research institutions in the Republic of Georgia, Kingdom of Jordan, Uganda, and Gabon; broader collaborations in genomics applications have also been developed with research institutions in many other countries. PMID:26697418

  2. How to enhance integrated care towards the personal health paradigm?

    PubMed

    Blobel, Bernd G M E; Pharow, Peter; Norgall, Thomas

    2007-01-01

    For improving quality and efficiency of health delivery under the well-known burdens, the health service paradigm has to change from organization-centered over process-controlled to personal health. The growing complexity of highly distributed and fully integrated healthcare settings can only be managed through an advanced architectural approach, which has to include all dimensions of personal health. Here, ICT, medicine, biomedical engineering, bioinformatics and genomics, legal and administrative aspects, terminology and ontology have to be mentioned. The Generic Component Model allows for different domains' concept representation and aggregation. Framework, requirements, methodology and process design possibilities for such a future-proof and meanwhile practically demonstrated approach are discussed in detail. The deployment of the Generic Component Model and the concept representation to biomedical engineering aspects of eHealth are touched upon as essential issues. PMID:17911701

  3. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula

    PubMed Central

    Li, Wei; Xu, Hanyun; Liu, Ying; Song, Lili; Guo, Changhong; Shu, Yongjun

    2016-01-01

    Mitogen-activated protein kinase kinase kinase (MAPKKK) is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome-wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high-throughput sequencing-data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA-seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome-wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula. PMID:27049397

  4. Bioinformatics Analysis of MAPKKK Family Genes in Medicago truncatula.

    PubMed

    Li, Wei; Xu, Hanyun; Liu, Ying; Song, Lili; Guo, Changhong; Shu, Yongjun

    2016-01-01

    Mitogen-activated protein kinase kinase kinase (MAPKKK) is a component of the MAPK cascade pathway that plays an important role in plant growth, development, and response to abiotic stress, the functions of which have been well characterized in several plant species, such as Arabidopsis, rice, and maize. In this study, we performed genome-wide and systemic bioinformatics analysis of MAPKKK family genes in Medicago truncatula. In total, there were 73 MAPKKK family members identified by search of homologs, and they were classified into three subfamilies, MEKK, ZIK, and RAF. Based on the genomic duplication function, 72 MtMAPKKK genes were located throughout all chromosomes, but they cluster in different chromosomes. Using microarray data and high-throughput sequencing-data, we assessed their expression profiles in growth and development processes; these results provided evidence for exploring their important functions in developmental regulation, especially in the nodulation process. Furthermore, we investigated their expression in abiotic stresses by RNA-seq, which confirmed their critical roles in signal transduction and regulation processes under stress. In summary, our genome-wide, systemic characterization and expressional analysis of MtMAPKKK genes will provide insights that will be useful for characterizing the molecular functions of these genes in M. truncatula. PMID:27049397

  5. Personalizing cancer treatment in the age of global genomic analyses: PALB2 gene mutations and the response to DNA damaging agents in pancreatic cancer.

    PubMed

    Villarroel, Maria C; Rajeshkumar, N V; Garrido-Laguna, Ignacio; De Jesus-Acosta, Ana; Jones, Siân; Maitra, Anirban; Hruban, Ralph H; Eshleman, James R; Klein, Alison; Laheru, Daniel; Donehower, Ross; Hidalgo, Manuel

    2011-01-01

    Metastasis and drug resistance are the major causes of mortality in patients with pancreatic cancer. Once developed, the progression of pancreatic cancer metastasis is virtually unstoppable with current therapies. Here, we report the remarkable clinical outcome of a patient with advanced, gemcitabine-resistant, pancreatic cancer who was later treated with DNA damaging agents, on the basis of the observation of significant activity of this class of drugs against a personalized xenograft generated from the patient's surgically resected tumor. Mitomycin C treatment, selected on the basis of its robust preclinical activity in a personalized xenograft generated from the patient's tumor, resulted in long-lasting (36+ months) tumor response. Global genomic sequencing revealed biallelic inactivation of the gene encoding PalB2 protein in this patient's cancer; the mutation is predicted to disrupt BRCA1 and BRCA2 interactions critical to DNA double-strand break repair. This work suggests that inactivation of the PALB2 gene is a determinant of response to DNA damage in pancreatic cancer and a new target for personalizing cancer treatment. Integrating personalized xenografts with unbiased exomic sequencing led to customized therapy, tailored to the genetic environment of the patient's tumor, and identification of a new biomarker of drug response in a lethal cancer. PMID:21135251

  6. The genetic association between personality and major depression or bipolar disorder. A polygenic score analysis using genome-wide association data

    PubMed Central

    Middeldorp, C M; de Moor, M H M; McGrath, L M; Gordon, S D; Blackwood, D H; Costa, P T; Terracciano, A; Krueger, R F; de Geus, E J C; Nyholt, D R; Tanaka, T; Esko, T; Madden, P A F; Derringer, J; Amin, N; Willemsen, G; Hottenga, J-J; Distel, M A; Uda, M; Sanna, S; Spinhoven, P; Hartman, C A; Ripke, S; Sullivan, P F; Realo, A; Allik, J; Heath, A C; Pergadia, M L; Agrawal, A; Lin, P; Grucza, R A; Widen, E; Cousminer, D L; Eriksson, J G; Palotie, A; Barnett, J H; Lee, P H; Luciano, M; Tenesa, A; Davies, G; Lopez, L M; Hansell, N K; Medland, S E; Ferrucci, L; Schlessinger, D; Montgomery, G W; Wright, M J; Aulchenko, Y S; Janssens, A C J W; Oostra, B A; Metspalu, A; Abecasis, G R; Deary, I J; Räikkönen, K; Bierut, L J; Martin, N G; Wray, N R; van Duijn, C M; Smoller, J W; Penninx, B W J H; Boomsma, D I

    2011-01-01

    The relationship between major depressive disorder (MDD) and bipolar disorder (BD) remains controversial. Previous research has reported differences and similarities in risk factors for MDD and BD, such as predisposing personality traits. For example, high neuroticism is related to both disorders, whereas openness to experience is specific for BD. This study examined the genetic association between personality and MDD and BD by applying polygenic scores for neuroticism, extraversion, openness to experience, agreeableness and conscientiousness to both disorders. Polygenic scores reflect the weighted sum of multiple single-nucleotide polymorphism alleles associated with the trait for an individual and were based on a meta-analysis of genome-wide association studies for personality traits including 13 835 subjects. Polygenic scores were tested for MDD in the combined Genetic Association Information Network (GAIN-MDD) and MDD2000+ samples (N=8921) and for BD in the combined Systematic Treatment Enhancement Program for Bipolar Disorder and Wellcome Trust Case–Control Consortium samples (N=6329) using logistic regression analyses. At the phenotypic level, personality dimensions were associated with MDD and BD. Polygenic neuroticism scores were significantly positively associated with MDD, whereas polygenic extraversion scores were significantly positively associated with BD. The explained variance of MDD and BD, ∼0.1%, was highly comparable to the variance explained by the polygenic personality scores in the corresponding personality traits themselves (between 0.1 and 0.4%). This indicates that the proportions of variance explained in mood disorders are at the upper limit of what could have been expected. This study suggests shared genetic risk factors for neuroticism and MDD on the one hand and for extraversion and BD on the other. PMID:22833196

  7. The genetic association between personality and major depression or bipolar disorder. A polygenic score analysis using genome-wide association data.

    PubMed

    Middeldorp, C M; de Moor, M H M; McGrath, L M; Gordon, S D; Blackwood, D H; Costa, P T; Terracciano, A; Krueger, R F; de Geus, E J C; Nyholt, D R; Tanaka, T; Esko, T; Madden, P A F; Derringer, J; Amin, N; Willemsen, G; Hottenga, J-J; Distel, M A; Uda, M; Sanna, S; Spinhoven, P; Hartman, C A; Ripke, S; Sullivan, P F; Realo, A; Allik, J; Heath, A C; Pergadia, M L; Agrawal, A; Lin, P; Grucza, R A; Widen, E; Cousminer, D L; Eriksson, J G; Palotie, A; Barnett, J H; Lee, P H; Luciano, M; Tenesa, A; Davies, G; Lopez, L M; Hansell, N K; Medland, S E; Ferrucci, L; Schlessinger, D; Montgomery, G W; Wright, M J; Aulchenko, Y S; Janssens, A C J W; Oostra, B A; Metspalu, A; Abecasis, G R; Deary, I J; Räikkönen, K; Bierut, L J; Martin, N G; Wray, N R; van Duijn, C M; Smoller, J W; Penninx, B W J H; Boomsma, D I

    2011-01-01

    The relationship between major depressive disorder (MDD) and bipolar disorder (BD) remains controversial. Previous research has reported differences and similarities in risk factors for MDD and BD, such as predisposing personality traits. For example, high neuroticism is related to both disorders, whereas openness to experience is specific for BD. This study examined the genetic association between personality and MDD and BD by applying polygenic scores for neuroticism, extraversion, openness to experience, agreeableness and conscientiousness to both disorders. Polygenic scores reflect the weighted sum of multiple single-nucleotide polymorphism alleles associated with the trait for an individual and were based on a meta-analysis of genome-wide association studies for personality traits including 13,835 subjects. Polygenic scores were tested for MDD in the combined Genetic Association Information Network (GAIN-MDD) and MDD2000+ samples (N=8921) and for BD in the combined Systematic Treatment Enhancement Program for Bipolar Disorder and Wellcome Trust Case-Control Consortium samples (N=6329) using logistic regression analyses. At the phenotypic level, personality dimensions were associated with MDD and BD. Polygenic neuroticism scores were significantly positively associated with MDD, whereas polygenic extraversion scores were significantly positively associated with BD. The explained variance of MDD and BD, ∼0.1%, was highly comparable to the variance explained by the polygenic personality scores in the corresponding personality traits themselves (between 0.1 and 0.4%). This indicates that the proportions of variance explained in mood disorders are at the upper limit of what could have been expected. This study suggests shared genetic risk factors for neuroticism and MDD on the one hand and for extraversion and BD on the other. PMID:22833196

  8. Implementing bioinformatic workflows within the bioextract server

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows typically require the integrated use of multiple, distributed data sources and analytic tools. The BioExtract Server (http://bioextract.org) is a distributed servi...

  9. Bioinformatics in Undergraduate Education: Practical Examples

    ERIC Educational Resources Information Center

    Boyle, John A.

    2004-01-01

    Bioinformatics has emerged as an important research tool in recent years. The ability to mine large databases for relevant information has become increasingly central to many different aspects of biochemistry and molecular biology. It is important that undergraduates be introduced to the available information and methodologies. We present a…

  10. SPECIES DATABASES AND THE BIOINFORMATICS REVOLUTION.

    EPA Science Inventory

    Biological databases are having a growth spurt. Much of this results from research in genetics and biodiversity, coupled with fast-paced developments in information technology. The revolution in bioinformatics, defined by Sugden and Pennisi (2000) as the "tools and techniques for...

  11. Bioinformatics: A History of Evolution "In Silico"

    ERIC Educational Resources Information Center

    Ondrej, Vladan; Dvorak, Petr

    2012-01-01

    Bioinformatics, biological databases, and the worldwide use of computers have accelerated biological research in many fields, such as evolutionary biology. Here, we describe a primer of nucleotide sequence management and the construction of a phylogenetic tree with two examples; the two selected are from completely different groups of organisms:…

  12. "Extreme Programming" in a Bioinformatics Class

    ERIC Educational Resources Information Center

    Kelley, Scott; Alger, Christianna; Deutschman, Douglas

    2009-01-01

    The importance of Bioinformatics tools and methodology in modern biological research underscores the need for robust and effective courses at the college level. This paper describes such a course designed on the principles of cooperative learning based on a computer software industry production model called "Extreme Programming" (EP). The…

  13. KDE Bioscience: platform for bioinformatics analysis workflows.

    PubMed

    Lu, Qiang; Hao, Pei; Curcin, Vasa; He, Weizhong; Li, Yuan-Yuan; Luo, Qing-Ming; Guo, Yi-Ke; Li, Yi-Xue

    2006-08-01

    Bioinformatics is a dynamic research area in which a large number of algorithms and programs have been developed rapidly and independently without much consideration so far of the need for standardization. The lack of such common standards combined with unfriendly interfaces make it difficult for biologists to learn how to use these tools and to translate the data formats from one to another. Consequently, the construction of an integrative bioinformatics platform to facilitate biologists' research is an urgent and challenging task. KDE Bioscience is a java-based software platform that collects a variety of bioinformatics tools and provides a workflow mechanism to integrate them. Nucleotide and protein sequences from local flat files, web sites, and relational databases can be entered, annotated, and aligned. Several home-made or 3rd-party viewers are built-in to provide visualization of annotations or alignments. KDE Bioscience can also be deployed in client-server mode where simultaneous execution of the same workflow is supported for multiple users. Moreover, workflows can be published as web pages that can be executed from a web browser. The power of KDE Bioscience comes from the integrated algorithms and data sources. With its generic workflow mechanism other novel calculations and simulations can be integrated to augment the current sequence analysis functions. Because of this flexible and extensible architecture, KDE Bioscience makes an ideal integrated informatics environment for future bioinformatics or systems biology research. PMID:16260186

  14. Navigating the changing learning landscape: perspective from bioinformatics.ca

    PubMed Central

    Ouellette, B. F. Francis

    2013-01-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs. PMID:23515468

  15. Navigating the changing learning landscape: perspective from bioinformatics.ca.

    PubMed

    Brazas, Michelle D; Ouellette, B F Francis

    2013-09-01

    With the advent of YouTube channels in bioinformatics, open platforms for problem solving in bioinformatics, active web forums in computing analyses and online resources for learning to code or use a bioinformatics tool, the more traditional continuing education bioinformatics training programs have had to adapt. Bioinformatics training programs that solely rely on traditional didactic methods are being superseded by these newer resources. Yet such face-to-face instruction is still invaluable in the learning continuum. Bioinformatics.ca, which hosts the Canadian Bioinformatics Workshops, has blended more traditional learning styles with current online and social learning styles. Here we share our growing experiences over the past 12 years and look toward what the future holds for bioinformatics training programs. PMID:23515468

  16. A Bioinformatics Reference Model: Towards a Framework for Developing and Organising Bioinformatic Resources

    NASA Astrophysics Data System (ADS)

    Hiew, Hong Liang; Bellgard, Matthew

    2007-11-01

    Life Science research faces the constant challenge of how to effectively handle an ever-growing body of bioinformatics software and online resources. The users and developers of bioinformatics resources have a diverse set of competing demands on how these resources need to be developed and organised. Unfortunately, there does not exist an adequate community-wide framework to integrate such competing demands. The problems that arise from this include unstructured standards development, the emergence of tools that do not meet specific needs of researchers, and often times a communications gap between those who use the tools and those who supply them. This paper presents an overview of the different functions and needs of bioinformatics stakeholders to determine what may be required in a community-wide framework. A Bioinformatics Reference Model is proposed as a basis for such a framework. The reference model outlines the functional relationship between research usage and technical aspects of bioinformatics resources. It separates important functions into multiple structured layers, clarifies how they relate to each other, and highlights the gaps that need to be addressed for progress towards a diverse, manageable, and sustainable body of resources. The relevance of this reference model to the bioscience research community, and its implications in progress for organising our bioinformatics resources, are discussed.

  17. Component-Based Approach for Educating Students in Bioinformatics

    ERIC Educational Resources Information Center

    Poe, D.; Venkatraman, N.; Hansen, C.; Singh, G.

    2009-01-01

    There is an increasing need for an effective method of teaching bioinformatics. Increased progress and availability of computer-based tools for educating students have led to the implementation of a computer-based system for teaching bioinformatics as described in this paper. Bioinformatics is a recent, hybrid field of study combining elements of…

  18. Sequencing an Ashkenazi reference panel supports population-targeted personal genomics and illuminates Jewish and European origins.

    PubMed

    Carmi, Shai; Hui, Ken Y; Kochav, Ethan; Liu, Xinmin; Xue, James; Grady, Fillan; Guha, Saurav; Upadhyay, Kinnari; Ben-Avraham, Dan; Mukherjee, Semanti; Bowen, B Monica; Thomas, Tinu; Vijai, Joseph; Cruts, Marc; Froyen, Guy; Lambrechts, Diether; Plaisance, Stéphane; Van Broeckhoven, Christine; Van Damme, Philip; Van Marck, Herwig; Barzilai, Nir; Darvasi, Ariel; Offit, Kenneth; Bressman, Susan; Ozelius, Laurie J; Peter, Inga; Cho, Judy H; Ostrer, Harry; Atzmon, Gil; Clark, Lorraine N; Lencz, Todd; Pe'er, Itsik

    2014-01-01

    The Ashkenazi Jewish (AJ) population is a genetic isolate close to European and Middle Eastern groups, with genetic diversity patterns conducive to disease mapping. Here we report high-depth sequencing of 128 complete genomes of AJ controls. Compared with European samples, our AJ panel has 47% more novel variants per genome and is eightfold more effective at filtering benign variants out of AJ clinical genomes. Our panel improves imputation accuracy for AJ SNP arrays by 28%, and covers at least one haplotype in ≈ 67% of any AJ genome with long, identical-by-descent segments. Reconstruction of recent AJ history from such segments confirms a recent bottleneck of merely ≈ 350 individuals. Modelling of ancient histories for AJ and European populations using their joint allele frequency spectrum determines AJ to be an even admixture of European and likely Middle Eastern origins. We date the split between the two ancestral populations to ≈ 12-25 Kyr, suggesting a predominantly Near Eastern source for the repopulation of Europe after the Last Glacial Maximum. PMID:25203624

  19. Sequencing an Ashkenazi reference panel supports population-targeted personal genomics and illuminates Jewish and European origins

    PubMed Central

    Carmi, Shai; Hui, Ken Y.; Kochav, Ethan; Liu, Xinmin; Xue, James; Grady, Fillan; Guha, Saurav; Upadhyay, Kinnari; Ben-Avraham, Dan; Mukherjee, Semanti; Bowen, B. Monica; Thomas, Tinu; Vijai, Joseph; Cruts, Marc; Froyen, Guy; Lambrechts, Diether; Plaisance, Stéphane; Van Broeckhoven, Christine; Van Damme, Philip; Van Marck, Herwig; Barzilai, Nir; Darvasi, Ariel; Offit, Kenneth; Bressman, Susan; Ozelius, Laurie J.; Peter, Inga; Cho, Judy H.; Ostrer, Harry; Atzmon, Gil; Clark, Lorraine N.; Lencz, Todd; Pe’er, Itsik

    2014-01-01

    The Ashkenazi Jewish (AJ) population is a genetic isolate close to European and Middle Eastern groups, with genetic diversity patterns conducive to disease mapping. Here we report high-depth sequencing of 128 complete genomes of AJ controls. Compared with European samples, our AJ panel has 47% more novel variants per genome and is eightfold more effective at filtering benign variants out of AJ clinical genomes. Our panel improves imputation accuracy for AJ SNP arrays by 28%, and covers at least one haplotype in ≈67% of any AJ genome with long, identical-by-descent segments. Reconstruction of recent AJ history from such segments confirms a recent bottleneck of merely ≈350 individuals. Modelling of ancient histories for AJ and European populations using their joint allele frequency spectrum determines AJ to be an even admixture of European and likely Middle Eastern origins. We date the split between the two ancestral populations to ≈12–25 Kyr, suggesting a predominantly Near Eastern source for the repopulation of Europe after the Last Glacial Maximum. PMID:25203624

  20. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    PubMed

    Noar, Roslyn D; Daub, Margaret E

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode

  1. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis

    PubMed Central

    Noar, Roslyn D.; Daub, Margaret E.

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode

  2. Automatic Discovery and Inferencing of Complex Bioinformatics Web Interfaces

    SciTech Connect

    Ngu, A; Rocco, D; Critchlow, T; Buttler, D

    2003-12-22

    The World Wide Web provides a vast resource to genomics researchers in the form of web-based access to distributed data sources--e.g. BLAST sequence homology search interfaces. However, the process for seeking the desired scientific information is still very tedious and frustrating. While there are several known servers on genomic data (e.g., GeneBank, EMBL, NCBI), that are shared and accessed frequently, new data sources are created each day in laboratories all over the world. The sharing of these newly discovered genomics results are hindered by the lack of a common interface or data exchange mechanism. Moreover, the number of autonomous genomics sources and their rate of change out-pace the speed at which they can be manually identified, meaning that the available data is not being utilized to its full potential. An automated system that can find, classify, describe and wrap new sources without tedious and low-level coding of source specific wrappers is needed to assist scientists to access to hundreds of dynamically changing bioinformatics web data sources through a single interface. A correct classification of any kind of Web data source must address both the capability of the source and the conversation/interaction semantics which is inherent in the design of the Web data source. In this paper, we propose an automatic approach to classify Web data sources that takes into account both the capability and the conversational semantics of the source. The ability to discover the interaction pattern of a Web source leads to increased accuracy in the classification process. At the same time, it facilitates the extraction of process semantics, which is necessary for the automatic generation of wrappers that can interact correctly with the sources.

  3. Candidate genes for nicotine dependence via linkage, epistasis, and bioinformatics.

    PubMed

    Sullivan, Patrick F; Neale, Benjamin M; van den Oord, Edwin; Miles, Michael F; Neale, Michael C; Bulik, Cynthia M; Joyce, Peter R; Straub, Richard E; Kendler, Kenneth S

    2004-04-01

    Many smoking-related phenotypes are substantially heritable. One genome scan of nicotine dependence (ND) has been published and several others are in progress and should be completed in the next 5 years. The goal of this hypothesis-generating study was two-fold. First, we present further analyses of our genome scan data for ND published by Straub et al. [1999: Mol Psychiatry 4:129-144] (PMID: 10208445). Second, we used the method described by Cox et al. [1999: Nat Genet 21:213-215] (PMID: 9988276) to search for epistatic loci across the markers used in the genome scan. The overall results of the genome scan nearly reached the rigorous Lander and Kruglyak [1995: Nat Genet 11:241-247] criteria for "significant" linkage with the best findings on chromosomes 10 and 2. We then looked for correspondence between genes located in the 10 regions implicated in affected sibling pair (ASP) and epistatic linkage analyses with a list of genes suggested by microarray studies of experimental nicotine exposure and candidate genes from the literature. We found correspondence between linkage and microarray/candidate gene studies for genes involved with the mitogen-activated protein kinase (MAPK) signaling system, nuclear factor kappa B (NFKB) complex, neuropeptide Y (NPY) neurotransmission, a nicotinic receptor subunit (CHRNA2), the vesicular monoamine transporter (SLC18A2), genes in pathways implicated in human anxiety (HTR7, TDO2, and the endozepine-related protein precursor, DKFZP434A2417), and the micro 1-opioid receptor (OPRM1). Although the hypotheses resulting from these linkage and bioinformatic analyses are plausible and intriguing, their ultimate worth depends on replication in additional linkage samples and in future experimental studies. PMID:15048644

  4. The complete mitochondrial genome of the invasive Ponto-Caspian goby Ponticola kessleri obtained from high-throughput sequencing using the Ion Torrent Personal Genome Machine.

    PubMed

    Kalchhauser, Irene; Kutschera, Verena E; Burkhardt-Holm, Patricia

    2016-05-01

    We report the first complete mitochondrial genome (mitogenome) of an invasive Ponto-Caspian goby, Ponticola kessleri (bighead goby, Günther 1891). Ion Torrent PGM sequencing of total DNA from two individuals yielded a contig of 16,971 bp, with overlapping ends located in the repetitive control region, which was validated using Sanger sequencing. The final mitogenome of Ponticola kessleri has a size of 16,890 bp and contains the expected gene configuration of 13 protein-coding genes, 2 rRNA genes and 22 tRNA genes. In a comparison with complete mitogenomes from other goby species, we identified a translocation of tRNA-Glu in the mitogenome of P. kessleri. Rearrangements are unique and rare events, and can thus provide phylogenetic information. PMID:25329282

  5. An integrated framework for reporting clinically relevant biomarkers from paired tumor/normal genomic and transcriptomic sequencing data in support of clinical trials in personalized medicine.

    PubMed

    Nasser, Sara; Kurdolgu, Ahmet A; Izatt, Tyler; Aldrich, Jessica; Russell, Megan L; Christoforides, Alexis; Tembe, Wiabhav; Keifer, Jeffery A; Corneveaux, Jason J; Byron, Sara A; Forman, Karen M; Zuccaro, Clarice; Keats, Jonathan J; Lorusso, Patricia M; Carpten, John D; Trent, Jeffrey M; Craig, David W

    2015-01-01

    The ability to rapidly sequence the tumor and germline DNA of an individual holds the eventual promise of revolutionizing our ability to match targeted therapies to tumors harboring the associated genetic biomarkers. Analyzing high throughput genomic data consisting of millions of base pairs and discovering alterations in clinically actionable genes in a structured and real time manner is at the crux of personalized testing. This requires a computational architecture that can monitor and track a system within a regulated environment as terabytes of data are reduced to a small number of therapeutically relevant variants, delivered as a diagnostic laboratory developed test. These high complexity assays require data structures that enable real-time and retrospective ad-hoc analysis, with a capability of updating to keep up with the rapidly changing genomic and therapeutic options, all under a regulated environment that is relevant under both CMS and FDA depending on application. We describe a flexible computational framework that uses a paired tumor/normal sample allowing for complete analysis and reporting in approximately 24 hours, providing identification of single nucleotide changes, small insertions and deletions, chromosomal rearrangements, gene fusions and gene expression with positive predictive values over 90%. In this paper we present the challenges in integrating clinical, genomic and annotation databases to provide interpreted draft reports which we utilize within ongoing clinical research protocols. We demonstrate the need to retire from existing performance measurements of accuracy and specificity and measure metrics that are meaningful to a genomic diagnostic environment. This paper presents a three-tier infrastructure that is currently being used to analyze an individual genome and provide available therapeutic options via a clinical report. Our framework utilizes a non-relational variant-centric database that is scaleable to a large amount of data and

  6. Bioinformatic analysis of functional proteins involved in obesity associated with diabetes.

    PubMed

    Rao, Allam Appa; Tayaru, N Manga; Thota, Hanuman; Changalasetty, Suresh Babu; Thota, Lalitha Saroja; Gedela, Srinubabu

    2008-03-01

    The twin epidemic of diabetes and obesity pose daunting challenges worldwide. The dramatic rise in obesity-associated diabetes resulted in an alarming increase in the incidence and prevalence of obesity an important complication of diabetes. Differences among individuals in their susceptibility to both these conditions probably reflect their genetic constitutions. The dramatic improvements in genomic and bioinformatic resources are accelerating the pace of gene discovery. It is tempting to speculate the key susceptible genes/proteins that bridges diabetes mellitus and obesity. In this regard, we evaluated the role of several genes/proteins that are believed to be involved in the evolution of obesity associated diabetes by employing multiple sequence alignment using ClustalW tool and constructed a phylogram tree using functional protein sequences extracted from NCBI. Phylogram was constructed using Neighbor-Joining Algorithm a bioinformatic tool. Our bioinformatic analysis reports resistin gene as ominous link with obesity associated diabetes. This bioinformatic study will be useful for future studies towards therapeutic inventions of obesity associated type 2 diabetes. PMID:23675069

  7. Comparative modeling of proteins: a method for engaging students' interest in bioinformatics tools.

    PubMed

    Badotti, Fernanda; Barbosa, Alan Sales; Reis, André Luiz Martins; do Valle, Italo Faria; Ambrósio, Lara; Bitar, Mainá

    2014-01-01

    The huge increase in data being produced in the genomic era has produced a need to incorporate computers into the research process. Sequence generation, its subsequent storage, interpretation, and analysis are now entirely computer-dependent tasks. Universities from all over the world have been challenged to seek a way of encouraging students to incorporate computational and bioinformatics skills since undergraduation in order to understand biological processes. The aim of this article is to report the experience of awakening students' interest in bioinformatics tools during a course focused on comparative modeling of proteins. The authors start by giving a full description of the course environmental context and students' backgrounds. Then they detail each class and present a general overview of the protein modeling protocol. The positive and negative aspects of the course are also reported, and some of the results generated in class and in projects outside the classroom are discussed. In the last section of the article, general perspectives about the course from students' point of view are given. This work can serve as a guide for professors who teach subjects for which bioinformatics tools are useful and for universities that plan to incorporate bioinformatics into the curriculum. PMID:24167006

  8. Relax with CouchDB - Into the non-relational DBMS era of Bioinformatics

    PubMed Central

    Manyam, Ganiraju; Payton, Michelle A.; Roth, Jack A.; Abruzzo, Lynne V.; Coombes, Kevin R.

    2012-01-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849

  9. Relax with CouchDB--into the non-relational DBMS era of bioinformatics.

    PubMed

    Manyam, Ganiraju; Payton, Michelle A; Roth, Jack A; Abruzzo, Lynne V; Coombes, Kevin R

    2012-07-01

    With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849

  10. Genomics and privacy: implications of the new reality of closed data for the field.

    PubMed

    Greenbaum, Dov; Sboner, Andrea; Mu, Xinmeng Jasmine; Gerstein, Mark

    2011-12-01

    Open source and open data have been driving forces in bioinformatics in the past. However, privacy concerns may soon change the landscape, limiting future access to important data sets, including personal genomics data. Here we survey this situation in some detail, describing, in particular, how the large scale of the data from personal genomic sequencing makes it especially hard to share data, exacerbating the privacy problem. We also go over various aspects of genomic privacy: first, there is basic identifiability of subjects having their genome sequenced. However, even for individuals who have consented to be identified, there is the prospect of very detailed future characterization of their genotype, which, unanticipated at the time of their consent, may be more personal and invasive than the release of their medical records. We go over various computational strategies for dealing with the issue of genomic privacy. One can "slice" and reformat datasets to allow them to be partially shared while securing the most private variants. This is particularly applicable to functional genomics information, which can be largely processed without variant information. For handling the most private data there are a number of legal and technological approaches-for example, modifying the informed consent procedure to acknowledge that privacy cannot be guaranteed, and/or employing a secure cloud computing environment. Cloud computing in particular may allow access to the data in a more controlled fashion than the current practice of downloading and computing on large datasets. Furthermore, it may be particularly advantageous for small labs, given that the burden of many privacy issues falls disproportionately on them in comparison to large corporations and genome centers. Finally, we discuss how education of future genetics researchers will be important, with curriculums emphasizing privacy and data security. However, teaching personal genomics with identifiable subjects in the

  11. Genomics and Privacy: Implications of the New Reality of Closed Data for the Field

    PubMed Central

    Greenbaum, Dov; Sboner, Andrea; Mu, Xinmeng Jasmine; Gerstein, Mark

    2011-01-01

    Open source and open data have been driving forces in bioinformatics in the past. However, privacy concerns may soon change the landscape, limiting future access to important data sets, including personal genomics data. Here we survey this situation in some detail, describing, in particular, how the large scale of the data from personal genomic sequencing makes it especially hard to share data, exacerbating the privacy problem. We also go over various aspects of genomic privacy: first, there is basic identifiability of subjects having their genome sequenced. However, even for individuals who have consented to be identified, there is the prospect of very detailed future characterization of their genotype, which, unanticipated at the time of their consent, may be more personal and invasive than the release of their medical records. We go over various computational strategies for dealing with the issue of genomic privacy. One can “slice” and reformat datasets to allow them to be partially shared while securing the most private variants. This is particularly applicable to functional genomics information, which can be largely processed without variant information. For handling the most private data there are a number of legal and technological approaches—for example, modifying the informed consent procedure to acknowledge that privacy cannot be guaranteed, and/or employing a secure cloud computing environment. Cloud computing in particular may allow access to the data in a more controlled fashion than the current practice of downloading and computing on large datasets. Furthermore, it may be particularly advantageous for small labs, given that the burden of many privacy issues falls disproportionately on them in comparison to large corporations and genome centers. Finally, we discuss how education of future genetics researchers will be important, with curriculums emphasizing privacy and data security. However, teaching personal genomics with identifiable subjects

  12. Should we have blind faith in bioinformatics software? Illustrations from the SNAP web-based tool.

    PubMed

    Robiou-du-Pont, Sébastien; Li, Aihua; Christie, Shanice; Sohani, Zahra N; Meyre, David

    2015-01-01

    Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY) birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37). With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%). With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any 'false positive' SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file). The Cohen's Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes). In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation. PMID:25742008

  13. Should We Have Blind Faith in Bioinformatics Software? Illustrations from the SNAP Web-Based Tool

    PubMed Central

    Robiou-du-Pont, Sébastien; Li, Aihua; Christie, Shanice; Sohani, Zahra N.; Meyre, David

    2015-01-01

    Bioinformatics tools have gained popularity in biology but little is known about their validity. We aimed to assess the early contribution of 415 single nucleotide polymorphisms (SNPs) associated with eight cardio-metabolic traits at the genome-wide significance level in adults in the Family Atherosclerosis Monitoring In earLY Life (FAMILY) birth cohort. We used the popular web-based tool SNAP to assess the availability of the 415 SNPs in the Illumina Cardio-Metabochip genotyped in the FAMILY study participants. We then compared the SNAP output with the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from NCBI Human Genome Browser (Genome Reference Consortium Human Build 37). With the HapMap 3 release 2 reference, 201 out of 415 SNPs were reported as missing in the Cardio-Metabochip by the SNAP output. However, the Cardio-Metabochip file revealed that 152 of these 201 SNPs were in fact present in the Cardio-Metabochip array (false negative rate of 36.6%). With the more recent 1000 Genomes Project release, we found a false-negative rate of 17.6% by comparing the outputs of SNAP and the Illumina product file. We did not find any ‘false positive’ SNPs (SNPs specified as available in the Cardio-Metabochip by SNAP, but not by the Cardio-Metabochip Illumina file). The Cohen’s Kappa coefficient, which calculates the percentage of agreement between both methods, indicated that the validity of SNAP was fair to moderate depending on the reference used (the HapMap 3 or 1000 Genomes). In conclusion, we demonstrate that the SNAP outputs for the Cardio-Metabochip are invalid. This study illustrates the importance of systematically assessing the validity of bioinformatics tools in an independent manner. We propose a series of guidelines to improve practices in the fast-moving field of bioinformatics software implementation. PMID:25742008

  14. The Complete Mitochondrial Genome of the Foodborne Parasitic Pathogen Cyclospora cayetanensis

    PubMed Central

    Cinar, Hediye Nese; Gopinath, Gopal; Jarvis, Karen; Murphy, Helen R.

    2015-01-01

    Cyclospora cayetanensis is a human-specific coccidian parasite responsible for several food and water-related outbreaks around the world, including the most recent ones involving over 900 persons in 2013 and 2014 outbreaks in the USA. Multicopy organellar DNA such as mitochondrion genomes have been particularly informative for detection and genetic traceback analysis in other parasites. We sequenced the C. cayetanensis genomic DNA obtained from stool samples from patients infected with Cyclospora in Nepal using the Illumina MiSeq platform. By bioinformatically filtering out the metagenomic reads of non-coccidian origin sequences and concentrating the reads by targeted alignment, we were able to obtain contigs containing Eimeria-like mitochondrial, apicoplastic and some chromosomal genomic fragments. A mitochondrial genomic sequence was assembled and confirmed by cloning and sequencing targeted PCR products amplified from Cyclospora DNA using primers based on our draft assembly sequence. The results show that the C. cayetanensis mitochondrion genome is 6274 bp in length, with 33% GC content, and likely exists in concatemeric arrays as in Eimeria mitochondrial genomes. Phylogenetic analysis of the C. cayetanensis mitochondrial genome places this organism in a tight cluster with Eimeria species. The mitochondrial genome of C. cayetanensis contains three protein coding genes, cytochrome (cytb), cytochrome C oxidase subunit 1 (cox1), and cytochrome C oxidase subunit 3 (cox3), in addition to 14 large subunit (LSU) and nine small subunit (SSU) fragmented rRNA genes. PMID:26042787

  15. A toolbox for developing bioinformatics software

    PubMed Central

    Potrzebowski, Wojciech; Puton, Tomasz; Rother, Magdalena; Wywial, Ewa; Bujnicki, Janusz M.

    2012-01-01

    Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers. PMID:21803787

  16. A toolbox for developing bioinformatics software.

    PubMed

    Rother, Kristian; Potrzebowski, Wojciech; Puton, Tomasz; Rother, Magdalena; Wywial, Ewa; Bujnicki, Janusz M

    2012-03-01

    Creating useful software is a major activity of many scientists, including bioinformaticians. Nevertheless, software development in an academic setting is often unsystematic, which can lead to problems associated with maintenance and long-term availibility. Unfortunately, well-documented software development methodology is difficult to adopt, and technical measures that directly improve bioinformatic programming have not been described comprehensively. We have examined 22 software projects and have identified a set of practices for software development in an academic environment. We found them useful to plan a project, support the involvement of experts (e.g. experimentalists), and to promote higher quality and maintainability of the resulting programs. This article describes 12 techniques that facilitate a quick start into software engineering. We describe 3 of the 22 projects in detail and give many examples to illustrate the usage of particular techniques. We expect this toolbox to be useful for many bioinformatics programming projects and to the training of scientific programmers. PMID:21803787

  17. Bioinformatics in New Generation Flavivirus Vaccines

    PubMed Central

    Koraka, Penelope; Martina, Byron E. E.; Osterhaus, Albert D. M. E.

    2010-01-01

    Flavivirus infections are the most prevalent arthropod-borne infections world wide, often causing severe disease especially among children, the elderly, and the immunocompromised. In the absence of effective antiviral treatment, prevention through vaccination would greatly reduce morbidity and mortality associated with flavivirus infections. Despite the success of the empirically developed vaccines against yellow fever virus, Japanese encephalitis virus and tick-borne encephalitis virus, there is an increasing need for a more rational design and development of safe and effective vaccines. Several bioinformatic tools are available to support such rational vaccine design. In doing so, several parameters have to be taken into account, such as safety for the target population, overall immunogenicity of the candidate vaccine, and efficacy and longevity of the immune responses triggered. Examples of how bio-informatics is applied to assist in the rational design and improvements of vaccines, particularly flavivirus vaccines, are presented and discussed. PMID:20467477

  18. Discovery and Classification of Bioinformatics Web Services

    SciTech Connect

    Rocco, D; Critchlow, T

    2002-09-02

    The transition of the World Wide Web from a paradigm of static Web pages to one of dynamic Web services provides new and exciting opportunities for bioinformatics with respect to data dissemination, transformation, and integration. However, the rapid growth of bioinformatics services, coupled with non-standardized interfaces, diminish the potential that these Web services offer. To face this challenge, we examine the notion of a Web service class that defines the functionality provided by a collection of interfaces. These descriptions are an integral part of a larger framework that can be used to discover, classify, and wrapWeb services automatically. We discuss how this framework can be used in the context of the proliferation of sites offering BLAST sequence alignment services for specialized data sets.

  19. Bioinformatics approaches to single-cell analysis in developmental biology.

    PubMed

    Yalcin, Dicle; Hakguder, Zeynep M; Otu, Hasan H

    2016-03-01

    Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues and present future prospects for application of single-cell analyses to developmental biology. PMID:26358759

  20. Hydroxysteroid dehydrogenases (HSDs) in bacteria: a bioinformatic perspective.

    PubMed

    Kisiela, Michael; Skarka, Adam; Ebert, Bettina; Maser, Edmund

    2012-03-01

    Steroidal compounds including cholesterol, bile acids and steroid hormones play a central role in various physiological processes such as cell signaling, growth, reproduction, and energy homeostasis. Hydroxysteroid dehydrogenases (HSDs), which belong to the superfamily of short-chain dehydrogenases/reductases (SDR) or aldo-keto reductases (AKR), are important enzymes involved in the steroid hormone metabolism. HSDs function as an enzymatic switch that controls the access of receptor-active steroids to nuclear hormone receptors and thereby mediate a fine-tuning of the steroid response. The aim of this study was the identification of classified functional HSDs and the bioinformatic annotation of these proteins in all complete sequenced bacterial genomes followed by a phylogenetic analysis. For the bioinformatic annotation we constructed specific hidden Markov models in an iterative approach to provide a reliable identification for the specific catalytic groups of HSDs. Here, we show a detailed phylogenetic analysis of 3α-, 7α-, 12α-HSDs and two further functional related enzymes (3-ketosteroid-Δ(1)-dehydrogenase, 3-ketosteroid-Δ(4)(5α)-dehydrogenase) from the superfamily of SDRs. For some bacteria that have been previously reported to posses a specific HSD activity, we could annotate the corresponding HSD protein. The dominating phyla that were identified to express HSDs were that of Actinobacteria, Proteobacteria, and Firmicutes. Moreover, some evolutionarily more ancient microorganisms (e.g., Cyanobacteria and Euryachaeota) were found as well. A large number of HSD-expressing bacteria constitute the normal human gastro-intestinal flora. Another group of bacteria were originally isolated from natural habitats like seawater, soil, marine and permafrost sediments. These bacteria include polycyclic aromatic hydrocarbons-degrading species such as Pseudomonas, Burkholderia and Rhodococcus. In conclusion, HSDs are found in a wide variety of microorganisms including

  1. Bioinformatic Analysis of HIV-1 Entry and Pathogenesis

    PubMed Central

    Aiamkitsumrit, Benjamas; Dampier, Will; Antell, Gregory; Rivera, Nina; Martin-Garcia, Julio; Pirrone, Vanessa; Nonnemacher, Michael R.; Wigdahl, Brian

    2015-01-01

    The evolution of human immunodeficiency virus type 1 (HIV-1) with respect to co-receptor utilization has been shown to be relevant to HIV-1 pathogenesis and disease. The CCR5-utilizing (R5) virus has been shown to be important in the very early stages of transmission and highly prevalent during asymptomatic infection and chronic disease. In addition, the R5 virus has been proposed to be involved in neuroinvasion and central nervous system (CNS) disease. In contrast, the CXCR4-utilizing (X4) virus is more prevalent during the course of disease progression and concurrent with the loss of CD4+ T cells. The dual-tropic virus is able to utilize both co-receptors (CXCR4 and CCR5) and has been thought to represent an intermediate transitional virus that possesses properties of both X4 and R5 viruses that can be encountered at many stages of disease. The use of computational tools and bioinformatic approaches in the prediction of HIV-1 co-receptor usage has been growing in importance with respect to understanding HIV-1 pathogenesis and disease, developing diagnostic tools, and improving the efficacy of therapeutic strategies focused on blocking viral entry. Current strategies have enhanced the sensitivity, specificity, and reproducibility relative to the prediction of co-receptor use; however, these technologies need to be improved with respect to their efficient and accurate use across the HIV-1 subtypes. The most effective approach may center on the combined use of different algorithms involving sequences within and outside of the env-V3 loop. This review focuses on the HIV-1 entry process and on co-receptor utilization, including bioinformatic tools utilized in the prediction of co-receptor usage. It also provides novel preliminary analyses for enabling identification of linkages between amino acids in V3 with other components of the HIV-1 genome and demonstrates that these linkages are different between X4 and R5 viruses. PMID:24862329

  2. Comprehensive Decision Tree Models in Bioinformatics

    PubMed Central

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class

  3. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    NASA Technical Reports Server (NTRS)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  4. Adapting bioinformatics curricula for big data.

    PubMed

    Greene, Anna C; Giffin, Kristine A; Greene, Casey S; Moore, Jason H

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  5. Chapter 16: text mining for translational bioinformatics.

    PubMed

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing. PMID:23633944

  6. Adapting bioinformatics curricula for big data

    PubMed Central

    Greene, Anna C.; Giffin, Kristine A.; Greene, Casey S.

    2016-01-01

    Modern technologies are capable of generating enormous amounts of data that measure complex biological systems. Computational biologists and bioinformatics scientists are increasingly being asked to use these data to reveal key systems-level properties. We review the extent to which curricula are changing in the era of big data. We identify key competencies that scientists dealing with big data are expected to possess across fields, and we use this information to propose courses to meet these growing needs. While bioinformatics programs have traditionally trained students in data-intensive science, we identify areas of particular biological, computational and statistical emphasis important for this era that can be incorporated into existing curricula. For each area, we propose a course structured around these topics, which can be adapted in whole or in parts into existing curricula. In summary, specific challenges associated with big data provide an important opportunity to update existing curricula, but we do not foresee a wholesale redesign of bioinformatics training programs. PMID:25829469

  7. Chapter 16: Text Mining for Translational Bioinformatics

    PubMed Central

    Cohen, K. Bretonnel; Hunter, Lawrence E.

    2013-01-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research—translating basic science results into new interventions—and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing. PMID:23633944

  8. Bioinformatic pipelines in Python with Leaf

    PubMed Central

    2013-01-01

    Background An incremental, loosely planned development approach is often used in bioinformatic studies when dealing with custom data analysis in a rapidly changing environment. Unfortunately, the lack of a rigorous software structuring can undermine the maintainability, communicability and replicability of the process. To ameliorate this problem we propose the Leaf system, the aim of which is to seamlessly introduce the pipeline formality on top of a dynamical development process with minimum overhead for the programmer, thus providing a simple layer of software structuring. Results Leaf includes a formal language for the definition of pipelines with code that can be transparently inserted into the user’s Python code. Its syntax is designed to visually highlight dependencies in the pipeline structure it defines. While encouraging the developer to think in terms of bioinformatic pipelines, Leaf supports a number of automated features including data and session persistence, consistency checks between steps of the analysis, processing optimization and publication of the analytic protocol in the form of a hypertext. Conclusions Leaf offers a powerful balance between plan-driven and change-driven development environments in the design, management and communication of bioinformatic pipelines. Its unique features make it a valuable alternative to other related tools. PMID:23786315

  9. Bioinformatics on the Cloud Computing Platform Azure

    PubMed Central

    Shanahan, Hugh P.; Owen, Anne M.; Harrison, Andrew P.

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  10. Bioinformatics on the cloud computing platform Azure.

    PubMed

    Shanahan, Hugh P; Owen, Anne M; Harrison, Andrew P

    2014-01-01

    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811

  11. Application of Bioinformatics in Chronobiology Research

    PubMed Central

    Lopes, Robson da Silva; Resende, Nathalia Maria; Honorio-França, Adenilda Cristina; França, Eduardo Luzía

    2013-01-01

    Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through “omics” projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research. PMID:24187519

  12. Interoperability of GADU in using heterogeneous Grid resources for bioinformatics applications.

    SciTech Connect

    Sulakhe, D.; Rodriguez, A.; Wilde, M.; Foster, I.; Maltsev, N.; Univ. of Chicago

    2008-03-01

    Bioinformatics tools used for efficient and computationally intensive analysis of genetic sequences require large-scale computational resources to accommodate the growing data. Grid computational resources such as the Open Science Grid and TeraGrid have proved useful for scientific discovery. The genome analysis and database update system (GADU) is a high-throughput computational system developed to automate the steps involved in accessing the Grid resources for running bioinformatics applications. This paper describes the requirements for building an automated scalable system such as GADU that can run jobs on different Grids. The paper describes the resource-independent configuration of GADU using the Pegasus-based virtual data system that makes high-throughput computational tools interoperable on heterogeneous Grid resources. The paper also highlights the features implemented to make GADU a gateway to computationally intensive bioinformatics applications on the Grid. The paper will not go into the details of problems involved or the lessons learned in using individual Grid resources as it has already been published in our paper on genome analysis research environment (GNARE) and will focus primarily on the architecture that makes GADU resource independent and interoperable across heterogeneous Grid resources.

  13. Quantum Bio-Informatics II From Quantum Information to Bio-Informatics

    NASA Astrophysics Data System (ADS)

    Accardi, L.; Freudenberg, Wolfgang; Ohya, Masanori

    2009-02-01

    / H. Kamimura -- Massive collection of full-length complementary DNA clones and microarray analyses: keys to rice transcriptome analysis / S. Kikuchi -- Changes of influenza A(H5) viruses by means of entropic chaos degree / K. Sato and M. Ohya -- Basics of genome sequence analysis in bioinformatics - its fundamental ideas and problems / T. Suzuki and S. Miyazaki -- A basic introduction to gene expression studies using microarray expression data analysis / D. Wanke and J. Kilian -- Integrating biological perspectives: a quantum leap for microarray expression analysis / D. Wanke ... [et al.].

  14. [Ethical issues raised by direct-to-consumer personal genome analysis and whole body scans: discussion and contextualisation of a report by the Nuffield Council on Bioethics].

    PubMed

    Buyx, Alena M; Strech, Daniel; Schmidt, Harald

    2012-01-01

    The paradigm of personalised medicine has many different facets, further to the application of pharmacogenetics. We examine here (direct-to-consumer) personal genome analysis and whole body scans and summarise findings from the Nuffield Council's on Bioethics recent report "Medical profiling and online medicine: the ethics of 'personalised healthcare' in a consumer age". We describe the current situation in Germany with regard to access to such services, and contextualise the Nuffield Council's report with summaries of position statements by German professional bodies. We conclude with three points that merit examination further to the analyses of the Nuffield Council's report and the German professional bodies. These concern the role of indirect evidence in considering restrictive policies, the question of whether regulations should require commercial providers to contribute to the generation of better evidence, and the option of using data from evaluations in combination with indirect evidence in justifying restrictive policies. PMID:22325105

  15. Genomic Profiling of Metastatic Gastroenteropancreatic Neuroendocrine Tumor (GEP-NET) Patients in the Personalized-Medicine Era

    PubMed Central

    Kim, Seung Tae; Lee, Su Jin; Park, Se Hoon; Park, Joon Oh; Lim, Ho Yeong; Kang, Won Ki; Lee, Jeeyun; Park, Young Suk

    2016-01-01

    Background: We have conducted molecular profiling through a high-throughput molecular test as part of our clinical practice for patients with advanced gastrointestinal (GI) cancer or rare cancers including gastroenteropancreatic neuroendocrine tumors (GEP-NETs). Herein, we report on the molecular characterization of 14 metastatic GEP-NET patients. Methods: We conducted the Ion AmpliSeq Cancer Hotspot Panel v2 (detecting 2,855 oncogenic mutations in 50 commonly mutated genes) and nCounter Copy Number Variation Assay, which was designed with 21 genes based on available targeted agents, as a high throughput genomic platform in 14 patients with metastatic GEP-NETs. Results: Among the 14 GEP-NET patients analyzed in this study, 8 patients had grade III neuroendocrine carcinoma (NEC) and 6 had grade I/II NET. Primary sites included pancreas (n=3), small intestine and ascending colon (n=3), distal colon and rectum (n=5), and unknown primary origin (n=3). The most common metastatic site was the liver. Of 14 GEP-NET patients available for mutational profiling, 7 (50.0%) patients had one or more aberrations detected. Common aberrations were as follows: SMARCB1 mutation (n=2), TP53 mutation (n=2), STK11 mutation (n=1), RET mutation (n=1), and BRAF mutation (n=1). Gene amplification by nCounter was detected in only 1 patient, showing CCNE1 amplification, and this patient also had a TP53 mutation. Conclusions: This high throughput genomic test may be useful to identify new drug targets in metastatic GEP-NET patients. Currently, we plan to conduct further genomic analysis to develop predictive and prognostic biomarkers in a larger number of GEP-NET patients. PMID:27326246

  16. CFGP: a web-based, comparative fungal genomics platform.

    PubMed

    Park, Jongsun; Park, Bongsoo; Jung, Kyongyong; Jang, Suwang; Yu, Kwangyul; Choi, Jaeyoung; Kong, Sunghyung; Park, Jaejin; Kim, Seryun; Kim, Hyojeong; Kim, Soonok; Kim, Jihyun F; Blair, Jaime E; Lee, Kwangwon; Kang, Seogchan; Lee, Yong-Hwan

    2008-01-01

    Since the completion of the Saccharomyces cerevisiae genome sequencing project in 1996, the genomes of over 80 fungal species have been sequenced or are currently being sequenced. Resulting data provide opportunities for studying and comparing fungal biology and evolution at the genome level. To support such studies, the Comparative Fungal Genomics Platform (CFGP; http://cfgp.snu.ac.kr), a web-based multifunctional informatics workbench, was developed. The CFGP comprises three layers, including the basal layer, middleware and the user interface. The data warehouse in the basal layer contains standardized genome sequences of 65 fungal species. The middleware processes queries via six analysis tools, including BLAST, ClustalW, InterProScan, SignalP 3.0, PSORT II and a newly developed tool named BLASTMatrix. The BLASTMatrix permits the identification and visualization of genes homologous to a query across multiple species. The Data-driven User Interface (DUI) of the CFGP was built on a new concept of pre-collecting data and post-executing analysis instead of the 'fill-in-the-form-and-press-SUBMIT' user interfaces utilized by most bioinformatics sites. A tool termed Favorite, which supports the management of encapsulated sequence data and provides a personalized data repository to users, is another novel feature in the DUI. PMID:17947331

  17. Characterization of microRNA expression profiles in blood and saliva using the Ion Personal Genome Machine(®) System (Ion PGM™ System).

    PubMed

    Wang, Zheng; Zhou, Di; Cao, Yandong; Hu, Zhen; Zhang, Suhua; Bian, Yingnan; Hou, Yiping; Li, Chengtao

    2016-01-01

    MicroRNA (miRNA) expression profiling is gaining interest in the forensic community because the intrinsically short fragment and tissue-specific expression pattern enable miRNAs as a useful biomarker for body fluid identification. Measuring the quantity of miRNAs in forensically relevant body fluids is an important step to screen specific miRNAs for body fluid identification. The recent introduction of massively parallel sequencing (MPS) has the potential for screening miRNA biomarkers at the genome-wide level, which allows both the detection of expression pattern and miRNA sequences. In this study, we employed the Ion Personal Genome Machine(®) System (Ion PGM™ System, Thermo Fisher) to characterize the distribution and expression of 2588 human mature miRNAs (miRBase v21) in 5 blood samples and 5 saliva samples. An average of 1,885,000 and 1,356,000 sequence reads were generated in blood and saliva respectively. Based on miRDong, a Perl-based tool developed for semi-automated miRNA distribution designations, and manually ascertained, 6 and 19 miRNAs were identified respectively as potentially blood and saliva-specific biomarkers. Herein, this study describes a complete and reliable miRNA workflow solution based on Ion PGM™ System, starting from efficient RNA extraction, followed by small RNA library construction and sequencing. With this workflow solution and miRDong analysis it will be possible to measure miRNA expression pattern at the genome-wide level in other forensically relevant body fluids. PMID:26600000

  18. Advances in Omics and Bioinformatics Tools for Systems Analyses of Plant Functions

    PubMed Central

    Mochida, Keiichi; Shinozaki, Kazuo

    2011-01-01

    Omics and bioinformatics are essential to understanding the molecular systems that underlie various plant functions. Recent game-changing sequencing technologies have revitalized sequencing approaches in genomics and have produced opportunities for various emerging analytical applications. Driven by technological advances, several new omics layers such as the interactome, epigenome and hormonome have emerged. Furthermore, in several plant species, the development of omics resources has progressed to address particular biological properties of individual species. Integration of knowledge from omics-based research is an emerging issue as researchers seek to identify significance, gain biological insights and promote translational research. From these perspectives, we provide this review of the emerging aspects of plant systems research based on omics and bioinformatics analyses together with their associated resources and technological advances. PMID:22156726

  19. Role of remote sensing, geographical information system (GIS) and bioinformatics in kala-azar epidemiology.

    PubMed

    Bhunia, Gouri Sankar; Dikhit, Manas Ranjan; Kesari, Shreekant; Sahoo, Ganesh Chandra; Das, Pradeep

    2011-11-01

    Visceral leishmaniasis or kala-azar is a potent parasitic infection causing death of thousands of people each year. Medicinal compounds currently available for the treatment of kala-azar have serious side effects and decreased efficacy owing to the emergence of resistant strains. The type of immune reaction is also to be considered in patients infected with Leishmania donovani (L. donovani). For complete eradication of this disease, a high level modern research is currently being applied both at the molecular level as well as at the field level. The computational approaches like remote sensing, geographical information system (GIS) and bioinformatics are the key resources for the detection and distribution of vectors, patterns, ecological and environmental factors and genomic and proteomic analysis. Novel approaches like GIS and bioinformatics have been more appropriately utilized in determining the cause of visearal leishmaniasis and in designing strategies for preventing the disease from spreading from one region to another. PMID:23554714

  20. Role of remote sensing, geographical information system (GIS) and bioinformatics in kala-azar epidemiology

    PubMed Central

    Bhunia, Gouri Sankar; Dikhit, Manas Ranjan; Kesari, Shreekant; Sahoo, Ganesh Chandra; Das, Pradeep

    2011-01-01

    Visceral leishmaniasis or kala-azar is a potent parasitic infection causing death of thousands of people each year. Medicinal compounds currently available for the treatment of kala-azar have serious side effects and decreased efficacy owing to the emergence of resistant strains. The type of immune reaction is also to be considered in patients infected with Leishmania donovani (L. donovani). For complete eradication of this disease, a high level modern research is currently being applied both at the molecular level as well as at the field level. The computational approaches like remote sensing, geographical information system (GIS) and bioinformatics are the key resources for the detection and distribution of vectors, patterns, ecological and environmental factors and genomic and proteomic analysis. Novel approaches like GIS and bioinformatics have been more appropriately utilized in determining the cause of visearal leishmaniasis and in designing strategies for preventing the disease from spreading from one region to another. PMID:23554714

  1. Bioinformatics of Cancer ncRNA in High Throughput Sequencing: Present State and Challenges

    PubMed Central

    Jorge, Natasha Andressa Nogueira; Ferreira, Carlos Gil; Passetti, Fabio

    2012-01-01

    The numerous genome sequencing projects produced unprecedented amount of data providing significant information to the discovery of novel non-coding RNA (ncRNA). Several ncRNAs have been described to control gene expression and display important role during cell differentiation and homeostasis. In the last decade, high throughput methods in conjunction with approaches in bioinformatics have been used to identify, classify, and evaluate the expression of hundreds of ncRNA in normal and pathological states, such as cancer. Patient outcomes have been already associated with differential expression of ncRNAs in normal and tumoral tissues, providing new insights in the development of innovative therapeutic strategies in oncology. In this review, we present and discuss bioinformatics advances in the development of computational approaches to analyze and discover ncRNA data in oncology using high throughput sequencing technologies. PMID:23251139

  2. cl-dash: rapid configuration and deployment of Hadoop clusters for bioinformatics research in the cloud

    PubMed Central

    Hodor, Paul; Chawla, Amandeep; Clark, Andrew; Neal, Lauren

    2016-01-01

    Summary: One of the solutions proposed for addressing the challenge of the overwhelming abundance of genomic sequence and other biological data is the use of the Hadoop computing framework. Appropriate tools are needed to set up computational environments that facilitate research of novel bioinformatics methodology using Hadoop. Here, we present cl-dash, a complete starter kit for setting up such an environment. Configuring and deploying new Hadoop clusters can be done in minutes. Use of Amazon Web Services ensures no initial investment and minimal operation costs. Two sample bioinformatics applications help the researcher understand and learn the principles of implementing an algorithm using the MapReduce programming pattern. Availability and implementation: Source code is available at https://bitbucket.org/booz-allen-sci-comp-team/cl-dash.git. Contact: hodor_paul@bah.com PMID:26428290

  3. Managing Large-Scale Genomic Datasets and Translation into Clinical Practice

    PubMed Central

    2014-01-01

    Summary Objective To summarize excellent current research in the field of Bioinformatics and Translational Informatics with application in the health domain. Method We provide a synopsis of the articles selected for the IMIA Yearbook 2014, from which we attempt to derive a synthetic overview of current and future activities in the field. A first step of selection was performed by querying MEDLINE with a list of MeSH descriptors completed by a list of terms adapted to the section. Each section editor evaluated independently the set of 1,851 articles and 15 articles were retained for peer-review. Results The selection and evaluation process of this Yearbook’s section on Bioinformatics and Translational Informatics yielded three excellent articles regarding data management and genome medicine. In the first article, the authors present VEST (Variant Effect Scoring Tool) which is a supervised machine learning tool for prioritizing variants found in exome sequencing projects that are more likely involved in human Mendelian diseases. In the second article, the authors show how to infer surnames of male individuals by crossing anonymous publicly available genomic data from the Y chromosome and public genealogy data banks. The third article presents a statistical framework called iCluster+ that can perform pattern discovery in integrated cancer genomic data. This framework was able to determine different tumor subtypes in colon cancer. Conclusions The current research activities still attest the continuous convergence of Bioinformatics and Medical Informatics, with a focus this year on large-scale biological, genomic, and Electronic Health Records data. Indeed, there is a need for powerful tools for managing and interpreting complex data, but also a need for user-friendly tools developed for the clinicians in their daily practice. All the recent research and development efforts are contributing to the challenge of impacting clinically the results and even going towards a

  4. CucCAP - Developing genomic resources for the cucurbit community

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The U.S. cucurbit community has initiated a USDA-SCRI funded cucurbit genomics project, CucCAP: Leveraging applied genomics to increase disease resistance in cucurbit crops. Our primary objectives are: develop genomic and bioinformatic breeding tool kits for accelerated crop improvement across the...

  5. Microbial bioinformatics for food safety and production

    PubMed Central

    Alkema, Wynand; Boekhorst, Jos; Wels, Michiel

    2016-01-01

    In the production of fermented foods, microbes play an important role. Optimization of fermentation processes or starter culture production traditionally was a trial-and-error approach inspired by expert knowledge of the fermentation process. Current developments in high-throughput ‘omics’ technologies allow developing more rational approaches to improve fermentation processes both from the food functionality as well as from the food safety perspective. Here, the authors thematically review typical bioinformatics techniques and approaches to improve various aspects of the microbial production of fermented food products and food safety. PMID:26082168

  6. Critical Issues in Bioinformatics and Computing

    PubMed Central

    Kesh, Someswa; Raghupathi, Wullianallur

    2004-01-01

    This article provides an overview of the field of bioinformatics and its implications for the various participants. Next-generation issues facing developers (programmers), users (molecular biologists), and the general public (patients) who would benefit from the potential applications are identified. The goal is to create awareness and debate on the opportunities (such as career paths) and the challenges such as privacy that arise. A triad model of the participants' roles and responsibilities is presented along with the identification of the challenges and possible solutions. PMID:18066389

  7. Translational Bioinformatics: Past, Present, and Future

    PubMed Central

    Tenenbaum, Jessica D.

    2016-01-01

    Though a relatively young discipline, translational bioinformatics (TBI) has become a key component of biomedical research in the era of precision medicine. Development of high-throughput technologies and electronic health records has caused a paradigm shift in both healthcare and biomedical research. Novel tools and methods are required to convert increasingly voluminous datasets into information and actionable knowledge. This review provides a definition and contextualization of the term TBI, describes the discipline’s brief history and past accomplishments, as well as current foci, and concludes with predictions of future directions in the field. PMID:26876718

  8. Mobyle: a new full web bioinformatics framework

    PubMed Central

    Néron, Bertrand; Ménager, Hervé; Maufrais, Corinne; Joly, Nicolas; Maupetit, Julien; Letort, Sébastien; Carrere, Sébastien; Tuffery, Pierre; Letondal, Catherine

    2009-01-01

    Motivation: For the biologist, running bioinformatics analyses involves a time-consuming management of data and tools. Users need support to organize their work, retrieve parameters and reproduce their analyses. They also need to be able to combine their analytic tools using a safe data flow software mechanism. Finally, given that scientific tools can be difficult to install, it is particularly helpful for biologists to be able to use these tools through a web user interface. However, providing a web interface for a set of tools raises the problem that a single web portal cannot offer all the existing and possible services: it is the user, again, who has to cope with data copy among a number of different services. A framework enabling portal administrators to build a network of cooperating services would therefore clearly be beneficial. Results: We have designed a system, Mobyle, to provide a flexible and usable Web environment for defining and running bioinformatics analyses. It embeds simple yet powerful data management features that allow the user to reproduce analyses and to combine tools using a hierarchical typing system. Mobyle offers invocation of services distributed over remote Mobyle servers, thus enabling a federated network of curated bioinformatics portals without the user having to learn complex concepts or to install sophisticated software. While being focused on the end user, the Mobyle system also addresses the need, for the bioinfomatician, to automate remote services execution: PlayMOBY is a companion tool that automates the publication of BioMOBY web services, using Mobyle program definitions. Availability: The Mobyle system is distributed under the terms of the GNU GPLv2 on the project web site (http://bioweb2.pasteur.fr/projects/mobyle/). It is already deployed on three servers: http://mobyle.pasteur.fr, http://mobyle.rpbs.univ-paris-diderot.fr and http://lipm-bioinfo.toulouse.inra.fr/Mobyle. The PlayMOBY companion is distributed under the

  9. Bioinformatics in proteomics: application, terminology, and pitfalls.

    PubMed

    Wiemer, Jan C; Prokudin, Alexander

    2004-01-01

    Bioinformatics applies data mining, i.e., modern computer-based statistics, to biomedical data. It leverages on machine learning approaches, such as artificial neural networks, decision trees and clustering algorithms, and is ideally suited for handling huge data amounts. In this article, we review the analysis of mass spectrometry data in proteomics, starting with common pre-processing steps and using single decision trees and decision tree ensembles for classification. Special emphasis is put on the pitfall of overfitting, i.e., of generating too complex single decision trees. Finally, we discuss the pros and cons of the two different decision tree usages. PMID:15237926

  10. Robust Bioinformatics Recognition with VLSI Biochip Microsystem

    NASA Technical Reports Server (NTRS)

    Lue, Jaw-Chyng L.; Fang, Wai-Chi

    2006-01-01

    A microsystem architecture for real-time, on-site, robust bioinformatic patterns recognition and analysis has been proposed. This system is compatible with on-chip DNA analysis means such as polymerase chain reaction (PCR)amplification. A corresponding novel artificial neural network (ANN) learning algorithm using new sigmoid-logarithmic transfer function based on error backpropagation (EBP) algorithm is invented. Our results show the trained new ANN can recognize low fluorescence patterns better than the conventional sigmoidal ANN does. A differential logarithmic imaging chip is designed for calculating logarithm of relative intensities of fluorescence signals. The single-rail logarithmic circuit and a prototype ANN chip are designed, fabricated and characterized.

  11. Genomic Landscapes of Pancreatic Neoplasia

    PubMed Central

    Wood, Laura D.; Hruban, Ralph H.

    2015-01-01

    Pancreatic cancer is a deadly disease with a dismal prognosis. However, recent advances in sequencing and bioinformatic technology have led to the systematic characterization of the genomes of all major tumor types in the pancreas. This characterization has revealed the unique genomic landscape of each tumor type. This knowledge will pave the way for improved diagnostic and therapeutic approaches to pancreatic tumors that take advantage of the genetic alterations in these neoplasms. PMID:25812653

  12. Minimum taxonomic criteria for bacterial genome sequence depositions and announcements.

    PubMed

    Bull, Matthew J; Marchesi, Julian R; Vandamme, Peter; Plummer, Sue; Mahenthiralingam, Eshwar

    2012-04-01

    Multiple bioinformatic methods are available to analyse the information encoded within the complete genome sequence of a bacterium and accurately assign its species status or nearest phylogenetic neighbour. However, it is clear that even now in what is the third decade of bacterial genomics, taxonomically incorrect genome sequence depositions are still being made. We outline a simple scheme of bioinformatic analysis and a set of minimum criteria that should be applied to all bacterial genomic data to ensure that they are accurately assigned to the species or genus level prior to database deposition. To illustrate the utility of the bioinformatic workflow, we analysed the recently deposited genome sequence of Lactobacillus acidophilus 30SC and demonstrated that this DNA was in fact derived from a strain of Lactobacillus amylovorus. Using these methods researchers can ensure that the taxonomic accuracy of genome sequence depositions is maintained within the ever increasing nucleic acid datasets. PMID:22366464

  13. OpenHelix: bioinformatics education outside of a different box.

    PubMed

    Williams, Jennifer M; Mangan, Mary E; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C

    2010-11-01

    The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review. PMID:20798181

  14. Translational Bioinformatics: Linking the Molecular World to the Clinical World

    PubMed Central

    Altman, RB

    2014-01-01

    Translational bioinformatics represents the union of translational medicine and bioinformatics. Translational medicine moves basic biological discoveries from the research bench into the patient-care setting and uses clinical observations to inform basic biology. It focuses on patient care, including the creation of new diagnostics, prognostics, prevention strategies, and therapies based on biological discoveries. Bioinformatics involves algorithms to represent, store, and analyze basic biological data, including DNA sequence, RNA expression, and protein and small-molecule abundance within cells. Translational bioinformatics spans these two fields; it involves the development of algorithms to analyze basic molecular and cellular data with an explicit goal of affecting clinical care. PMID:22549287

  15. OpenHelix: bioinformatics education outside of a different box

    PubMed Central

    Mangan, Mary E.; Perreault-Micale, Cynthia; Lathe, Scott; Sirohi, Neeraj; Lathe, Warren C.

    2010-01-01

    The amount of biological data is increasing rapidly, and will continue to increase as new rapid technologies are developed. Professionals in every area of bioscience will have data management needs that require publicly available bioinformatics resources. Not all scientists desire a formal bioinformatics education but would benefit from more informal educational sources of learning. Effective bioinformatics education formats will address a broad range of scientific needs, will be aimed at a variety of user skill levels, and will be delivered in a number of different formats to address different learning styles. Informal sources of bioinformatics education that are effective are available, and will be explored in this review. PMID:20798181

  16. Contribution of Bioinformatics prediction in microRNA-based cancer therapeutics

    PubMed Central

    Banwait, Jasjit K; Bastola, Dhundy R

    2014-01-01

    Despite enormous efforts, cancer remains one of the most lethal diseases in the world. With the advancement of high throughput technologies massive amounts of cancer data can be accessed and analyzed. Bioinformatics provides a platform to assist biologists in developing minimally invasive biomarkers to detect cancer, and in designing effective personalized therapies to treat cancer patients. Still, the early diagnosis, prognosis, and treatment of cancer are an open challenge for the research community. MicroRNAs (miRNAs) are small non-coding RNAs that serve to regulate gene expression. The discovery of deregulated miRNAs in cancer cells and tissues has led many to investigate the use of miRNAs as potential biomarkers for early detection, and as a therapeutic agent to treat cancer. Here we describe advancements in computational approaches to predict miRNAs and their targets, and discuss the role of bioinformatics in studying miRNAs in the context of human cancer. PMID:25450261

  17. Tools and collaborative environments for bioinformatics research

    PubMed Central

    Giugno, Rosalba; Pulvirenti, Alfredo

    2011-01-01

    Advanced research requires intensive interaction among a multitude of actors, often possessing different expertise and usually working at a distance from each other. The field of collaborative research aims to establish suitable models and technologies to properly support these interactions. In this article, we first present the reasons for an interest of Bioinformatics in this context by also suggesting some research domains that could benefit from collaborative research. We then review the principles and some of the most relevant applications of social networking, with a special attention to networks supporting scientific collaboration, by also highlighting some critical issues, such as identification of users and standardization of formats. We then introduce some systems for collaborative document creation, including wiki systems and tools for ontology development, and review some of the most interesting biological wikis. We also review the principles of Collaborative Development Environments for software and show some examples in Bioinformatics. Finally, we present the principles and some examples of Learning Management Systems. In conclusion, we try to devise some of the goals to be achieved in the short term for the exploitation of these technologies. PMID:21984743

  18. Bioinformatic Insights from Metagenomics through Visualization

    SciTech Connect

    Havre, Susan L.; Webb-Robertson, Bobbie-Jo M.; Shah, Anuj; Posse, Christian; Gopalan, Banu; Brockman, Fred J.

    2005-08-10

    Revised abstract: (remove current and replace with this) Cutting-edge biological and bioinformatics research seeks a systems perspective through the analysis of multiple types of high-throughput and other experimental data for the same sample. Systems-level analysis requires the integration and fusion of such data, typically through advanced statistics and mathematics. Visualization is a complementary com-putational approach that supports integration and analysis of complex data or its derivatives. We present a bioinformatics visualization prototype, Juxter, which depicts categorical information derived from or assigned to these diverse data for the purpose of comparing patterns across categorizations. The visualization allows users to easily discern correlated and anomalous patterns in the data. These patterns, which might not be detected automatically by algorithms, may reveal valuable information leading to insight and discovery. We describe the visualization and interaction capabilities and demonstrate its utility in a new field, metagenomics, which combines molecular biology and genetics to identify and characterize genetic material from multi-species microbial samples.

  19. Receptor-binding sites: bioinformatic approaches.

    PubMed

    Flower, Darren R

    2006-01-01

    It is increasingly clear that both transient and long-lasting interactions between biomacromolecules and their molecular partners are the most fundamental of all biological mechanisms and lie at the conceptual heart of protein function. In particular, the protein-binding site is the most fascinating and important mechanistic arbiter of protein function. In this review, I examine the nature of protein-binding sites found in both ligand-binding receptors and substrate-binding enzymes. I highlight two important concepts underlying the identification and analysis of binding sites. The first is based on knowledge: when one knows the location of a binding site in one protein, one can "inherit" the site from one protein to another. The second approach involves the a priori prediction of a binding site from a sequence or a structure. The full and complete analysis of binding sites will necessarily involve the full range of informatic techniques ranging from sequence-based bioinformatic analysis through structural bioinformatics to computational chemistry and molecular physics. Integration of both diverse experimental and diverse theoretical approaches is thus a mandatory requirement in the evaluation of binding sites and the binding events that occur within them. PMID:16671408

  20. Bioinformatic Prediction of WSSV-Host Protein-Protein Interaction

    PubMed Central

    Sun, Zheng; Xiang, Jianhai

    2014-01-01

    WSSV is one of the most dangerous pathogens in shrimp aquaculture. However, the molecular mechanism of how WSSV interacts with shrimp is still not very clear. In the present study, bioinformatic approaches were used to predict interactions between proteins from WSSV and shrimp. The genome data of WSSV (NC_003225.1) and the constructed transcriptome data of F. chinensis were used to screen potentially interacting proteins by searching in protein interaction databases, including STRING, Reactome, and DIP. Forty-four pairs of proteins were suggested to have interactions between WSSV and the shrimp. Gene ontology analysis revealed that 6 pairs of these interacting proteins were classified into “extracellular region” or “receptor complex” GO-terms. KEGG pathway analysis showed that they were involved in the “ECM-receptor interaction pathway.” In the 6 pairs of interacting proteins, an envelope protein called “collagen-like protein” (WSSV-CLP) encoded by an early virus gene “wsv001” in WSSV interacted with 6 deduced proteins from the shrimp, including three integrin alpha (ITGA), two integrin beta (ITGB), and one syndecan (SDC). Sequence analysis on WSSV-CLP, ITGA, ITGB, and SDC revealed that they possessed the sequence features for protein-protein interactions. This study might provide new insights into the interaction mechanisms between WSSV and shrimp. PMID:24982879

  1. Using Bioinformatic Approaches to Identify Pathways Targeted by Human Leukemogens

    PubMed Central

    Thomas, Reuben; Phuong, Jimmy; McHale, Cliona M.; Zhang, Luoping

    2012-01-01

    We have applied bioinformatic approaches to identify pathways common to chemical leukemogens and to determine whether leukemogens could be distinguished from non-leukemogenic carcinogens. From all known and probable carcinogens classified by IARC and NTP, we identified 35 carcinogens that were associated with leukemia risk in human studies and 16 non-leukemogenic carcinogens. Using data on gene/protein targets available in the Comparative Toxicogenomics Database (CTD) for 29 of the leukemogens and 11 of the non-leukemogenic carcinogens, we analyzed for enrichment of all 250 human biochemical pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The top pathways targeted by the leukemogens included metabolism of xenobiotics by cytochrome P450, glutathione metabolism, neurotrophin signaling pathway, apoptosis, MAPK signaling, Toll-like receptor signaling and various cancer pathways. The 29 leukemogens formed 18 distinct clusters comprising 1 to 3 chemicals that did not correlate with known mechanism of action or with structural similarity as determined by 2D Tanimoto coefficients in the PubChem database. Unsupervised clustering and one-class support vector machines, based on the pathway data, were unable to distinguish the 29 leukemogens from 11 non-leukemogenic known and probable IARC carcinogens. However, using two-class random forests to estimate leukemogen and non-leukemogen patterns, we estimated a 76% chance of distinguishing a random leukemogen/non-leukemogen pair from each other. PMID:22851955

  2. Bioinformatics of varicella-zoster virus: Single nucleotide polymorphisms define clades and attenuated vaccine genotypes

    PubMed Central

    Chow, Vincent T.; Tipples, Graham A.; Grose, Charles

    2012-01-01

    Varicella zoster virus (VZV) is one of the human herpesviruses. To date, over 40 complete VZV genomes have been sequenced and analyzed. The VZV genome contains around 125,000 base pairs including 70 open reading frames (ORFs). Enumeration of single nucleotide polymorphisms (SNPs) has determined that the following ORFs are the most variable (in descending order): 62, 22, 29, 28, 37, 21, 54, 31, 1 and 55. ORF 62 is the major immediate early regulatory VZV gene. Further SNP analysis across the entire genome has led to the observation that VZV strains can be broadly grouped into clades within a phylogenetic tree. VZV strains collected in Singapore provided important sequence data for construction of the phylogenetic tree. Currently 5 VZV clades are recognized; they have been designated clades 1 through 5. Clades 1 and 3 include European/North American strains; clade 2 includes Asian strains, especially from Japan; and clade 5 includes strains from India. Clade 4 includes some strains from Europe, but its geographic origins need further documentation.. Within clade 1, five variant viruses have been isolated with a missense mutation in the gE (ORF 68) glycoprotein; these strains have an altered increased cell spread phenotype. Bioinformatics analyses of the attenuated vaccine strains have also been performed, with a subsequent discovery of a stop-codon SNP in ORFO as a likely attenuation determinant. Taken together, these VZV bioinformatics analyses have provided enormous insights into VZV phylogenetics as well as VZV SNPs associated with attenuation. PMID:23183312

  3. Introduction to Personalized Medicine in Diabetes Mellitus

    PubMed Central

    Glauber, Harry S.; Rishe, Naphtali; Karnieli, Eddy

    2014-01-01

    The world is facing an epidemic rise in diabetes mellitus (DM) incidence, which is challenging health funders, health systems, clinicians, and patients to understand and respond to a flood of research and knowledge. Evidence-based guidelines provide uniform management recommendations for “average” patients that rarely take into account individual variation in susceptibility to DM, to its complications, and responses to pharmacological and lifestyle interventions. Personalized medicine combines bioinformatics with genomic, proteomic, metabolomic, pharmacogenomic (“omics”) and other new technologies to explore pathophysiology and to characterize more precisely an individual’s risk for disease, as well as response to interventions. In this review we will introduce readers to personalized medicine as applied to DM, in particular the use of clinical, genetic, metabolic, and other markers of risk for DM and its chronic microvascular and macrovascular complications, as well as insights into variations in response to and tolerance of commonly used medications, dietary changes, and exercise. These advances in “omic” information and techniques also provide clues to potential pathophysiological mechanisms underlying DM and its complications. PMID:24498509

  4. ChIPseq in Yeast Species: From Chromatin Immunoprecipitation to High-Throughput Sequencing and Bioinformatics Data Analyses.

    PubMed

    Lelandais, Gaëlle; Blugeon, Corinne; Merhej, Jawad

    2016-01-01

    Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIPseq) is a powerful technique for the genome-wide location of protein DNA-binding sites. The ChIP experiment consists in treating living cells with a cross-linking agent to bind proteins to their DNA substrates. After fragmentation of DNA, specific fractions associated with a particular protein of interest are purified by immunoaffinity. They are next sequenced and identified on the reference genome using dedicated bioinformatics programs. Several technical aspects are important to obtain high-quality ChIPseq results. This includes the quality of antibodies, the sequencing protocols, the use of accurate controls and the careful choice of bioinformatics tools. We present here a general protocol to perform ChIPseq analyses in yeast species. This protocol has been optimized to identify target genes of specific transcription factors but can be used for any other DNA binding proteins. PMID:26483023

  5. Structural biology and bioinformatics in drug design: opportunities and challenges for target identification and lead discovery

    PubMed Central

    Blundell, Tom L; Sibanda, Bancinyane L; Montalvão, Rinaldo Wander; Brewerton, Suzanne; Chelliah, Vijayalakshmi; Worth, Catherine L; Harmer, Nicholas J; Davies, Owen; Burke, David

    2006-01-01

    Impressive progress in genome sequencing, protein expression and high-throughput crystallography and NMR has radically transformed the opportunities to use protein three-dimensional structures to accelerate drug discovery, but the quantity and complexity of the data have ensured a central place for informatics. Structural biology and bioinformatics have assisted in lead optimization and target identification where they have well established roles; they can now contribute to lead discovery, exploiting high-throughput methods of structure determination that provide powerful approaches to screening of fragment binding. PMID:16524830

  6. Toward genome-enabled mycology.

    PubMed

    Hibbett, David S; Stajich, Jason E; Spatafora, Joseph W

    2013-01-01

    Genome-enabled mycology is a rapidly expanding field that is characterized by the pervasive use of genome-scale data and associated computational tools in all aspects of fungal biology. Genome-enabled mycology is integrative and often requires teams of researchers with diverse skills in organismal mycology, bioinformatics and molecular biology. This issue of Mycologia presents the first complete fungal genomes in the history of the journal, reflecting the ongoing transformation of mycology into a genome-enabled science. Here, we consider the prospects for genome-enabled mycology and the technical and social challenges that will need to be overcome to grow the database of complete fungal genomes and enable all fungal biologists to make use of the new data. PMID:23928422

  7. The new physician as unwitting quantum mechanic: is adapting Dirac's inference system best practice for personalized medicine, genomics, and proteomics?

    PubMed

    Robson, Barry

    2007-08-01

    What is the Best Practice for automated inference in Medical Decision Support for personalized medicine? A known system already exists as Dirac's inference system from quantum mechanics (QM) using bra-kets and bras where A and B are states, events, or measurements representing, say, clinical and biomedical rules. Dirac's system should theoretically be the universal best practice for all inference, though QM is notorious as sometimes leading to bizarre conclusions that appear not to be applicable to the macroscopic world of everyday world human experience and medical practice. It is here argued that this apparent difficulty vanishes if QM is assigned one new multiplication function @, which conserves conditionality appropriately, making QM applicable to classical inference including a quantitative form of the predicate calculus. An alternative interpretation with the same consequences is if every i = radical-1 in Dirac's QM is replaced by h, an entity distinct from 1 and i and arguably a hidden root of 1 such that h2 = 1. With that exception, this paper is thus primarily a review of the application of Dirac's system, by application of linear algebra in the complex domain to help manipulate information about associations and ontology in complicated data. Any combined bra-ket can be shown to be composed only of the sum of QM-like bra and ket weights c(), times an exponential function of Fano's mutual information measure I(A; B) about the association between A and B, that is, an association rule from data mining. With the weights and Fano measure re-expressed as expectations on finite data using Riemann's Incomplete (i.e., Generalized) Zeta Functions, actual counts of observations for real world sparse data can be readily utilized. Finally, the paper compares identical character, distinguishability of states events or measurements, correlation, mutual information, and orthogonal character, important issues in data mining

  8. Molecular cloning and bioinformatic analysis of SPATA4 gene.

    PubMed

    Liu, Shang-feng; Ai, Chao; Ge, Zhong-qi; Liu, Hai-luo; Liu, Bo-wen; He, Shan; Wang, Zhao

    2005-11-30

    Full-length cDNA sequences of four novel SPATA4 genes in chimpanzee, cow, chicken and ascidian were identified by bioinformatic analysis using mouse or human SPATA4 cDNA fragment as electronic probe. All these genes have 6 exons and have similar protein molecular weight and do not localize in sex chromosome. The mouse SPATA4 sequence is identified as significantly changed in cryptorchidism, which shares no significant homology with any known protein in swissprot databases except for the homologous genes in various vertebrates. Our searching results showed that all SPATA4 proteins have a putative conserved domain DUF1042. The percentages of putative SPATA4 protein sequence identity ranging from 30 % to 99 %. The high similarity was also found in 1 kb promoter regions of human, mouse and rat SPATA4 gene. The similarities of the sequences upstream of SPATA4 promoter also have a high proportion. The results of searching SymAtlas (http://symatlas.gnf.org/SymAtlas/) showed that human SPATA4 has a high expression in testis, especially in testis interstitial, leydig cell, seminiferous tubule and germ cell. Mouse SPATA4 was observed exclusively in adult mouse testis and almost no signal was detected in other tissues. The pI values of the protein are negative, ranging from 9.44 to 10.15. The subcellular location of the protein is usually in the nucleus. And the signal peptide possibilities for SPATA4 are always zero. Using the SNPs data in NCBI, we found 33 SNPs in human SPATA4 gene genomic DNA region, with the distribution of 29 SNPs in the introns. CpG island searching gives the data about CpG island, which shows that the regions of the CpG island have a high similarity with each other, though the length of the CpG island is different from each other. This research is a fundamental work in the fields of the bioinformational analysis, and also put forward a new way for the bioinformatic analysis of other genes. PMID:16336790

  9. Atlas – a data warehouse for integrative bioinformatics

    PubMed Central

    Shah, Sohrab P; Huang, Yong; Xu, Tao; Yuen, Macaire MS; Ling, John; Ouellette, BF Francis

    2005-01-01

    Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL) calls that are implemented in a set of Application Programming Interfaces (APIs). The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD), Biomolecular Interaction Network Database (BIND), Database of Interacting Proteins (DIP), Molecular Interactions Database (MINT), IntAct, NCBI Taxonomy, Gene Ontology (GO), Online Mendelian Inheritance in Man (OMIM), LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First, Atlas stores data of

  10. Assessment of a Bioinformatics across Life Science Curricula Initiative

    ERIC Educational Resources Information Center

    Howard, David R.; Miskowski, Jennifer A.; Grunwald, Sandra K.; Abler, Michael L.

    2007-01-01

    At the University of Wisconsin-La Crosse, we have undertaken a program to integrate the study of bioinformatics across the undergraduate life science curricula. Our efforts have included incorporating bioinformatics exercises into courses in the biology, microbiology, and chemistry departments, as well as coordinating the efforts of faculty within…

  11. Evaluating an Inquiry-Based Bioinformatics Course Using Q Methodology

    ERIC Educational Resources Information Center

    Ramlo, Susan E.; McConnell, David; Duan, Zhong-Hui; Moore, Francisco B.

    2008-01-01

    Faculty at a Midwestern metropolitan public university recently developed a course on bioinformatics that emphasized collaboration and inquiry. Bioinformatics, essentially the application of computational tools to biological data, is inherently interdisciplinary. Thus part of the challenge of creating this course was serving the needs and…

  12. The 2015 Bioinformatics Open Source Conference (BOSC 2015).

    PubMed

    Harris, Nomi L; Cock, Peter J A; Lapp, Hilmar; Chapman, Brad; Davey, Rob; Fields, Christopher; Hokamp, Karsten; Munoz-Torres, Monica

    2016-02-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule. PMID:26914653

  13. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    ERIC Educational Resources Information Center

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  14. Bioinformatics Resources for MicroRNA Discovery

    PubMed Central

    Moore, Alyssa C.; Winkjer, Jonathan S.; Tseng, Tsai-Tien

    2015-01-01

    Biomarker identification is often associated with the diagnosis and evaluation of various diseases. Recently, the role of microRNA (miRNA) has been implicated in the development of diseases, particularly cancer. With the advent of next-generation sequencing, the amount of data on miRNA has increased tremendously in the last decade, requiring new bioinformatics approaches for processing and storing new information. New strategies have been developed in mining these sequencing datasets to allow better understanding toward the actions of miRNAs. As a result, many databases have also been established to disseminate these findings. This review focuses on several curated databases of miRNAs and their targets from both predicted and validated sources. PMID:26819547

  15. Survey: Translational Bioinformatics embraces Big Data

    PubMed Central

    Shah, Nigam H.

    2015-01-01

    Summary We review the latest trends and major developments in translational bioinformatics in the year 2011–2012. Our emphasis is on highlighting the key events in the field and pointing at promising research areas for the future. The key take-home points are: Translational informatics is ready to revolutionize human health and healthcare using large-scale measurements on individuals.Data–centric approaches that compute on massive amounts of data (often called “Big Data”) to discover patterns and to make clinically relevant predictions will gain adoption.Research that bridges the latest multimodal measurement technologies with large amounts of electronic healthcare data is increasing; and is where new breakthroughs will occur. PMID:22890354

  16. Bioinformatics and molecular modeling in glycobiology

    PubMed Central

    Schloissnig, Siegfried

    2010-01-01

    The field of glycobiology is concerned with the study of the structure, properties, and biological functions of the family of biomolecules called carbohydrates. Bioinformatics for glycobiology is a particularly challenging field, because carbohydrates exhibit a high structural diversity and their chains are often branched. Significant improvements in experimental analytical methods over recent years have led to a tremendous increase in the amount of carbohydrate structure data generated. Consequently, the availability of databases and tools to store, retrieve and analyze these data in an efficient way is of fundamental importance to progress in glycobiology. In this review, the various graphical representations and sequence formats of carbohydrates are introduced, and an overview of newly developed databases, the latest developments in sequence alignment and data mining, and tools to support experimental glycan analysis are presented. Finally, the field of structural glycoinformatics and molecular modeling of carbohydrates, glycoproteins, and protein–carbohydrate interaction are reviewed. PMID:20364395

  17. Bioinformatics Analysis of Estrogen-Responsive Genes.

    PubMed

    Handel, Adam E

    2016-01-01

    Estrogen is a steroid hormone that plays critical roles in a myriad of intracellular pathways. The expression of many genes is regulated through the steroid hormone receptors ESR1 and ESR2. These bind to DNA and modulate the expression of target genes. Identification of estrogen target genes is greatly facilitated by the use of transcriptomic methods, such as RNA-seq and expression microarrays, and chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq). Combining transcriptomic and ChIP-seq data enables a distinction to be drawn between direct and indirect estrogen target genes. This chapter discusses some methods of identifying estrogen target genes that do not require any expertise in programming languages or complex bioinformatics. PMID:26585125

  18. Biology and bioinformatics of myeloma cell.

    PubMed

    Abroun, Saeid; Saki, Najmaldin; Fakher, Rahim; Asghari, Farahnaz

    2012-12-01

    Multiple myeloma (MM) is a plasma cell disorder that occurs in about 10% of all hematologic cancers. The majority of patients (99%) are over 50 years of age when diagnosed. In the bone marrow (BM), stromal and hematopoietic stem cells (HSCs) are responsible for the production of blood cells. Therefore any destruction or/and changes within the BM undesirably impacts a wide range of hematopoiesis, causing diseases and influencing patient survival. In order to establish an effective therapeutic strategy, recognition of the biology and evaluation of bioinformatics models for myeloma cells are necessary to assist in determining suitable methods to cure or prevent disease complications in patients. This review presents the evaluation of molecular and cellular aspects of MM such as genetic translocation, genetic analysis, cell surface marker, transcription factors, and chemokine signaling pathways. It also briefly reviews some of the mechanisms involved in MM in order to develop a better understanding for use in future studies. PMID:23253865

  19. The rise of genomics.

    PubMed

    Weissenbach, Jean

    2016-01-01

    A brief history of the development of genomics is provided. Complete sequencing of genomes of uni- and multicellular organisms is based on important progress in sequencing and bioinformatics. Evolution of these methods is ongoing and has triggered an explosion in data production and analysis. Initial analyses focused on the inventory of genes encoding proteins. Completeness and quality of gene prediction remains crucial. Genome analyses profoundly modified our views on evolution, biodiversity and contributed to the detection of new functions, yet to be fully elucidated, such as those fulfilled by non-coding RNAs. Genomics has become the basis for the study of biology and provides the molecular support for a bunch of large-scale studies, the omics. PMID:27263360

  20. Performance characteristics of the AmpliSeq Cancer Hotspot panel v2 in combination with the Ion Torrent Next Generation Sequencing Personal Genome Machine.

    PubMed

    Butler, Kimberly S; Young, Megan Y L; Li, Zhihua; Elespuru, Rosalie K; Wood, Steven C

    2016-02-01

    Next-Generation Sequencing is a rapidly advancing technology that has research and clinical applications. For many cancers, it is important to know the precise mutation(s) present, as specific mutations could indicate or contra-indicate certain treatments as well as be indicative of prognosis. Using the Ion Torrent Personal Genome Machine and the AmpliSeq Cancer Hotspot panel v2, we sequenced two pancreatic cancer cell lines, BxPC-3 and HPAF-II, alone or in mixtures, to determine the error rate, sensitivity, and reproducibility of this system. The system resulted in coverage averaging 2000× across the various amplicons and was able to reliably and reproducibly identify mutations present at a rate of 5%. Identification of mutations present at a lower rate was possible by altering the parameters by which calls were made, but with an increase in erroneous, low-level calls. The panel was able to identify known mutations in these cell lines that are present in the COSMIC database. In addition, other, novel mutations were also identified that may prove clinically useful. The system was assessed for systematic errors such as homopolymer effects, end of amplicon effects and patterns in NO CALL sequence. Overall, the system is adequate at identifying the known, targeted mutations in the panel. PMID:26387931

  1. Comparison of Online and Onsite Bioinformatics Instruction for a Fully Online Bioinformatics Master’s Program

    PubMed Central

    Obom, Kristina M.; Cummings, Patrick J.

    2007-01-01

    The completely online Master of Science in Bioinformatics program differs from the onsite program only in the mode of content delivery. Analysis of student satisfaction indicates no statistically significant difference between most online and onsite student responses, however, online and onsite students do differ significantly in their responses to a few questions on the course evaluation queries. Analysis of student exam performance using three assessments indicates that there was no significant difference in grades earned by students in online and onsite courses. These results suggest that our model for online bioinformatics education provides students with a rigorous course of study that is comparable to onsite course instruction and possibly provides a more rigorous course load and more opportunities for participation. PMID:23653816

  2. LXtoo: an integrated live Linux distribution for the bioinformatics community

    PubMed Central

    2012-01-01

    Background Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Findings Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. Conclusions LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo. PMID:22813356

  3. 4273π: Bioinformatics education on low cost ARM hardware

    PubMed Central

    2013-01-01

    Background Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. Results We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012–2013. Conclusions 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost. PMID:23937194

  4. Genomics and functional genomics with haloarchaea.

    PubMed

    Soppa, J; Baumann, A; Brenneis, M; Dambeck, M; Hering, O; Lange, C

    2008-09-01

    The first haloarchaeal genome was published in 2000 and today five genome sequences are available. Transcriptome and proteome analyses have been established for two and three haloarchaeal species, respectively, and more than 20 studies using these functional genomic approaches have been published in the last two years. These studies gave global overviews of metabolic regulation (aerobic and anaerobic respiration, phototrophy, carbon source usage), stress response (UV, X-rays, transition metals, osmotic and temperature stress), cell cycle-dependent transcript level regulation, and transcript half-lives. The only translatome analysis available for any prokaryotic species revealed that 10 and 20% of all transcripts are translationally regulated in Haloferax volcanii and Halobacterium salinarum, respectively. Very effective methods for the construction of in frame deletion mutants have been established recently for haloarchaea and are intensively used to unravel the biological roles of genes in this group. Bioinformatic analyses include both cross-genome comparisons as well as integration of genomic data with experimental results. The first systems biology approaches have been performed that used experimental data to construct predictive models of gene expression and metabolism, respectively. In this contribution the current status of genomics, functional genomics, and molecular genetics of haloarchaea is summarized and selected examples are discussed. PMID:18493745

  5. Personalized ophthalmology

    PubMed Central

    Porter, LF; Black, GCM

    2014-01-01

    Porter L.F., Black G.C.M. Personalized ophthalmology. Clin Genet 2014: 86: 1–11. © 2014 The Authors. Clinical Genetics published by John Wiley & Sons A/S. Published by John Wiley & Sons Ltd., 2014 Ophthalmology has been an early adopter of personalized medicine. Drawing on genomic advances to improve molecular diagnosis, such as next-generation sequencing, and basic and translational research to develop novel therapies, application of genetic technologies in ophthalmology now heralds development of gene replacement therapies for some inherited monogenic eye diseases. It also promises to alter prediction, diagnosis and management of the complex disease age-related macular degeneration. Personalized ophthalmology is underpinned by an understanding of the molecular basis of eye disease. Two important areas of focus are required for adoption of personalized approaches: disease stratification and individualization. Disease stratification relies on phenotypic and genetic assessment leading to molecular diagnosis; individualization encompasses all aspects of patient management from optimized genetic counseling and conventional therapies to trials of novel DNA-based therapies. This review discusses the clinical implications of these twin strategies. Advantages and implications of genetic testing for patients with inherited eye diseases, choice of molecular diagnostic modality, drivers for adoption of personalized ophthalmology, service planning implications, ethical considerations and future challenges are considered. Indeed, whilst many difficulties remain, personalized ophthalmology truly has the potential to revolutionize the specialty. PMID:24665880

  6. Bioinformatic identification of Mycobacterium tuberculosis proteins likely to target host cell mitochondria: virulence factors?

    PubMed Central

    2012-01-01

    Background M. tuberculosis infection either induces or inhibits host cell death, depending on the bacterial strain and the cell microenvironment. There is evidence suggesting a role for mitochondria in these processes. On the other hand, it has been shown that several bacterial proteins are able to target mitochondria, playing a critical role in bacterial pathogenesis and modulation of cell death. However, mycobacteria–derived proteins able to target host cell mitochondria are less studied. Results A bioinformaic analysis based on available genomic sequences of the common laboratory virulent reference strain Mycobacterium tuberculosis H37Rv, the avirulent strain H37Ra, the clinical isolate CDC1551, and M. bovis BCG Pasteur strain 1173P2, as well as of suitable bioinformatic tools (MitoProt II, PSORT II, and SignalP) for the in silico search for proteins likely to be secreted by mycobacteria that could target host cell mitochondria, showed that at least 19 M. tuberculosis proteins could possibly target host cell mitochondria. We experimentally tested this bioinformatic prediction on four M. tuberculosis recombinant proteins chosen from this list of 19 proteins (p27, PE_PGRS1, PE_PGRS33, and MT_1866). Confocal microscopy analyses showed that p27, and PE_PGRS33 proteins colocalize with mitochondria. Conclusions Based on the bioinformatic analysis of whole M. tuberculosis genome sequences, we propose that at least 19 out of 4,246 M. tuberculosis predicted proteins would be able to target host cell mitochondria and, in turn, control mitochondrial physiology. Interestingly, such a list of 19 proteins includes five members of a mycobacteria specific family of proteins (PE/PE_PGRS) thought to be virulence factors, and p27, a well known virulence factor. P27, and PE_PGRS33 proteins experimentally showed to target mitochondria in J774 cells. Our results suggest a link between mitochondrial targeting of M. tuberculosis proteins and virulence. PMID:23259719

  7. Bioinformatics for precision medicine in oncology: principles and application to the SHIVA clinical trial

    PubMed Central

    Servant, Nicolas; Roméjon, Julien; Gestraud, Pierre; La Rosa, Philippe; Lucotte, Georges; Lair, Séverine; Bernard, Virginie; Zeitouni, Bruno; Coffin, Fanny; Jules-Clément, Gérôme; Yvon, Florent; Lermine, Alban; Poullet, Patrick; Liva, Stéphane; Pook, Stuart; Popova, Tatiana; Barette, Camille; Prud’homme, François; Dick, Jean-Gabriel; Kamal, Maud; Le Tourneau, Christophe; Barillot, Emmanuel; Hupé, Philippe

    2014-01-01

    Precision medicine (PM) requires the delivery of individually adapted medical care based on the genetic characteristics of each patient and his/her tumor. The last decade witnessed the development of high-throughput technologies such as microarrays and next-generation sequencing which paved the way to PM in the field of oncology. While the cost of these technologies decreases, we are facing an exponential increase in the amount of data produced. Our ability to use this information in daily practice relies strongly on the availability of an efficient bioinformatics system that assists in the translation of knowledge from the bench towards molecular targeting and diagnosis. Clinical trials and routine diagnoses constitute different approaches, both requiring a strong bioinformatics environment capable of (i) warranting the integration and the traceability of data, (ii) ensuring the correct processing and analyses of genomic data, and (iii) applying well-defined and reproducible procedures for workflow management and decision-making. To address the issues, a seamless information system was developed at Institut Curie which facilitates the data integration and tracks in real-time the processing of individual samples. Moreover, computational pipelines were developed to identify reliably genomic alterations and mutations from the molecular profiles of each patient. After a rigorous quality control, a meaningful report is delivered to the clinicians and biologists for the therapeutic decision. The complete bioinformatics environment and the key points of its implementation are presented in the context of the SHIVA clinical trial, a multicentric randomized phase II trial comparing targeted therapy based on tumor molecular profiling versus conventional therapy in patients with refractory cancer. The numerous challenges faced in practice during the setting up and the conduct of this trial are discussed as an illustration of PM application. PMID:24910641

  8. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers

    PubMed Central

    Brazas, Michelle D.; Ouellette, B. F. Francis

    2016-01-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression. PMID:27281025

  9. Continuing Education Workshops in Bioinformatics Positively Impact Research and Careers.

    PubMed

    Brazas, Michelle D; Ouellette, B F Francis

    2016-06-01

    Bioinformatics.ca has been hosting continuing education programs in introductory and advanced bioinformatics topics in Canada since 1999 and has trained more than 2,000 participants to date. These workshops have been adapted over the years to keep pace with advances in both science and technology as well as the changing landscape in available learning modalities and the bioinformatics training needs of our audience. Post-workshop surveys have been a mandatory component of each workshop and are used to ensure appropriate adjustments are made to workshops to maximize learning. However, neither bioinformatics.ca nor others offering similar training programs have explored the long-term impact of bioinformatics continuing education training. Bioinformatics.ca recently initiated a look back on the impact its workshops have had on the career trajectories, research outcomes, publications, and collaborations of its participants. Using an anonymous online survey, bioinformatics.ca analyzed responses from those surveyed and discovered its workshops have had a positive impact on collaborations, research, publications, and career progression. PMID:27281025

  10. From Gigabyte to Kilobyte: A Bioinformatics Protocol for Mining Large RNA-Seq Transcriptomics Data.

    PubMed

    Li, Jilong; Hou, Jie; Sun, Lin; Wilkins, Jordan Maximillian; Lu, Yuan; Niederhuth, Chad E; Merideth, Benjamin Ryan; Mawhinney, Thomas P; Mossine, Valeri V; Greenlief, C Michael; Walker, John C; Folk, William R; Hannink, Mark; Lubahn, Dennis B; Birchler, James A; Cheng, Jianlin

    2015-01-01

    RNA-Seq techniques generate hundreds of millions of short RNA reads using next-generation sequencing (NGS). These RNA reads can be mapped to reference genomes to investigate changes of gene expression but improved procedures for mining large RNA-Seq datasets to extract valuable biological knowledge are needed. RNAMiner--a multi-level bioinformatics protocol and pipeline--has been developed for such datasets. It includes five steps: Mapping RNA-Seq reads to a reference genome, calculating gene expression values, identifying differentially expressed genes, predicting gene functions, and constructing gene regulatory networks. To demonstrate its utility, we applied RNAMiner to datasets generated from Human, Mouse, Arabidopsis thaliana, and Drosophila melanogaster cells, and successfully identified differentially expressed genes, clustered them into cohesive functional groups, and constructed novel gene regulatory networks. The RNAMiner web service is available at http://calla.rnet.missouri.edu/rnaminer/index.html. PMID:25902288

  11. From Gigabyte to Kilobyte: A Bioinformatics Protocol for Mining Large RNA-Seq Transcriptomics Data

    PubMed Central

    Li, Jilong; Hou, Jie; Sun, Lin; Wilkins, Jordan Maximillian; Lu, Yuan; Niederhuth, Chad E.; Merideth, Benjamin Ryan; Mawhinney, Thomas P.; Mossine, Valeri V.; Greenlief, C. Michael; Walker, John C.; Folk, William R.; Hannink, Mark; Lubahn, Dennis B.; Birchler, James A.; Cheng, Jianlin

    2015-01-01

    RNA-Seq techniques generate hundreds of millions of short RNA reads using next-generation sequencing (NGS). These RNA reads can be mapped to reference genomes to investigate changes of gene expression but improved procedures for mining large RNA-Seq datasets to extract valuable biological knowledge are needed. RNAMiner—a multi-level bioinformatics protocol and pipeline—has been developed for such datasets. It includes five steps: Mapping RNA-Seq reads to a reference genome, calculating gene expression values, identifying differentially expressed genes, predicting gene functions, and constructing gene regulatory networks. To demonstrate its utility, we applied RNAMiner to datasets generated from Human, Mouse, Arabidopsis thaliana, and Drosophila melanogaster cells, and successfully identified differentially expressed genes, clustered them into cohesive functional groups, and constructed novel gene regulatory networks. The RNAMiner web service is available at http://calla.rnet.missouri.edu/rnaminer/index.html. PMID:25902288

  12. ArrayTrack: a free FDA bioinformatics tool to support emerging biomedical research--an update.

    PubMed

    Xu, Joshua; Kelly, Reagan; Fang, Hong; Tong, Weida

    2010-08-01

    ArrayTrack is a Food and Drug Administration (FDA) bioinformatics tool that has been widely adopted by the research community for genomics studies. It provides an integrated environment for microarray data management, analysis and interpretation. Most of its functionality for statistical, pathway and gene ontology analysis can also be applied independently to data generated by other molecular technologies. ArrayTrack has been undergoing active development and enhancement since its inception in 2001. This review summarises its key functionalities, with emphasis on the most recent extensions in support of the evolving needs of FDA's research programmes. ArrayTrack has added capability to manage, analyse and interpret proteomics and metabolomics data after quantification of peptides and metabolites abundance, respectively. Annotation information about single nucleotide polymorphisms and quantitative trait loci has been integrated to support genetics-related studies. Other extensions have been added to manage and analyse genomics data related to bacterial food-borne pathogens. PMID:20846933

  13. Grouping and identification of sequence tags (GRIST): bioinformatics tools for the NEIBank database.

    PubMed

    Wistow, Graeme; Bernstein, Steven L; Touchman, Jeffrey W; Bouffard, Gerald; Wyatt, M Keith; Peterson, Katherine; Behal, Amita; Gao, James; Buchoff, Patee; Smith, Don

    2002-06-15

    NEIBank is a project to develop and organize genomics and bioinformatics resources for the eye. As part of this effort, tools have been developed for bioinformatics analysis and web based display of data from expressed sequence tag (EST) analyses. EST sequences are identified and formed into groups or clusters representing related transcripts from the same gene. This is carried out by a rules-based procedure called GRIST (GRouping and Identification of Sequence Tags) that uses sequence match parameters derived from BLAST programs. Linked procedures are used to eliminate non-mRNA contaminants. All data are assembled in a relational database and assembled for display as web pages with annotations and links to other informatics resources. Genome projects generate huge amounts of data that need to be classified and organized to become easily accessible to the research community. GRIST provides a useful tool for assembling and displaying the results of EST analyses. The NEIBank web site contains a growing set of pages cataloging the known transcriptional repertoire of eye tissues, derived from new NEIBank cDNA libraries and from eye-related data deposited in the dbEST section of GenBank. PMID:12107414

  14. Review of Current Methods, Applications, and Data Management for the Bioinformatics Analysis of Whole Exome Sequencing

    PubMed Central

    Bao, Riyue; Huang, Lei; Andrade, Jorge; Tan, Wei; Kibbe, Warren A; Jiang, Hongmei; Feng, Gang

    2014-01-01

    The advent of next-generation sequencing technologies has greatly promoted advances in the study of human diseases at the genomic, transcriptomic, and epigenetic levels. Exome sequencing, where the coding region of the genome is captured and sequenced at a deep level, has proven to be a cost-effective method to detect disease-causing variants and discover gene targets. In this review, we outline the general framework of whole exome sequence data analysis. We focus on established bioinformatics tools and applications that support five analytical steps: raw data quality assessment, pre-processing, alignment, post-processing, and variant analysis (detection, annotation, and prioritization). We evaluate the performance of open-source alignment programs and variant calling tools using simulated and benchmark datasets, and highlight the challenges posed by the lack of concordance among variant detection tools. Based on these results, we recommend adopting multiple tools and resources to reduce false positives and increase the sensitivity of variant calling. In addition, we briefly discuss the current status and solutions for big data management, analysis, and summarization in the field of bioinformatics. PMID:25288881

  15. Opportunities and challenges provided by cloud repositories for bioinformatics-enabled drug discovery.

    PubMed

    Dalpé, Gratien; Joly, Yann

    2014-09-01

    Healthcare-related bioinformatics databases are increasingly offering the possibility to maintain, organize, and distribute DNA sequencing data. Different national and international institutions are currently hosting such databases that offer researchers website platforms where they can obtain sequencing data on which they can perform different types of analysis. Until recently, this process remained mostly one-dimensional, with most analysis concentrated on a limited amount of data. However, newer genome sequencing technology is producing a huge amount of data that current computer facilities are unable to handle. An alternative approach has been to start adopting cloud computing services for combining the information embedded in genomic and model system biology data, patient healthcare records, and clinical trials' data. In this new technological paradigm, researchers use virtual space and computing power from existing commercial or not-for-profit cloud service providers to access, store, and analyze data via different application programming interfaces. Cloud services are an alternative to the need of larger data storage; however, they raise different ethical, legal, and social issues. The purpose of this Commentary is to summarize how cloud computing can contribute to bioinformatics-based drug discovery and to highlight some of the outstanding legal, ethical, and social issues that are inherent in the use of cloud services. PMID:25195583

  16. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses

    PubMed Central

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-01-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. PMID:24462600

  17. Bioinformatic analysis of peptide precursor proteins.

    PubMed

    Baggerman, G; Liu, F; Wets, G; Schoofs, L

    2005-04-01

    Neuropeptides are among the most important signal molecules in animals. Traditional identification of peptide hormones through peptide purification is a tedious and time-consuming process. With the advent of the genome sequencing projects, putative peptide precursor can be mined from the genome. However, because bioactive peptides are usually quite short in length and because the active core of a peptide is often limited to only a few amino acids, using the BLAST search engine to identify neuropeptide precursors in the genome is difficult and sometimes impossible. To overcome these shortcomings, we subject the entire set of all known Drosophila melanogaster peptide precursor sequences to motif-finding algorithms in search of a motif that is common for all prepropeptides and that could be used in the search for new peptide precursors. PMID:15891006

  18. Survey of MapReduce frame operation in bioinformatics.

    PubMed

    Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke

    2014-07-01

    Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics. PMID:23396756

  19. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom.

    PubMed

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R; Domozych, David S; Popper, Zoë A; Showalter, Allan M

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  20. Bioinformatic Identification and Analysis of Extensins in the Plant Kingdom

    PubMed Central

    Liu, Xiao; Wolfe, Richard; Welch, Lonnie R.; Domozych, David S.; Popper, Zoë A.; Showalter, Allan M.

    2016-01-01

    Extensins (EXTs) are a family of plant cell wall hydroxyproline-rich glycoproteins (HRGPs) that are implicated to play important roles in plant growth, development, and defense. Structurally, EXTs are characterized by the repeated occurrence of serine (Ser) followed by three to five prolines (Pro) residues, which are hydroxylated as hydroxyproline (Hyp) and glycosylated. Some EXTs have Tyrosine (Tyr)-X-Tyr (where X can be any amino acid) motifs that are responsible for intramolecular or intermolecular cross-linkings. EXTs can be divided into several classes: classical EXTs, short EXTs, leucine-rich repeat extensins (LRXs), proline-rich extensin-like receptor kinases (PERKs), formin-homolog EXTs (FH EXTs), chimeric EXTs, and long chimeric EXTs. To guide future research on the EXTs and understand evolutionary history of EXTs in the plant kingdom, a bioinformatics study was conducted to identify and classify EXTs from 16 fully sequenced plant genomes, including Ostreococcus lucimarinus, Chlamydomonas reinhardtii, Volvox carteri, Klebsormidium flaccidum, Physcomitrella patens, Selaginella moellendorffii, Pinus taeda, Picea abies, Brachypodium distachyon, Zea mays, Oryza sativa, Glycine max, Medicago truncatula, Brassica rapa, Solanum lycopersicum, and Solanum tuberosum, to supplement data previously obtained from Arabidopsis thaliana and Populus trichocarpa. A total of 758 EXTs were newly identified, including 87 classical EXTs, 97 short EXTs, 61 LRXs, 75 PERKs, 54 FH EXTs, 38 long chimeric EXTs, and 346 other chimeric EXTs. Several notable findings were made: (1) classical EXTs were likely derived after the terrestrialization of plants; (2) LRXs, PERKs, and FHs were derived earlier than classical EXTs; (3) monocots have few classical EXTs; (4) Eudicots have the greatest number of classical EXTs and Tyr-X-Tyr cross-linking motifs are predominantly in classical EXTs; (5) green algae have no classical EXTs but have a number of long chimeric EXTs that are absent in

  1. Development of Bioinformatic and Experimental Technologies for Identification of Prokaryotic Regulatory Networks

    SciTech Connect

    Lawrence, Charles E; McCue, Lee Ann

    2008-07-31

    The transcription regulatory network is arguably the most important foundation of cellular function, since it exerts the most fundamental control over the abundance of virtually all of a cell’s functional macromolecules. The two major components of a prokaryotic cell’s transcription regulation network are the transcription factors (TFs) and the transcription factor binding sites (TFBS); these components are connected by the binding of TFs to their cognate TFBS under appropriate environmental conditions. Comparative genomics has proven to be a powerful bioinformatics method with which to study transcription regulation on a genome-wide level. We have further extended comparative genomics technologies that we introduced over the last several years. Specifically, we developed and applied statistical approaches to analysis of correlated sequence data (i.e., sequences from closely related species). We also combined these technologies with functional genomic, proteomic and sequence data from multiple species, and developed computational technologies that provide inferences on the regulatory network connections, identifying the cognate transcription factor for predicted regulatory sites. Arguably the most important contribution of this work emerged in the course of the project. Specifically, the development of novel procedures of estimation and prediction in discrete high-D settings has broad implications for biology, genomics and well beyond. We showed that these procedures enjoy advantages over existing technologies in the identification of TBFS. These efforts are aimed toward identifying a cell’s complete transcription regulatory network and underlying molecular mechanisms.

  2. [Ethical, legal, and social issues of genome research--new phase of genome research desperately requires social understanding and safeguards on the use of medical records and other personal information].

    PubMed

    Masui, Tohru; Takada, Yoko

    2003-03-01

    This article provides an overview of the use of human materials and information (human subject) in the new phase of pharmacological research and development in the current context, especially as it relates to the progress of the human genome project. In a sense, humanity has been drastically reduced to an array of DNA sequences that can be universally used in comparing living things. Pharmacological studies now acquire a unique status in bridging chemical substances to human body function. To perform the full activity of the nature of pharmacology, it requires both genotype and personal information, i.e. medical records and life style information, as research resources. In the UK, the Medical Research Council, the Wellcome Trust, and the Department of Health had started to plan UK Biobank for promoting and supporting the new stage of medical and pharmacological research and development. UK Biobank will collect DNA samples, medical records, and life style information of 500,000 people between the age range of 45 to 69 years old. It will follow the changes in health status of the participants for more than 10 years. The Biobank will provide researchers chances to correlate the genotypic traits to phenotypic ones, i.e. common diseases. In relation to the secondary use of medical records in health research, National Health Service (NHS) initiated a new strategy on the governance of patient information. These movements clearly demonstrated the indispensable nature of infrastructures for promoting and supporting pharmacological and medical research. We discuss on the necessary policies in constructing the Japanese infrastructure. PMID:12693011

  3. Evaluation of the Ion Torrent Personal Genome Machine for Gene-Targeted Studies Using Amplicons of the Nitrogenase Gene nifH.

    PubMed

    Zhang, Bangzhou; Penton, C Ryan; Xue, Chao; Wang, Qiong; Zheng, Tianling; Tiedje, James M

    2015-07-01

    The sequencing chips and kits of the Ion Torrent Personal Genome Machine (PGM), which employs semiconductor technology to measure pH changes in polymerization events, have recently been upgraded. The quality of PGM sequences has not been reassessed, and results have not been compared in the context of a gene-targeted microbial ecology study. To address this, we compared sequence profiles across available PGM chips and chemistries and with 454 pyrosequencing data by determining error types and rates and diazotrophic community structures. The PGM was then used to assess differences in nifH-harboring bacterial community structure among four corn-based cropping systems. Using our suggested filters from mock community analyses, the overall error rates were 0.62, 0.36, and 0.39% per base for chips 318 and 314 with the 400-bp kit and chip 318 with the Hi-Q chemistry, respectively. Compared with the 400-bp kit, the Hi-Q kit reduced indel rates by 28 to 59% and produced one to seven times more reads acceptable for downstream analyses. The PGM produced higher frameshift rates than pyrosequencing that were corrected by the RDP FrameBot tool. Significant differences among platforms were identified, although the diversity indices and overall site-based conclusions remained similar. For the cropping system analyses, a total of 6,182 unique NifH operational taxonomic units at 5% amino acid dissimilarity were obtained. The current crop type, as well as the crop rotation history, significantly influenced the composition of the soil diazotrophic community detected. PMID:25911484

  4. Bioinformatics construction of the human cell surfaceome

    PubMed Central

    da Cunha, J. P. C.; Galante, P. A. F.; de Souza, J. E.; de Souza, R. F.; Carvalho, P. M.; Ohara, D. T.; Moura, R. P.; Oba-Shinja, S. M.; Marie, S. K. N.; Silva, W. A.; Perez, R. O.; Stransky, B.; Pieprzyk, M.; Moore, J.; Caballero, O.; Gama-Rodrigues, J.; Habr-Gama, A.; Kuo, W. P.; Simpson, A. J.; Camargo, A. A.; Old, Lloyd J.; de Souza, S. J.

    2009-01-01

    Cell surface proteins are excellent targets for diagnostic and therapeutic interventions. By using bioinformatics tools, we generated a catalog of 3,702 transmembrane proteins located at the surface of human cells (human cell surfaceome). We explored the genetic diversity of the human cell surfaceome at different levels, including the distribution of polymorphisms, conservation among eukaryotic species, and patterns of gene expression. By integrating expression information from a variety of sources, we were able to identify surfaceome genes with a restricted expression in normal tissues and/or differential expression in tumors, important characteristics for putative tumor targets. A high-throughput and efficient quantitative real-time PCR approach was used to validate 593 surfaceome genes selected on the basis of their expression pattern in normal and tumor samples. A number of candidates were identified as potential diagnostic and therapeutic targets for colorectal tumors and glioblastoma. Several candidate genes were also identified as coding for cell surface cancer/testis antigens. The human cell surfaceome will serve as a reference for further studies aimed at characterizing tumor targets at the surface of human cells. PMID:19805368

  5. Bioinformatic tools for microRNA dissection

    PubMed Central

    Akhtar, Most Mauluda; Micolucci, Luigina; Islam, Md Soriful; Olivieri, Fabiola; Procopio, Antonio Domenico

    2016-01-01

    Recently, microRNAs (miRNAs) have emerged as important elements of gene regulatory networks. MiRNAs are endogenous single-stranded non-coding RNAs (∼22-nt long) that regulate gene expression at the post-transcriptional level. Through pairing with mRNA, miRNAs can down-regulate gene expression by inhibiting translation or stimulating mRNA degradation. In some cases they can also up-regulate the expression of a target gene. MiRNAs influence a variety of cellular pathways that range from development to carcinogenesis. The involvement of miRNAs in several human diseases, particularly cancer, makes them potential diagnostic and prognostic biomarkers. Recent technological advances, especially high-throughput sequencing, have led to an exponential growth in the generation of miRNA-related data. A number of bioinformatic tools and databases have been devised to manage this growing body of data. We analyze 129 miRNA tools that are being used in diverse areas of miRNA research, to assist investigators in choosing the most appropriate tools for their needs. PMID:26578605

  6. Parallel evolutionary computation in bioinformatics applications.

    PubMed

    Pinho, Jorge; Sobral, João Luis; Rocha, Miguel

    2013-05-01

    A large number of optimization problems within the field of Bioinformatics require methods able to handle its inherent complexity (e.g. NP-hard problems) and also demand increased computational efforts. In this context, the use of parallel architectures is a necessity. In this work, we propose ParJECoLi, a Java based library that offers a large set of metaheuristic methods (such as Evolutionary Algorithms) and also addresses the issue of its efficient execution on a wide range of parallel architectures. The proposed approach focuses on the easiness of use, making the adaptation to distinct parallel environments (multicore, cluster, grid) transparent to the user. Indeed, this work shows how the development of the optimization library can proceed independently of its adaptation for several architectures, making use of Aspect-Oriented Programming. The pluggable nature of parallelism related modules allows the user to easily configure its environment, adding parallelism modules to the base source code when needed. The performance of the platform is validated with two case studies within biological model optimization. PMID:23127284

  7. Structural bioinformatics of the human spliceosomal proteome

    PubMed Central

    Korneta, Iga; Magnus, Marcin; Bujnicki, Janusz M.

    2012-01-01

    In this work, we describe the results of a comprehensive structural bioinformatics analysis of the spliceosomal proteome. We used fold recognition analysis to complement prior data on the ordered domains of 252 human splicing proteins. Examples of newly identified domains include a PWI domain in the U5 snRNP protein 200K (hBrr2, residues 258–338), while examples of previously known domains with a newly determined fold include the DUF1115 domain of the U4/U6 di-snRNP protein 90K (hPrp3, residues 540–683). We also established a non-redundant set of experimental models of spliceosomal proteins, as well as constructed in silico models for regions without an experimental structure. The combined set of structural models is available for download. Altogether, over 90% of the ordered regions of the spliceosomal proteome can be represented structurally with a high degree of confidence. We analyzed the reduced spliceosomal proteome of the intron-poor organism Giardia lamblia, and as a result, we proposed a candidate set of ordered structural regions necessary for a functional spliceosome. The results of this work will aid experimental and structural analyses of the spliceosomal proteins and complexes, and can serve as a starting point for multiscale modeling of the structure of the entire spliceosome. PMID:22573172

  8. Identifiying human MHC supertypes using bioinformatic methods.

    PubMed

    Doytchinova, Irini A; Guan, Pingping; Flower, Darren R

    2004-04-01

    Classification of MHC molecules into supertypes in terms of peptide-binding specificities is an important issue, with direct implications for the development of epitope-based vaccines with wide population coverage. In view of extremely high MHC polymorphism (948 class I and 633 class II HLA alleles) the experimental solution of this task is presently impossible. In this study, we describe a bioinformatics strategy for classifying MHC molecules into supertypes using information drawn solely from three-dimensional protein structure. Two chemometric techniques-hierarchical clustering and principal component analysis-were used independently on a set of 783 HLA class I molecules to identify supertypes based on structural similarities and molecular interaction fields calculated for the peptide binding site. Eight supertypes were defined: A2, A3, A24, B7, B27, B44, C1, and C4. The two techniques gave 77% consensus, i.e., 605 HLA class I alleles were classified in the same supertype by both methods. The proposed strategy allowed "supertype fingerprints" to be identified. Thus, the A2 supertype fingerprint is Tyr(9)/Phe(9), Arg(97), and His(114) or Tyr(116); the A3-Tyr(9)/Phe(9)/Ser(9), Ile(97)/Met(97) and Glu(114) or Asp(116); the A24-Ser(9) and Met(97); the B7-Asn(63) and Leu(81); the B27-Glu(63) and Leu(81); for B44-Ala(81); the C1-Ser(77); and the C4-Asn(77). PMID:15034046

  9. Perspectives on Clinical Informatics: Integrating Large-Scale Clinical, Genomic, and Health Information for Clinical Care

    PubMed Central

    Choi, In Young; Kim, Tae-Min; Kim, Myung Shin; Mun, Seong K.

    2013-01-01

    The advances in electronic medical records (EMRs) and bioinformatics (BI) represent two significant trends in healthcare. The widespread adoption of EMR systems and the completion of the Human Genome Project developed the technologies for data acquisition, analysis, and visualization in two different domains. The massive amount of data from both clinical and biology domains is expected to provide personalized, preventive, and predictive healthcare services in the near future. The integrated use of EMR and BI data needs to consider four key informatics areas: data modeling, analytics, standardization, and privacy. Bioclinical data warehouses integrating heterogeneous patient-related clinical or omics data should be considered. The representative standardization effort by the Clinical Bioinformatics Ontology (CBO) aims to provide uniquely identified concepts to include molecular pathology terminologies. Since individual genome data are easily used to predict current and future health status, different safeguards to ensure confidentiality should be considered. In this paper, we focused on the informatics aspects of integrating the EMR community and BI community by identifying opportunities, challenges, and approaches to provide the best possible care service for our patients and the population. PMID:24465229

  10. Perspectives on clinical informatics: integrating large-scale clinical, genomic, and health information for clinical care.

    PubMed

    Choi, In Young; Kim, Tae-Min; Kim, Myung Shin; Mun, Seong K; Chung, Yeun-Jun

    2013-12-01

    The advances in electronic medical records (EMRs) and bioinformatics (BI) represent two significant trends in healthcare. The widespread adoption of EMR systems and the completion of the Human Genome Project developed the technologies for data acquisition, analysis, and visualization in two different domains. The massive amount of data from both clinical and biology domains is expected to provide personalized, preventive, and predictive healthcare services in the near future. The integrated use of EMR and BI data needs to consider four key informatics areas: data modeling, analytics, standardization, and privacy. Bioclinical data warehouses integrating heterogeneous patient-related clinical or omics data should be considered. The representative standardization effort by the Clinical Bioinformatics Ontology (CBO) aims to provide uniquely identified concepts to include molecular pathology terminologies. Since individual genome data are easily used to predict current and future health status, different safeguards to ensure confidentiality should be considered. In this paper, we focused on the informatics aspects of integrating the EMR community and BI community by identifying opportunities, challenges, and approaches to provide the best possible care service for our patients and the population. PMID:24465229

  11. Teaching Structural Bioinformatics at the Undergraduate Level

    ERIC Educational Resources Information Center

    Centeno, Nuria B.; Villa-Freixa, Jordi; Oliva, Baldomero

    2003-01-01

    Understanding the basic principles of structural biology is becoming a major subject of study in most undergraduate level programs in biology. In the genomic and proteomic age, it is becoming indispensable for biology students to master concepts related to the sequence and structure of proteins in order to develop skills that may be useful in a…

  12. Creating Bioinformatic Workflows within the BioExtract Server

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Computational workflows in bioinformatics are becoming increasingly important in the achievement of scientific advances. These workflows generally require access to multiple, distributed data sources and analytic tools. The requisite data sources may include large public data repositories, community...

  13. Survey of Natural Language Processing Techniques in Bioinformatics.

    PubMed

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  14. Survey of Natural Language Processing Techniques in Bioinformatics

    PubMed Central

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers. PMID:26525745

  15. Bioinformatics opportunities for identification and study of medicinal plants

    PubMed Central

    Sharma, Vivekanand

    2013-01-01

    Plants have been used as a source of medicine since historic times and several commercially important drugs are of plant-based origin. The traditional approach towards discovery of plant-based drugs often times involves significant amount of time and expenditure. These labor-intensive approaches have struggled to keep pace with the rapid development of high-throughput technologies. In the era of high volume, high-throughput data generation across the biosciences, bioinformatics plays a crucial role. This has generally been the case in the context of drug designing and discovery. However, there has been limited attention to date to the potential application of bioinformatics approaches that can leverage plant-based knowledge. Here, we review bioinformatics studies that have contributed to medicinal plants research. In particular, we highlight areas in medicinal plant research where the application of bioinformatics methodologies may result in quicker and potentially cost-effective leads toward finding plant-based remedies. PMID:22589384

  16. [An overview of feature selection algorithm in bioinformatics].

    PubMed

    Li, Xin; Ma, Li; Wang, Jinjia; Zhao, Chun

    2011-04-01

    Feature selection (FS) techniques have become an important tool in bioinformatics field. The core algorithm of it is to select the hidden significant data with low-dimension from high-dimensional data space, and thus to analyse the basic built-in rule of the data. The data of bioinformatics fields are always with high-dimension and small samples, so the research of FS algorithm in the bioinformatics fields has great foreground. In this article, we make the interested reader aware of the possibilities of feature selection, provide basic properties of feature selection techniques, and discuss their uses in the sequence analysis, microarray analysis, mass spectra analysis etc. Finally, the current problems and the prospects of feature selection algorithm in the application of bioinformatics is also discussed. PMID:21604512

  17. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond

    PubMed Central

    Hiraoka, Satoshi; Yang, Ching-chia; Iwasaki, Wataru

    2016-01-01

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives. PMID:27383682

  18. Bridging the Gap from Bench to Bedside--An Informatics Infrastructure for Integrating Clinical, Genomics and Environmental Data (ICGED).

    PubMed

    2015-01-01

    The abundance of heterogeneous biomedical data from a variety of sources demands the development of strategies to address data integration and management issues, so that the data can be used effectively in clinical practices and biomedical research. This research presents an Informatics Infrastructure for Integrating Clinical, Genomics and Environmental Data (ICGED) and provides a roadmap that envisions utilizing the clinical and biomedical resources in our case study. This work describes a data integration approach, proposed by ICGED, with a two-fold purpose: personalized medicine and biomedical data storage and sharing platform. It describes our experiences integrating disease specific clinical and genomics datasets with Data Integration and Analysis Tools (DIAT)--using Informatics for Integrating Biology and the Bedside, and discusses work in progress and future work for extending DIAT, and the development of Risk Assessment and Prediction Tools, Clinical Decision Support Systems and a Bioinformatics Data Warehouse. PMID:26262353

  19. The 2015 Bioinformatics Open Source Conference (BOSC 2015)

    PubMed Central

    Harris, Nomi L.; Cock, Peter J. A.; Lapp, Hilmar

    2016-01-01

    The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included “Data Science;” “Standards and Interoperability;” “Open Science and Reproducibility;” “Translational Bioinformatics;” “Visualization;” and “Bioinformatics Open Source Project Updates”. In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled “Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community,” that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule. PMID:26914653

  20. Partnering for functional genomics research conference: Abstracts of poster presentations

    SciTech Connect

    1998-06-01

    This reports contains abstracts of poster presentations presented at the Functional Genomics Research Conference held April 16--17, 1998 in Oak Ridge, Tennessee. Attention is focused on the following areas: mouse mutagenesis and genomics; phenotype screening; gene expression analysis; DNA analysis technology development; bioinformatics; comparative analyses of mouse, human, and yeast sequences; and pilot projects to evaluate methodologies.