Science.gov

Sample records for gene discovery project

  1. Pine Gene Discovery Project - Final Report - 08/31/1997 - 02/28/2001

    SciTech Connect

    Whetten, R. W.; Sederoff, R. R.; Kinlaw, C.; Retzel, E.

    2001-04-30

    Integration of pines into the large scope of plant biology research depends on study of pines in parallel with study of annual plants, and on availability of research materials from pine to plant biologists interested in comparing pine with annual plant systems. The objectives of the Pine Gene Discovery Project were to obtain 10,000 partial DNA sequences of genes expressed in loblolly pine, to determine which of those pine genes were similar to known genes from other organisms, and to make the DNA sequences and isolated pine genes available to plant researchers to stimulate integration of pines into the wider scope of plant biology research. Those objectives have been completed, and the results are available to the public. Requests for pine genes have been received from a number of laboratories that would otherwise not have included pine in their research, indicating that progress is being made toward the goal of integrating pine research into the larger molecular biology research community.

  2. Tools for visualization and integration of intermediate sequencing results in large disease gene discovery projects.

    PubMed

    Rzhetsky, A; Kalachikov, S; Ye, X; Zhang, P; Russo, J J

    1998-02-16

    We describe two Java applets which are useful for insightful presentation of intermediate experimental data in gene discovery projects involving large scale sequencing. One of these applets provides a physical map of a genomic region and provides easy access to the second applet, which furnishes a detailed map of sequence contigs associated with clones on the physical map. In particular, the second applet displays all the known information about each contig, including the presence of exons, database homology 'hits', repetitive elements and other features; the graphics are linked to other World Wide Web pages, providing detailed information on each feature. These applets should be useful to other research groups working on large sequencing projects.

  3. False discoveries and models for gene discovery.

    PubMed

    van den Oord, Edwin J C G; Sullivan, Patrick F

    2003-10-01

    In the search for genes underlying complex traits, there is a tendency to impose increasingly stringent criteria to avoid false discoveries. These stringent criteria make it hard to find true effects, and we argue that it might be better to optimize our procedures for eliminating and controlling false discoveries. Focusing on achieving an acceptable ratio of true- and false-positives, we show that false discoveries could be eliminated much more efficiently using a stepwise approach. To avoid a relatively high false discovery rate, corrections for 'multiple testing' might also be needed in candidate gene studies. If the appropriate methods are used, detecting the proportion of true effects appears to be a more important determinant of the genotyping burden than the desired false discovery rate. This raises the question of whether current models for gene discovery are shaped excessively by a fear of false discoveries.

  4. Independent Gene Discovery and Testing

    ERIC Educational Resources Information Center

    Palsule, Vrushalee; Coric, Dijana; Delancy, Russell; Dunham, Heather; Melancon, Caleb; Thompson, Dennis; Toms, Jamie; White, Ashley; Shultz, Jeffry

    2010-01-01

    A clear understanding of basic gene structure is critical when teaching molecular genetics, the central dogma and the biological sciences. We sought to create a gene-based teaching project to improve students' understanding of gene structure and to integrate this into a research project that can be implemented by instructors at the secondary level…

  5. Gene and enhancer traps for gene discovery.

    PubMed

    Rojas-Pierce, Marcela; Springer, Patricia S

    2003-01-01

    Gene traps and enhancer traps provide a valuable tool for gene discovery. With this system, genes can be identified based solely on the expression pattern of an inserted reporter gene. The use of a reporter gene, such as beta-glucuoronidase (GUS), provides a very sensitive assay for the identification of tissue- and cell-type specific expression patterns. In this chapter, protocols for examining and documenting GUS reporter gene activity in individual lines are described. Methods for the amplification of sequences flanking transposant insertions and subsequent molecular and genetic characterization of individual insertions are provided.

  6. Metagenomics and novel gene discovery

    PubMed Central

    Culligan, Eamonn P; Sleator, Roy D; Marchesi, Julian R; Hill, Colin

    2014-01-01

    Metagenomics provides a means of assessing the total genetic pool of all the microbes in a particular environment, in a culture-independent manner. It has revealed unprecedented diversity in microbial community composition, which is further reflected in the encoded functional diversity of the genomes, a large proportion of which consists of novel genes. Herein, we review both sequence-based and functional metagenomic methods to uncover novel genes and outline some of the associated problems of each type of approach, as well as potential solutions. Furthermore, we discuss the potential for metagenomic biotherapeutic discovery, with a particular focus on the human gut microbiome and finally, we outline how the discovery of novel genes may be used to create bioengineered probiotics. PMID:24317337

  7. Human brain evolution: from gene discovery to phenotype discovery.

    PubMed

    Preuss, Todd M

    2012-06-26

    The rise of comparative genomics and related technologies has added important new dimensions to the study of human evolution. Our knowledge of the genes that underwent expression changes or were targets of positive selection in human evolution is rapidly increasing, as is our knowledge of gene duplications, translocations, and deletions. It is now clear that the genetic differences between humans and chimpanzees are far more extensive than previously thought; their genomes are not 98% or 99% identical. Despite the rapid growth in our understanding of the evolution of the human genome, our understanding of the relationship between genetic changes and phenotypic changes is tenuous. This is true even for the most intensively studied gene, FOXP2, which underwent positive selection in the human terminal lineage and is thought to have played an important role in the evolution of human speech and language. In part, the difficulty of connecting genes to phenotypes reflects our generally poor knowledge of human phenotypic specializations, as well as the difficulty of interpreting the consequences of genetic changes in species that are not amenable to invasive research. On the positive side, investigations of FOXP2, along with genomewide surveys of gene-expression changes and selection-driven sequence changes, offer the opportunity for "phenotype discovery," providing clues to human phenotypic specializations that were previously unsuspected. What is more, at least some of the specializations that have been proposed are amenable to testing with noninvasive experimental techniques appropriate for the study of humans and apes.

  8. Phenotypic mutant library: potential for gene discovery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The rapid development of high throughput and affordable Next- Generation Sequencing (NGS) techniques has renewed interest in gene discovery using forward genetics. The conventional forward genetic approach starts with isolation of mutants with a phenotype of interest, mapping the mutation within a s...

  9. The Helioviewer Project: Discovery For Everyone Everywhere

    NASA Astrophysics Data System (ADS)

    Ireland, Jack; Hughitt, K.; Müller, D.; Dimitoglou, G.; Schmiedel, P.; Fleck, B.

    2009-05-01

    The Helioviewer Project: discovery for everyone everywhere There is an ever increasing amount of solar and heliospheric data gathered from multiple sources such as space-based facilities and ground based observatories. There are also multiple feature and event catalogs arising from human and computer based detection methods. The Helioviewer Project is developing a suite of technologies to allow users around the world to visualize, browse and access these heterogeneous datasets in an intuitive and highly customizable fashion. Helioviewer technologies are based on the JPEG2000 file format, an extremely flexible format that allows for the efficient transfer of data (and meta-data, such as FITS keywords) between client and server. Rather then having to download an entire image and then examine the small portion- for example, an active region - that you are interested in, the JPEG2000 file format lets you preferentially download only those portions you are interested in. This dramatically reduces the amount of data transferred, making possible responsive and flexible scientific discovery applications that can browse populous archives of large images, such as those from the Solar Dynamics Observatory. In addition, the Helioviewer Project is designed to be flexible and extensible to data sources as they become available. Helioviewer.org (www.helioviewer.org) works seamlessly with the Virtual Solar Observatory (VSO) whilst an application programming interface (API) is being developed for interaction with the Solar Dynamics Observatory Heliophysics Event Knowledgebase. After a short introduction to the underlying technology, a live demonstration of the web application www.helioviewer.org will be given. We will also comment on other client applications (Jhelioviewer, a Java-based browse tool), and the application of Helioviewer technology to existing and future solar and heliospheric data and feature/event repositories. This project is funded by NASA VxO and LWS awards and

  10. Pathway-driven discovery of epilepsy genes

    PubMed Central

    Noebels, Jeffrey

    2016-01-01

    Epilepsy genes deliver critical insights into the molecular control of brain synchronization and are revolutionizing our understanding and treatment of the disease. The epilepsy-associated genome is rapidly expanding, and two powerful complementary approaches, isolation of de novo exome variants in patients and targeted mutagenesis in model systems, account for the steep increase. In sheer number, the tally of genes linked to seizures will likely match that of cancer and exceed it in biological diversity. The proteins act within most intracellular compartments and span the molecular determinants of firing and wiring in the developing brain. Every facet of neurotransmission, from dendritic spine to exocytotic machinery, is in play, and defects of synaptic inhibition are over-represented. The contributions of somatic mutations and noncoding microRNAs are also being explored. The functional spectrum of established epilepsy genes and the arrival of rapid, precise technologies for genome editing now provide a robust scaffold to prioritize hypothesis-driven discovery and further populate this genetic proto-map. Although each gene identified offers translational potential to stratify patient care, the complexity of individual variation and covert actions of genetic modifiers may confound single-gene solutions for the clinical disorder. In vivo genetic deconstruction of epileptic networks, ex vivo validation of variant profiles in patient-derived induced pluripotent stem cells, in silico variant modeling and modifier gene discovery, now in their earliest stages, will help clarify individual patterns. Because seizures stand at the crossroads of all neuronal synchronization disorders in the developing and aging brain, the neurobiological analysis of epilepsy-associated genes provides an extraordinary gateway to new insights into higher cortical function. PMID:25710836

  11. Biomarker Gene Signature Discovery Integrating Network Knowledge

    PubMed Central

    Cun, Yupeng; Fröhlich, Holger

    2012-01-01

    Discovery of prognostic and diagnostic biomarker gene signatures for diseases, such as cancer, is seen as a major step towards a better personalized medicine. During the last decade various methods, mainly coming from the machine learning or statistical domain, have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinical diagnosis is the typical low reproducibility of these signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. Here we review the current state of research in this field by giving an overview about so-far proposed approaches. PMID:24832044

  12. Genome-enabled Discovery of Carbon Sequestration Genes

    SciTech Connect

    Tuskan, Gerald A; Tschaplinski, Timothy J; Kalluri, Udaya C; Yin, Tongming; Yang, Xiaohan; Zhang, Xinye; Engle, Nancy L; Ranjan, Priya; Basu, Manojit M; Gunter, Lee E; Jawdy, Sara; Martin, Madhavi Z; Campbell, Alina S; DiFazio, Stephen P; Davis, John M; Hinchee, Maud; Pinnacchio, Christa; Meilan, R; Busov, V.; Strauss, S

    2009-01-01

    The fate of carbon below ground is likely to be a major factor determining the success of carbon sequestration strategies involving plants. Despite their importance, molecular processes controlling belowground C allocation and partitioning are poorly understood. This project is leveraging the Populus trichocarpa genome sequence to discover genes important to C sequestration in plants and soils. The focus is on the identification of genes that provide key control points for the flow and chemical transformations of carbon in roots, concentrating on genes that control the synthesis of chemical forms of carbon that result in slower turnover rates of soil organic matter (i.e., increased recalcitrance). We propose to enhance carbon allocation and partitioning to roots by 1) modifying the auxin signaling pathway, and the invertase family, which controls sucrose metabolism, and by 2) increasing root proliferation through transgenesis with genes known to control fine root proliferation (e.g., ANT), 3) increasing the production of recalcitrant C metabolites by identifying genes controlling secondary C metabolism by a major mQTL-based gene discovery effort, and 4) increasing aboveground productivity by enhancing drought tolerance to achieve maximum C sequestration. This broad, integrated approach is aimed at ultimately enhancing root biomass as well as root detritus longevity, providing the best prospects for significant enhancement of belowground C sequestration.

  13. Gene Discoveries Offer New Height Insights

    MedlinePlus

    ... Health and Human Services. More Health News on: Child Development Genes and Gene Therapy Recent Health News Related MedlinePlus Health Topics Child Development Genes and Gene Therapy About MedlinePlus Site Map ...

  14. Implementation of Discovery Projects in Statistics

    ERIC Educational Resources Information Center

    Bailey, Brad; Spence, Dianna J.; Sinn, Robb

    2013-01-01

    Researchers and statistics educators consistently suggest that students will learn statistics more effectively by conducting projects through which they actively engage in a broad spectrum of tasks integral to statistical inquiry, in the authentic context of a real-world application. In keeping with these findings, we share an implementation of…

  15. Standardized Plant Disease Evaluations will Enhance Resistance Gene Discovery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gene discovery and marker development using DNA based tools require plant populations with well-documented phenotypes. Related crops such as apples and pears may share a number of genes, for example resistance to common diseases, and data mining in one crop may reveal genes for the other. However, u...

  16. Setting up a kinase discovery and development project.

    PubMed

    Bollag, Gideon

    2012-01-01

    Discovery of novel kinase inhibitors has matured rapidly over the last decade. Paramount to the successful development of kinase inhibitors is appropriate selectivity for validated targets. Many different approaches have been applied over the years, with varied results. There are currently thirteen different small molecule protein kinase inhibitors on the marketplace. Interestingly, a majority of these compounds lack precise selectivity for specific targets. This will change in the coming years, as technology for achieving improved selectivity becomes more widely applied. This chapter will focus on some of the critical considerations in setting up a kinase discovery and development project, citing examples particularly targeting the Raf kinases.

  17. Discovery of Tumor Suppressor Gene Function.

    ERIC Educational Resources Information Center

    Oppenheimer, Steven B.

    1995-01-01

    This is an update of a 1991 review on tumor suppressor genes written at a time when understanding of how the genes work was limited. A recent major breakthrough in the understanding of the function of tumor suppressor genes is discussed. (LZ)

  18. SNP marker discovery in koala TLR genes.

    PubMed

    Cui, Jian; Frankham, Greta J; Johnson, Rebecca N; Polkinghorne, Adam; Timms, Peter; O'Meally, Denis; Cheng, Yuanyuan; Belov, Katherine

    2015-01-01

    Toll-like receptors (TLRs) play a crucial role in the early defence against invading pathogens, yet our understanding of TLRs in marsupial immunity is limited. Here, we describe the characterisation of nine TLRs from a koala immune tissue transcriptome and one TLR from a draft sequence of the koala genome and the subsequent development of an assay to study genetic diversity in these genes. We surveyed genetic diversity in 20 koalas from New South Wales, Australia and showed that one gene, TLR10 is monomorphic, while the other nine TLR genes have between two and 12 alleles. 40 SNPs (16 non-synonymous) were identified across the ten TLR genes. These markers provide a springboard to future studies on innate immunity in the koala, a species under threat from two major infectious diseases.

  19. A Discovery Lab for Studying Gene Regulation.

    ERIC Educational Resources Information Center

    Moss, Robert

    1997-01-01

    Presents a laboratory in which students are provided with cultures of three bacterial strains. Using the results, students will determine which of the strains corresponds to a mutant lacking a particular functional gene. (DDR)

  20. GWATCH: a web platform for automated gene association discovery analysis

    PubMed Central

    2014-01-01

    Background As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Findings Here we present a dynamic web-based platform – GWATCH – that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. Conclusions GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH. PMID:25374661

  1. Standardized plant disease evaluations will enhance resistance gene discovery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gene discovery and marker development using DNA-based tools require plant populations with well documented phenotypes. If dissimilar phenotype evaluation methods or data scoring techniques are employed with different crops, or at different labs for the same crops, then data mining for genetic marker...

  2. Future Mission Proposal Opportunities: Discovery, New Frontiers, and Project Prometheus

    NASA Technical Reports Server (NTRS)

    Niebur, S. M.; Morgan, T. H.; Niebur, C. S.

    2003-01-01

    The NASA Office of Space Science is expanding opportunities to propose missions to comets, asteroids, and other solar system targets. The Discovery Program continues to be popular, with two sample return missions, Stardust and Genesis, currently in operation. The New Frontiers Program, a new proposal opportunity modeled on the successful Discovery Program, begins this year with the release of its first Announcement of Opportunity. Project Prometheus, a program to develop nuclear electric power and propulsion technology intended to enable a new class of high-power, high-capability investigations, is a third opportunity to propose solar system exploration. All three classes of mission include a commitment to provide data to the Planetary Data System, any samples to the NASA Curatorial Facility at Johnson Space Center, and programs for education and public outreach.

  3. Technology development for gene discovery and full-length sequencing

    SciTech Connect

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  4. A practical drug discovery project at the undergraduate level.

    PubMed

    Fray, M Jonathan; Macdonald, Simon J F; Baldwin, Ian R; Barton, Nick; Brown, Jack; Campbell, Ian B; Churcher, Ian; Coe, Diane M; Cooper, Anthony W J; Craven, Andrew P; Fisher, Gail; Inglis, Graham G A; Kelly, Henry A; Liddle, John; Maxwell, Aoife C; Patel, Vipulkumar K; Swanson, Stephen; Wellaway, Natalie

    2013-12-01

    In this article, we describe a practical drug discovery project for third-year undergraduates. No previous knowledge of medicinal chemistry is assumed. Initial lecture workshops cover the basic principles; then students, in teams, seek to improve the profile of a weakly potent, insoluble phosphatidylinositide 3-kinase delta (PI3Kδ) inhibitor (1) through compound array design, molecular modelling, screening data analysis and the synthesis of target compounds in the laboratory. The project benefits from significant industrial support, including lectures, student mentoring and consumables. The aim is to make the learning experience as close as possible to real-life industrial situations. In total, 48 target compounds were prepared, the best of which (5b, 5j, 6b and 6ap) improved the potency and aqueous solubility of the lead compound (1) by 100-1000 fold and ≥tenfold, respectively.

  5. Distant collaboration in drug discovery: The LINK3D project

    NASA Astrophysics Data System (ADS)

    Pastor, Manuel; Benedetti, Paolo; Carotti, Angelo; Carrieri, Antonio; Díaz, Carlos; Herráiz, Cristina; Höltje, Hans-Dieter; Loza, M. Isabel; Oprea, Tudor; Padín, Fernando; Pubill, Francesc; Sanz, Ferran; Stoll, Friederike; the LINK3D Consortium

    2002-11-01

    The work describes the development of novel software supporting synchronous distant collaboration between scientists involved in drug discovery and development projects. The program allows to visualize and share data as well as to interact in real time using standard intranets and Internet resources. Direct visualization of 2D and 3D molecular structures is supported and original tools for facilitating remote discussion have been integrated. The software is multiplatform (MS-Windows, SGI-IRIX, Linux), allowing for a seamless integration of heterogeneous working environments. The project aims to support collaboration both within and between academic and industrial institutions. Since confidentiality is very important in some scenarios, special attention has been paid to security aspects. The article presents the research carried out to gather the requirements of collaborative software in the field of drug discovery and development and describes the features of the first fully functional prototype obtained. Real-world testing activities carried out on this prototype in order to guarantee its adequacy in diverse environments are also described and discussed.

  6. INTEGRATE: gene fusion discovery using whole genome and transcriptome data

    PubMed Central

    Zhang, Jin; White, Nicole M.; Schmidt, Heather K.; Fulton, Robert S.; Tomlinson, Chad; Warren, Wesley C.; Wilson, Richard K.; Maher, Christopher A.

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use. PMID:26556708

  7. INTEGRATE: gene fusion discovery using whole genome and transcriptome data.

    PubMed

    Zhang, Jin; White, Nicole M; Schmidt, Heather K; Fulton, Robert S; Tomlinson, Chad; Warren, Wesley C; Wilson, Richard K; Maher, Christopher A

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use.

  8. Mouse models for the discovery of colorectal cancer driver genes

    PubMed Central

    Clark, Christopher R; Starr, Timothy K

    2016-01-01

    Colorectal cancer (CRC) constitutes a major public health problem as the third most commonly diagnosed and third most lethal malignancy worldwide. The prevalence and the physical accessibility to colorectal tumors have made CRC an ideal model for the study of tumor genetics. Early research efforts using patient derived CRC samples led to the discovery of several highly penetrant mutations (e.g., APC, KRAS, MMR genes) in both hereditary and sporadic CRC tumors. This knowledge has enabled researchers to develop genetically engineered and chemically induced tumor models of CRC, both of which have had a substantial impact on our understanding of the molecular basis of CRC. Despite these advances, the morbidity and mortality of CRC remains a cause for concern and highlight the need to uncover novel genetic drivers of CRC. This review focuses on mouse models of CRC with particular emphasis on a newly developed cancer gene discovery tool, the Sleeping Beauty transposon-based mutagenesis model of CRC. PMID:26811627

  9. Mouse models for the discovery of colorectal cancer driver genes.

    PubMed

    Clark, Christopher R; Starr, Timothy K

    2016-01-14

    Colorectal cancer (CRC) constitutes a major public health problem as the third most commonly diagnosed and third most lethal malignancy worldwide. The prevalence and the physical accessibility to colorectal tumors have made CRC an ideal model for the study of tumor genetics. Early research efforts using patient derived CRC samples led to the discovery of several highly penetrant mutations (e.g., APC, KRAS, MMR genes) in both hereditary and sporadic CRC tumors. This knowledge has enabled researchers to develop genetically engineered and chemically induced tumor models of CRC, both of which have had a substantial impact on our understanding of the molecular basis of CRC. Despite these advances, the morbidity and mortality of CRC remains a cause for concern and highlight the need to uncover novel genetic drivers of CRC. This review focuses on mouse models of CRC with particular emphasis on a newly developed cancer gene discovery tool, the Sleeping Beauty transposon-based mutagenesis model of CRC.

  10. Comparative Oncogenomics for Peripheral Nerve Sheath Cancer Gene Discovery

    DTIC Science & Technology

    2015-06-01

    Steven L. Carroll , MD, PhD RECIPIENT: Medical University of South Carolina Charleston, SC 29425 REPORT DATE: June 2015 TYPE OF REPORT: Annual...Peripheral Nerve Sheath Cancer Gene Discovery 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Steven L. Carroll 5d...2014, when Dr. Carroll left UAB to assume a position as Professor and Chair of the Department of Pathology and Laboratory Medicine at the Medical

  11. Genomic advances for gene discovery in hereditary hearing loss.

    PubMed

    Avraham, Karen B; Kanaan, Moien

    2012-09-07

    High-throughput sequencing is changing the face of genetic diagnosis and counseling. While in the past, it would take on average 1 to 5 years to identify a mutation leading to deafness, today, the genetic basis for deafness can be determined within months in a child or adult with inherited hearing loss. Obstacles and challenges still remain, but the field is changing at a dramatic rate, making gene discovery a much easier and more efficient task than in the past.

  12. Integrated project views: decision support platform for drug discovery project teams.

    PubMed

    Baede, Eric J; den Bekker, Ernest; Boiten, Jan-Willem; Cronin, Deborah; van Gammeren, Rob; de Vlieg, Jacob

    2012-06-25

    Drug discovery teams continuously have to decide which compounds to progress and which experiments to perform next, but the data required to make informed decisions is often scattered, inaccessible, or inconsistent. In particular, data tend to be stored and represented in a compound-centric or assay-centric manner rather than project-centric as often needed for effective use in drug discovery teams. The Integrated Project Views (IPV) system has been created to fill this gap; it integrates and consolidates data from various sources in a project-oriented manner. Its automatic gathering and updating of project data not only ensures that the information is comprehensive and available on a timely basis, but also improves the data consistency. Due to the lack of suitable off-the-shelf solutions, we were prompted to develop custom functionality and algorithms geared specifically to our drug discovery decision making process. In 10 years of usage, the resulting IPV application has become very well-accepted and appreciated, which is perhaps best evidenced by the observation that standalone Excel spreadsheets are largely eliminated from project team meetings.

  13. Advances in Pig Genomics and Functional Gene Discovery

    PubMed Central

    2003-01-01

    Advances in pig gene identification, mapping and functional analysis have continued to make rapid progress. The porcine genetic linkage map now has nearly 3000 loci, including several hundred genes, and is likely to expand considerably in the next few years, with many more genes and amplified fragment length polymorphism (AFLP) markers being added to the map. The physical genetic map is also growing rapidly and has over 3000 genes and markers. Several recent quantitative trait loci (QTL) scans and candidate gene analyses have identified important chromosomal regions and individual genes associated with traits of economic interest. The commercial pig industry is actively using this information and traditional performance information to improve pig production by marker-assisted selection (MAS). Research to study the co-expression of thousands of genes is now advancing and methods to combine these approaches to aid in gene discovery are under way. The pig's role in xenotransplantation and biomedical research makes the study of its genome important for the study of human disease. This review will briefly describe advances made, directions for future research and the implications for both the pig industry and human health. PMID:18629119

  14. Discovery of a widely distributed toxin biosynthetic gene cluster

    PubMed Central

    Lee, Shaun W.; Mitchell, Douglas A.; Markley, Andrew L.; Hensler, Mary E.; Gonzalez, David; Wohlrab, Aaron; Dorrestein, Pieter C.; Nizet, Victor; Dixon, Jack E.

    2008-01-01

    Bacteriocins represent a large family of ribosomally produced peptide antibiotics. Here we describe the discovery of a widely conserved biosynthetic gene cluster for the synthesis of thiazole and oxazole heterocycles on ribosomally produced peptides. These clusters encode a toxin precursor and all necessary proteins for toxin maturation and export. Using the toxin precursor peptide and heterocycle-forming synthetase proteins from the human pathogen Streptococcus pyogenes, we demonstrate the in vitro reconstitution of streptolysin S activity. We provide evidence that the synthetase enzymes, as predicted from our bioinformatics analysis, introduce heterocycles onto precursor peptides, thereby providing molecular insight into the chemical structure of streptolysin S. Furthermore, our studies reveal that the synthetase exhibits relaxed substrate specificity and modifies toxin precursors from both related and distant species. Given our findings, it is likely that the discovery of similar peptidic toxins will rapidly expand to existing and emerging genomes. PMID:18375757

  15. The emerging role for rat models in gene discovery

    PubMed Central

    Dwinell, Melinda R.; Lazar, Jozef; Geurts, Aron M.

    2011-01-01

    Rat models have been used for many decades to study physiological and pathophysiological mechanisms. Prior to the release of the rat genome and new technologies for targeting gene manipulation, the rat had been the underdog in the genomics era, despite the abundance of physiological data compared to the mouse. The overarching goal of biomedical research is to improve health and advance medical science. Translating human disease gene discovery and validation in the rat, through the use of emerging technologies and integrated tools and databases, is providing power to understand the genetics, environmental influences, and biology of disease. In this review, we will briefly outline the rat models, bioinformatic tools, and technologies that are changing the landscape of translational research. The strategies used to translate disease traits to genes to function, and ultimately, to improve human health will be discussed. Finally, our perspectives on how rat models will continue to positively impact biomedical research will be provided. PMID:21732192

  16. Gene discovery in amyotrophic lateral sclerosis: implications for clinical management.

    PubMed

    Al-Chalabi, Ammar; van den Berg, Leonard H; Veldink, Jan

    2017-02-01

    Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease predominantly affecting upper and lower motor neurons. The disease leads to relentlessly progressive weakness of voluntary muscles, with death typically resulting from diaphragmatic failure within 2-5 years. Since the discovery of mutations in SOD1, which account for ∼2% of ALS cases, increasing efforts have been made to understand the genetic component of ALS risk, with the expectation that this insight will not only aid diagnosis and classification, but also guide personalized treatment and reveal the mechanisms that cause motor neuron death. In this Review, we outline previous and current efforts to characterize genes that are associated with ALS, describe current knowledge about the genetic architecture of ALS - including the relevance of family history - and the probable nature of future gene discoveries, and explore how our understanding of ALS genetics affects present and future clinical decisions. We observe that many gene variants associated with ALS have effect sizes between those of mutations that greatly increase risk and those of common variants that have a small effect on risk, and combine this observation with insights from next-generation sequencing to explore the implications for genetic counselling.

  17. Neuropsychiatric genomics in precision medicine: diagnostics, gene discovery, and translation

    PubMed Central

    Need, Anna C.; Goldstein, David B.

    2016-01-01

    Only a few years after its development, next-generation sequencing is rapidly becoming an essential part of clinical care for patients with serious neurological conditions, especially in the diagnosis of early-onset and severe presentations. Beyond this diagnostic role, there has been an explosion in definitive gene discovery in a range of neuropsychiatric diseases. This is providing new pointers to underlying disease biology and is beginning to outline a new framework for genetic stratification of neuropsychiatric disease, with clear relevance to both individual treatment optimization and clinical trial design. Here, we outline these developments and chart the expected impact on the treatment of neurological, neurodevelopmental, and psychiatric disease. PMID:27757059

  18. Neuropsychiatric genomics in precision medicine: diagnostics, gene discovery, and translation.

    PubMed

    Need, Anna C; Goldstein, David B

    2016-09-01

    Only a few years after its development, next-generation sequencing is rapidly becoming an essential part of clinical care for patients with serious neurological conditions, especially in the diagnosis of early-onset and severe presentations. Beyond this diagnostic role, there has been an explosion in definitive gene discovery in a range of neuropsychiatric diseases. This is providing new pointers to underlying disease biology and is beginning to outline a new framework for genetic stratification of neuropsychiatric disease, with clear relevance to both individual treatment optimization and clinical trial design. Here, we outline these developments and chart the expected impact on the treatment of neurological, neurodevelopmental, and psychiatric disease.

  19. Discovery and New Frontiers Project Budget Analysis Tool

    NASA Technical Reports Server (NTRS)

    Newhouse, Marilyn E.

    2011-01-01

    The Discovery and New Frontiers (D&NF) programs are multi-project, uncoupled programs that currently comprise 13 missions in phases A through F. The ability to fly frequent science missions to explore the solar system is the primary measure of program success. The program office uses a Budget Analysis Tool to perform "what-if" analyses and compare mission scenarios to the current program budget, and rapidly forecast the programs ability to meet their launch rate requirements. The tool allows the user to specify the total mission cost (fixed year), mission development and operations profile by phase (percent total mission cost and duration), launch vehicle, and launch date for multiple missions. The tool automatically applies inflation and rolls up the total program costs (in real year dollars) for comparison against available program budget. Thus, the tool allows the user to rapidly and easily explore a variety of launch rates and analyze the effect of changes in future mission or launch vehicle costs, the differing development profiles or operational durations of a future mission, or a replan of a current mission on the overall program budget. Because the tool also reports average monthly costs for the specified mission profile, the development or operations cost profile can easily be validate against program experience for similar missions. While specifically designed for predicting overall program budgets for programs that develop and operate multiple missions concurrently, the basic concept of the tool (rolling up multiple, independently-budget lines) could easily be adapted to other applications.

  20. Non-syndromic retinal ciliopathies: translating gene discovery into therapy.

    PubMed

    Estrada-Cuzcano, Alejandro; Roepman, Ronald; Cremers, Frans P M; den Hollander, Anneke I; Mans, Dorus A

    2012-10-15

    Homozygosity mapping and exome sequencing have accelerated the discovery of gene mutations and modifier alleles implicated in inherited retinal degeneration in humans. To date, 158 genes have been found to be mutated in individuals with retinal dystrophies. Approximately one-third of the gene defects underlying retinal degeneration affect the structure and/or function of the 'connecting cilium' in photoreceptors. This structure corresponds to the transition zone of a prototypic cilium, a region with increasing relevance for ciliary homeostasis. The connecting cilium connects the inner and outer segments of the photoreceptor, mediating bi-directional transport of phototransducing proteins required for vision. In fact, the outer segment, connecting cilium and associated basal body, forms a highly specialized sensory cilium, fully dedicated to photoreception and subsequent signal transduction to the brain. At least 21 genes that encode ciliary proteins are implicated in non-syndromic retinal dystrophies such as cone dystrophy, cone-rod dystrophy, Leber congenital amaurosis (LCA), macular degeneration or retinitis pigmentosa (RP). The generation and characterization of vertebrate retinal ciliopathy animal models have revealed insights into the molecular disease mechanism which are indispensable for the development and evaluation of therapeutic strategies. Gene augmentation therapy has proven to be safe and successful in restoring long-term sight in mice, dogs and humans suffering from LCA or RP. Here, we present a comprehensive overview of the genes, mutations and modifier alleles involved in non-syndromic retinal ciliopathies, review the progress in dissecting the associated retinal disease mechanisms and evaluate gene augmentation approaches to antagonize retinal degeneration in these ciliopathies.

  1. Psychiatric gene discoveries shape evidence on ADHD's biology

    PubMed Central

    Thapar, A; Martin, J; Mick, E; Arias Vásquez, A; Langley, K; Scherer, S W; Schachar, R; Crosbie, J; Williams, N; Franke, B; Elia, J; Glessner, J; Hakonarson, H; Owen, M J; Faraone, S V; O'Donovan, M C; Holmans, P

    2016-01-01

    A strong motivation for undertaking psychiatric gene discovery studies is to provide novel insights into unknown biology. Although attention-deficit hyperactivity disorder (ADHD) is highly heritable, and large, rare copy number variants (CNVs) contribute to risk, little is known about its pathogenesis and it remains commonly misunderstood. We assembled and pooled five ADHD and control CNV data sets from the United Kingdom, Ireland, United States of America, Northern Europe and Canada. Our aim was to test for enrichment of neurodevelopmental gene sets, implicated by recent exome-sequencing studies of (a) schizophrenia and (b) autism as a means of testing the hypothesis that common pathogenic mechanisms underlie ADHD and these other neurodevelopmental disorders. We also undertook hypothesis-free testing of all biological pathways. We observed significant enrichment of individual genes previously found to harbour schizophrenia de novo non-synonymous single-nucleotide variants (SNVs; P=5.4 × 10−4) and targets of the Fragile X mental retardation protein (P=0.0018). No enrichment was observed for activity-regulated cytoskeleton-associated protein (P=0.23) or N-methyl-D-aspartate receptor (P=0.74) post-synaptic signalling gene sets previously implicated in schizophrenia. Enrichment of ADHD CNV hits for genes impacted by autism de novo SNVs (P=0.019 for non-synonymous SNV genes) did not survive Bonferroni correction. Hypothesis-free testing yielded several highly significantly enriched biological pathways, including ion channel pathways. Enrichment findings were robust to multiple testing corrections and to sensitivity analyses that excluded the most significant sample. The findings reveal that CNVs in ADHD converge on biologically meaningful gene clusters, including ones now established as conferring risk of other neurodevelopmental disorders. PMID:26573769

  2. Sequencing of GJB2 in Cameroonians and Black South Africans and comparison to 1000 Genomes Project Data Support Need to Revise Strategy for Discovery of Nonsyndromic Deafness Genes in Africans.

    PubMed

    Bosch, Jason; Noubiap, Jean Jacques N; Dandara, Collet; Makubalo, Nomlindo; Wright, Galen; Entfellner, Jean-Baka Domelevo; Tiffin, Nicki; Wonkam, Ambroise

    2014-11-01

    Mutations in the GJB2 gene, encoding connexin 26, could account for 50% of congenital, nonsyndromic, recessive deafness cases in some Caucasian/Asian populations. There is a scarcity of published data in sub-Saharan Africans. We Sanger sequenced the coding region of the GJB2 gene in 205 Cameroonian and Xhosa South Africans with congenital, nonsyndromic deafness; and performed bioinformatic analysis of variations in the GJB2 gene, incorporating data from the 1000 Genomes Project. Amongst Cameroonian patients, 26.1% were familial. The majority of patients (70%) suffered from sensorineural hearing loss. Ten GJB2 genetic variants were detected by sequencing. A previously reported pathogenic mutation, g.3741_3743delTTC (p.F142del), and a putative pathogenic mutation, g.3816G>A (p.V167M), were identified in single heterozygous samples. Amongst eight the remaining variants, two novel variants, g.3318-41G>A and g.3332G>A, were reported. There were no statistically significant differences in allele frequencies between cases and controls. Principal Components Analyses differentiated between Africans, Asians, and Europeans, but only explained 40% of the variation. The present study is the first to compare African GJB2 sequences with the data from the 1000 Genomes Project and have revealed the low variation between population groups. This finding has emphasized the hypothesis that the prevalence of mutations in GJB2 in nonsyndromic deafness amongst European and Asian populations is due to founder effects arising after these individuals migrated out of Africa, and not to a putative "protective" variant in the genomic structure of GJB2 in Africans. Our results confirm that mutations in GJB2 are not associated with nonsyndromic deafness in Africans.

  3. Genome Enabled Discovery of Carbon Sequestration Genes in Poplar

    SciTech Connect

    Filichkin, Sergei; Etherington, Elizabeth; Ma, Caiping; Strauss, Steve

    2007-02-22

    The goals of the S.H. Strauss laboratory portion of 'Genome-enabled discovery of carbon sequestration genes in poplar' are (1) to explore the functions of candidate genes using Populus transformation by inserting genes provided by Oakridge National Laboratory (ORNL) and the University of Florida (UF) into poplar; (2) to expand the poplar transformation toolkit by developing transformation methods for important genotypes; and (3) to allow induced expression, and efficient gene suppression, in roots and other tissues. As part of the transformation improvement effort, OSU developed transformation protocols for Populus trichocarpa 'Nisqually-1' clone and an early flowering P. alba clone, 6K10. Complete descriptions of the transformation systems were published (Ma et. al. 2004, Meilan et. al 2004). Twenty-one 'Nisqually-1' and 622 6K10 transgenic plants were generated. To identify root predominant promoters, a set of three promoters were tested for their tissue-specific expression patterns in poplar and in Arabidopsis as a model system. A novel gene, ET304, was identified by analyzing a collection of poplar enhancer trap lines generated at OSU (Filichkin et. al 2006a, 2006b). Other promoters include the pGgMT1 root-predominant promoter from Casuarina glauca and the pAtPIN2 promoter from Arabidopsis root specific PIN2 gene. OSU tested two induction systems, alcohol- and estrogen-inducible, in multiple poplar transgenics. Ethanol proved to be the more efficient when tested in tissue culture and greenhouse conditions. Two estrogen-inducible systems were evaluated in transgenic Populus, neither of which functioned reliably in tissue culture conditions. GATEWAY-compatible plant binary vectors were designed to compare the silencing efficiency of homologous (direct) RNAi vs. heterologous (transitive) RNAi inverted repeats. A set of genes was targeted for post transcriptional silencing in the model Arabidopsis system; these include the floral meristem identity gene (APETALA1 or

  4. Sugarcane Functional Genomics: Gene Discovery for Agronomic Trait Development

    PubMed Central

    Menossi, M.; Silva-Filho, M. C.; Vincentz, M.; Van-Sluys, M.-A.; Souza, G. M.

    2008-01-01

    Sugarcane is a highly productive crop used for centuries as the main source of sugar and recently to produce ethanol, a renewable bio-fuel energy source. There is increased interest in this crop due to the impending need to decrease fossil fuel usage. Sugarcane has a highly polyploid genome. Expressed sequence tag (EST) sequencing has significantly contributed to gene discovery and expression studies used to associate function with sugarcane genes. A significant amount of data exists on regulatory events controlling responses to herbivory, drought, and phosphate deficiency, which cause important constraints on yield and on endophytic bacteria, which are highly beneficial. The means to reduce drought, phosphate deficiency, and herbivory by the sugarcane borer have a negative impact on the environment. Improved tolerance for these constraints is being sought. Sugarcane's ability to accumulate sucrose up to 16% of its culm dry weight is a challenge for genetic manipulation. Genome-based technology such as cDNA microarray data indicates genes associated with sugar content that may be used to develop new varieties improved for sucrose content or for traits that restrict the expansion of the cultivated land. The genes can also be used as molecular markers of agronomic traits in traditional breeding programs. PMID:18273390

  5. Function-driven discovery of disease genes in zebrafish using an integrated genomics big data resource

    PubMed Central

    Shim, Hongseok; Kim, Ji Hyun; Kim, Chan Yeong; Hwang, Sohyun; Kim, Hyojin; Yang, Sunmo; Lee, Ji Eun; Lee, Insuk

    2016-01-01

    Whole exome sequencing (WES) accelerates disease gene discovery using rare genetic variants, but further statistical and functional evidence is required to avoid false-discovery. To complement variant-driven disease gene discovery, here we present function-driven disease gene discovery in zebrafish (Danio rerio), a promising human disease model owing to its high anatomical and genomic similarity to humans. To facilitate zebrafish-based function-driven disease gene discovery, we developed a genome-scale co-functional network of zebrafish genes, DanioNet (www.inetbio.org/danionet), which was constructed by Bayesian integration of genomics big data. Rigorous statistical assessment confirmed the high prediction capacity of DanioNet for a wide variety of human diseases. To demonstrate the feasibility of the function-driven disease gene discovery using DanioNet, we predicted genes for ciliopathies and performed experimental validation for eight candidate genes. We also validated the existence of heterozygous rare variants in the candidate genes of individuals with ciliopathies yet not in controls derived from the UK10K consortium, suggesting that these variants are potentially involved in enhancing the risk of ciliopathies. These results showed that an integrated genomics big data for a model animal of diseases can expand our opportunity for harnessing WES data in disease gene discovery. PMID:27903883

  6. Gene Discovery through Expressed Sequence Tag Sequencing in Trypanosoma cruzi

    PubMed Central

    Verdun, Ramiro E.; Di Paolo, Nelson; Urmenyi, Turan P.; Rondinelli, Edson; Frasch, Alberto C. C.; Sanchez, Daniel O.

    1998-01-01

    Analysis of expressed sequence tags (ESTs) constitutes a useful approach for gene identification that, in the case of human pathogens, might result in the identification of new targets for chemotherapy and vaccine development. As part of the Trypanosoma cruzi genome project, we have partially sequenced the 5′ ends of 1,949 clones to generate ESTs. The clones were randomly selected from a normalized CL Brener epimastigote cDNA library. A total of 14.6% of the clones were homologous to previously identified T. cruzi genes, while 18.4% had significant matches to genes from other organisms in the database. A total of 67% of the ESTs had no matches in the database, and thus, some of them might be T. cruzi-specific genes. Functional groups of those sequences with matches in the database were constructed according to their putative biological functions. The two largest categories were protein synthesis (23.3%) and cell surface molecules (10.8%). The information reported in this paper should be useful for researchers in the field to analyze genes and proteins of their own interest. PMID:9784549

  7. Next-generation diagnostics and disease-gene discovery with the Exomiser.

    PubMed

    Smedley, Damian; Jacobsen, Julius O B; Jäger, Marten; Köhler, Sebastian; Holtgrewe, Manuel; Schubach, Max; Siragusa, Enrico; Zemojtel, Tomasz; Buske, Orion J; Washington, Nicole L; Bone, William P; Haendel, Melissa A; Robinson, Peter N

    2015-12-01

    Exomiser is an application that prioritizes genes and variants in next-generation sequencing (NGS) projects for novel disease-gene discovery or differential diagnostics of Mendelian disease. Exomiser comprises a suite of algorithms for prioritizing exome sequences using random-walk analysis of protein interaction networks, clinical relevance and cross-species phenotype comparisons, as well as a wide range of other computational filters for variant frequency, predicted pathogenicity and pedigree analysis. In this protocol, we provide a detailed explanation of how to install Exomiser and use it to prioritize exome sequences in a number of scenarios. Exomiser requires ∼3 GB of RAM and roughly 15-90 s of computing time on a standard desktop computer to analyze a variant call format (VCF) file. Exomiser is freely available for academic use from http://www.sanger.ac.uk/science/tools/exomiser.

  8. Turning publicly available gene expression data into discoveries using gene set context analysis.

    PubMed

    Ji, Zhicheng; Vokes, Steven A; Dang, Chi V; Ji, Hongkai

    2016-01-08

    Gene Set Context Analysis (GSCA) is an open source software package to help researchers use massive amounts of publicly available gene expression data (PED) to make discoveries. Users can interactively visualize and explore gene and gene set activities in 25,000+ consistently normalized human and mouse gene expression samples representing diverse biological contexts (e.g. different cells, tissues and disease types, etc.). By providing one or multiple genes or gene sets as input and specifying a gene set activity pattern of interest, users can query the expression compendium to systematically identify biological contexts associated with the specified gene set activity pattern. In this way, researchers with new gene sets from their own experiments may discover previously unknown contexts of gene set functions and hence increase the value of their experiments. GSCA has a graphical user interface (GUI). The GUI makes the analysis convenient and customizable. Analysis results can be conveniently exported as publication quality figures and tables. GSCA is available at https://github.com/zji90/GSCA. This software significantly lowers the bar for biomedical investigators to use PED in their daily research for generating and screening hypotheses, which was previously difficult because of the complexity, heterogeneity and size of the data.

  9. Africa: the next frontier for human disease gene discovery?

    PubMed

    Ramsay, Michèle; Tiemessen, Caroline T; Choudhury, Ananyo; Soodyall, Himla

    2011-10-15

    The populations of Africa harbour the greatest human genetic diversity following an evolutionary history tracing its beginnings on the continent to time before the emergence of Homo sapiens. Signatures of selection are detectable as responses to ancient environments and cultural practices, modulated by more recent events including infectious epidemics, migrations, admixture and, of course, chance. The age of high-throughput biology is not passing Africa by. African-based cohort studies and networks with an African footprint are ideal springboards for disease-related genetic and genomic studies. Initiatives like HapMap, the 1000 Genomes Project, MalariaGEN, the INDEPTH network and Human Heredity and Health in Africa are catalysts to exploring African genetic diversity and its role in the spectrum from health to disease. The challenges are abundant in dissecting biological questions in the light of linguistic, cultural, geographic and political boundaries and their respective roles in shaping health-related profiles. Will studies based on African populations lead to a new wave of discovery of genetic contributors to disease?

  10. Risk genes for schizophrenia: translational opportunities for drug discovery.

    PubMed

    Winchester, Catherine L; Pratt, Judith A; Morris, Brian J

    2014-07-01

    Despite intensive research over many years, the treatment of schizophrenia remains a major health issue. Current and emerging treatments for schizophrenia are based upon the classical dopamine and glutamate hypotheses of disease. Existing first and second generation antipsychotic drugs based upon the dopamine hypothesis are limited by their inability to treat all symptom domains and their undesirable side effect profiles. Third generation drugs based upon the glutamate hypothesis of disease are currently under evaluation but are more likely to be used as add on treatments. Hence there is a large unmet clinical need. A major challenge in neuropsychiatric disease research is the relatively limited knowledge of disease mechanisms. However, as our understanding of the genetic causes of the disease evolves, novel strategies for the development of improved therapeutic agents will become apparent. In this review we consider the current status of knowledge of the genetic basis of schizophrenia, including methods for identifying genetic variants associated with the disorder and how they impact on gene function. Although the genetic architecture of schizophrenia is complex, some targets amenable to pharmacological intervention can be discerned. We conclude that many challenges lie ahead but the stratification of patients according to biobehavioural constructs that cross existing disease classifications but with common genetic and neurobiological bases, offer opportunities for new approaches to effective drug discovery.

  11. Discovery

    ERIC Educational Resources Information Center

    de Mestre, Neville

    2010-01-01

    All common fractions can be written in decimal form. In this Discovery article, the author suggests that teachers ask their students to calculate the decimals by actually doing the divisions themselves, and later on they can use a calculator to check their answers. This article presents a lesson based on the research of Bolt (1982).

  12. Gene discovery for the carcinogenic human liver fluke, Opisthorchis viverrini

    PubMed Central

    Laha, Thewarach; Pinlaor, Porntip; Mulvenna, Jason; Sripa, Banchob; Sripa, Manop; Smout, Michael J; Gasser, Robin B; Brindley, Paul J; Loukas, Alex

    2007-01-01

    Background Cholangiocarcinoma (CCA) – cancer of the bile ducts – is associated with chronic infection with the liver fluke, Opisthorchis viverrini. Despite being the only eukaryote that is designated as a 'class I carcinogen' by the International Agency for Research on Cancer, little is known about its genome. Results Approximately 5,000 randomly selected cDNAs from the adult stage of O. viverrini were characterized and accounted for 1,932 contigs, representing ~14% of the entire transcriptome, and, presently, the largest sequence dataset for any species of liver fluke. Twenty percent of contigs were assigned GO classifications. Abundantly represented protein families included those involved in physiological functions that are essential to parasitism, such as anaerobic respiration, reproduction, detoxification, surface maintenance and feeding. GO assignments were well conserved in relation to other parasitic flukes, however, some categories were over-represented in O. viverrini, such as structural and motor proteins. An assessment of evolutionary relationships showed that O. viverrini was more similar to other parasitic (Clonorchis sinensis and Schistosoma japonicum) than to free-living (Schmidtea mediterranea) flatworms, and 105 sequences had close homologues in both parasitic species but not in S. mediterranea. A total of 164 O. viverrini contigs contained ORFs with signal sequences, many of which were platyhelminth-specific. Examples of convergent evolution between host and parasite secreted/membrane proteins were identified as were homologues of vaccine antigens from other helminths. Finally, ORFs representing secreted proteins with known roles in tumorigenesis were identified, and these might play roles in the pathogenesis of O. viverrini-induced CCA. Conclusion This gene discovery effort for O. viverrini should expedite molecular studies of cholangiocarcinogenesis and accelerate research focused on developing new interventions, drugs and vaccines, to

  13. Gene Prioritization for Imaging Genetics Studies Using Gene Ontology and a Stratified False Discovery Rate Approach

    PubMed Central

    Patel, Sejal; Park, Min Tae M.; Chakravarty, M. Mallar; Knight, Jo

    2016-01-01

    Imaging genetics is an emerging field in which the association between genes and neuroimaging-based quantitative phenotypes are used to explore the functional role of genes in neuroanatomy and neurophysiology in the context of healthy function and neuropsychiatric disorders. The main obstacle for researchers in the field is the high dimensionality of the data in both the imaging phenotypes and the genetic variants commonly typed. In this article, we develop a novel method that utilizes Gene Ontology, an online database, to select and prioritize certain genes, employing a stratified false discovery rate (sFDR) approach to investigate their associations with imaging phenotypes. sFDR has the potential to increase power in genome wide association studies (GWAS), and is quickly gaining traction as a method for multiple testing correction. Our novel approach addresses both the pressing need in genetic research to move beyond candidate gene studies, while not being overburdened with a loss of power due to multiple testing. As an example of our methodology, we perform a GWAS of hippocampal volume using both the Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA2) and the Alzheimer's Disease Neuroimaging Initiative datasets. The analysis of ENIGMA2 data yielded a set of SNPs with sFDR values between 10 and 20%. Our approach demonstrates a potential method to prioritize genes based on biological systems impaired in a disease. PMID:27092072

  14. Enabling drug discovery project decisions with integrated computational chemistry and informatics.

    PubMed

    Tsui, Vickie; Ortwine, Daniel F; Blaney, Jeffrey M

    2016-10-31

    Computational chemistry/informatics scientists and software engineers in Genentech Small Molecule Drug Discovery collaborate with experimental scientists in a therapeutic project-centric environment. Our mission is to enable and improve pre-clinical drug discovery design and decisions. Our goal is to deliver timely data, analysis, and modeling to our therapeutic project teams using best-in-class software tools. We describe our strategy, the organization of our group, and our approaches to reach this goal. We conclude with a summary of the interdisciplinary skills required for computational scientists and recommendations for their training.

  15. Enabling drug discovery project decisions with integrated computational chemistry and informatics

    NASA Astrophysics Data System (ADS)

    Tsui, Vickie; Ortwine, Daniel F.; Blaney, Jeffrey M.

    2016-10-01

    Computational chemistry/informatics scientists and software engineers in Genentech Small Molecule Drug Discovery collaborate with experimental scientists in a therapeutic project-centric environment. Our mission is to enable and improve pre-clinical drug discovery design and decisions. Our goal is to deliver timely data, analysis, and modeling to our therapeutic project teams using best-in-class software tools. We describe our strategy, the organization of our group, and our approaches to reach this goal. We conclude with a summary of the interdisciplinary skills required for computational scientists and recommendations for their training.

  16. A Projection and Density Estimation Method for Knowledge Discovery

    PubMed Central

    Stanski, Adam; Hellwich, Olaf

    2012-01-01

    A key ingredient to modern data analysis is probability density estimation. However, it is well known that the curse of dimensionality prevents a proper estimation of densities in high dimensions. The problem is typically circumvented by using a fixed set of assumptions about the data, e.g., by assuming partial independence of features, data on a manifold or a customized kernel. These fixed assumptions limit the applicability of a method. In this paper we propose a framework that uses a flexible set of assumptions instead. It allows to tailor a model to various problems by means of 1d-decompositions. The approach achieves a fast runtime and is not limited by the curse of dimensionality as all estimations are performed in 1d-space. The wide range of applications is demonstrated at two very different real world examples. The first is a data mining software that allows the fully automatic discovery of patterns. The software is publicly available for evaluation. As a second example an image segmentation method is realized. It achieves state of the art performance on a benchmark dataset although it uses only a fraction of the training data and very simple features. PMID:23049675

  17. The Salinas Airshower Learning And Discovery Project (SALAD)

    NASA Astrophysics Data System (ADS)

    Hernandez, Victor; Niduaza, Rommel; Ruiz Castruita, Daniel; Knox, Adrian; Ramos, Daniel; Fan, Sewan; Fatuzzo, Laura

    2015-04-01

    The SALAD project partners community college and high school STEM students in order to develop and investigate cosmic ray detector telescopes and the physical concepts, using a new light sensor technology based on silicon photomultiplier (SiPM) detectors. Replacing the conventional photomultiplier with the SiPM, offers notable advantages in cost and facilitates more in depth, hands-on learning laboratory activities. The students in the SALAD project design, construct and extensively evaluate the SiPM detector modules. These SiPM modules, can be completed in a short time utilizing cost effective components. We describe our research to implement SiPM as read out light detectors for plastic scintillators in a cosmic ray detector telescope for use in high schools. In particular, we describe our work in the design, evaluation and the assembly of (1) a fast preamplifier, (2) a simple coincidence circuit using fast comparators, to discriminate the SiPM noise signal pulses, and (3) a monovibrator circuit to shape the singles plus the AND logic pulses for subsequent processing. To store the singles and coincidence counts data, an Arduino micro-controller with program sketches can be implemented. Results and findings from our work would be described and presented. US Department of Education Title V Grant Award PO31S090007

  18. Discovery of mammalian genes that participate in virus infection

    PubMed Central

    Organ, Edward L; Sheng, Jinsong; Ruley, H Earl; Rubin, Donald H

    2004-01-01

    Background Viruses are obligate intracellular parasites that rely upon the host cell for different steps in their life cycles. The characterization of cellular genes required for virus infection and/or cell killing will be essential for understanding viral life cycles, and may provide cellular targets for new antiviral therapies. Results Candidate genes required for lytic reovirus infection were identified by tagged sequence mutagenesis, a process that permits rapid identification of genes disrupted by gene entrapment. One hundred fifty-one reovirus resistant clones were selected from cell libraries containing 2 × 105 independently disrupted genes, of which 111 contained mutations in previously characterized genes and functionally anonymous transcription units. Collectively, the genes associated with reovirus resistance differed from genes targeted by random gene entrapment in that known mutational hot spots were under represented, and a number of mutations appeared to cluster around specific cellular processes, including: IGF-II expression/signalling, vesicular transport/cytoskeletal trafficking and apoptosis. Notably, several of the genes have been directly implicated in the replication of reovirus and other viruses at different steps in the viral lifecycle. Conclusions Tagged sequence mutagenesis provides a rapid, genome-wide strategy to identify candidate cellular genes required for virus infection. The candidate genes provide a starting point for mechanistic studies of cellular processes that participate in the virus lifecycle and may provide targets for novel anti-viral therapies. PMID:15522117

  19. Using the TIGR gene index databases for biological discovery.

    PubMed

    Lee, Yuandan; Quackenbush, John

    2003-11-01

    The TIGR Gene Index web pages provide access to analyses of ESTs and gene sequences for nearly 60 species, as well as a number of resources derived from these. Each species-specific database is presented using a common format with a homepage. A variety of methods exist that allow users to search each species-specific database. Methods implemented currently include nucleotide or protein sequence queries using WU-BLAST, text-based searches using various sequence identifiers, searches by gene, tissue and library name, and searches using functional classes through Gene Ontology assignments. This protocol provides guidance for using the Gene Index Databases to extract information.

  20. Recurrent and pathological gene fusions in breast cancer: current advances in genomic discovery and clinical implications.

    PubMed

    Veeraraghavan, Jamunarani; Ma, Jiacheng; Hu, Yiheng; Wang, Xiao-Song

    2016-07-01

    Gene fusions have long been considered principally as the oncogenic events of hematologic malignancies, but have recently gained wide attention in solid tumors due to several milestone discoveries and the advancement of deep sequencing technologies. With the progress in deep sequencing studies of breast cancer transcriptomes and genomes, the discovery of recurrent and pathological gene fusions in breast cancer is on the focus. Recently, driven by new deep sequencing studies, several recurrent or pathological gene fusions have been identified in breast cancer, including ESR1-CCDC170, SEC16A-NOTCH1, SEC22B-NOTCH2, and ESR1-YAP1 etc. More important, most of these gene fusions are preferentially identified in the more aggressive breast cancers, such as luminal B, basal-like, or endocrine-resistant breast cancer, suggesting recurrent gene fusions as additional key driver events in these tumors other than the known drivers such as the estrogen receptor. In this paper, we have comprehensively summarized the newly identified recurrent or pathological gene fusion events in breast cancer, reviewed the contributions of new genomic and deep sequencing technologies to new fusion discovery and the integrative bioinformatics tools to analyze these data, highlighted the biological relevance and clinical implications of these fusion discoveries, and discussed future directions of gene fusion research in breast cancer.

  1. Prioritization of neurodevelopmental disease genes by discovery of new mutations

    PubMed Central

    Hoischen, Alexander; Krumm, Niklas; Eichler, Evan E.

    2014-01-01

    Advances in genome sequencing technologies have begun to revolutionize neurogenetics allowing the full spectrum of genetic variation to be better understood in relationship to disease. Exome sequencing of hundreds to thousands of samples from patients with autism spectrum disorder, intellectual disability, epilepsy, and schizophrenia provide strong evidence of the importance of de novo and gene-disruptive events. There are now several hundred new candidate genes and targeted resequencing technologies that allow screening of dozens of genes in tens of thousands of individuals with high specificity and sensitivity. The decision of which genes to pursue depends on numerous factors including recurrence, prior evidence of overlap with pathogenic copy number variants, the position of the mutation within the protein, the mutational burden among healthy individuals, and membership of the candidate gene within disease-implicated protein networks. We discuss these emerging criteria for gene prioritization and the potential impact on the field of neuroscience. PMID:24866042

  2. LSST's Projected Near-Earth Asteroid Discovery Performance

    NASA Astrophysics Data System (ADS)

    Chesley, Steven R.; Veres, Peter

    2016-10-01

    The Large Synoptic Survey Telescope (LSST) is an ambitious project that has the potential to make major advances in Near-Earth Asteroid search efforts. With construction already underway and major optical elements complete, first light is set for 2020, followed by two years of commissioning. Once regular survey operations begin in 2022, LSST will systematically survey the observable sky over a ten-year period from its site on Cerro Pachon, Chile. With an 8.4 m aperture (6.5 m effective), 9.6 square degree field of view, and a 3.2-Gigapixel camera, LSST represents the most capable asteroid survey instrument ever built. LSST will be able cover over 6000 square degrees of sky per clear night with single visit exposures of 30 s, reaching a faint limit of 24.5 mag in the r band. However the cadence of survey operations is a critical factor for the near-Earth asteroid search performance, and there are multiple science drivers with different cadence objectives that are competing to shape the final survey strategy. We examine the NEA search performance of various LSST search strategies, paying particular attention to the challenges of linking large numbers asteroid detections in the presence of noise. Our approach is to derive lists of synthetic detections for a given instantiation of the LSST survey, based on an assumed model for the populations of solar system objects from the main asteroid belt inwards to the near-Earth population. These detection lists are combined with false detection lists that model both random noise and non-random artifacts resulting from image differencing algorithms. These large detection lists are fed to the Moving Object Processing System (MOPS), which attempts to link the synthetic detections correctly without becoming confused or overwhelmed by the false detections. The LSST baseline survey cadence relies primarily on single night pairs of detections, with roughly 30-60 min separating elements of the pair. The strategy of using pairs is an

  3. Discovery of Cationic Polymers for Non-viral Gene Delivery using Combinatorial Approaches

    PubMed Central

    Barua, Sutapa; Ramos, James; Potta, Thrimoorthy; Taylor, David; Huang, Huang-Chiao; Montanez, Gabriela; Rege, Kaushal

    2015-01-01

    Gene therapy is an attractive treatment option for diseases of genetic origin, including several cancers and cardiovascular diseases. While viruses are effective vectors for delivering exogenous genes to cells, concerns related to insertional mutagenesis, immunogenicity, lack of tropism, decay and high production costs necessitate the discovery of non-viral methods. Significant efforts have been focused on cationic polymers as non-viral alternatives for gene delivery. Recent studies have employed combinatorial syntheses and parallel screening methods for enhancing the efficacy of gene delivery, biocompatibility of the delivery vehicle, and overcoming cellular level barriers as they relate to polymer-mediated transgene uptake, transport, transcription, and expression. This review summarizes and discusses recent advances in combinatorial syntheses and parallel screening of cationic polymer libraries for the discovery of efficient and safe gene delivery systems. PMID:21843141

  4. GENOME-ENABLED DISCOVERY OF CARBON SEQUESTRATION GENES IN POPLAR

    SciTech Connect

    DAVIS J M

    2007-10-11

    Plants utilize carbon by partitioning the reduced carbon obtained through photosynthesis into different compartments and into different chemistries within a cell and subsequently allocating such carbon to sink tissues throughout the plant. Since the phytohormones auxin and cytokinin are known to influence sink strength in tissues such as roots (Skoog & Miller 1957, Nordstrom et al. 2004), we hypothesized that altering the expression of genes that regulate auxin-mediated (e.g., AUX/IAA or ARF transcription factors) or cytokinin-mediated (e.g., RR transcription factors) control of root growth and development would impact carbon allocation and partitioning belowground (Fig. 1 - Renewal Proposal). Specifically, the ARF, AUX/IAA and RR transcription factor gene families mediate the effects of the growth regulators auxin and cytokinin on cell expansion, cell division and differentiation into root primordia. Invertases (IVR), whose transcript abundance is enhanced by both auxin and cytokinin, are critical components of carbon movement and therefore of carbon allocation. Thus, we initiated comparative genomic studies to identify the AUX/IAA, ARF, RR and IVR gene families in the Populus genome that could impact carbon allocation and partitioning. Bioinformatics searches using Arabidopsis gene sequences as queries identified regions with high degrees of sequence similarities in the Populus genome. These Populus sequences formed the basis of our transgenic experiments. Transgenic modification of gene expression involving members of these gene families was hypothesized to have profound effects on carbon allocation and partitioning.

  5. OpenZika: An IBM World Community Grid Project to Accelerate Zika Virus Drug Discovery

    PubMed Central

    Perryman, Alexander L.; Horta Andrade, Carolina

    2016-01-01

    The Zika virus outbreak in the Americas has caused global concern. To help accelerate this fight against Zika, we launched the OpenZika project. OpenZika is an IBM World Community Grid Project that uses distributed computing on millions of computers and Android devices to run docking experiments, in order to dock tens of millions of drug-like compounds against crystal structures and homology models of Zika proteins (and other related flavivirus targets). This will enable the identification of new candidates that can then be tested in vitro, to advance the discovery and development of new antiviral drugs against the Zika virus. The docking data is being made openly accessible so that all members of the global research community can use it to further advance drug discovery studies against Zika and other related flaviviruses. PMID:27764115

  6. Cryptococcus neoformans Virulence Gene Discovery through Insertional Mutagenesis

    PubMed Central

    Idnurm, Alexander; Reedy, Jennifer L.; Nussbaum, Jesse C.; Heitman, Joseph

    2004-01-01

    Insertional mutagenesis was applied to Cryptococcus neoformans to identify genes associated with virulence attributes. Using biolistic transformation, we generated 4,300 nourseothricin (NAT)-resistant strains, of which 590 exhibited stable resistance. We focused on mutants with defects in established virulence factors and identified two with reduced growth at 37°C, four with reduced production of the antioxidant pigment melanin, and two with an increased sensitivity to nitric oxide (NO). The NAT insertion and mutant phenotypes were genetically linked in five of eight mutants, and the DNA flanking the insertions was characterized. For the strains with altered growth at 37°C and altered melanin production, mutations were in previously uncharacterized genes, while the two NO-sensitive strains bore insertions in the flavohemoglobin gene FHB1, whose product counters NO stress. Because of the frequent instability of nourseothricin resistance associated with biolistic transformation, Agrobacterium-mediated transformation was tested. This transkingdom DNA delivery approach produced 100% stable nourseothricin-resistant transformants, and three melanin-defective strains were identified from 576 transformants, of which 2 were linked to NAT in segregation analysis. One of these mutants contained a T-DNA insertion in the promoter of the LAC1 (laccase) gene, which encodes a key enzyme required for melanin production, while the second contained an insertion in the promoter of the CLC1 gene, encoding a voltage-gated chloride channel. Clc1 and its homologs are required for ion homeostasis, and in their absence Cu+ transport into the secretory pathway is compromised, depriving laccase and other Cu+-dependent proteins of their essential cofactor. The NAT resistance cassette was optimized for cryptococcal codon usage and GC content and was then used to disrupt a mitogen-activated protein kinase gene, a predicted gene, and two putative chloride channel genes to analyze their

  7. Literature mining for the discovery of hidden connections between drugs, genes and diseases.

    PubMed

    Frijters, Raoul; van Vugt, Marianne; Smeets, Ruben; van Schaik, René; de Vlieg, Jacob; Alkema, Wynand

    2010-09-23

    The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs.

  8. Literature Mining for the Discovery of Hidden Connections between Drugs, Genes and Diseases

    PubMed Central

    Frijters, Raoul; van Vugt, Marianne; Smeets, Ruben; van Schaik, René; de Vlieg, Jacob; Alkema, Wynand

    2010-01-01

    The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs. PMID:20885778

  9. A semiautomated approach to gene discovery through expressed sequence tag data mining: discovery of new human transporter genes.

    PubMed

    Brown, Shoshana; Chang, Jean L; Sadée, Wolfgang; Babbitt, Patricia C

    2003-01-01

    Identification and functional characterization of the genes in the human genome remain a major challenge. A principal source of publicly available information used for this purpose is the National Center for Biotechnology Information database of expressed sequence tags (dbEST), which contains over 4 million human ESTs. To extract the information buried in this data more effectively, we have developed a semiautomated method to mine dbEST for uncharacterized human genes. Starting with a single protein input sequence, a family of related proteins from all species is compiled. This entire family is then used to mine the human EST database for new gene candidates. Evaluation of putative new gene candidates in the context of a family of characterized proteins provides a framework for inference of the structure and function of the new genes. When applied to a test data set of 28 families within the major facilitator superfamily (MFS) of membrane transporters, our protocol found 73 previously characterized human MFS genes and 43 new MFS gene candidates. Development of this approach provided insights into the problems and pitfalls of automated data mining using public databases.

  10. Discovery of myosin genes by physical mapping in Dictyostelium.

    PubMed Central

    Titus, M A; Kuspa, A; Loomis, W F

    1994-01-01

    The diversity of the myosin family in a single organism, Dictyostelium discoideum, has been investigated by a strategy devised to rapidly identify and clone additional members of a gene family. An ordered array of yeast artificial chromosome clones that encompasses the Dictyostelium genome was probed at low stringency with conserved regions of the myosin motor domain to identify all possible myosin loci. The previously identified myosin loci (mchA, myoA-E) were detected by hybridization to the probes, as well as an additional seven previously unidentified loci (referred to as myoF-L). Clones corresponding to four of these additional loci (myoF, myoH-J) were obtained by using the isolated yeast artificial chromosomes as templates in a PCR employing degenerate primers specific for conserved regions of the myosin head. Sequence analysis and physical mapping of these clones confirm that these PCR products are derived from four previously unidentified myosin genes. Preliminary analysis of these sequences suggests that at least one of the genes (myoJ) encodes a member of a potentially different class of myosins. With the development of whole genome libraries for a variety of organisms, this approach can be used to rapidly explore the diversity of this and other gene families in a number of systems. PMID:7937787

  11. Resequencing and comparative genomics of Stagonospora nodorum: Sectional gene absence and effector discovery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    S. nodorum is an important wheat (Triticum aestivum) pathogen in many parts of the world causing major yield losses. It was the first species in the large fungal Dothideomycete class to be genome sequenced. The reference genome sequence (SN15) has been instrumental in the discovery of genes encoding...

  12. Discovery of Novel Metastasis Genes in Breast Cancer

    DTIC Science & Technology

    2005-07-01

    fibroblast growth factor binding protein 1 >500 mouse colon, colon adenoma and skin Yes Ray et al., 2003 amplified in breast cancer 1 (AIB1) 679 BC(T), PC(T...Lloyd R.V. (1999). Analysis of anterior pituitary hormone mRNA expression in immunophenotypically characterized single cells after laser capture... pituitary . Histochem J. 33, 201- 211. Van Deerlin V.M., Gill L.H. and Nelson P.T. (2002). Optimizing gene expression analysis in archival brain tissue

  13. TILLING in forage grasses for gene discovery and breeding improvement.

    PubMed

    Manzanares, Chloe; Yates, Steven; Ruckle, Michael; Nay, Michelle; Studer, Bruno

    2016-09-25

    Mutation breeding has a long-standing history and in some major crop species, many of the most important cultivars have their origin in germplasm generated by mutation induction. For almost two decades, methods for TILLING (Targeting Induced Local Lesions IN Genomes) have been established in model plant species such as Arabidopsis (Arabidopsis thaliana L.), enabling the functional analysis of genes. Recent advances in mutation detection by second generation sequencing technology have brought its utility to major crop species. However, it has remained difficult to apply similar approaches in forage and turf grasses, mainly due to their outbreeding nature maintained by an efficient self-incompatibility system. Starting with a description of the extent to which traditional mutagenesis methods have contributed to crop yield increase in the past, this review focuses on technological approaches to implement TILLING-based strategies for the improvement of forage grass breeding through forward and reverse genetics. We present first results from TILLING in allogamous forage grasses for traits such as stress tolerance and evaluate prospects for rapid implementation of beneficial alleles to forage grass breeding. In conclusion, large-scale induced mutation resources, used for forward genetic screens, constitute a valuable tool to increase the genetic diversity for breeding and can be generated with relatively small investments in forage grasses. Furthermore, large libraries of sequenced mutations can be readily established, providing enhanced opportunities to discover mutations in genes controlling traits of agricultural importance and to study gene functions by reverse genetics.

  14. Gapped permutation pattern discovery for gene order comparisons.

    PubMed

    Parida, Laxmi

    2007-01-01

    The list of species whose complete DNA sequence have been read is growing steadily, and it is believed that comparative genomics is in its early days. Permutations patterns (groups of genes in some "close" proximity) on gene sequences of genomes across species is being studied under different models, to cope with this explosion of data. The challenge is to (intelligently and efficiently) analyze the genomes in the context of other genomes. In this paper, we present a generalized model that uses three notions, gapped permutation patterns (with gap g), genome clusters, via quorum, K>1, parameter, and, possible multiplicity in the patterns. The task is to automatically discover all permutation patterns (with possible multiplicity), that occur with gap g in at least K of the given m genomes. We present (log mN (I) + /Sigma/log/Sigma/N (O)) time algorithm where m is the number of sequences, each defined on Sigma, N (I) is the size of the input and N (O) is the size of the maximal gene clusters that appear in at least K of the m genomes.

  15. The biological knowledge discovery by PCCF measure and PCA-F projection.

    PubMed

    Jia, Xingang; Zhu, Guanqun; Han, Qiuhong; Lu, Zuhong

    2017-01-01

    In the process of biological knowledge discovery, PCA is commonly used to complement the clustering analysis, but PCA typically gives the poor visualizations for most gene expression data sets. Here, we propose a PCCF measure, and use PCA-F to display clusters of PCCF, where PCCF and PCA-F are modeled from the modified cumulative probabilities of genes. From the analysis of simulated and experimental data sets, we demonstrate that PCCF is more appropriate and reliable for analyzing gene expression data compared to other commonly used distances or similarity measures, and PCA-F is a good visualization technique for identifying clusters of PCCF, where we aim at such data sets that the expression values of genes are collected at different time points.

  16. Scientific Discovery with the Blue Gene/L

    SciTech Connect

    Negele, John W.

    2011-12-09

    This project succeeded in developing key software optimization tools to bring fundamental QCD calculations of nucleon structure from the Terascale era through the Petascale era and prepare for the Exascale era. It also enabled fundamental QCD physics calculations and demonstrated the power of placing small versions of frontier emerging architectures at MIT to attract outstanding students to computational science. MIT also hosted a workshop September 19 2008 to brainstorm ways to promote computational science at top tier research universities and attract gifted students into the field, some of whom would provide the next generation of talent at our defense laboratories.

  17. Discovery of genes from feces correlated with colorectal cancer progression

    PubMed Central

    Lee, Chia-Long; Huang, Chi-Jung; Yang, Shung-Haur; Chang, Chun-Chao; Huang, Chi-Cheng; Chien, Chih-Cheng; Yang, Ruey-Neng

    2016-01-01

    Colorectal cancer (CRC) is considered to develop slowly via a progressive accumulation of genetic mutations. Markers of CRC may serve to provide the basis for decision-making, and may assist in cancer prevention, detection and prognostic prediction. DNA and messenger (m)RNA molecules that are present in human feces faithfully represent CRC manifestations. In the present study, exogenous mouse cells verified the feasibility of total fecal RNA as a marker of CRC. Furthermore, five significant genes encoding solute carrier family 15, member 4 (SLC15A4), cluster of differentiation (CD)44, 3-oxoacid CoA-transferase 1 (OXCT1), placenta-specific 8 (PLAC8) and growth arrest-specific 2 (GAS2), which are differentially expressed in the feces of CRC patients, were verified in different CRC cell lines using quantitative polymerase chain reaction. The present study demonstrated that the mRNA level of SLC15A4 was increased in the majority of CRC cell lines evaluated (SW1116, LS123, Caco-2 and T84). An increased level of CD44 mRNA was only detected in an early-stage CRC cell line, SW1116, whereas OXCT1 was expressed at higher levels in the metastatic CRC cell line CC-M3. In addition, two genes, PLAC8 and GAS2, were highly expressed in the recurrent CRC cell line SW620. Genes identified in the feces of CRC patients differed according to their clinical characteristics, and this differential expression was also detected in the corresponding CRC cell lines. In conclusion, feces represent a good marker of CRC and can be interpreted through the appropriate CRC cell lines. PMID:27900008

  18. The Bering Sea Project Archive: a Prototype for Improved Discovery and Access

    NASA Astrophysics Data System (ADS)

    Stott, D.; Mayernik, M. S.; Daniels, M. D.; Moore, J. A.; Williams, S. F.; Allison, J.

    2015-12-01

    The Bering Sea Project was a research program from 2007 through 2012 that sought to understand the impacts of climate change and dynamic sea ice cover on the eastern Bering Sea ecosystem. More than 100 scientists engaged in field data collection, original research, and ecosystem modeling to link climate, physical oceanography, plankton, fishes, seabirds, marine mammals, humans, traditional knowledge and economic outcomes. Over the six-year period of the program hundreds of multidisciplinary datasets coming from a variety of instrumentation and measurement platforms within thirty-one categories of research were processed and curated by the National Center for Atmospheric Research (NCAR) Earth Observing Laboratory (EOL). For the investigator proposing a field project, the researcher performing synthesis, or the modeler seeking data for verification, the easy discovery and access to the most relevant data is of prime importance. The heterogeneous products of oceanographic field programs such as the Bering Sea Project challenge the ability of researchers to identify which data sets, people, or tools might be relevant to their research, and to understand how certain data, instruments, or methods were used to produce particular results.EOL, as a partner in the NSF funded EarthCollab project, is using linked open data to permit the direct interlinking of information and data across platforms and projects. We are leveraging an existing open-source semantic web application, VIVO, to address connectivity gaps across distributed networks of researchers and resources and identify relevant content, independent of location. We will present our approach in connecting ontologies and integrating them within the VIVO system, using the Bering Sea Project datasets as a case study, and will provide insight into how the geosciences can leverage linked data to produce more coherent methods of information and data discovery across large multi-disciplinary projects.

  19. Network-based gene prediction for Plasmodium falciparum malaria towards genetics-based drug discovery

    PubMed Central

    2015-01-01

    Background Malaria is the most deadly parasitic infectious disease. Existing drug treatments have limited efficacy in malaria elimination, and the complex pathogenesis of the disease is not fully understood. Detecting novel malaria-associated genes not only contributes in revealing the disease pathogenesis, but also facilitates discovering new targets for anti-malaria drugs. Methods In this study, we developed a network-based approach to predict malaria-associated genes. We constructed a cross-species network to integrate human-human, parasite-parasite and human-parasite protein interactions. Then we extended the random walk algorithm on this network, and used known malaria genes as the seeds to find novel candidate genes for malaria. Results We validated our algorithms using 77 known malaria genes: 14 human genes and 63 parasite genes were ranked averagely within top 2% and top 4%, respectively among human and parasite genomes. We also evaluated our method for predicting novel malaria genes using a set of 27 genes with literature supporting evidence. Our approach ranked 12 genes within top 1% and 24 genes within top 5%. In addition, we demonstrated that top-ranked candied genes were enriched for drug targets, and identified commonalities underlying top-ranked malaria genes through pathway analysis. In summary, the candidate malaria-associated genes predicted by our data-driven approach have the potential to guide genetics-based anti-malaria drug discovery. PMID:26099491

  20. Arctic Research Mapping Application (ARMAP) Showcases discovery level metadata for US Funded Research Projects

    NASA Astrophysics Data System (ADS)

    Score, R.; Gaylord, A. G.; Kassin, A.; Cody, R. P.; Copenhaver, W.; Manley, W. F.; Dover, M.; Tweedie, C. E.

    2014-12-01

    The Arctic Research Mapping Application (ARMAP) is a suite of online applications and data services that support Arctic science by providing project tracking information (who's doing what, when and where in the region) for United States Government funded projects. Development of an interagency standard for tracking discovery level metadata for projects has been achieved through collaboration with the Alaska Data Integration work group. The US National Science Foundation plus 17 other agencies and organizations have adopted the standard with several entities successfully implementing XML based REST webservices. With ARMAP's web mapping applications and data services (http://armap.org), users can search for research projects by location, year, funding program, keyword, investigator, and discipline, among other variables. Key information about each project is displayed within the application with links to web pages that provide additional information. The ARMAP 2D mapping application has been significantly enhanced to include support for multiple projections, improved base maps, additional reference data layers, and optimization for better performance. In 2014, ship tracks for US National Science Foundation supported vessel based surveys have been expanded. These enhancements have been made to increase awareness of projects funded by numerous entities in the Arctic, enhance coordination for logistics support, help identify geographic gaps in research efforts and potentially foster more collaboration amongst researchers working in the region. Additionally, ARMAP can be used to demonstrate past, present, and future research efforts supported by the U.S. Government.

  1. Arctic Research Mapping Application (ARMAP) Showcases discovery level metadata for US Funded Research Projects

    NASA Astrophysics Data System (ADS)

    Gaylord, A. G.; Kassin, A.; Cody, R. P.; Manley, W. F.; Dover, M.; Score, R.; Garcia-Lavigne3, D.; Tweedie, C. E.

    2013-12-01

    The Arctic Research Mapping Application (ARMAP) is a suite of online applications and data services that support Arctic science by providing project tracking information (who's doing what, when and where in the region) for United States Government funded projects. Development of an interagency standard for tracking discovery level metadata for projects has been achieved through collaboration with the Alaska Data Integration work group. The US National Science Foundation plus 17 other agencies and organizations have adopted the standard with several entities successfully implementing XML based REST webservices. With ARMAP's web mapping applications and data services (http://armap.org), users can search for research projects by location, year, funding program, keyword, investigator, and discipline, among other variables. Key information about each project is displayed within the application with links to web pages that provide additional information. The ARMAP 2D mapping application has been significantly enhanced to include support for multiple projections, improved base maps, additional reference data layers, and optimization for better performance. In 2013, ship tracks for US National Science Foundation supported vessel based surveys and health care facilities have been included in ARMAP. The additional functionality of this tool will increase awareness of projects funded by numerous entities in the Arctic, enhance coordination for logistics support, help identify geographic gaps in research efforts and potentially foster more collaboration amongst researchers working in the region. Additionally, ARMAP can be used to demonstrate the effects of the International Polar Year (IPY) on funding of different research disciplines by the U.S. Government.

  2. The Alveolate Perkinsus marinus: Biological Insights from EST Gene Discovery

    PubMed Central

    2010-01-01

    Background Perkinsus marinus, a protozoan parasite of the eastern oyster Crassostrea virginica, has devastated natural and farmed oyster populations along the Atlantic and Gulf coasts of the United States. It is classified as a member of the Perkinsozoa, a recently established phylum considered close to the ancestor of ciliates, dinoflagellates, and apicomplexans, and a key taxon for understanding unique adaptations (e.g. parasitism) within the Alveolata. Despite intense parasite pressure, no disease-resistant oysters have been identified and no effective therapies have been developed to date. Results To gain insight into the biological basis of the parasite's virulence and pathogenesis mechanisms, and to identify genes encoding potential targets for intervention, we generated >31,000 5' expressed sequence tags (ESTs) derived from four trophozoite libraries generated from two P. marinus strains. Trimming and clustering of the sequence tags yielded 7,863 unique sequences, some of which carry a spliced leader. Similarity searches revealed that 55% of these had hits in protein sequence databases, of which 1,729 had their best hit with proteins from the chromalveolates (E-value ≤ 1e-5). Some sequences are similar to those proven to be targets for effective intervention in other protozoan parasites, and include not only proteases, antioxidant enzymes, and heat shock proteins, but also those associated with relict plastids, such as acetyl-CoA carboxylase and methyl erythrithol phosphate pathway components, and those involved in glycan assembly, protein folding/secretion, and parasite-host interactions. Conclusions Our transcriptome analysis of P. marinus, the first for any member of the Perkinsozoa, contributes new insight into its biology and taxonomic position. It provides a very informative, albeit preliminary, glimpse into the expression of genes encoding functionally relevant proteins as potential targets for chemotherapy, and evidence for the presence of a relict

  3. Replication Study: Discovery and preclinical validation of drug indications using compendia of public gene expression data

    PubMed Central

    Kandela, Irawati; Aird, Fraser

    2017-01-01

    In 2015, as part of the Reproducibility Project: Cancer Biology, we published a Registered Report (Kandela et al., 2015) that described how we intended to replicate selected experiments from the paper “Discovery and Preclinical Validation of Drug Indications Using Compendia of Public Gene Expression Data“ (Sirota et al., 2011). Here we report the results of those experiments. We found that cimetidine treatment in a xenograft model using A549 lung adenocarcinoma cells resulted in decreased tumor volume compared to vehicle control; however, while the effect was in the same direction as the original study (Figure 4C; Sirota et al., 2011), it was not statistically significant. Cimetidine treatment in a xenograft model using ACHN renal cell carcinoma cells did not differ from vehicle control treatment, similar to the original study (Supplemental Figure 1; Sirota et al., 2011). Doxorubicin treatment in a xenograft model using A549 lung adenocarcinoma cells did not result in a statistically significant difference compared to vehicle control despite tumor volume being reduced to levels similar to those reported in the original study (Figure 4C; Sirota et al., 2011). Finally, we report a random effects meta-analysis for each result. These meta-analyses show that the inhibition of A549 derived tumors by cimetidine resulted in a statistically significant effect, as did the inhibition of A549 derived tumors by doxorubicin. The effect of cimetidine on ACHN derived tumors was not statistically significant, as predicted. DOI: http://dx.doi.org/10.7554/eLife.17044.001 PMID:28100397

  4. A three component latent class model for robust semiparametric gene discovery.

    PubMed

    Alfo', Marco; Farcomeni, Alessio; Tardella, Luca

    2011-01-01

    We propose a robust model for discovering differentially expressed genes which directly incorporates biological significance, i.e., effect dimension. Using the so-called c-fold rule, we transform the expressions into a nominal observed random variable with three categories: below a fixed lower threshold, above a fixed upper threshold or within the two thresholds. Gene expression data is then transformed into a nominal variable with three levels possibly originated by three different distributions corresponding to under expressed, not differential, and over expressed genes. This leads to a statistical model for a 3-component mixture of trinomial distributions with suitable constraints on the parameter space. In order to obtain the MLE estimates, we show how to implement a constrained EM algorithm with a latent label for the corresponding component of each gene. Different strategies for a statistically significant gene discovery are discussed and compared. We illustrate the method on a little simulation study and a real dataset on multiple sclerosis.

  5. Modern plant metabolomics: Advanced natural product gene discoveries, improved technologies, and future prospects

    SciTech Connect

    Sumner, Lloyd W.; Lei, Zhentian; Nikolau, Basil J.; Saito, Kazuki

    2014-10-24

    Plant metabolomics has matured and modern plant metabolomics has accelerated gene discoveries and the elucidation of a variety of plant natural product biosynthetic pathways. This study highlights specific examples of the discovery and characterization of novel genes and enzymes associated with the biosynthesis of natural products such as flavonoids, glucosinolates, terpenoids, and alkaloids. Additional examples of the integration of metabolomics with genome-based functional characterizations of plant natural products that are important to modern pharmaceutical technology are also reviewed. This article also provides a substantial review of recent technical advances in mass spectrometry imaging, nuclear magnetic resonance imaging, integrated LC-MS-SPE-NMR for metabolite identifications, and x-ray crystallography of microgram quantities for structural determinations. The review closes with a discussion on the future prospects of metabolomics related to crop species and herbal medicine.

  6. Modern plant metabolomics: Advanced natural product gene discoveries, improved technologies, and future prospects

    DOE PAGES

    Sumner, Lloyd W.; Lei, Zhentian; Nikolau, Basil J.; ...

    2014-10-24

    Plant metabolomics has matured and modern plant metabolomics has accelerated gene discoveries and the elucidation of a variety of plant natural product biosynthetic pathways. This study highlights specific examples of the discovery and characterization of novel genes and enzymes associated with the biosynthesis of natural products such as flavonoids, glucosinolates, terpenoids, and alkaloids. Additional examples of the integration of metabolomics with genome-based functional characterizations of plant natural products that are important to modern pharmaceutical technology are also reviewed. This article also provides a substantial review of recent technical advances in mass spectrometry imaging, nuclear magnetic resonance imaging, integrated LC-MS-SPE-NMR formore » metabolite identifications, and x-ray crystallography of microgram quantities for structural determinations. The review closes with a discussion on the future prospects of metabolomics related to crop species and herbal medicine.« less

  7. Discovery of new candidate genes related to brain development using protein interaction information.

    PubMed

    Chen, Lei; Chu, Chen; Kong, Xiangyin; Huang, Tao; Cai, Yu-Dong

    2015-01-01

    Human brain development is a dramatic process composed of a series of complex and fine-tuned spatiotemporal gene expressions. A good comprehension of this process can assist us in developing the potential of our brain. However, we have only limited knowledge about the genes and gene functions that are involved in this biological process. Therefore, a substantial demand remains to discover new brain development-related genes and identify their biological functions. In this study, we aimed to discover new brain-development related genes by building a computational method. We referred to a series of computational methods used to discover new disease-related genes and developed a similar method. In this method, the shortest path algorithm was executed on a weighted graph that was constructed using protein-protein interactions. New candidate genes fell on at least one of the shortest paths connecting two known genes that are related to brain development. A randomization test was then adopted to filter positive discoveries. Of the final identified genes, several have been reported to be associated with brain development, indicating the effectiveness of the method, whereas several of the others may have potential roles in brain development.

  8. Transcriptome profiling for discovery of genes involved in shoot apical meristem and flower development.

    PubMed

    Singh, Vikash K; Jain, Mukesh

    2014-12-01

    Flower development is one of the major developmental processes that governs seed setting in angiosperms. However, little is known about the molecular mechanisms underlying flower development in legumes. Employing RNA-seq for various stages of flower development and few vegetative tissues in chickpea, we identified differentially expressed genes in flower tissues/stages in comparison to vegetative tissues, which are related to various biological processes and molecular functions during flower development. Here, we provide details of experimental methods, RNA-seq data (available at Gene Expression Omnibus database under GSE42679) and analysis pipeline published by Singh and colleagues in the Plant Biotechnology Journal (Singh et al., 2013), along with additional analysis for discovery of genes involved in shoot apical meristem (SAM) development. Our data provide a resource for exploring the complex molecular mechanisms underlying SAM and flower development and identification of gene targets for functional and applied genomics in legumes.

  9. Discovery of the faithfulness gene: a model of transmission and transformation of scientific information.

    PubMed

    Green, Eva G T; Clémence, Alain

    2008-09-01

    The purpose of this paper is to study the diffusion and transformation of scientific information in everyday discussions. Based on rumour models and social representations theory, the impact of interpersonal communication and pre-existing beliefs on transmission of the content of a scientific discovery was analysed. In three experiments, a communication chain was simulated to investigate how laypeople make sense of a genetic discovery first published in a scientific outlet, then reported in a mainstream newspaper and finally discussed in groups. Study 1 (N=40) demonstrated a transformation of information when the scientific discovery moved along the communication chain. During successive narratives, scientific expert terminology disappeared while scientific information associated with lay terminology persisted. Moreover, the idea of a discovery of a faithfulness gene emerged. Study 2 (N=70) revealed that transmission of the scientific message varied as a function of attitudes towards genetic explanations of behaviour (pro-genetics vs. anti-genetics). Pro-genetics employed more scientific terminology than anti-genetics. Study 3 (N=75) showed that endorsement of genetic explanations was related to descriptive accounts of the scientific information, whereas rejection of genetic explanations was related to evaluative accounts of the information.

  10. Discovery of dominant and dormant genes from expression data using a novel generalization of SNR for multi-class problems

    PubMed Central

    Tsai, Yu-Shuen; Lin, Chin-Teng; Tseng, George C; Chung, I-Fang; Pal, Nikhil Ranjan

    2008-01-01

    Background The Signal-to-Noise-Ratio (SNR) is often used for identification of biomarkers for two-class problems and no formal and useful generalization of SNR is available for multiclass problems. We propose innovative generalizations of SNR for multiclass cancer discrimination through introduction of two indices, Gene Dominant Index and Gene Dormant Index (GDIs). These two indices lead to the concepts of dominant and dormant genes with biological significance. We use these indices to develop methodologies for discovery of dominant and dormant biomarkers with interesting biological significance. The dominancy and dormancy of the identified biomarkers and their excellent discriminating power are also demonstrated pictorially using the scatterplot of individual gene and 2-D Sammon's projection of the selected set of genes. Using information from the literature we have shown that the GDI based method can identify dominant and dormant genes that play significant roles in cancer biology. These biomarkers are also used to design diagnostic prediction systems. Results and discussion To evaluate the effectiveness of the GDIs, we have used four multiclass cancer data sets (Small Round Blue Cell Tumors, Leukemia, Central Nervous System Tumors, and Lung Cancer). For each data set we demonstrate that the new indices can find biologically meaningful genes that can act as biomarkers. We then use six machine learning tools, Nearest Neighbor Classifier (NNC), Nearest Mean Classifier (NMC), Support Vector Machine (SVM) classifier with linear kernel, and SVM classifier with Gaussian kernel, where both SVMs are used in conjunction with one-vs-all (OVA) and one-vs-one (OVO) strategies. We found GDIs to be very effective in identifying biomarkers with strong class specific signatures. With all six tools and for all data sets we could achieve better or comparable prediction accuracies usually with fewer marker genes than results reported in the literature using the same computational

  11. Weighted gene co-expression based biomarker discovery for psoriasis detection.

    PubMed

    Sundarrajan, Sudharsana; Arumugam, Mohanapriya

    2016-11-15

    Psoriasis is a chronic inflammatory disease of the skin with an unknown aetiology. The disease manifests itself as red and silvery scaly plaques distributed over the scalp, lower back and extensor aspects of the limbs. After receiving scant consideration for quite a few years, psoriasis has now become a prominent focus for new drug development. A group of closely connected and differentially co-expressed genes may act in a network and may serve as molecular signatures for an underlying phenotype. A weighted gene coexpression network analysis (WGCNA), a system biology approach has been utilized for identification of new molecular targets for psoriasis. Gene coexpression relationships were investigated in 58 psoriatic lesional samples resulting in five gene modules, clustered based on the gene coexpression patterns. The coexpression pattern was validated using three psoriatic datasets. 10 highly connected and informative genes from each module was selected and termed as psoriasis specific hub signatures. A random forest based binary classifier built using the expression profiles of signature genes robustly distinguished psoriatic samples from the normal samples in the validation set with an accuracy of 0.95 to 1. These signature genes may serve as potential candidates for biomarker discovery leading to new therapeutic targets. WGCNA, the network based approach has provided an alternative path to mine out key controllers and drivers of psoriasis. The study principle from the current work can be extended to other pathological conditions.

  12. Discovery of nucleotide polymorphisms in the Musa gene pool by Ecotilling.

    PubMed

    Till, Bradley J; Jankowicz-Cieslak, Joanna; Sági, László; Huynh, Owen A; Utsushi, Hiroe; Swennen, Rony; Terauchi, Ryohei; Mba, Chikelu

    2010-11-01

    Musa (banana and plantain) is an important genus for the global export market and in local markets where it provides staple food for approximately 400 million people. Hybridization and polyploidization of several (sub)species, combined with vegetative propagation and human selection have produced a complex genetic history. We describe the application of the Ecotilling method for the discovery and characterization of nucleotide polymorphisms in diploid and polyploid accessions of Musa. We discovered over 800 novel alleles in 80 accessions. Sequencing and band evaluation shows Ecotilling to be a robust and accurate platform for the discovery of polymorphisms in homologous and homeologous gene targets. In the process of validating the method, we identified two single nucleotide polymorphisms that may be deleterious for the function of a gene putatively important for phototropism. Evaluation of heterozygous polymorphism and haplotype blocks revealed a high level of nucleotide diversity in Musa accessions. We further applied a strategy for the simultaneous discovery of heterozygous and homozygous polymorphisms in diploid accessions to rapidly evaluate nucleotide diversity in accessions of the same genome type. This strategy can be used to develop hypotheses for inheritance patterns of nucleotide polymorphisms within and between genome types. We conclude that Ecotilling is suitable for diversity studies in Musa, that it can be considered for functional genomics studies and as tool in selecting germplasm for traditional and mutation breeding approaches.

  13. Riboswitches: discovery of drugs that target bacterial gene-regulatory RNAs.

    PubMed

    Deigan, Katherine E; Ferré-D'Amaré, Adrian R

    2011-12-20

    Riboswitches are messenger RNA (mRNA) domains that regulate gene function in response to the intracellular concentration of a variety of metabolites and second messengers. They control essential genes in many pathogenic bacteria, thus representing an inviting new class of biomolecular target for the development of antibiotics and chemical-biological tools. In this Account, we briefly review the discovery of riboswitches in the first years of the 21st century and their ensuing characterization over the past decade. We then discuss the progress achieved so far in using riboswitches as a focus for drug discovery, considering both the value of past serendipity and the particular challenges that confront current researchers. Five mechanisms of gene regulation have been determined for riboswitches. Most bacterial riboswitches modulate either transcription termination or translation initiation in response to ligand binding. All known examples of eukaryotic riboswitches, and some bacterial riboswitches, control gene expression by alternative splicing. The glmS riboswitch, which is widespread in Gram-positive bacteria, is a catalytic RNA activated by ligand binding: its self-cleavage destabilizes the mRNA of which it is part. Finally, one example of a trans-acting riboswitch is known. Three-dimensional structures have been determined for representatives of 13 structurally distinct riboswitch classes, providing atomic-level insight into their mechanisms of ligand recognition. While cellular and viral RNAs have attracted widespread interest as potential drug targets, riboswitches show special promise due to the diversity of small-molecule recognition strategies that are on display in their ligand-binding pockets. Moreover, riboswitches have evolved to recognize small-molecule ligands, which is unique among known structured RNA domains. Structural and biochemical advances in the study of riboswitches provide an impetus for the development of methods for the discovery of novel

  14. Ontological Discovery Environment: a system for integrating gene-phenotype associations.

    PubMed

    Baker, Erich J; Jay, Jeremy J; Philip, Vivek M; Zhang, Yun; Li, Zuopan; Kirova, Roumyana; Langston, Michael A; Chesler, Elissa J

    2009-12-01

    The wealth of genomic technologies has enabled biologists to rapidly ascribe phenotypic characters to biological substrates. Central to effective biological investigation is the operational definition of the process under investigation. We propose an elucidation of categories of biological characters, including disease relevant traits, based on natural endogenous processes and experimentally observed biological networks, pathways and systems rather than on externally manifested constructs and current semantics such as disease names and processes. The Ontological Discovery Environment (ODE) is an Internet accessible resource for the storage, sharing, retrieval and analysis of phenotype-centered genomic data sets across species and experimental model systems. Any type of data set representing gene-phenotype relationships, such quantitative trait loci (QTL) positional candidates, literature reviews, microarray experiments, ontological or even meta-data, may serve as inputs. To demonstrate a use case leveraging the homology capabilities of ODE and its ability to synthesize diverse data sets, we conducted an analysis of genomic studies related to alcoholism. The core of ODE's gene set similarity, distance and hierarchical analysis is the creation of a bipartite network of gene-phenotype relations, a unique discrete graph approach to analysis that enables set-set matching of non-referential data. Gene sets are annotated with several levels of metadata, including community ontologies, while gene set translations compare models across species. Computationally derived gene sets are integrated into hierarchical trees based on gene-derived phenotype interdependencies. Automated set identifications are augmented by statistical tools which enable users to interpret the confidence of modeled results. This approach allows data integration and hypothesis discovery across multiple experimental contexts, regardless of the face similarity and semantic annotation of the experimental

  15. Ontological Discovery Environment: A system for integrating gene-phenotype associations

    PubMed Central

    Baker, Erich J.; Jay, Jeremy J.; Philip, Vivek M.; Zhang, Yun; Li, Zuopan; Kirova, Roumyana; Langston, Michael A.; Chesler, Elissa J.

    2009-01-01

    The wealth of genomic technologies has enabled biologists to rapidly ascribe phenotypic characters to biological substrates. Central to effective biological investigation is the operational definition of the process under investigation. We propose an elucidation of categories of biological characters, including disease relevant traits, based on natural endogenous processes and experimentally observed biological networks, pathways and systems rather than on externally manifested constructs and current semantics such as disease names and processes. The Ontological Discovery Environment (ODE) is an Internet accessible resource for the storage, sharing, retrieval and analysis of phenotype-centered genomic data sets across species and experimental model systems. Any type of data set representing gene-phenotype relationships, such quantitative trait loci (QTL) positional candidates, literature reviews, microarray experiments, ontological or even meta-data, may serve as inputs. To demonstrate a use case leveraging the homology capabilities of ODE and its ability to synthesize diverse data sets, we conducted an analysis of genomic studies related to alcoholism. The core of ODE’s gene-set similarity, distance and hierarchical analysis is the creation of a bipartite network of gene-phenotype relations, a unique discrete graph approach to analysis that enables set-set matching of non-referential data. Gene sets are annotated with several levels of metadata, including community ontologies, while gene set translations compare models across species. Computationally derived gene sets are integrated into hierarchical trees based on gene-derived phenotype interdependencies. Automated set identifications are augmented by statistical tools which enable users to interpret the confidence of modeled results. This approach allows data integration and hypothesis discovery across multiple experimental contexts, regardless of the face similarity and semantic annotation of the experimental

  16. Gene expression, single nucleotide variant and fusion transcript discovery in archival material from breast tumors.

    PubMed

    Norton, Nadine; Sun, Zhifu; Asmann, Yan W; Serie, Daniel J; Necela, Brian M; Bhagwate, Aditya; Jen, Jin; Eckloff, Bruce W; Kalari, Krishna R; Thompson, Kevin J; Carr, Jennifer M; Kachergus, Jennifer M; Geiger, Xochiquetzal J; Perez, Edith A; Thompson, E Aubrey

    2013-01-01

    Advantages of RNA-Seq over array based platforms are quantitative gene expression and discovery of expressed single nucleotide variants (eSNVs) and fusion transcripts from a single platform, but the sensitivity for each of these characteristics is unknown. We measured gene expression in a set of manually degraded RNAs, nine pairs of matched fresh-frozen, and FFPE RNA isolated from breast tumor with the hybridization based, NanoString nCounter (226 gene panel) and with whole transcriptome RNA-Seq using RiboZeroGold ScriptSeq V2 library preparation kits. We performed correlation analyses of gene expression between samples and across platforms. We then specifically assessed whole transcriptome expression of lincRNA and discovery of eSNVs and fusion transcripts in the FFPE RNA-Seq data. For gene expression in the manually degraded samples, we observed Pearson correlations of >0.94 and >0.80 with NanoString and ScriptSeq protocols, respectively. Gene expression data for matched fresh-frozen and FFPE samples yielded mean Pearson correlations of 0.874 and 0.783 for NanoString (226 genes) and ScriptSeq whole transcriptome protocols respectively, p<2x10(-16). Specifically for lincRNAs, we observed superb Pearson correlation (0.988) between matched fresh-frozen and FFPE pairs. FFPE samples across NanoString and RNA-Seq platforms gave a mean Pearson correlation of 0.838. In FFPE libraries, we detected 53.4% of high confidence SNVs and 24% of high confidence fusion transcripts. Sensitivity of fusion transcript detection was not overcome by an increase in depth of sequencing up to 3-fold (increase from ~56 to ~159 million reads). Both NanoString and ScriptSeq RNA-Seq technologies yield reliable gene expression data for degraded and FFPE material. The high degree of correlation between NanoString and RNA-Seq platforms suggests discovery based whole transcriptome studies from FFPE material will produce reliable expression data. The RiboZeroGold ScriptSeq protocol performed

  17. Applying knowledge-anchored hypothesis discovery methods to advance clinical and translational research: the OAMiner project

    PubMed Central

    Jackson, Rebecca D; Best, Thomas M; Borlawsky, Tara B; Lai, Albert M; James, Stephen; Gurcan, Metin N

    2012-01-01

    The conduct of clinical and translational research regularly involves the use of a variety of heterogeneous and large-scale data resources. Scalable methods for the integrative analysis of such resources, particularly when attempting to leverage computable domain knowledge in order to generate actionable hypotheses in a high-throughput manner, remain an open area of research. In this report, we describe both a generalizable design pattern for such integrative knowledge-anchored hypothesis discovery operations and our experience in applying that design pattern in the experimental context of a set of driving research questions related to the publicly available Osteoarthritis Initiative data repository. We believe that this ‘test bed’ project and the lessons learned during its execution are both generalizable and representative of common clinical and translational research paradigms. PMID:22647689

  18. Systems Pharmacology-Based Discovery of Natural Products for Precision Oncology Through Targeting Cancer Mutated Genes.

    PubMed

    Fang, J; Cai, C; Wang, Q; Lin, P; Zhao, Z; Cheng, F

    2017-03-01

    Massive cancer genomics data have facilitated the rapid revolution of a novel oncology drug discovery paradigm through targeting clinically relevant driver genes or mutations for the development of precision oncology. Natural products with polypharmacological profiles have been demonstrated as promising agents for the development of novel cancer therapies. In this study, we developed an integrated systems pharmacology framework that facilitated identifying potential natural products that target mutated genes across 15 cancer types or subtypes in the realm of precision medicine. High performance was achieved for our systems pharmacology framework. In case studies, we computationally identified novel anticancer indications for several US Food and Drug Administration-approved or clinically investigational natural products (e.g., resveratrol, quercetin, genistein, and fisetin) through targeting significantly mutated genes in multiple cancer types. In summary, this study provides a powerful tool for the development of molecularly targeted cancer therapies through targeting the clinically actionable alterations by exploiting the systems pharmacology of natural products.

  19. Systems Pharmacology‐Based Discovery of Natural Products for Precision Oncology Through Targeting Cancer Mutated Genes

    PubMed Central

    Fang, J; Cai, C; Wang, Q; Lin, P

    2017-01-01

    Massive cancer genomics data have facilitated the rapid revolution of a novel oncology drug discovery paradigm through targeting clinically relevant driver genes or mutations for the development of precision oncology. Natural products with polypharmacological profiles have been demonstrated as promising agents for the development of novel cancer therapies. In this study, we developed an integrated systems pharmacology framework that facilitated identifying potential natural products that target mutated genes across 15 cancer types or subtypes in the realm of precision medicine. High performance was achieved for our systems pharmacology framework. In case studies, we computationally identified novel anticancer indications for several US Food and Drug Administration‐approved or clinically investigational natural products (e.g., resveratrol, quercetin, genistein, and fisetin) through targeting significantly mutated genes in multiple cancer types. In summary, this study provides a powerful tool for the development of molecularly targeted cancer therapies through targeting the clinically actionable alterations by exploiting the systems pharmacology of natural products. PMID:28294568

  20. Gene Discovery of Modular Diterpene Metabolism in Nonmodel Systems1[W][OA

    PubMed Central

    Zerbe, Philipp; Hamberger, Björn; Yuen, Macaire M.S.; Chiang, Angela; Sandhu, Harpreet K.; Madilao, Lina L.; Nguyen, Anh; Hamberger, Britta; Bach, Søren Spanner; Bohlmann, Jörg

    2013-01-01

    Plants produce over 10,000 different diterpenes of specialized (secondary) metabolism, and fewer diterpenes of general (primary) metabolism. Specialized diterpenes may have functions in ecological interactions of plants with other organisms and also benefit humanity as pharmaceuticals, fragrances, resins, and other industrial bioproducts. Examples of high-value diterpenes are taxol and forskolin pharmaceuticals or ambroxide fragrances. Yields and purity of diterpenes obtained from natural sources or by chemical synthesis are often insufficient for large-volume or high-end applications. Improvement of agricultural or biotechnological diterpene production requires knowledge of biosynthetic genes and enzymes. However, specialized diterpene pathways are extremely diverse across the plant kingdom, and most specialized diterpenes are taxonomically restricted to a few plant species, genera, or families. Consequently, there is no single reference system to guide gene discovery and rapid annotation of specialized diterpene pathways. Functional diversification of genes and plasticity of enzyme functions of these pathways further complicate correct annotation. To address this challenge, we used a set of 10 different plant species to develop a general strategy for diterpene gene discovery in nonmodel systems. The approach combines metabolite-guided transcriptome resources, custom diterpene synthase (diTPS) and cytochrome P450 reference gene databases, phylogenies, and, as shown for select diTPSs, single and coupled enzyme assays using microbial and plant expression systems. In the 10 species, we identified 46 new diTPS candidates and over 400 putatively terpenoid-related P450s in a resource of nearly 1 million predicted transcripts of diterpene-accumulating tissues. Phylogenetic patterns of lineage-specific blooms of genes guided functional characterization. PMID:23613273

  1. A Review of Whole-Exome Sequencing Efforts Toward Hereditary Breast Cancer Susceptibility Gene Discovery.

    PubMed

    Chandler, Madison R; Bilgili, Erin P; Merner, Nancy D

    2016-09-01

    Inherited genetic risk factors contribute toward breast cancer (BC) onset. BC risk variants can be divided into three categories of penetrance (high, moderate, and low) that reflect the probability of developing the disease. Traditional BC susceptibility gene discovery approaches that searched for high- and moderate-risk variants in familial BC cases have had limited success; to date, these risk variants explain only ∼30% of familial BC cases. Next-generation sequencing technologies can be used to search for novel high and moderate BC risk variants, and this manuscript reviews 12 familial BC whole-exome sequencing efforts. Study design, filtering strategies, and segregation and validation analyses are discussed. Overall, only a modest number of novel BC risk genes were identified, and 90% and 97% of the exome-sequenced families and cases, respectively, had no BC risk variants reported. It is important to learn from these studies and consider alternate strategies in order to make further advances. The discovery of new BC susceptibility genes is critical for improved risk assessment and to provide insight toward disease mechanisms for the development of more effective therapies.

  2. Discovery of the rhizopodin biosynthetic gene cluster in Stigmatella aurantiaca Sg a15 by genome mining.

    PubMed

    Pistorius, Dominik; Müller, Rolf

    2012-02-13

    The field of bacterial natural product research is currently undergoing a paradigm change concerning the discovery of natural products. Previously most efforts were based on isolation of the most abundant compound in an extract, or on tracking bioactivity. However, traditional activity-guided approaches are limited by the available test panels and frequently lead to the rediscovery of already known compounds. The constantly increasing availability of bacterial genome sequences provides the potential for the discovery of a huge number of new natural compounds by in silico identification of biosynthetic gene clusters. Examination of the information on the biosynthetic machinery can further prevent rediscovery of known compounds, and can help identify so far unknown biosynthetic pathways of known compounds. By in silico screening of the genome of the myxobacterium Stigmatella aurantiaca Sg a15, a trans-AT polyketide synthase/non-ribosomal peptide synthetase (PKS/NRPS) gene cluster was identified that could not be correlated to any secondary metabolite known to be produced by this strain. Targeted gene inactivation and analysis of extracts from the resulting mutants by high performance liquid chromatography coupled to high resolution mass spectrometry (HPLC-HRMS), in combination with the use of statistical tools resulted in the identification of a compound that was absent in the mutants extracts. By matching with our in-house database of myxobacterial secondary metabolites, this compound was identified as rhizopodin. A detailed analysis of the rhizopodin biosynthetic machinery is presented in this manuscript.

  3. In silico mining and PCR-based approaches to transcription factor discovery in non-model plants: gene discovery of the WRKY transcription factors in conifers.

    PubMed

    Liu, Jun-Jun; Xiang, Yu

    2011-01-01

    WRKY transcription factors are key regulators of numerous biological processes in plant growth and development, as well as plant responses to abiotic and biotic stresses. Research on biological functions of plant WRKY genes has focused in the past on model plant species or species with largely characterized transcriptomes. However, a variety of non-model plants, such as forest conifers, are essential as feed, biofuel, and wood or for sustainable ecosystems. Identification of WRKY genes in these non-model plants is equally important for understanding the evolutionary and function-adaptive processes of this transcription factor family. Because of limited genomic information, the rarity of regulatory gene mRNAs in transcriptomes, and the sequence divergence to model organism genes, identification of transcription factors in non-model plants using methods similar to those generally used for model plants is difficult. This chapter describes a gene family discovery strategy for identification of WRKY transcription factors in conifers by a combination of in silico-based prediction and PCR-based experimental approaches. Compared to traditional cDNA library screening or EST sequencing at transcriptome scales, this integrated gene discovery strategy provides fast, simple, reliable, and specific methods to unveil the WRKY gene family at both genome and transcriptome levels in non-model plants.

  4. High-throughput platform for the discovery of elicitors of silent bacterial gene clusters

    PubMed Central

    Seyedsayamdost, Mohammad R.

    2014-01-01

    Over the past decade, bacterial genome sequences have revealed an immense reservoir of biosynthetic gene clusters, sets of contiguous genes that have the potential to produce drugs or drug-like molecules. However, the majority of these gene clusters appear to be inactive for unknown reasons prompting terms such as “cryptic” or “silent” to describe them. Because natural products have been a major source of therapeutic molecules, methods that rationally activate these silent clusters would have a profound impact on drug discovery. Herein, a new strategy is outlined for awakening silent gene clusters using small molecule elicitors. In this method, a genetic reporter construct affords a facile read-out for activation of the silent cluster of interest, while high-throughput screening of small molecule libraries provides potential inducers. This approach was applied to two cryptic gene clusters in the pathogenic model Burkholderia thailandensis. The results not only demonstrate a prominent activation of these two clusters, but also reveal that the majority of elicitors are themselves antibiotics, most in common clinical use. Antibiotics, which kill B. thailandensis at high concentrations, act as inducers of secondary metabolism at low concentrations. One of these antibiotics, trimethoprim, served as a global activator of secondary metabolism by inducing at least five biosynthetic pathways. Further application of this strategy promises to uncover the regulatory networks that activate silent gene clusters while at the same time providing access to the vast array of cryptic molecules found in bacteria. PMID:24808135

  5. Inherited retinal diseases in dogs: advances in gene/mutation discovery.

    PubMed

    Miyadera, Keiko

    1. Inherited retinal diseases (RDs) are vision-threatening conditions affecting humans as well as many domestic animals. Through many years of clinical studies of the domestic dog population, a wide array of RDs has been phenotypically characterized. Extensive effort to map the causative gene and to identify the underlying mutation followed. Through candidate gene, linkage analysis, genome-wide association studies, and more recently, by means of next-generation sequencing, as many as 31 mutations in 24 genes have been identified as the underlying cause for canine RDs. Most of these genes have been associated with human RDs providing opportunities to study their roles in the disease pathogenesis and in normal visual function. The canine model has also contributed in developing new treatments such as gene therapy which has been clinically applied to human patients. Meanwhile, with increasing knowledge of the molecular architecture of RDs in different subpopulations of dogs, the conventional understanding of RDs as a simple monogenic disease is beginning to change. Emerging evidence of modifiers that alters the disease outcome is complicating the interpretation of DNA tests. In this review, advances in the gene/mutation discovery approaches and the emerging genetic complexity of canine RDs are discussed.

  6. Inherited retinal diseases in dogs: advances in gene/mutation discovery

    PubMed Central

    Miyadera, Keiko

    2015-01-01

    1. Inherited retinal diseases (RDs) are vision-threatening conditions affecting humans as well as many domestic animals. Through many years of clinical studies of the domestic dog population, a wide array of RDs has been phenotypically characterized. Extensive effort to map the causative gene and to identify the underlying mutation followed. Through candidate gene, linkage analysis, genome-wide association studies, and more recently, by means of next-generation sequencing, as many as 31 mutations in 24 genes have been identified as the underlying cause for canine RDs. Most of these genes have been associated with human RDs providing opportunities to study their roles in the disease pathogenesis and in normal visual function. The canine model has also contributed in developing new treatments such as gene therapy which has been clinically applied to human patients. Meanwhile, with increasing knowledge of the molecular architecture of RDs in different subpopulations of dogs, the conventional understanding of RDs as a simple monogenic disease is beginning to change. Emerging evidence of modifiers that alters the disease outcome is complicating the interpretation of DNA tests. In this review, advances in the gene/mutation discovery approaches and the emerging genetic complexity of canine RDs are discussed. PMID:26120276

  7. Repurposed transcriptomic data facilitate discovery of innate immunity toll-like receptor (TLR) Genes across Lophotrochozoa.

    PubMed

    Halanych, Kenneth M; Kocot, Kevin M

    2014-10-01

    The growing volume of genomic data from across life represents opportunities for deriving valuable biological information from data that were initially collected for another purpose. Here, we use transcriptomes collected for phylogenomic studies to search for toll-like receptor (TLR) genes in poorly sampled lophotrochozoan clades (Annelida, Mollusca, Brachiopoda, Phoronida, and Entoprocta) and one ecdysozoan clade (Priapulida). TLR genes are involved in innate immunity across animals by recognizing potential microbial infection. They have an extracellular leucine-rich repeat (LRR) domain connected to a transmembrane domain and an intracellular toll/interleukin-1 receptor (TIR) domain. Consequently, these genes are important in initiating a signaling pathway to trigger defense. We found at least one TLR ortholog in all but two taxa examined, suggesting that a broad array of lophotrochozoans may have innate immune systems similar to those observed in vertebrates and arthropods. Comparison to the SMART database confirmed the presence of both the LRR and the TIR protein motifs characteristic of TLR genes. Because we looked at only one transcriptome per species, discovery of TLR genes was limited for most taxa. However, several TRL-like genes that vary in the number and placement of LRR domains were found in phoronids. Additionally, several contigs contained LRR domains but lacked TIR domains, suggesting they were not TLRs. Many of these LRR-containing contigs had other domains (e.g., immunoglobin) and are likely involved in innate immunity.

  8. Transcriptome sequencing of black grouse (Tetrao tetrix) for immune gene discovery and microsatellite development

    PubMed Central

    Wang, Biao; Ekblom, Robert; Castoe, Todd A.; Jones, Eleanor P.; Kozma, Radoslav; Bongcam-Rudloff, Erik; Pollock, David D.; Höglund, Jacob

    2012-01-01

    The black grouse (Tetrao tetrix) is a galliform bird species that is important for both ecological studies and conservation genetics. Here, we report the sequencing of the spleen transcriptome of black grouse using 454 GS FLX Titanium sequencing. We performed a large-scale gene discovery analysis with a focus on genes that might be related to fitness in this species and also identified a large set of microsatellites. In total, we obtained 182 179 quality-filtered sequencing reads that we assembled into 9035 contigs. Using these contigs and 15 794 length-filtered (greater than 200 bp) singletons, we identified 7762 transcripts that appear to be homologues of chicken genes. A specific BLAST search with an emphasis on immune genes found 308 homologous chicken genes that have immune function, including ten major histocompatibility complex-related genes located on chicken chromosome 16. We also identified 1300 expressed sequence tag microsatellites and were able to design suitable flanking primers for 526 of these. A preliminary test of the polymorphism of the microsatellites found 10 polymorphic microsatellites of the 102 tested. Genomic resources generated in this study should greatly benefit future ecological, evolutionary and conservation genetic studies on this species. PMID:22724064

  9. Transcriptome sequencing of black grouse (Tetrao tetrix) for immune gene discovery and microsatellite development.

    PubMed

    Wang, Biao; Ekblom, Robert; Castoe, Todd A; Jones, Eleanor P; Kozma, Radoslav; Bongcam-Rudloff, Erik; Pollock, David D; Höglund, Jacob

    2012-04-01

    The black grouse (Tetrao tetrix) is a galliform bird species that is important for both ecological studies and conservation genetics. Here, we report the sequencing of the spleen transcriptome of black grouse using 454 GS FLX Titanium sequencing. We performed a large-scale gene discovery analysis with a focus on genes that might be related to fitness in this species and also identified a large set of microsatellites. In total, we obtained 182 179 quality-filtered sequencing reads that we assembled into 9035 contigs. Using these contigs and 15 794 length-filtered (greater than 200 bp) singletons, we identified 7762 transcripts that appear to be homologues of chicken genes. A specific BLAST search with an emphasis on immune genes found 308 homologous chicken genes that have immune function, including ten major histocompatibility complex-related genes located on chicken chromosome 16. We also identified 1300 expressed sequence tag microsatellites and were able to design suitable flanking primers for 526 of these. A preliminary test of the polymorphism of the microsatellites found 10 polymorphic microsatellites of the 102 tested. Genomic resources generated in this study should greatly benefit future ecological, evolutionary and conservation genetic studies on this species.

  10. Evaluation of Gene Association Methods for Coexpression Network Construction and Biological Knowledge Discovery

    PubMed Central

    Kumari, Sapna; Nie, Jeff; Chen, Huann-Sheng; Ma, Hao; Stewart, Ron; Li, Xiang; Lu, Meng-Zhu; Taylor, William M.; Wei, Hairong

    2012-01-01

    Background Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. Methods and Results In this study, we compared eight gene association methods – Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson – and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. Conclusions We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction. PMID:23226279

  11. Genomics-driven discovery of the pneumocandin biosynthetic gene cluster in the fungus Glarea lozoyensis

    PubMed Central

    2013-01-01

    Background The antifungal therapy caspofungin is a semi-synthetic derivative of pneumocandin B0, a lipohexapeptide produced by the fungus Glarea lozoyensis, and was the first member of the echinocandin class approved for human therapy. The nonribosomal peptide synthetase (NRPS)-polyketide synthases (PKS) gene cluster responsible for pneumocandin biosynthesis from G. lozoyensis has not been elucidated to date. In this study, we report the elucidation of the pneumocandin biosynthetic gene cluster by whole genome sequencing of the G. lozoyensis wild-type strain ATCC 20868. Results The pneumocandin biosynthetic gene cluster contains a NRPS (GLNRPS4) and a PKS (GLPKS4) arranged in tandem, two cytochrome P450 monooxygenases, seven other modifying enzymes, and genes for L-homotyrosine biosynthesis, a component of the peptide core. Thus, the pneumocandin biosynthetic gene cluster is significantly more autonomous and organized than that of the recently characterized echinocandin B gene cluster. Disruption mutants of GLNRPS4 and GLPKS4 no longer produced the pneumocandins (A0 and B0), and the Δglnrps4 and Δglpks4 mutants lost antifungal activity against the human pathogenic fungus Candida albicans. In addition to pneumocandins, the G. lozoyensis genome encodes a rich repertoire of natural product-encoding genes including 24 PKSs, six NRPSs, five PKS-NRPS hybrids, two dimethylallyl tryptophan synthases, and 14 terpene synthases. Conclusions Characterization of the gene cluster provides a blueprint for engineering new pneumocandin derivatives with improved pharmacological properties. Whole genome estimation of the secondary metabolite-encoding genes from G. lozoyensis provides yet another example of the huge potential for drug discovery from natural products from the fungal kingdom. PMID:23688303

  12. Gene Overexpression Resources in Cereals for Functional Genomics and Discovery of Useful Genes

    PubMed Central

    Abe, Kiyomi; Ichikawa, Hiroaki

    2016-01-01

    Identification and elucidation of functions of plant genes is valuable for both basic and applied research. In addition to natural variation in model plants, numerous loss-of-function resources have been produced by mutagenesis with chemicals, irradiation, or insertions of transposable elements or T-DNA. However, we may be unable to observe loss-of-function phenotypes for genes with functionally redundant homologs and for those essential for growth and development. To offset such disadvantages, gain-of-function transgenic resources have been exploited. Activation-tagged lines have been generated using obligatory overexpression of endogenous genes by random insertion of an enhancer. Recent progress in DNA sequencing technology and bioinformatics has enabled the preparation of genomewide collections of full-length cDNAs (fl-cDNAs) in some model species. Using the fl-cDNA clones, a novel gain-of-function strategy, Fl-cDNA OvereXpressor gene (FOX)-hunting system, has been developed. A mutant phenotype in a FOX line can be directly attributed to the overexpressed fl-cDNA. Investigating a large population of FOX lines could reveal important genes conferring favorable phenotypes for crop breeding. Alternatively, a unique loss-of-function approach Chimeric REpressor gene Silencing Technology (CRES-T) has been developed. In CRES-T, overexpression of a chimeric repressor, composed of the coding sequence of a transcription factor (TF) and short peptide designated as the repression domain, could interfere with the action of endogenous TF in plants. Although plant TFs usually consist of gene families, CRES-T is effective, in principle, even for the TFs with functional redundancy. In this review, we focus on the current status of the gene-overexpression strategies and resources for identifying and elucidating novel functions of cereal genes. We discuss the potential of these research tools for identifying useful genes and phenotypes for application in crop breeding. PMID

  13. Discovery of rice essential genes by characterizing a CRISPR-edited mutation of closely related rice MAP kinase genes.

    PubMed

    Minkenberg, Bastian; Xie, Kabin; Yang, Yinong

    2017-02-01

    The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 nuclease (Cas9) system depends on a guide RNA (gRNA) to specify its target. By efficiently co-expressing multiple gRNAs that target different genomic sites, the polycistronic tRNA-gRNA gene (PTG) strategy enables multiplex gene editing in the family of closely related mitogen-activated protein kinase (MPK) genes in Oryza sativa (rice). In this study, we identified MPK1 and MPK6 (Arabidopsis AtMPK6 and AtMPK4 orthologs, respectively) as essential genes for rice development by finding the preservation of MPK functional alleles and normal phenotypes in CRISPR-edited mutants. The true knock-out mutants of MPK1 were severely dwarfed and sterile, and homozygous mpk1 seeds from heterozygous parents were defective in embryo development. By contrast, heterozygous mpk6 mutant plants completely failed to produce homozygous mpk6 seeds. In addition, the functional importance of specific MPK features could be evaluated by characterizing CRISPR-induced allelic variation in the conserved kinase domain of MPK6. By simultaneously targeting between two and eight genomic sites in the closely related MPK genes, we demonstrated 45-86% frequency of biallelic mutations and the successful creation of single, double and quadruple gene mutants. Indels and fragment deletion were both stably inherited to the next generations, and transgene-free mutants of rice MPK genes were readily obtained via genetic segregation, thereby eliminating any positional effects of transgene insertions. Taken together, our study reveals the essentiality of MPK1 and MPK6 in rice development, and enables the functional discovery of previously inaccessible genes or domains with phenotypes masked by lethality or redundancy.

  14. MORPHIN: a web tool for human disease research by projecting model organism biology onto a human integrated gene network.

    PubMed

    Hwang, Sohyun; Kim, Eiru; Yang, Sunmo; Marcotte, Edward M; Lee, Insuk

    2014-07-01

    Despite recent advances in human genetics, model organisms are indispensable for human disease research. Most human disease pathways are evolutionally conserved among other species, where they may phenocopy the human condition or be associated with seemingly unrelated phenotypes. Much of the known gene-to-phenotype association information is distributed across diverse databases, growing rapidly due to new experimental techniques. Accessible bioinformatics tools will therefore facilitate translation of discoveries from model organisms into human disease biology. Here, we present a web-based discovery tool for human disease studies, MORPHIN (model organisms projected on a human integrated gene network), which prioritizes the most relevant human diseases for a given set of model organism genes, potentially highlighting new model systems for human diseases and providing context to model organism studies. Conceptually, MORPHIN investigates human diseases by an orthology-based projection of a set of model organism genes onto a genome-scale human gene network. MORPHIN then prioritizes human diseases by relevance to the projected model organism genes using two distinct methods: a conventional overlap-based gene set enrichment analysis and a network-based measure of closeness between the query and disease gene sets capable of detecting associations undetectable by the conventional overlap-based methods. MORPHIN is freely accessible at http://www.inetbio.org/morphin.

  15. Clustering cancer gene expression data by projective clustering ensemble

    PubMed Central

    Yu, Xianxue; Yu, Guoxian

    2017-01-01

    Gene expression data analysis has paramount implications for gene treatments, cancer diagnosis and other domains. Clustering is an important and promising tool to analyze gene expression data. Gene expression data is often characterized by a large amount of genes but with limited samples, thus various projective clustering techniques and ensemble techniques have been suggested to combat with these challenges. However, it is rather challenging to synergy these two kinds of techniques together to avoid the curse of dimensionality problem and to boost the performance of gene expression data clustering. In this paper, we employ a projective clustering ensemble (PCE) to integrate the advantages of projective clustering and ensemble clustering, and to avoid the dilemma of combining multiple projective clusterings. Our experimental results on publicly available cancer gene expression data show PCE can improve the quality of clustering gene expression data by at least 4.5% (on average) than other related techniques, including dimensionality reduction based single clustering and ensemble approaches. The empirical study demonstrates that, to further boost the performance of clustering cancer gene expression data, it is necessary and promising to synergy projective clustering with ensemble clustering. PCE can serve as an effective alternative technique for clustering gene expression data. PMID:28234920

  16. Gene discovery using next-generation pyrosequencing to develop ESTs for Phalaenopsis orchids

    PubMed Central

    2011-01-01

    Background Orchids are one of the most diversified angiosperms, but few genomic resources are available for these non-model plants. In addition to the ecological significance, Phalaenopsis has been considered as an economically important floriculture industry worldwide. We aimed to use massively parallel 454 pyrosequencing for a global characterization of the Phalaenopsis transcriptome. Results To maximize sequence diversity, we pooled RNA from 10 samples of different tissues, various developmental stages, and biotic- or abiotic-stressed plants. We obtained 206,960 expressed sequence tags (ESTs) with an average read length of 228 bp. These reads were assembled into 8,233 contigs and 34,630 singletons. The unigenes were searched against the NCBI non-redundant (NR) protein database. Based on sequence similarity with known proteins, these analyses identified 22,234 different genes (E-value cutoff, e-7). Assembled sequences were annotated with Gene Ontology, Gene Family and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Among these annotations, over 780 unigenes encoding putative transcription factors were identified. Conclusion Pyrosequencing was effective in identifying a large set of unigenes from Phalaenopsis. The informative EST dataset we developed constitutes a much-needed resource for discovery of genes involved in various biological processes in Phalaenopsis and other orchid species. These transcribed sequences will narrow the gap between study of model organisms with many genomic resources and species that are important for ecological and evolutionary studies. PMID:21749684

  17. Discovery of midgut genes for the RNA interference control of corn rootworm

    PubMed Central

    Hu, Xu; Richtman, Nina M.; Zhao, Jian-Zhou; Duncan, Keith E.; Niu, Xiping; Procyk, Lisa A.; Oneal, Meghan A.; Kernodle, Bliss M.; Steimel, Joseph P.; Crane, Virginia C.; Sandahl, Gary; Ritland, Julie L.; Howard, Richard J.; Presnail, James K.; Lu, Albert L.; Wu, Gusui

    2016-01-01

    RNA interference (RNAi) is a promising new technology for corn rootworm control. This paper presents the discovery of new gene targets - dvssj1 and dvssj2, in western corn rootworm (WCR). Dvssj1 and dvssj2 are orthologs of the Drosophila genes snakeskin (ssk) and mesh, respectively. These genes encode membrane proteins associated with smooth septate junctions (SSJ) which are required for intestinal barrier function. Based on bioinformatics analysis, dvssj1 appears to be an arthropod-specific gene. Diet based insect feeding assays using double-stranded RNA (dsRNA) targeting dvssj1 and dvssj2 demonstrate targeted mRNA suppression, larval growth inhibition, and mortality. In RNAi treated WCR, injury to the midgut was manifested by “blebbing” of the midgut epithelium into the gut lumen. Ultrastructural examination of midgut epithelial cells revealed apoptosis and regenerative activities. Transgenic plants expressing dsRNA targeting dvssj1 show insecticidal activity and significant plant protection from WCR damage. The data indicate that dvssj1 and dvssj2 are effective gene targets for the control of WCR using RNAi technology, by apparent suppression of production of their respective smooth septate junction membrane proteins located within the intestinal lining, leading to growth inhibition and mortality. PMID:27464714

  18. Discovery of a novel tumour metastasis-promoting gene, NVM-1.

    PubMed

    Thiele, Wilko; Novac, Natalia; Mink, Sigrun; Schreiber, Caroline; Plaumann, Diana; Fritzmann, Johannes; Cremers, Natascha; Rothley, Melanie; Schwager, Christian; Regiert, Thomas; Huber, Peter E; Stein, Ulrike; Schlag, Peter; Moll, Jürgen; Abdollahi, Amir; Sleeman, Jonathan P

    2011-09-01

    We have previously reported that over-expression of a panel of 119 genes correlates with the metastatic potential of pancreatic carcinoma cells. We sought to identify and functionally characterize candidate tumour metastasis promoting genes among this library using a secondary phenotype-assisted screen. Here we report the discovery of the metastasis-promoting function of a hitherto not characterized gene located on chromosome 14 (ORF138), which we have named 'novel metastasis-promoting gene 1' (NVM-1). The NVM-1 transcript is extensively alternatively spliced, is expressed endogenously in a number of different tissues, and is strongly over-expressed at the protein level in a variety of human tumour types. Importantly, NVM-1 expression stimulates the migratory and invasive behaviour of tumour cells and promotes metastasis formation in experimental animals in vivo. Up-regulation of FMNL2 and MT1E and down-regulation of TIMP4 and MHC-I is observed as a consequence of NVM-1 expression. Together these data identify NVM-1 as a gene that is functionally involved in tumour metastasis, and suggest that NVM-1 may constitute a promising therapeutic target for inhibition of tumour metastasis.

  19. Discovery of nucleotide polymorphisms in the Musa gene pool by Ecotilling

    PubMed Central

    Jankowicz-Cieslak, Joanna; Sági, László; Huynh, Owen A.; Utsushi, Hiroe; Swennen, Rony; Terauchi, Ryohei; Mba, Chikelu

    2010-01-01

    Musa (banana and plantain) is an important genus for the global export market and in local markets where it provides staple food for approximately 400 million people. Hybridization and polyploidization of several (sub)species, combined with vegetative propagation and human selection have produced a complex genetic history. We describe the application of the Ecotilling method for the discovery and characterization of nucleotide polymorphisms in diploid and polyploid accessions of Musa. We discovered over 800 novel alleles in 80 accessions. Sequencing and band evaluation shows Ecotilling to be a robust and accurate platform for the discovery of polymorphisms in homologous and homeologous gene targets. In the process of validating the method, we identified two single nucleotide polymorphisms that may be deleterious for the function of a gene putatively important for phototropism. Evaluation of heterozygous polymorphism and haplotype blocks revealed a high level of nucleotide diversity in Musa accessions. We further applied a strategy for the simultaneous discovery of heterozygous and homozygous polymorphisms in diploid accessions to rapidly evaluate nucleotide diversity in accessions of the same genome type. This strategy can be used to develop hypotheses for inheritance patterns of nucleotide polymorphisms within and between genome types. We conclude that Ecotilling is suitable for diversity studies in Musa, that it can be considered for functional genomics studies and as tool in selecting germplasm for traditional and mutation breeding approaches. Electronic supplementary material The online version of this article (doi:10.1007/s00122-010-1395-5) contains supplementary material, which is available to authorized users. PMID:20589365

  20. Using Osteoclast Differentiation as a Model for Gene Discovery in an Undergraduate Cell Biology Laboratory

    ERIC Educational Resources Information Center

    Birnbaum, Mark J.; Picco, Jenna; Clements, Meghan; Witwicka, Hanna; Yang, Meiheng; Hoey, Margaret T.; Odgren, Paul R.

    2010-01-01

    A key goal of molecular/cell biology/biotechnology is to identify essential genes in virtually every physiological process to uncover basic mechanisms of cell function and to establish potential targets of drug therapy combating human disease. This article describes a semester-long, project-oriented molecular/cellular/biotechnology laboratory…

  1. Long Serial Analysis of Gene Expression for Gene Discovery and Transcriptome Profiling in the Widespread Marine Coccolithophore Emiliania huxleyi†

    PubMed Central

    Dyhrman, Sonya T.; Haley, Sheean T.; Birkeland, Shanda R.; Wurch, Louie L.; Cipriano, Michael J.; McArthur, Andrew G.

    2006-01-01

    The abundant and widespread coccolithophore Emiliania huxleyi plays an important role in mediating CO2 exchange between the ocean and the atmosphere through its impact on marine photosynthesis and calcification. Here, we use long serial analysis of gene expression (SAGE) to identify E. huxleyi genes responsive to nitrogen (N) or phosphorus (P) starvation. Long SAGE is an elegant approach for examining quantitative and comprehensive gene expression patterns without a priori knowledge of gene sequences via the detection of 21-bp nucleotide sequence tags. E. huxleyi appears to have a robust transcriptional-level response to macronutrient deficiency, with 42 tags uniquely present or up-regulated twofold or greater in the N-starved library and 128 tags uniquely present or up-regulated twofold or greater in the P-starved library. The expression patterns of several tags were validated with reverse transcriptase PCR. Roughly 48% of these differentially expressed tags could be mapped to publicly available genomic or expressed sequence tag (EST) sequence data. For example, in the P-starved library a number of the tags mapped to genes with a role in P scavenging, including a putative phosphate-repressible permease and a putative polyphosphate synthetase. In short, the long SAGE analyses have (i) identified many new differentially regulated gene sequences, (ii) assigned regulation data to EST sequences with no database homology and unknown function, and (iii) highlighted previously uncharacterized aspects of E. huxleyi N and P physiology. To this end, our long SAGE libraries provide a new public resource for gene discovery and transcriptional analysis in this biogeochemically important marine organism. PMID:16391051

  2. Effector genomics accelerates discovery and functional profiling of potato disease resistance and phytophthora infestans avirulence genes.

    PubMed

    Vleeshouwers, Vivianne G A A; Rietman, Hendrik; Krenek, Pavel; Champouret, Nicolas; Young, Carolyn; Oh, Sang-Keun; Wang, Miqia; Bouwmeester, Klaas; Vosman, Ben; Visser, Richard G F; Jacobsen, Evert; Govers, Francine; Kamoun, Sophien; Van der Vossen, Edwin A G

    2008-08-06

    Potato is the world's fourth largest food crop yet it continues to endure late blight, a devastating disease caused by the Irish famine pathogen Phytophthora infestans. Breeding broad-spectrum disease resistance (R) genes into potato (Solanum tuberosum) is the best strategy for genetically managing late blight but current approaches are slow and inefficient. We used a repertoire of effector genes predicted computationally from the P. infestans genome to accelerate the identification, functional characterization, and cloning of potentially broad-spectrum R genes. An initial set of 54 effectors containing a signal peptide and a RXLR motif was profiled for activation of innate immunity (avirulence or Avr activity) on wild Solanum species and tentative Avr candidates were identified. The RXLR effector family IpiO induced hypersensitive responses (HR) in S. stoloniferum, S. papita and the more distantly related S. bulbocastanum, the source of the R gene Rpi-blb1. Genetic studies with S. stoloniferum showed cosegregation of resistance to P. infestans and response to IpiO. Transient co-expression of IpiO with Rpi-blb1 in a heterologous Nicotiana benthamiana system identified IpiO as Avr-blb1. A candidate gene approach led to the rapid cloning of S. stoloniferum Rpi-sto1 and S. papita Rpi-pta1, which are functionally equivalent to Rpi-blb1. Our findings indicate that effector genomics enables discovery and functional profiling of late blight R genes and Avr genes at an unprecedented rate and promises to accelerate the engineering of late blight resistant potato varieties.

  3. Effector Genomics Accelerates Discovery and Functional Profiling of Potato Disease Resistance and Phytophthora Infestans Avirulence Genes

    PubMed Central

    Vleeshouwers, Vivianne G. A. A.; Rietman, Hendrik; Krenek, Pavel; Champouret, Nicolas; Young, Carolyn; Oh, Sang-Keun; Wang, Miqia; Bouwmeester, Klaas; Vosman, Ben; Visser, Richard G. F.; Jacobsen, Evert; Govers, Francine; Kamoun, Sophien; Van der Vossen, Edwin A. G.

    2008-01-01

    Potato is the world's fourth largest food crop yet it continues to endure late blight, a devastating disease caused by the Irish famine pathogen Phytophthora infestans. Breeding broad-spectrum disease resistance (R) genes into potato (Solanum tuberosum) is the best strategy for genetically managing late blight but current approaches are slow and inefficient. We used a repertoire of effector genes predicted computationally from the P. infestans genome to accelerate the identification, functional characterization, and cloning of potentially broad-spectrum R genes. An initial set of 54 effectors containing a signal peptide and a RXLR motif was profiled for activation of innate immunity (avirulence or Avr activity) on wild Solanum species and tentative Avr candidates were identified. The RXLR effector family IpiO induced hypersensitive responses (HR) in S. stoloniferum, S. papita and the more distantly related S. bulbocastanum, the source of the R gene Rpi-blb1. Genetic studies with S. stoloniferum showed cosegregation of resistance to P. infestans and response to IpiO. Transient co-expression of IpiO with Rpi-blb1 in a heterologous Nicotiana benthamiana system identified IpiO as Avr-blb1. A candidate gene approach led to the rapid cloning of S. stoloniferum Rpi-sto1 and S. papita Rpi-pta1, which are functionally equivalent to Rpi-blb1. Our findings indicate that effector genomics enables discovery and functional profiling of late blight R genes and Avr genes at an unprecedented rate and promises to accelerate the engineering of late blight resistant potato varieties. PMID:18682852

  4. Gene discovery by chemical mutagenesis and whole-genome sequencing in Dictyostelium

    PubMed Central

    Li, Cheng-Lin Frank; Santhanam, Balaji; Webb, Amanda Nicole; Zupan, Blaž

    2016-01-01

    Whole-genome sequencing is a useful approach for identification of chemical-induced lesions, but previous applications involved tedious genetic mapping to pinpoint the causative mutations. We propose that saturation mutagenesis under low mutagenic loads, followed by whole-genome sequencing, should allow direct implication of genes by identifying multiple independent alleles of each relevant gene. We tested the hypothesis by performing three genetic screens with chemical mutagenesis in the social soil amoeba Dictyostelium discoideum. Through genome sequencing, we successfully identified mutant genes with multiple alleles in near-saturation screens, including resistance to intense illumination and strong suppressors of defects in an allorecognition pathway. We tested the causality of the mutations by comparison to published data and by direct complementation tests, finding both dominant and recessive causative mutations. Therefore, our strategy provides a cost- and time-efficient approach to gene discovery by integrating chemical mutagenesis and whole-genome sequencing. The method should be applicable to many microbial systems, and it is expected to revolutionize the field of functional genomics in Dictyostelium by greatly expanding the mutation spectrum relative to other common mutagenesis methods. PMID:27307293

  5. Discovery of candidate genes and pathways in the endometrium regulating ovine blastocyst growth and conceptus elongation.

    PubMed

    Satterfield, M Carey; Song, Gwonhwa; Kochan, Kelli J; Riggs, Penny K; Simmons, Rebecca M; Elsik, Christine G; Adelson, David L; Bazer, Fuller W; Zhou, Huaijun; Spencer, Thomas E

    2009-10-07

    Establishment of pregnancy in ruminants requires blastocyst growth to form an elongated conceptus that produces interferon tau, the pregnancy recognition signal, and initiates implantation. Blastocyst growth and development requires secretions from the uterine endometrium. An early increase in circulating concentrations of progesterone (P4) stimulates blastocyst growth and elongation in ruminants. This study utilized sheep as a model to identify candidate genes and regulatory networks in the endometrium that govern preimplantation blastocyst growth and development. Ewes were treated daily with either P4 or corn oil vehicle from day 1.5 after mating to either day 9 or day 12 of pregnancy when endometrium was obtained by hysterectomy. Microarray analyses revealed many differentially expressed genes in the endometria affected by day of pregnancy and early P4 treatment. In situ hybridization analyses revealed that many differentially expressed genes were expressed in a cell-specific manner within the endometrium. The Database for Annotation, Visualization, and Integrated Discovery (DAVID) was used to identify functional groups of genes and biological processes in the endometrium that are associated with growth and development of preimplantation blastocysts. Notably, biological processes affected by day of pregnancy and/or early P4 treatment included lipid biosynthesis and metabolism, angiogenesis, transport, extracellular space, defense and inflammatory response, proteolysis, amino acid transport and metabolism, and hormone metabolism. This transcriptomic data provides novel insights into the biology of endometrial function and preimplantation blastocyst growth and development in sheep.

  6. SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate

    USGS Publications Warehouse

    Roffler, Gretchen H.; Amish, Stephen J.; Smith, Seth; Cosart, Ted F.; Kardos, Marty; Schwartz, Michael K.; Luikart, Gordon

    2016-01-01

    Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding and nearby 5′ and 3′ untranslated regions of chosen candidate genes. Targeted sequences were taken from bighorn sheep (Ovis canadensis) exon capture data and directly from the domestic sheep genome (Ovis aries v. 3; oviAri3). The bighorn sheep sequences used in the Dall's sheep (Ovis dalli dalli) exon capture aligned to 2350 genes on the oviAri3 genome with an average of 2 exons each. We developed a microfluidic qPCR-based SNP chip to genotype 476 Dall's sheep from locations across their range and test for patterns of selection. Using multiple corroborating approaches (lositan and bayescan), we detected 28 SNP loci potentially under selection. We additionally identified candidate loci significantly associated with latitude, longitude, precipitation and temperature, suggesting local environmental adaptation. The three methods demonstrated consistent support for natural selection on nine genes with immune and disease-regulating functions (e.g. Ovar-DRA, APC, BATF2, MAGEB18), cell regulation signalling pathways (e.g. KRIT1, PI3K, ORRC3), and respiratory health (CYSLTR1). Characterizing adaptive allele distributions from novel genetic techniques will facilitate investigation of the influence of environmental variation on local adaptation of a northern alpine ungulate throughout its range. This research demonstrated the utility of exon capture for gene-targeted SNP discovery and subsequent SNP chip genotyping using low-quality samples in a nonmodel species.

  7. Leveraging gene-environment interactions and endotypes for asthma gene discovery.

    PubMed

    Bønnelykke, Klaus; Ober, Carole

    2016-03-01

    Asthma is a heterogeneous clinical syndrome that includes subtypes of disease with different underlying causes and disease mechanisms. Asthma is caused by a complex interaction between genes and environmental exposures; early-life exposures in particular play an important role. Asthma is also heritable, and a number of susceptibility variants have been discovered in genome-wide association studies, although the known risk alleles explain only a small proportion of the heritability. In this review, we present evidence supporting the hypothesis that focusing on more specific asthma phenotypes, such as childhood asthma with severe exacerbations, and on relevant exposures that are involved in gene-environment interactions (GEIs), such as rhinovirus infections, will improve detection of asthma genes and our understanding of the underlying mechanisms. We will discuss the challenges of considering GEIs and the advantages of studying responses to asthma-associated exposures in clinical birth cohorts, as well as in cell models of GEIs, to dissect the context-specific nature of genotypic risks, to prioritize variants in genome-wide association studies, and to identify pathways involved in pathogenesis in subgroups of patients. We propose that such approaches, in spite of their many challenges, present great opportunities for better understanding of asthma pathogenesis and heterogeneity and, ultimately, for improving prevention and treatment of disease.

  8. SAGExplore: a web server for unambiguous tag mapping in serial analysis of gene expression oriented to gene discovery and annotation.

    PubMed

    Norambuena, Tomás; Malig, Rodrigo; Melo, Francisco

    2007-07-01

    We describe a web server for the accurate mapping of experimental tags in serial analysis of gene expression (SAGE). The core of the server relies on a database of genomic virtual tags built by a recently described method that attempts to reduce the amount of ambiguous assignments for those tags that are not unique in the genome. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome. The output of the server consists of a table in HTML format that contains links to a graphic representation of the results and to some external servers and databases, facilitating the tasks of analysis of gene expression and gene discovery. Also, a table in tab delimited text format is produced, allowing the user to export the results into custom databases and software for further analysis. The current server version provides the most accurate and complete SAGE tag mapping source that is available for the yeast organism. In the near future, this server will also allow the accurate mapping of experimental SAGE-tags from other model organisms such as human, mouse, frog and fly. The server is freely available on the web at: http://dna.bio.puc.cl/SAGExplore.html.

  9. Sleeping Beauty transposon insertional mutagenesis based mouse models for cancer gene discovery.

    PubMed

    Moriarity, Branden S; Largaespada, David A

    2015-02-01

    Large-scale genomic efforts to study human cancer, such as the cancer gene atlas (TCGA), have identified numerous cancer drivers in a wide variety of tumor types. However, there are limitations to this approach, the mutations and expression or copy number changes that are identified are not always clearly functionally relevant, and only annotated genes and genetic elements are thoroughly queried. The use of complimentary, nonbiased, functional approaches to identify drivers of cancer development and progression is ideal to maximize the rate at which cancer discoveries are achieved. One such approach that has been successful is the use of the Sleeping Beauty (SB) transposon-based mutagenesis system in mice. This system uses a conditionally expressed transposase and mutagenic transposon allele to target mutagenesis to somatic cells of a given tissue in mice to cause random mutations leading to tumor development. Analysis of tumors for transposon common insertion sites (CIS) identifies candidate cancer genes specific to that tumor type. While similar screens have been performed in mice with the PiggyBac (PB) transposon and viral approaches, we limit extensive discussion to SB. Here we discuss the basic structure of these screens, screens that have been performed, methods used to identify CIS.

  10. Sleeping Beauty transposon insertional mutagenesis based mouse models for cancer gene discovery

    PubMed Central

    Moriarity, Branden S; Largaespada, David A

    2016-01-01

    Large-scale genomic efforts to study human cancer, such as the cancer gene atlas (TCGA), have identified numerous cancer drivers in a wide variety of tumor types. However, there are limitations to this approach, the mutations and expression or copy number changes that are identified are not always clearly functionally relevant, and only annotated genes and genetic elements are thoroughly queried. The use of complimentary, nonbiased, functional approaches to identify drivers of cancer development and progression is ideal to maximize the rate at which cancer discoveries are achieved. One such approach that has been successful is the use of the Sleeping Beauty (SB) transposon-based mutagenesis system in mice. This system uses a conditionally expressed transposase and mutagenic transposon allele to target mutagenesis to somatic cells of a given tissue in mice to cause random mutations leading to tumor development. Analysis of tumors for transposon common insertion sites (CIS) identifies candidate cancer genes specific to that tumor type. While similar screens have been performed in mice with the PiggyBac (PB) transposon and viral approaches, we limit extensive discussion to SB. Here we discuss the basic structure of these screens, screens that have been performed, methods used to identify CIS. PMID:26051241

  11. Discovery of genes essential for heme biosynthesis through large-scale gene expression analysis

    PubMed Central

    Nilsson, Roland; Schultz, Iman J.; Pierce, Eric L.; Soltis, Kathleen A.; Naranuntarat, Amornrat; Ward, Diane M.; Baughman, Joshua; Paradkar, Prasad N.; Kingsley, Paul D.; Culotta, Valeria C.; Kaplan, Jerry; Palis, James; Paw, Barry H.; Mootha, Vamsi K.

    2009-01-01

    Summary Heme biosynthesis consists of a series of eight enzymatic reactions that originate in mitochondria and continue in the cytosol before returning to mitochondria. Although these core enzymes are well studied, additional mitochondrial transporters and regulatory factors are predicted to be required. To discover such unknown components, we utilized a large-scale computational screen to identify mitochondrial proteins whose transcripts consistently co-express with the core machinery of heme biosynthesis. We identified SLC25A39, SLC22A4 and TMEM14C, which are putative mitochondrial transporters, as well as C1orf69 and ISCA1, which are iron-sulfur cluster proteins. Targeted knockdowns of all five genes in zebrafish resulted in profound anemia without impacting erythroid lineage specification. Moreover, silencing of Slc25a39 in murine erythroleukemia cells impaired iron incorporation into protoporphyrin IX, and vertebrate Slc25a39 complemented an iron homeostasis defect in the orthologous yeast mtm1Δ deletion mutant. Our results advance the molecular understanding of heme biosynthesis and offer promising candidate genes for inherited anemias. PMID:19656490

  12. Contemporary Physics Education Project (CPEP) - an organization dedicated to presenting the latest physics discoveries in an understandable and colorful style

    NASA Astrophysics Data System (ADS)

    Matis, Howard

    2017-01-01

    Since 1986, The Contemporary Physics Education Project has been creating colorful posters and materials that describe s physics. Its mission is to provide information so that the public easily can access current physics knowledge. Our publications summarize the latest discoveries and describe areas of current research. We have created materials in Particle Physics, Fusion Science, Nuclear Physics, and Cosmology. All of these materials are carefully vetted for scientific accuracy. CPEP presents them at workshops and distributes the materials widely so that teachers and students can use them in their classrooms. In this talk, I will describe what CPEP has created, including its most recent project that features the recent discovery of gravitational waves. To further our mission, our organization is actively seeking physicists and teachers to extend our outreach.

  13. Accelerating Gene Discovery by Phenotyping Whole-Genome Sequenced Multi-mutation Strains and Using the Sequence Kernel Association Test (SKAT)

    PubMed Central

    Garland, Stephanie J.; Mohan, Swetha; Flibotte, Stephane; Muncaster, Quintin; Cai, Jerry; Rademakers, Suzanne; Moerman, Donald G.; Leroux, Michel R.

    2016-01-01

    Forward genetic screens represent powerful, unbiased approaches to uncover novel components in any biological process. Such screens suffer from a major bottleneck, however, namely the cloning of corresponding genes causing the phenotypic variation. Reverse genetic screens have been employed as a way to circumvent this issue, but can often be limited in scope. Here we demonstrate an innovative approach to gene discovery. Using C. elegans as a model system, we used a whole-genome sequenced multi-mutation library, from the Million Mutation Project, together with the Sequence Kernel Association Test (SKAT), to rapidly screen for and identify genes associated with a phenotype of interest, namely defects in dye-filling of ciliated sensory neurons. Such anomalies in dye-filling are often associated with the disruption of cilia, organelles which in humans are implicated in sensory physiology (including vision, smell and hearing), development and disease. Beyond identifying several well characterised dye-filling genes, our approach uncovered three genes not previously linked to ciliated sensory neuron development or function. From these putative novel dye-filling genes, we confirmed the involvement of BGNT-1.1 in ciliated sensory neuron function and morphogenesis. BGNT-1.1 functions at the trans-Golgi network of sheath cells (glia) to influence dye-filling and cilium length, in a cell non-autonomous manner. Notably, BGNT-1.1 is the orthologue of human B3GNT1/B4GAT1, a glycosyltransferase associated with Walker-Warburg syndrome (WWS). WWS is a multigenic disorder characterised by muscular dystrophy as well as brain and eye anomalies. Together, our work unveils an effective and innovative approach to gene discovery, and provides the first evidence that B3GNT1-associated Walker-Warburg syndrome may be considered a ciliopathy. PMID:27508411

  14. Genes after the human genome project.

    PubMed

    Baetu, Tudor M

    2012-03-01

    While the Human Genome Nomenclature Committee (HGNC) concept of the gene can accommodate a wide variety of genomic sequences contributing to phenotypic outcomes, it fails to specify how sequences should be grouped when dealing with complex loci consisting of adjacent/overlapping sequences contributing to the same phenotype, distant sequences shown to contribute to the same gene product, and partially overlapping sequences identified by different techniques. The purpose of this paper is to review recently proposed concepts of the gene and critically assess how well they succeed in addressing the above problems while preserving the degree of generality achieved by the HGNC concept. I conclude that a dynamic interplay between mapping and syntax-based concepts is required in order to satisfy these desiderata.

  15. Improving data discovery and usability through commentary and user feedback: the CHARMe project

    NASA Astrophysics Data System (ADS)

    Alegre, R.; Blower, J. D.

    2014-12-01

    Earth science datasets are highly diverse. Users of these datasets are similarly varied, ranging from research scientists through industrial users to government decision- and policy-makers. It is very important for these users to understand the applicability of any dataset to their particular problem so that they can select the most appropriate data sources for their needs. Although data providers often provide rich supporting information in the form of metadata, typically this information does not include community usage information that can help other users judge fitness-for-purpose.The CHARMe project (http://www.charme.org.uk) is filling this gap by developing a system for sharing "commentary metadata". These are annotations that are generated and shared by the user community and include: Links between publications and datasets. The CHARMe system can record information about why a particular dataset was used (e.g. the paper may describe the dataset, it may use the dataset as a source, or it may be publishing results of a dataset assessment). These publications may appear in the peer-reviewed literature, or may be technical reports, websites or blog posts. Free-text comments supplied by the user. Provenance information, including links between datasets and descriptions of processing algorithms and sensors. External events that may affect data quality (e.g. large volcanic eruptions or El Niño events); we call these "significant events". Data quality information, e.g. system maturity indices. Commentary information can be linked to anything that can be uniquely identified (e.g. a dataset with a DOI or a persistent web address). It is also possible to associate commentary with particular subsets of datasets, for example to highlight an issue that is confined to a particular geographic region. We will demonstrate tools that show these capabilities in action, showing how users can apply commentary information during data discovery, visualization and analysis. The

  16. A model organism for new gene discovery by cDNA sequencing

    SciTech Connect

    El-Saved, N.M.; Donelson, J.E.; Alarcon, C.M.

    1994-09-01

    One method of new gene discovery is single pass sequencing of cDNAs to identify expressed sequence tags (ESTs). Model organisms can have biological properties which makes their use advantageous over studies with humans. One such model organism with advantages for cDNA sequencing is the African trypanosome T. brucei rhodesiense. This organism has the same 40 nucleotide sequence (splice leader sequence) on the 5{prime} end of all mRNAs. We have constructed a 5{prime} cDNA library by priming off the splice leader sequence and have begun sequencing this cDNA library. To date, over nearly 500 such cDNA expressed sequence tags (ESTs) have been examined. Forty-three percent of the sequences sampled from the trypanosome cDNA library have significant similarities to sequences already in the protein and translated nucleic acid databases. Among these are cDNA sequences which encode previously reported T. brucej proteins such as the VSG, tubulin, calflagin, etc., and proteins previously identified in other trypanosomatids. Other cDNAs display significant similarities to genes in unrelated organisms encoding several ribosomal proteins, metabolic enzymes, GTP binding proteins, transcription factors, cyclophillin, nucleosomal histones, histone H1, and a macrophage stress protein, among others. The 57% of the cDNAs that are not similar to sequences currently in the databases likely encode both trypanosome-specific proteins and housekeeping proteins shared with other eukaryotes. These cDNA ESTs provide new avenues of research for exploring both the biochemistry and the genome organization of this parasite, as well as a resource for identifying the 5{prime} sequence of novel genes likely to have homology to genes expressed in other organisms.

  17. Molecular Networking and Pattern-Based Genome Mining Improves Discovery of Biosynthetic Gene Clusters and their Products from Salinispora Species

    SciTech Connect

    Duncan, Katherine R.; Crüsemann, Max; Lechner, Anna; Sarkar, Anindita; Li, Jie; Ziemert, Nadine; Wang, Mingxun; Bandeira, Nuno; Moore, Bradley S.; Dorrestein, Pieter C.; Jensen, Paul R.

    2015-04-09

    Genome sequencing has revealed that bacteria contain many more biosynthetic gene clusters than predicted based on the number of secondary metabolites discovered to date. While this biosynthetic reservoir has fostered interest in new tools for natural product discovery, there remains a gap between gene cluster detection and compound discovery. In this paper, we apply molecular networking and the new concept of pattern-based genome mining to 35 Salinispora strains, including 30 for which draft genome sequences were either available or obtained for this study. The results provide a method to simultaneously compare large numbers of complex microbial extracts, which facilitated the identification of media components, known compounds and their derivatives, and new compounds that could be prioritized for structure elucidation. Finally, these efforts revealed considerable metabolite diversity and led to several molecular family-gene cluster pairings, of which the quinomycin-type depsipeptide retimycin A was characterized and linked to gene cluster NRPS40 using pattern-based bioinformatic approaches.

  18. ESTs from a wild Arachis species for gene discovery and marker development

    PubMed Central

    Proite, Karina; Leal-Bertioli, Soraya CM; Bertioli, David J; Moretzsohn, Márcio C; da Silva, Felipe R; Martins, Natalia F; Guimarães, Patrícia M

    2007-01-01

    Background Due to its origin, peanut has a very narrow genetic background. Wild relatives can be a source of genetic variability for cultivated peanut. In this study, the transcriptome of the wild species Arachis stenosperma accession V10309 was analyzed. Results ESTs were produced from four cDNA libraries of RNAs extracted from leaves and roots of A. stenosperma. Randomly selected cDNA clones were sequenced to generate 8,785 ESTs, of which 6,264 (71.3%) had high quality, with 3,500 clusters: 963 contigs and 2537 singlets. Only 55.9% matched homologous sequences of known genes. ESTs were classified into 23 different categories according to putative protein functions. Numerous sequences related to disease resistance, drought tolerance and human health were identified. Two hundred and six microsatellites were found and markers have been developed for 188 of these. The microsatellite profile was analyzed and compared to other transcribed and genomic sequence data. Conclusion This is, to date, the first report on the analysis of transcriptome of a wild relative of peanut. The ESTs produced in this study are a valuable resource for gene discovery, the characterization of new wild alleles, and for marker development. The ESTs were released in the [GenBank:EH041934 to EH048197]. PMID:17302987

  19. How Formal Methods Impels Discovery: A Short History of an Air Traffic Management Project

    NASA Technical Reports Server (NTRS)

    Butler, Ricky W.; Hagen, George; Maddalon, Jeffrey M.; Munoz, Cesar A.; Narkawicz, Anthony; Dowek, Gilles

    2010-01-01

    In this paper we describe a process of algorithmic discovery that was driven by our goal of achieving complete, mechanically verified algorithms that compute conflict prevention bands for use in en route air traffic management. The algorithms were originally defined in the PVS specification language and subsequently have been implemented in Java and C++. We do not present the proofs in this paper: instead, we describe the process of discovery and the key ideas that enabled the final formal proof of correctness

  20. Adeno-Associated Virus at 50: A Golden Anniversary of Discovery, Research, and Gene Therapy Success—A Personal Perspective

    PubMed Central

    Hastie, Eric

    2015-01-01

    Abstract Fifty years after the discovery of adeno-associated virus (AAV) and more than 30 years after the first gene transfer experiment was conducted, dozens of gene therapy clinical trials are in progress, one vector is approved for use in Europe, and breakthroughs in virus modification and disease modeling are paving the way for a revolution in the treatment of rare diseases, cancer, as well as HIV. This review will provide a historical perspective on the progression of AAV for gene therapy from discovery to the clinic, focusing on contributions from the Samulski lab regarding basic science and cloning of AAV, optimized large-scale production of vectors, preclinical large animal studies and safety data, vector modifications for improved efficacy, and successful clinical applications. PMID:25807962

  1. Using Phenomic Analysis of Photosynthetic Function for Abiotic Stress Response Gene Discovery

    PubMed Central

    Rungrat, Tepsuda; Awlia, Mariam; Brown, Tim; Cheng, Riyan; Sirault, Xavier; Fajkus, Jiri; Trtilek, Martin; Furbank, Bob; Badger, Murray; Tester, Mark; Pogson, Barry J; Borevitz, Justin O; Wilson, Pip

    2016-01-01

    Monitoring the photosynthetic performance of plants is a major key to understanding how plants adapt to their growth conditions. Stress tolerance traits have a high genetic complexity as plants are constantly, and unavoidably, exposed to numerous stress factors, which limits their growth rates in the natural environment. Arabidopsis thaliana, with its broad genetic diversity and wide climatic range, has been shown to successfully adapt to stressful conditions to ensure the completion of its life cycle. As a result, A. thaliana has become a robust and renowned plant model system for studying natural variation and conducting gene discovery studies. Genome wide association studies (GWAS) in restructured populations combining natural and recombinant lines is a particularly effective way to identify the genetic basis of complex traits. As most abiotic stresses affect photosynthetic activity, chlorophyll fluorescence measurements are a potential phenotyping technique for monitoring plant performance under stress conditions. This review focuses on the use of chlorophyll fluorescence as a tool to study genetic variation underlying the stress tolerance responses to abiotic stress in A. thaliana. PMID:27695390

  2. Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers.

    PubMed

    Labaj, Wojciech; Papiez, Anna; Polanski, Andrzej; Polanska, Joanna

    2017-03-01

    Large collections of data in studies on cancer such as leukaemia provoke the necessity of applying tailored analysis algorithms to ensure supreme information extraction. In this work, a custom-fit pipeline is demonstrated for thorough investigation of the voluminous MILE gene expression data set. Three analyses are accomplished, each for gaining a deeper understanding of the processes underlying leukaemia types and subtypes. First, the main disease groups are tested for differential expression against the healthy control as in a standard case-control study. Here, the basic knowledge on molecular mechanisms is confirmed quantitatively and by literature references. Second, pairwise comparison testing is performed for juxtaposing the main leukaemia types among each other. In this case by means of the Dice coefficient similarity measure the general relations are pointed out. Moreover, lists of candidate main leukaemia group biomarkers are proposed. Finally, with this approach being successful, the third analysis provides insight into all of the studied subtypes, followed by the emergence of four leukaemia subtype biomarkers. In addition, the class enhanced DEG signature obtained on the basis of novel pipeline processing leads to significantly better classification power of multi-class data classifiers. The developed methodology consisting of batch effect adjustment, adaptive noise and feature filtration coupled with adequate statistical testing and biomarker definition proves to be an effective approach towards knowledge discovery in high-throughput molecular biology experiments.

  3. Paradigm of Tunable Clustering Using Binarization of Consensus Partition Matrices (Bi-CoPaM) for Gene Discovery

    PubMed Central

    Abu-Jamous, Basel; Fa, Rui; Roberts, David J.; Nandi, Asoke K.

    2013-01-01

    Clustering analysis has a growing role in the study of co-expressed genes for gene discovery. Conventional binary and fuzzy clustering do not embrace the biological reality that some genes may be irrelevant for a problem and not be assigned to a cluster, while other genes may participate in several biological functions and should simultaneously belong to multiple clusters. Also, these algorithms cannot generate tight clusters that focus on their cores or wide clusters that overlap and contain all possibly relevant genes. In this paper, a new clustering paradigm is proposed. In this paradigm, all three eventualities of a gene being exclusively assigned to a single cluster, being assigned to multiple clusters, and being not assigned to any cluster are possible. These possibilities are realised through the primary novelty of the introduction of tunable binarization techniques. Results from multiple clustering experiments are aggregated to generate one fuzzy consensus partition matrix (CoPaM), which is then binarized to obtain the final binary partitions. This is referred to as Binarization of Consensus Partition Matrices (Bi-CoPaM). The method has been tested with a set of synthetic datasets and a set of five real yeast cell-cycle datasets. The results demonstrate its validity in generating relevant tight, wide, and complementary clusters that can meet requirements of different gene discovery studies. PMID:23409186

  4. An Evaluation of Active Learning Causal Discovery Methods for Reverse-Engineering Local Causal Pathways of Gene Regulation.

    PubMed

    Ma, Sisi; Kemmeren, Patrick; Aliferis, Constantin F; Statnikov, Alexander

    2016-03-04

    Reverse-engineering of causal pathways that implicate diseases and vital cellular functions is a fundamental problem in biomedicine. Discovery of the local causal pathway of a target variable (that consists of its direct causes and direct effects) is essential for effective intervention and can facilitate accurate diagnosis and prognosis. Recent research has provided several active learning methods that can leverage passively observed high-throughput data to draft causal pathways and then refine the inferred relations with a limited number of experiments. The current study provides a comprehensive evaluation of the performance of active learning methods for local causal pathway discovery in real biological data. Specifically, 54 active learning methods/variants from 3 families of algorithms were applied for local causal pathways reconstruction of gene regulation for 5 transcription factors in S. cerevisiae. Four aspects of the methods' performance were assessed, including adjacency discovery quality, edge orientation accuracy, complete pathway discovery quality, and experimental cost. The results of this study show that some methods provide significant performance benefits over others and therefore should be routinely used for local causal pathway discovery tasks. This study also demonstrates the feasibility of local causal pathway reconstruction in real biological systems with significant quality and low experimental cost.

  5. A combination of gene expression ranking and co-expression network analysis increases discovery rate in large-scale mutant screens for novel Arabidopsis thaliana abiotic stress genes.

    PubMed

    Ransbotyn, Vanessa; Yeger-Lotem, Esti; Basha, Omer; Acuna, Tania; Verduyn, Christoph; Gordon, Michal; Chalifa-Caspi, Vered; Hannah, Matthew A; Barak, Simon

    2015-05-01

    As challenges to food security increase, the demand for lead genes for improving crop production is growing. However, genetic screens of plant mutants typically yield very low frequencies of desired phenotypes. Here, we present a powerful computational approach for selecting candidate genes for screening insertion mutants. We combined ranking of Arabidopsis thaliana regulatory genes according to their expression in response to multiple abiotic stresses (Multiple Stress [MST] score), with stress-responsive RNA co-expression network analysis to select candidate multiple stress regulatory (MSTR) genes. Screening of 62 T-DNA insertion mutants defective in candidate MSTR genes, for abiotic stress germination phenotypes yielded a remarkable hit rate of up to 62%; this gene discovery rate is 48-fold greater than that of other large-scale insertional mutant screens. Moreover, the MST score of these genes could be used to prioritize them for screening. To evaluate the contribution of the co-expression analysis, we screened 64 additional mutant lines of MST-scored genes that did not appear in the RNA co-expression network. The screening of these MST-scored genes yielded a gene discovery rate of 36%, which is much higher than that of classic mutant screens but not as high as when picking candidate genes from the co-expression network. The MSTR co-expression network that we created, AraSTressRegNet is publicly available at http://netbio.bgu.ac.il/arnet. This systems biology-based screening approach combining gene ranking and network analysis could be generally applicable to enhancing identification of genes regulating additional processes in plants and other organisms provided that suitable transcriptome data are available.

  6. The Berkeley Drosophila Genome Project gene disruption project: Single P-element insertions mutating 25% of vital Drosophila genes.

    PubMed Central

    Spradling, A C; Stern, D; Beaton, A; Rhem, E J; Laverty, T; Mozden, N; Misra, S; Rubin, G M

    1999-01-01

    A fundamental goal of genetics and functional genomics is to identify and mutate every gene in model organisms such as Drosophila melanogaster. The Berkeley Drosophila Genome Project (BDGP) gene disruption project generates single P-element insertion strains that each mutate unique genomic open reading frames. Such strains strongly facilitate further genetic and molecular studies of the disrupted loci, but it has remained unclear if P elements can be used to mutate all Drosophila genes. We now report that the primary collection has grown to contain 1045 strains that disrupt more than 25% of the estimated 3600 Drosophila genes that are essential for adult viability. Of these P insertions, 67% have been verified by genetic tests to cause the associated recessive mutant phenotypes, and the validity of most of the remaining lines is predicted on statistical grounds. Sequences flanking >920 insertions have been determined to exactly position them in the genome and to identify 376 potentially affected transcripts from collections of EST sequences. Strains in the BDGP collection are available from the Bloomington Stock Center and have already assisted the research community in characterizing >250 Drosophila genes. The likely identity of 131 additional genes in the collection is reported here. Our results show that Drosophila genes have a wide range of sensitivity to inactivation by P elements, and provide a rationale for greatly expanding the BDGP primary collection based entirely on insertion site sequencing. We predict that this approach can bring >85% of all Drosophila open reading frames under experimental control. PMID:10471706

  7. De novo Assembly and Characterization of the Transcriptome of Broomcorn Millet (Panicum miliaceum L.) for Gene Discovery and Marker Development

    PubMed Central

    Yue, Hong; Wang, Le; Liu, Hui; Yue, Wenjie; Du, Xianghong; Song, Weining; Nie, Xiaojun

    2016-01-01

    Broomcorn millet (Panicum miliaceum L.) is one of the world’s oldest cultivated cereals, which is well-adapted to extreme environments such as drought, heat, and salinity with an efficient C4 carbon fixation. Discovery and identification of genes involved in these processes will provide valuable information to improve the crop for meeting the challenge of global climate change. However, the lack of genetic resources and genomic information make gene discovery and molecular mechanism studies very difficult. Here, we sequenced and assembled the transcriptome of broomcorn millet using Illumina sequencing technology. After sequencing, a total of 45,406,730 and 51,160,820 clean paired-end reads were obtained for two genotypes Yumi No. 2 and Yumi No. 3. These reads were mixed and then assembled into 113,643 unigenes, with the length ranging from 351 to 15,691 bp, of which 62,543 contings could be assigned to 315 gene ontology (GO) categories. Cluster of orthologous groups and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses assigned could map 15,514 unigenes into 202 KEGG pathways and 51,020 unigenes to 25 COG categories, respectively. Furthermore, 35,216 simple sequence repeats (SSRs) were identified in 27,055 unigene sequences, of which trinucleotides were the most abundant repeat unit, accounting for 66.72% of SSRs. In addition, 292 differentially expressed genes were identified between the two genotypes, which were significantly enriched in 88 GO terms and 12 KEGG pathways. Finally, the expression patterns of four selected transcripts were validated through quantitative reverse transcription polymerase chain reaction analysis. Our study for the first time sequenced and assembled the transcriptome of broomcorn millet, which not only provided a rich sequence resource for gene discovery and marker development in this important crop, but will also facilitate the further investigation of the molecular mechanism of its favored agronomic traits and beyond. PMID

  8. Informatics-guided procurement of patient samples for biomarker discovery projects in cancer research.

    PubMed

    Suh, K Stephen; Remache, Yvonne K; Patel, Jalpa S; Chen, Steve H; Haystrand, Russell; Ford, Peggy; Shaikh, Anadil M; Wang, Jian; Goy, Andre H

    2009-02-01

    Modern cancer research for biomarker discovery program requires solving several tasks that are directly involved with patient sample procurement. One requirement is to construct a highly efficient workflow on the clinical side for the procurement to generate a consistent supply of high quality samples for research. This undertaking needs a network of interdepartmental collaborations and participations at various levels, including physical human interactions, information technology implementations and a bioinformatics tool that is highly effective and user-friendly to busy clinicians and researchers associated with the sample procurement. Collegial participation that is sequential but continual from one department to another demands dedicated bioinformatics software coordinating between the institutional clinic and the tissue repository facility. Participants in the process include admissions, consenting process, phlebotomy, surgery center and pathology. During this multiple step procedures, clinical data are collected for detailed analytical endpoints to supplement logistics of defining and validating the discovery of biomarkers.

  9. De Novo Transcriptomic Analysis of Peripheral Blood Lymphocytes from the Chinese Goose: Gene Discovery and Immune System Pathway Description

    PubMed Central

    Tariq, Mansoor; Chen, Rong; Yuan, Hongyu; Liu, Yanjie; Wu, Yanan; Wang, Junya; Xia, Chun

    2015-01-01

    Background The Chinese goose is one of the most economically important poultry birds and is a natural reservoir for many avian viruses. However, the nature and regulation of the innate and adaptive immune systems of this waterfowl species are not completely understood due to limited information on the goose genome. Recently, transcriptome sequencing technology was applied in the genomic studies focused on novel gene discovery. Thus, this study described the transcriptome of the goose peripheral blood lymphocytes to identify immunity relevant genes. Principal Findings De novo transcriptome assembly of the goose peripheral blood lymphocytes was sequenced by Illumina-Solexa technology. In total, 211,198 unigenes were assembled from the 69.36 million cleaned reads. The average length, N50 size and the maximum length of the assembled unigenes were 687 bp, 1,298 bp and 18,992 bp, respectively. A total of 36,854 unigenes showed similarity by BLAST search against the NCBI non-redundant (Nr) protein database. For functional classification, 163,161 unigenes were comprised of three Gene Ontology (Go) categories and 67 subcategories. A total of 15,334 unigenes were annotated into 25 eukaryotic orthologous groups (KOGs) categories. Kyoto Encyclopedia of Genes and Genomes (KEGG) database annotated 39,585 unigenes into six biological functional groups and 308 pathways. Among the 2,757 unigenes that participated in the 15 immune system KEGG pathways, 125 of the most important immune relevant genes were summarized and analyzed by STRING analysis to identify gene interactions and relationships. Moreover, 10 genes were confirmed by PCR and analyzed. Of these 125 unigenes, 109 unigenes, approximately 87%, were not previously identified in the goose. Conclusion This de novo transcriptome analysis could provide important Chinese goose sequence information and highlights the value of new gene discovery, pathways investigation and immune system gene identification, and comparison with

  10. Knowledge Discovery and Digital Cartography for the ALS (Linguistic Atlas of Sicily) Project

    NASA Astrophysics Data System (ADS)

    Gentile, Antonio; Pirrone, Roberto; Russo, Giuseppe

    In this paper the latest developments of the ALS (Linguistic Atlas of Sicily) project are presented. The ALS project has the purpose to define methodologies and tools to allow researches in the socio-linguistic field. Different types of variables (both quantitative and qualitative) are involved.

  11. Genome-Scale Discovery of Cell Wall Biosynthesis Genes in Populus (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    ScienceCinema

    Muchero, Wellington [Oak Ridge National Laboratory

    2016-07-12

    Wellington Muchero from Oak Ridge National Laboratory gives a talk titled "Discovery of Cell Wall Biosynthesis Genes in Populus" at the JGI 7th Annual Users Meeting: Genomics of Energy & Environment Meeting on March 22, 2012 in Walnut Creek, California.

  12. IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites

    PubMed Central

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T. B. K.; Cimermančič, Peter; Fischbach, Michael A.; Ivanova, Natalia N.; Markowitz, Victor M.

    2015-01-01

    ABSTRACT In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of “big” genomic data for discovering small molecules. IMG-ABC relies on IMG’s comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC’s focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. PMID:26173699

  13. Molecular Networking and Pattern-Based Genome Mining Improves Discovery of Biosynthetic Gene Clusters and their Products from Salinispora Species

    DOE PAGES

    Duncan, Katherine R.; Crüsemann, Max; Lechner, Anna; ...

    2015-04-09

    Genome sequencing has revealed that bacteria contain many more biosynthetic gene clusters than predicted based on the number of secondary metabolites discovered to date. While this biosynthetic reservoir has fostered interest in new tools for natural product discovery, there remains a gap between gene cluster detection and compound discovery. In this paper, we apply molecular networking and the new concept of pattern-based genome mining to 35 Salinispora strains, including 30 for which draft genome sequences were either available or obtained for this study. The results provide a method to simultaneously compare large numbers of complex microbial extracts, which facilitated themore » identification of media components, known compounds and their derivatives, and new compounds that could be prioritized for structure elucidation. Finally, these efforts revealed considerable metabolite diversity and led to several molecular family-gene cluster pairings, of which the quinomycin-type depsipeptide retimycin A was characterized and linked to gene cluster NRPS40 using pattern-based bioinformatic approaches.« less

  14. IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

    SciTech Connect

    Chen, I-Min; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Huang, Jinghua; Reddy, T. B.K.; Cimermancic, Peter; Fischbach, Michael; Ivanova, Natalia; Markowitz, Victor; Kyrpides, Nikos; Pati, Amrita

    2014-10-28

    In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC (https://img.jgi.doe.gov/abc/) -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorway to a new era in the discovery of novel molecules.

  15. The BDGP gene disruption project: single transposon insertions associated with 40% of Drosophila genes.

    PubMed Central

    Bellen, Hugo J; Levis, Robert W; Liao, Guochun; He, Yuchun; Carlson, Joseph W; Tsang, Garson; Evans-Holm, Martha; Hiesinger, P Robin; Schulze, Karen L; Rubin, Gerald M; Hoskins, Roger A; Spradling, Allan C

    2004-01-01

    The Berkeley Drosophila Genome Project (BDGP) strives to disrupt each Drosophila gene by the insertion of a single transposable element. As part of this effort, transposons in >30,000 fly strains were localized and analyzed relative to predicted Drosophila gene structures. Approximately 6300 lines that maximize genomic coverage were selected to be sent to the Bloomington Stock Center for public distribution, bringing the size of the BDGP gene disruption collection to 7140 lines. It now includes individual lines predicted to disrupt 5362 of the 13,666 currently annotated Drosophila genes (39%). Other lines contain an insertion at least 2 kb from others in the collection and likely mutate additional incompletely annotated or uncharacterized genes and chromosomal regulatory elements. The remaining strains contain insertions likely to disrupt alternative gene promoters or to allow gene misexpression. The expanded BDGP gene disruption collection provides a public resource that will facilitate the application of Drosophila genetics to diverse biological problems. Finally, the project reveals new insight into how transposons interact with a eukaryotic genome and helps define optimal strategies for using insertional mutagenesis as a genomic tool. PMID:15238527

  16. The BDGP gene disruption project: Single transposon insertions associated with 40 percent of Drosophila genes

    SciTech Connect

    Bellen, Hugo J.; Levis, Robert W.; Liao, Guochun; He, Yuchun; Carlson, Joseph W.; Tsang, Garson; Evans-Holm, Martha; Hiesinger, P. Robin; Schulze, Karen L.; Rubin, Gerald M.; Hoskins, Roger A.; Spradling, Allan C.

    2004-01-13

    The Berkeley Drosophila Genome Project (BDGP) strives to disrupt each Drosophila gene by the insertion of a single transposable element. As part of this effort, transposons in more than 30,000 fly strains were localized and analyzed relative to predicted Drosophila gene structures. Approximately 6,300 lines that maximize genomic coverage were selected to be sent to the Bloomington Stock Center for public distribution, bringing the size of the BDGP gene disruption collection to 7,140 lines. It now includes individual lines predicted to disrupt 5,362 of the 13,666 currently annotated Drosophila genes (39 percent). Other lines contain an insertion at least 2 kb from others in the collection and likely mutate additional incompletely annotated or uncharacterized genes and chromosomal regulatory elements. The remaining strains contain insertions likely to disrupt alternative gene promoters or to allow gene mis-expression. The expanded BDGP gene disruption collection provides a public resource that will facilitate the application of Drosophila genetics to diverse biological problems. Finally, the project reveals new insight into how transposons interact with a eukaryotic genome and helps define optimal strategies for using insertional mutagenesis as a genomic tool.

  17. Discovery of germline-related genes in Cephalochordate amphioxus: A genome wide survey using genome annotation and transcriptome data.

    PubMed

    Yue, Jia-Xing; Li, Kun-Lung; Yu, Jr-Kai

    2015-12-01

    The generation of germline cells is a critical process in the reproduction of multicellular organisms. Studies in animal models have identified a common repertoire of genes that play essential roles in primordial germ cell (PGC) formation. However, comparative studies also indicate that the timing and regulation of this core genetic program vary considerably in different animals, raising the intriguing questions regarding the evolution of PGC developmental mechanisms in metazoans. Cephalochordates (commonly called amphioxus or lancelets) represent one of the invertebrate chordate groups and can provide important information about the evolution of developmental mechanisms in the chordate lineage. In this study, we used genome and transcriptome data to identify germline-related genes in two distantly related cephalochordate species, Branchiostoma floridae and Asymmetron lucayanum. Branchiostoma and Asymmetron diverged more than 120 MYA, and the most conspicuous difference between them is their gonadal morphology. We used important germline developmental genes in several model animals to search the amphioxus genome and transcriptome dataset for conserved homologs. We also annotated the assembled transcriptome data using Gene Ontology (GO) terms to facilitate the discovery of putative genes associated with germ cell development and reproductive functions in amphioxus. We further confirmed the expression of 14 genes in developing oocytes or mature eggs using whole mount in situ hybridization, suggesting their potential functions in amphioxus germ cell development. The results of this global survey provide a useful resource for testing potential functions of candidate germline-related genes in cephalochordates and for investigating differences in gonad developmental mechanisms between Branchiostoma and Asymmetron species.

  18. Using false discovery rates to benchmark SNP-callers in next-generation sequencing projects.

    PubMed

    Farrer, Rhys A; Henk, Daniel A; MacLean, Dan; Studholme, David J; Fisher, Matthew C

    2013-01-01

    Sequence alignments form the basis for many comparative and population genomic studies. Alignment tools provide a range of accuracies dependent on the divergence between the sequences and the alignment methods. Despite widespread use, there is no standard method for assessing the accuracy of a dataset and alignment strategy after resequencing. We present a framework and tool for determining the overall accuracies of an input read dataset, alignment and SNP-calling method providing an isolate in that dataset has a corresponding, or closely related reference sequence available. In addition to this tool for comparing False Discovery Rates (FDR), we include a method for determining homozygous and heterozygous positions from an alignment using binomial probabilities for an expected error rate. We benchmark this method against other SNP callers using our FDR method with three fungal genomes, finding that it was able achieve a high level of accuracy. These tools are available at http://cfdr.sourceforge.net/.

  19. Using False Discovery Rates to Benchmark SNP-callers in next-generation sequencing projects

    PubMed Central

    Farrer, Rhys A.; Henk, Daniel A.; MacLean, Dan; Studholme, David J.; Fisher, Matthew C.

    2013-01-01

    Sequence alignments form the basis for many comparative and population genomic studies. Alignment tools provide a range of accuracies dependent on the divergence between the sequences and the alignment methods. Despite widespread use, there is no standard method for assessing the accuracy of a dataset and alignment strategy after resequencing. We present a framework and tool for determining the overall accuracies of an input read dataset, alignment and SNP-calling method providing an isolate in that dataset has a corresponding, or closely related reference sequence available. In addition to this tool for comparing False Discovery Rates (FDR), we include a method for determining homozygous and heterozygous positions from an alignment using binomial probabilities for an expected error rate. We benchmark this method against other SNP callers using our FDR method with three fungal genomes, finding that it was able achieve a high level of accuracy. These tools are available at http://cfdr.sourceforge.net/. PMID:23518929

  20. The AEROPATH project targeting Pseudomonas aeruginosa: crystallographic studies for assessment of potential targets in early-stage drug discovery

    PubMed Central

    Moynie, Lucille; Schnell, Robert; McMahon, Stephen A.; Sandalova, Tatyana; Boulkerou, Wassila Abdelli; Schmidberger, Jason W.; Alphey, Magnus; Cukier, Cyprian; Duthie, Fraser; Kopec, Jolanta; Liu, Huanting; Jacewicz, Agata; Hunter, William N.; Naismith, James H.; Schneider, Gunter

    2013-01-01

    Bacterial infections are increasingly difficult to treat owing to the spread of antibiotic resistance. A major concern is Gram-negative bacteria, for which the discovery of new antimicrobial drugs has been particularly scarce. In an effort to accelerate early steps in drug discovery, the EU-funded AEROPATH project aims to identify novel targets in the opportunistic pathogen Pseudomonas aeruginosa by applying a multidisciplinary approach encompassing target validation, structural characterization, assay development and hit identification from small-molecule libraries. Here, the strategies used for target selection are described and progress in protein production and structure analysis is reported. Of the 102 selected targets, 84 could be produced in soluble form and the de novo structures of 39 proteins have been determined. The crystal structures of eight of these targets, ranging from hypothetical unknown proteins to metabolic enzymes from different functional classes (PA1645, PA1648, PA2169, PA3770, PA4098, PA4485, PA4992 and PA5259), are reported here. The structural information is expected to provide a firm basis for the improvement of hit compounds identified from fragment-based and high-throughput screening campaigns. PMID:23295481

  1. The AEROPATH project targeting Pseudomonas aeruginosa: crystallographic studies for assessment of potential targets in early-stage drug discovery.

    PubMed

    Moynie, Lucille; Schnell, Robert; McMahon, Stephen A; Sandalova, Tatyana; Boulkerou, Wassila Abdelli; Schmidberger, Jason W; Alphey, Magnus; Cukier, Cyprian; Duthie, Fraser; Kopec, Jolanta; Liu, Huanting; Jacewicz, Agata; Hunter, William N; Naismith, James H; Schneider, Gunter

    2013-01-01

    Bacterial infections are increasingly difficult to treat owing to the spread of antibiotic resistance. A major concern is Gram-negative bacteria, for which the discovery of new antimicrobial drugs has been particularly scarce. In an effort to accelerate early steps in drug discovery, the EU-funded AEROPATH project aims to identify novel targets in the opportunistic pathogen Pseudomonas aeruginosa by applying a multidisciplinary approach encompassing target validation, structural characterization, assay development and hit identification from small-molecule libraries. Here, the strategies used for target selection are described and progress in protein production and structure analysis is reported. Of the 102 selected targets, 84 could be produced in soluble form and the de novo structures of 39 proteins have been determined. The crystal structures of eight of these targets, ranging from hypothetical unknown proteins to metabolic enzymes from different functional classes (PA1645, PA1648, PA2169, PA3770, PA4098, PA4485, PA4992 and PA5259), are reported here. The structural information is expected to provide a firm basis for the improvement of hit compounds identified from fragment-based and high-throughput screening campaigns.

  2. Scientific Discovery through Advanced Computing (SciDAC-3) Partnership Project Annual Report

    SciTech Connect

    Hoffman, Forest M.; Bochev, Pavel B.; Cameron-Smith, Philip J..; Easter, Richard C; Elliott, Scott M.; Ghan, Steven J.; Liu, Xiaohong; Lowrie, Robert B.; Lucas, Donald D.; Ma, Po-lun; Sacks, William J.; Shrivastava, Manish; Singh, Balwinder; Tautges, Timothy J.; Taylor, Mark A.; Vertenstein, Mariana; Worley, Patrick H.

    2014-01-15

    The Applying Computationally Efficient Schemes for BioGeochemical Cycles ACES4BGC Project is advancing the predictive capabilities of Earth System Models (ESMs) by reducing two of the largest sources of uncertainty, aerosols and biospheric feedbacks, with a highly efficient computational approach. In particular, this project is implementing and optimizing new computationally efficient tracer advection algorithms for large numbers of tracer species; adding important biogeochemical interactions between the atmosphere, land, and ocean models; and applying uncertainty quanti cation (UQ) techniques to constrain process parameters and evaluate uncertainties in feedbacks between biogeochemical cycles and the climate system.

  3. Antibiotic discovery throughout the Small World Initiative: A molecular strategy to identify biosynthetic gene clusters involved in antagonistic activity.

    PubMed

    Davis, Elizabeth; Sloan, Tyler; Aurelius, Krista; Barbour, Angela; Bodey, Elijah; Clark, Brigette; Dennis, Celeste; Drown, Rachel; Fleming, Megan; Humbert, Allison; Glasgo, Elizabeth; Kerns, Trent; Lingro, Kelly; McMillin, MacKenzie; Meyer, Aaron; Pope, Breanna; Stalevicz, April; Steffen, Brittney; Steindl, Austin; Williams, Carolyn; Wimberley, Carmen; Zenas, Robert; Butela, Kristen; Wildschutte, Hans

    2017-01-22

    The emergence of bacterial pathogens resistant to all known antibiotics is a global health crisis. Adding to this problem is that major pharmaceutical companies have shifted away from antibiotic discovery due to low profitability. As a result, the pipeline of new antibiotics is essentially dry and many bacteria now resist the effects of most commonly used drugs. To address this global health concern, citizen science through the Small World Initiative (SWI) was formed in 2012. As part of SWI, students isolate bacteria from their local environments, characterize the strains, and assay for antibiotic production. During the 2015 fall semester at Bowling Green State University, students isolated 77 soil-derived bacteria and genetically characterized strains using the 16S rRNA gene, identified strains exhibiting antagonistic activity, and performed an expanded SWI workflow using transposon mutagenesis to identify a biosynthetic gene cluster involved in toxigenic compound production. We identified one mutant with loss of antagonistic activity and through subsequent whole-genome sequencing and linker-mediated PCR identified a 24.9 kb biosynthetic gene locus likely involved in inhibitory activity in that mutant. Further assessment against human pathogens demonstrated the inhibition of Bacillus cereus, Listeria monocytogenes, and methicillin-resistant Staphylococcus aureus in the presence of this compound, thus supporting our molecular strategy as an effective research pipeline for SWI antibiotic discovery and genetic characterization.

  4. Project ARCHIMEDES: Applications, Reasoning and Concepts for High School Instructors: Making Educational Discoveries and Expanding Skills.

    ERIC Educational Resources Information Center

    Lea, Suzanne M.

    Project ARCHIMEDES was designed in cooperation with local teachers to enhance concept understanding of teachers of physics and physical sciences, to increase use of electronics and computers in the classroom, and to introduce research on students' misconceptions in physics, teaching methods for identifying and remediating misconceptions, and ways…

  5. Ataxin1L is a regulator of HSC function highlighting the utility of cross-tissue comparisons for gene discovery.

    PubMed

    Kahle, Juliette J; Souroullas, George P; Yu, Peng; Zohren, Fabian; Lee, Yoontae; Shaw, Chad A; Zoghbi, Huda Y; Goodell, Margaret A

    2013-03-01

    Hematopoietic stem cells (HSCs) are rare quiescent cells that continuously replenish the cellular components of the peripheral blood. Observing that the ataxia-associated gene Ataxin-1-like (Atxn1L) was highly expressed in HSCs, we examined its role in HSC function through in vitro and in vivo assays. Mice lacking Atxn1L had greater numbers of HSCs that regenerated the blood more quickly than their wild-type counterparts. Molecular analyses indicated Atxn1L null HSCs had gene expression changes that regulate a program consistent with their higher level of proliferation, suggesting that Atxn1L is a novel regulator of HSC quiescence. To determine if additional brain-associated genes were candidates for hematologic regulation, we examined genes encoding proteins from autism- and ataxia-associated protein-protein interaction networks for their representation in hematopoietic cell populations. The interactomes were found to be highly enriched for proteins encoded by genes specifically expressed in HSCs relative to their differentiated progeny. Our data suggest a heretofore unappreciated similarity between regulatory modules in the brain and HSCs, offering a new strategy for novel gene discovery in both systems.

  6. Markus Maeurer on the LifeGene project.

    PubMed

    Maeurer, Markus

    2010-09-01

    This year will see the full-scale roll-out of the LifeGene study. Coordinated by the Karolinska Institutet (Stockholm, Sweden), the project looks set to be one of the largest health studies performed in the world, aiming to enroll 500,000 Swedes in order to determine the relationship between disease and environmental, lifestyle and hereditary factors. International experts from a broad spectrum of scientific fields will collaborate on this project, including Mark Maeurer, Chair of the infections working group in LifeGene (www.lifegene.se). Maeurer leads the additional study on influenza-like illnesses, which is currently being run within the LifeGene project. Maeurer studied medicine in Germany, Switzerland and the USA. He is board-certified in medical microbiology, has served as Assistant Professor of Surgery at the University of Pittsburgh Medical School (PA, USA) in the Department of Surgical Oncology and Immunotherapy, as Professor of Medical Microbiology at the University of Mainz, Germany, and is now Professor of Clinical Immunology at the Microbiology and Tumor Cell Biology Center at the Karolinska Institutet. Maeurer has conducted experiments with nonhuman primate models to study TB vaccine take, with the aim to profile markers of immune protection. His general interests are in the field of immune reconstitution and memory immune responses, with a particular focus on gauging protective T-cell responses and antibody immune signatures using high-content peptide microarrays. He has published more than 100 original articles, ten book chapters and serves as a reviewer for a number of international scientific journals.

  7. Mapping Our Genes: The Genome Projects: How Big, How Fast

    DOE R&D Accomplishments Database

    1988-04-01

    For the past 2 years, scientific and technical journals in biology and medicine have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part of the public agenda. The debate involves science, technology, and politics. Congress is responsible for �writing the rules� of what various federal agencies do and for funding their work. This report surveys the points made so far in the debate, focusing on those that most directly influence the policy options facing the US Congress. Congressional interest focused on how to assess the rationales for conducting human genome projects, how to fund human genome projects (at what level and through which mechanisms), how to coordinate the scientific and technical programs of the several federal agencies and private interests already supporting various genome projects, and how to strike a balance regarding the impact of genome projects on international scientific cooperation and international economic competition in biotechnology. The Office of Technology Assessment (OTA) prepared this report with the assistance of several hundred experts throughout the world.

  8. Mapping our genes: The genome projects: How big, how fast

    SciTech Connect

    none,

    1988-04-01

    For the past 2 years, scientific and technical journals in biology and medicine have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part of the public agenda. The debate involves science, technology, and politics. Congress is responsible for /open quotes/writing the rules/close quotes/ of what various federal agencies do and for funding their work. This report surveys the points made so far in the debate, focusing on those that most directly influence the policy options facing the US Congress. Congressional interest focused on how to assess the rationales for conducting human genome projects, how to fund human genome projects (at what level and through which mechanisms), how to coordinate the scientific and technical programs of the several federal agencies and private interests already supporting various genome projects, and how to strike a balance regarding the impact of genome projects on international scientific cooperation and international economic competition in biotechnology. OTA prepared this report with the assistance of several hundred experts throughout the world. 342 refs., 26 figs., 11 tabs.

  9. Weeding out the genes: the Arabidopsis genome project.

    PubMed

    Martienssen, R A

    2000-05-01

    The Arabidopsis genome sequence is scheduled for completion at the end of this year (December 2000). It will be the first higher plant genome to be sequenced, and will allow a detailed comparison with bacterial, yeast and animal genomes. Already, two of the five chromosomes have been sequenced, and we have had our first glimpse of higher eukaryotic centromeres, and the structure of heterochromatin. The implications for understanding plant gene function, genome structure and genome organization are profound. In this review, the lessons learned for future genome projects are reviewed as well as a summary of the initial findings in Arabidopsis.

  10. Serendipitous Discovery of a Projected Pair of QSOs Separated by 4.5 arcsec on the Sky

    NASA Astrophysics Data System (ADS)

    Heintz, K. E.; Fynbo, J. P. U.; Krogager, J.-K.; Vestergaard, M.; Møller, P.; Arabsalmani, M.; Geier, S.; Noterdaeme, P.; Ledoux, C.; Saturni, F. G.; Venemans, B.

    2016-07-01

    We present the serendipitous discovery of a projected pair of quasi-stellar objects (QSOs) with an angular separation of Δθ = 4.50 arcsec. The redshifts of the two QSOs are widely different: one, our program target, is a QSO with a spectrum consistent with being a narrow line Seyfert 1 active galactic nucleus at z = 2.05. For this target we detect Lyα, C iv, and C iii]. The other QSO, which by chance was included on the spectroscopic slit, is a Type 1 QSO at a redshift of z = 1.68, for which we detect C iv, C iii], and Mg ii. We compare this system to previously detected projected QSO pairs and find that only about a dozen previously known pairs have smaller angular separation. Based on observations made with the Nordic Optical Telescope, on the island of La Palma jointly operated by Denmark, Finland, Iceland, Norway, and Sweden, in the Spanish Observatorio del Roque de los Muchachos of the Instituto de Astrofisica de Canarias.

  11. Discovery of agents that eradicate leukemia stem cells using an in silico screen of public gene expression data

    PubMed Central

    Hassane, Duane C.; Guzman, Monica L.; Corbett, Cheryl; Li, Xiaojie; Abboud, Ramzi; Young, Fay; Liesveld, Jane L.; Carroll, Martin

    2008-01-01

    Increasing evidence indicates that malignant stem cells are important for the pathogenesis of acute myelogenous leukemia (AML) and represent a reservoir of cells that drive the development of AML and relapse. Therefore, new treatment regimens are necessary to prevent relapse and improve therapeutic outcomes. Previous studies have shown that the sesquiterpene lactone, parthenolide (PTL), ablates bulk, progenitor, and stem AML cells while causing no appreciable toxicity to normal hematopoietic cells. Thus, PTL must evoke cellular responses capable of mediating AML selective cell death. Given recent advances in chemical genomics such as gene expression-based high-throughput screening (GE-HTS) and the Connectivity Map, we hypothesized that the gene expression signature resulting from treatment of primary AML with PTL could be used to search for similar signatures in publicly available gene expression profiles deposited into the Gene Expression Omnibus (GEO). We therefore devised a broad in silico screen of the GEO database using the PTL gene expression signature as a template and discovered 2 new agents, celastrol and 4-hydroxy-2-nonenal, that effectively eradicate AML at the bulk, progenitor, and stem cell level. These findings suggest the use of multicenter collections of high-throughput data to facilitate discovery of leukemia drugs and drug targets. PMID:18305216

  12. The MY NASA DATA Project: Tools and a Collaboration Space for Knowledge Discovery

    NASA Astrophysics Data System (ADS)

    Chambers, L. H.; Alston, E. J.; Diones, D. D.; Moore, S. W.; Oots, P. C.; Phelps, C. S.

    2006-05-01

    The Atmospheric Science Data Center (ASDC) at NASA Langley Research Center is charged with serving a wide user community that is interested in its large data holdings in the areas of Aerosols, Clouds, Radiation Budget, and Tropospheric Chemistry. Most of the data holdings, however, are in large files with specialized data formats. The MY NASA DATA (mynasadata.larc.nasa.gov) project began in 2004, as part of the NASA Research, Education, and Applications Solutions Network (REASoN), in order to open this important resource to a broader community including K-12 education and citizen scientists. MY NASA DATA (short for Mentoring and inquirY using NASA Data on Atmospheric and earth science for Teachers and Amateurs) consists of a web space that collects tools, lesson plans, and specially developed documentation to help the target audience more easily use the vast collection of NASA data about the Earth System. The core piece of the MY NASA DATA project is the creation of microsets (both static and custom) that make data easily accessible. The installation of a Live Access Server (LAS) greatly enhanced the ability for teachers, students, and citizen scientists to create and explore custom microsets of Earth System Science data. The LAS, which is an open source software tool using emerging data standards, also allows the MY NASA DATA team to make available data on other aspects of the Earth System from collaborating data centers. We are currently working with the Physical Oceanography DAAC at the Jet Propulsion Laboratory to bring in several parameters describing the ocean. In addition, MY NASA DATA serves as a central space for the K-12 community to share resources. The site already includes a dozen User-contributed lesson plans. This year we will be focusing on the Citizen Science portion of the site, and will be welcoming user-contributed project ideas, as well as reports of completed projects. An e-mentor network has also been created to involve a wider community in

  13. The Extragalactic Distance Scale Key Project VIII. The Discovery of Cepheids and a New Distance to NGC 3621 Using the Hubble Space Telescope

    NASA Technical Reports Server (NTRS)

    Rawson, D. M.; Mould, J. R.; Macri, L. M.; Huchra, J. P.; Kennicutt, R. C.; Harding, P.; Freedman, W. L.; Hill, R. J.; Phelps, R. L.; Madore, B. F.; Silbermann, N. A.; Graham, J. A.; Ferrarese, L.; Ford, H. C.; Illingworth, G. D.; Hoessel, J. G.; Han, M.; Hughes, S. M.; Saha, A.; Stetson, P. B.

    1996-01-01

    We report on the discovery of Cepheids in the field spiral galaxy NGC3621, based on observations made with the Wide Field and Planetary Camera 2 on board the Hubble Space Telescope (HST). NGC 3621 is one of 18 galaxies observed as part of the HST Key Project on the Extragalctic Distance Scale, which aims to measure the Hubble Constant to 10 percent accuracy.

  14. A hybrid computational method for the discovery of novel reproduction-related genes.

    PubMed

    Chen, Lei; Chu, Chen; Kong, Xiangyin; Huang, Guohua; Huang, Tao; Cai, Yu-Dong

    2015-01-01

    Uncovering the molecular mechanisms underlying reproduction is of great importance to infertility treatment and to the generation of healthy offspring. In this study, we discovered novel reproduction-related genes with a hybrid computational method, integrating three different types of method, which offered new clues for further reproduction research. This method was first executed on a weighted graph, constructed based on known protein-protein interactions, to search the shortest paths connecting any two known reproduction-related genes. Genes occurring in these paths were deemed to have a special relationship with reproduction. These newly discovered genes were filtered with a randomization test. Then, the remaining genes were further selected according to their associations with known reproduction-related genes measured by protein-protein interaction score and alignment score obtained by BLAST. The in-depth analysis of the high confidence novel reproduction genes revealed hidden mechanisms of reproduction and provided guidelines for further experimental validations.

  15. Targetfinder.org: a resource for systematic discovery of transcription factor target genes

    PubMed Central

    Kiełbasa, Szymon M.; Blüthgen, Nils; Fähling, Michael

    2010-01-01

    Targetfinder.org (http://targetfinder.org/) provides a web-based resource for finding genes that show a similar expression pattern to a group of user-selected genes. It is based on a large-scale gene expression compendium (>1200 experiments, >13 000 genes). The primary application of Targetfinder.org is to expand a list of known transcription factor targets by new candidate target genes. The user submits a group of genes (the ‘seed’), and as a result the web site provides a list of other genes ranked by similarity of their expression to the expression of the seed genes. Additionally, the web site provides information on a recovery/cross-validation test to check for consistency of the provided seed and the quality of the ranking. Furthermore, the web site allows to analyse affinities of a selected transcription factor to the promoter regions of the top-ranked genes in order to select the best new candidate target genes for further experimental analysis. PMID:20460454

  16. Co-clustering phenome–genome for phenotype classification and disease gene discovery

    PubMed Central

    Hwang, TaeHyun; Atluri, Gowtham; Xie, MaoQiang; Dey, Sanjoy; Hong, Changjin; Kumar, Vipin; Kuang, Rui

    2012-01-01

    Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype–gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this article, a regularized non-negative matrix tri-factorization (R-NMTF) algorithm is introduced to co-cluster phenotypes and genes, and simultaneously detect associations between the detected phenotype clusters and gene clusters. The R-NMTF algorithm factorizes the phenotype–gene association matrix under the prior knowledge from phenotype similarity network and protein–protein interaction network, supervised by the label information from known disease classes and biological pathways. In the experiments on disease phenotype–gene associations in OMIM and KEGG disease pathways, R-NMTF significantly improved the classification of disease phenotypes and disease pathway genes compared with support vector machines and Label Propagation in cross-validation on the annotated phenotypes and genes. The newly predicted phenotypes in each disease class are highly consistent with human phenotype ontology annotations. The roles of the new member genes in the disease pathways are examined and validated in the protein–protein interaction subnetworks. Extensive literature review also confirmed many new members of the disease classes and pathways as well as the predicted associations between disease phenotype classes and pathways. PMID:22735708

  17. Parallel antibody germline gene and haplotype analyses support the validity of immunoglobulin germline gene inference and discovery.

    PubMed

    Kirik, Ufuk; Greiff, Lennart; Levander, Fredrik; Ohlin, Mats

    2017-04-04

    Analysis of antibody repertoire development and specific antibody responses important for e.g. autoimmune conditions, allergy, and protection against disease is supported by high throughput sequencing and associated bioinformatics pipelines that describe the diversity of the encoded antibody variable domains. Proper assignment of sequences to germline genes are important for many such processes, for instance in the analysis of somatic hypermutation. Germline gene inference from antibody-encoding transcriptomes, by using tools such as TIgGER or IgDiscover, has a potential to enhance the quality of such analyses. These tools may also be used to identify germline genes not previously known. In this study, we exploited such software for germline gene inference and define aspects of analysis settings and pre-existing knowledge of germline genes that affect the outcome of gene inference. Furthermore, we demonstrate the capacity of IGHJ and IGHD haplotype inference, whenever subjects are heterozygous with respect to such genes, to lend support to IGHV gene inference in general, and to the identification of novel alleles presently not recognized by germline gene reference directories. We propose that such haplotype analysis shall, whenever possible, be used in future best practice to support the outcome of germline gene inference. IGHJ-directed haplotype inference was also used to identify haplotypes not expressing some IGHV germline genes. In particular, we identified a haplotype that did not express several major germline genes such as IGHV1-8, IGHV3-9, IGHV3-15, IGHV1-18, IGHV3-21, and IGHV3-23. We envisage that haplotype analysis will provide an efficient approach to identify subjects for further studies of the link between the available immunoglobulin repertoire and outcomes of immune responses.

  18. IMG-ABC. A knowledge base to fuel discovery of biosynthetic gene clusters and novel secondary metabolites

    SciTech Connect

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T. B. K.; Cimermančič, Peter; Fischbach, Michael A.; Ivanova, Natalia N.; Markowitz, Victor M.; Kyrpides, Nikos C.; Pati, Amrita

    2015-07-14

    In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of “big” genomic data for discovering small molecules. IMG-ABC relies on IMG’s comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC’s focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in lphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG’s extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG

  19. IMG-ABC. A knowledge base to fuel discovery of biosynthetic gene clusters and novel secondary metabolites

    DOE PAGES

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; ...

    2015-07-14

    In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of “big” genomic data for discovering small molecules. IMG-ABC relies on IMG’s comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve asmore » the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC’s focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in lphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG’s extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG

  20. Use of eQTL Analysis for the Discovery of Target Genes Identified by GWAS

    DTIC Science & Technology

    2013-04-01

    density representation of the scatterplot with %GC on the x-axis and log2(GeneCount) on the y-axis. A loess smoother line is shown indicating the...general pattern of all the Gene Count values for this particular subject. Similarly, Figure 55 to 79 shows the loess smoother line for each subject. Based...Distribution of Total Gene Counts) for each Subject by RunID 59 Figure 55: Distribution of Percent GC versus log2(Gene Count + 1) with a loess smoother for

  1. Discovery of Core Biotic Stress Responsive Genes in Arabidopsis by Weighted Gene Co-Expression Network Analysis

    PubMed Central

    Amrine, Katherine C. H.; Blanco-Ulate, Barbara; Cantu, Dario

    2015-01-01

    Intricate signal networks and transcriptional regulators translate the recognition of pathogens into defense responses. In this study, we carried out a gene co-expression analysis of all currently publicly available microarray data, which were generated in experiments that studied the interaction of the model plant Arabidopsis thaliana with microbial pathogens. This work was conducted to identify (i) modules of functionally related co-expressed genes that are differentially expressed in response to multiple biotic stresses, and (ii) hub genes that may function as core regulators of disease responses. Using Weighted Gene Co-expression Network Analysis (WGCNA) we constructed an undirected network leveraging a rich curated expression dataset comprising 272 microarrays that involved microbial infections of Arabidopsis plants with a wide array of fungal and bacterial pathogens with biotrophic, hemibiotrophic, and necrotrophic lifestyles. WGCNA produced a network with scale-free and small-world properties composed of 205 distinct clusters of co-expressed genes. Modules of functionally related co-expressed genes that are differentially regulated in response to multiple pathogens were identified by integrating differential gene expression testing with functional enrichment analyses of gene ontology terms, known disease associated genes, transcriptional regulators, and cis-regulatory elements. The significance of functional enrichments was validated by comparisons with randomly generated networks. Network topology was then analyzed to identify intra- and inter-modular gene hubs. Based on high connectivity, and centrality in meta-modules that are clearly enriched in defense responses, we propose a list of 66 target genes for reverse genetic experiments to further dissect the Arabidopsis immune system. Our results show that statistical-based data trimming prior to network analysis allows the integration of expression datasets generated by different groups, under different

  2. Exploring the Transcriptome Landscape of Pomegranate Fruit Peel for Natural Product Biosynthetic Gene and SSR Marker Discovery(F).

    PubMed

    Ono, Nadia Nicole; Britton, Monica Therese; Fass, Joseph Nathaniel; Nicolet, Charles Meyer; Lin, Dawei; Tian, Li

    2011-10-01

    Pomegranate fruit peel is rich in bioactive plant natural products, such as hydrolyzable tannins and anthocyanins. Despite their documented roles in human nutrition and fruit quality, genes involved in natural product biosynthesis have not been cloned from pomegranate and very little sequence information is available on pomegranate in the public domain. Shotgun transcriptome sequencing of pomegranate fruit peel cDNA was performed using RNA-Seq on the Illumina Genome Analyzer platform. Over 100 million raw sequence reads were obtained and assembled into 9,839 transcriptome assemblies (TAs) (>200 bp). Candidate genes for hydrolyzable tannin, anthocyanin, flavonoid, terpenoid and fatty acid biosynthesis and/or regulation were identified. Three lipid transfer proteins were obtained that may contribute to the previously reported IgE reactivity of pomegranate fruit extracts. In addition, 115 SSR markers were identified from the pomegranate fruit peel transcriptome and primers were designed for 77 SSR markers. The pomegranate fruit peel transcriptome set provides a valuable platform for natural product biosynthetic gene and SSR marker discovery in pomegranate. This work also demonstrates that next-generation transcriptome sequencing is an economical and effective approach for investigating natural product biosynthesis, identifying genes controlling important agronomic traits, and discovering molecular markers in non-model specialty crop species.

  3. Human Genome Project discoveries: Dialectics and rhetoric in the science of genetics

    NASA Astrophysics Data System (ADS)

    Robidoux, Charlotte A.

    The Human Genome Project (HGP), a $437 million effort that began in 1990 to chart the chemical sequence of our three billion base pairs of DNA, was completed in 2003, marking the 50th anniversary that proved the definitive structure of the molecule. This study considered how dialectical and rhetorical arguments functioned in the science, political, and public forums over a 20-year period, from 1980 to 2000, to advance human genome research and to establish the official project. I argue that Aristotle's continuum of knowledge--which ranges from the probable on one end to certified or demonstrated knowledge on the other--provides useful distinctions for analyzing scientific reasoning. While contemporary scientific research seeks to discover certified knowledge, investigators generally employ the hypothetico-deductive or scientific method, which often yields probable rather than certain findings, making these dialectical in nature. Analysis of the discourse describing human genome research revealed the use of numerous rhetorical figures and topics. Persuasive and probable reasoning were necessary for scientists to characterize unknown genetic phenomena, to secure interest in and funding for large-scale human genome research, to solve scientific problems, to issue probable findings, to convince colleagues and government officials that the findings were sound and to disseminate information to the public. Both government and private venture scientists drew on these tools of reasoning to promote their methods of mapping and sequencing the genome. The debate over how to carry out sequencing was rooted in conflicting values. Scientists representing the academic tradition valued a more conservative method that would establish high quality results, and those supporting private industry valued an unconventional approach that would yield products and profits more quickly. Values in turn influenced political and public forum arguments. Agency representatives and investors sided

  4. Discovery of differentially expressed genes in cashmere goat (Capra hircus) hair follicles by RNA sequencing.

    PubMed

    Qiao, X; Wu, J H; Wu, R B; Su, R; Li, C; Zhang, Y J; Wang, R J; Zhao, Y H; Fan, Y X; Zhang, W G; Li, J Q

    2016-09-02

    The mammalian hair follicle (HF) is a unique, highly regenerative organ with a distinct developmental cycle. Cashmere goat (Capra hircus) HFs can be divided into two categories based on structure and development time: primary and secondary follicles. To identify differentially expressed genes (DEGs) in the primary and secondary HFs of cashmere goats, the RNA sequencing of six individuals from Arbas, Inner Mongolia, was performed. A total of 617 DEGs were identified; 297 were upregulated while 320 were downregulated. Gene ontology analysis revealed that the main functions of the upregulated genes were electron transport, respiratory electron transport, mitochondrial electron transport, and gene expression. The downregulated genes were mainly involved in cell autophagy, protein complexes, neutrophil aggregation, and bacterial fungal defense reactions. According to the Kyoto Encyclopedia of Genes and Genomes database, these genes are mainly involved in the metabolism of cysteine and methionine, RNA polymerization, and the MAPK signaling pathway, and were enriched in primary follicles. A microRNA-target network revealed that secondary follicles are involved in several important biological processes, such as the synthesis of keratin-associated proteins and enzymes involved in amino acid biosynthesis. In summary, these findings will increase our understanding of the complex molecular mechanisms of HF development and cycling, and provide a basis for the further study of the genes and functions of HF development.

  5. From metagenomic gene discovery to enzymatic breakdown of crosslinks in agricultural fibers for functional products

    Technology Transfer Automated Retrieval System (TEKTRAN)

    From the rumen microflora, more than twenty novel genes involved in the hydrolysis of glucuronoarabinoxylans have been discovered and isolated. The specific genes functioning in the breakdown of crosslinkages have been cloned and expressed in E. coli, and the active enzymes purified and extensively ...

  6. Discovery of putative capsaicin biosynthetic genes by RNA-Seq and digital gene expression analysis of pepper.

    PubMed

    Zhang, Zi-Xin; Zhao, Shu-Niu; Liu, Gao-Feng; Huang, Zu-Mei; Cao, Zhen-Mu; Cheng, Shan-Han; Lin, Shi-Sen

    2016-10-19

    The Indian pepper 'Guijiangwang' (Capsicum frutescens L.), one of the world's hottest chili peppers, is rich in capsaicinoids. The accumulation of the alkaloid capsaicin and its analogs in the epidermal cells of the placenta contribute to the pungency of Capsicum fruits. To identify putative genes involved in capsaicin biosynthesis, RNA-Seq was used to analyze the pepper's expression profiles over five developmental stages. Five cDNA libraries were constructed from the total RNA of placental tissue and sequenced using an Illumina HiSeq 2000. More than 19 million clean reads were obtained from each library, and greater than 50% of the reads were assignable to reference genes. Digital gene expression (DGE) profile analysis using Solexa sequencing was performed at five fruit developmental stages and resulted in the identification of 135 genes of known function; their expression patterns were compared to the capsaicin accumulation pattern. Ten genes of known function were identified as most likely to be involved in regulating capsaicin synthesis. Additionally, 20 new candidate genes were identified related to capsaicin synthesis. We use a combination of RNA-Seq and DGE analyses to contribute to the understanding of the biosynthetic regulatory mechanism(s) of secondary metabolites in a nonmodel plant and to identify candidate enzyme-encoding genes.

  7. Discovery of putative capsaicin biosynthetic genes by RNA-Seq and digital gene expression analysis of pepper

    PubMed Central

    Zhang, Zi-Xin; Zhao, Shu-Niu; Liu, Gao-Feng; Huang, Zu-Mei; Cao, Zhen-Mu; Cheng, Shan-Han; Lin, Shi-Sen

    2016-01-01

    The Indian pepper ‘Guijiangwang’ (Capsicum frutescens L.), one of the world’s hottest chili peppers, is rich in capsaicinoids. The accumulation of the alkaloid capsaicin and its analogs in the epidermal cells of the placenta contribute to the pungency of Capsicum fruits. To identify putative genes involved in capsaicin biosynthesis, RNA-Seq was used to analyze the pepper’s expression profiles over five developmental stages. Five cDNA libraries were constructed from the total RNA of placental tissue and sequenced using an Illumina HiSeq 2000. More than 19 million clean reads were obtained from each library, and greater than 50% of the reads were assignable to reference genes. Digital gene expression (DGE) profile analysis using Solexa sequencing was performed at five fruit developmental stages and resulted in the identification of 135 genes of known function; their expression patterns were compared to the capsaicin accumulation pattern. Ten genes of known function were identified as most likely to be involved in regulating capsaicin synthesis. Additionally, 20 new candidate genes were identified related to capsaicin synthesis. We use a combination of RNA-Seq and DGE analyses to contribute to the understanding of the biosynthetic regulatory mechanism(s) of secondary metabolites in a nonmodel plant and to identify candidate enzyme-encoding genes. PMID:27756914

  8. Molecular Profiling of Breast Cancer Cell Lines Defines Relevant Tumor Models and Provides a Resource for Cancer Gene Discovery

    PubMed Central

    Bocanegra, Melanie; Choi, Yoon-La; Girard, Luc; Gandhi, Jeet; Kwei, Kevin A.; Hernandez-Boussard, Tina; Wang, Pei; Gazdar, Adi F.; Minna, John D.; Pollack, Jonathan R.

    2009-01-01

    candidate breast cancer genes. Conclusions Overall, breast cancer cell lines were genetically more complex than tumors, but retained expression patterns with relevance to the luminal-basal subtype distinction. The compendium of molecular profiles defines cell lines suitable for investigations of subtype-specific pathobiology, cancer stem cell biology, biomarkers and therapies, and provides a resource for discovery of new breast cancer genes. PMID:19582160

  9. Discovery of Unusual Biaryl Polyketides by Activation of a Silent Streptomyces venezuelae Biosynthetic Gene Cluster.

    PubMed

    Thanapipatsiri, Anyarat; Gomez-Escribano, Juan Pablo; Song, Lijiang; Bibb, Maureen J; Al-Bassam, Mahmoud; Chandra, Govind; Thamchaipenet, Arinthip; Challis, Gregory L; Bibb, Mervyn J

    2016-11-17

    Comparative transcriptional profiling of a ΔbldM mutant of Streptomyces venezuelae with its unmodified progenitor revealed that the expression of a cryptic biosynthetic gene cluster containing both type I and type III polyketide synthase genes is activated in the mutant. The 29.5 kb gene cluster, which was predicted to encode an unusual biaryl metabolite, which we named venemycin, and potentially halogenated derivatives, contains 16 genes including one-vemR-that encodes a transcriptional activator of the large ATP-binding LuxR-like (LAL) family. Constitutive expression of vemR in the ΔbldM mutant led to the production of sufficient venemycin for structural characterisation, confirming its unusual biaryl structure. Co-expression of the venemycin biosynthetic gene cluster and vemR in the heterologous host Streptomyces coelicolor also resulted in venemycin production. Although the gene cluster encodes two halogenases and a flavin reductase, constitutive expression of all three genes led to the accumulation only of a monohalogenated venemycin derivative, both in the native producer and the heterologous host. A competition experiment in which equimolar quantities of sodium chloride and sodium bromide were fed to the venemycin-producing strains resulted in the preferential incorporation of bromine, thus suggesting that bromide is the preferred substrate for one or both halogenases.

  10. Discovery of Unusual Biaryl Polyketides by Activation of a Silent Streptomyces venezuelae Biosynthetic Gene Cluster

    PubMed Central

    Thanapipatsiri, Anyarat; Gomez‐Escribano, Juan Pablo; Song, Lijiang; Bibb, Maureen J.; Al‐Bassam, Mahmoud; Chandra, Govind

    2016-01-01

    Abstract Comparative transcriptional profiling of a ΔbldM mutant of Streptomyces venezuelae with its unmodified progenitor revealed that the expression of a cryptic biosynthetic gene cluster containing both type I and type III polyketide synthase genes is activated in the mutant. The 29.5 kb gene cluster, which was predicted to encode an unusual biaryl metabolite, which we named venemycin, and potentially halogenated derivatives, contains 16 genes including one—vemR—that encodes a transcriptional activator of the large ATP‐binding LuxR‐like (LAL) family. Constitutive expression of vemR in the ΔbldM mutant led to the production of sufficient venemycin for structural characterisation, confirming its unusual biaryl structure. Co‐expression of the venemycin biosynthetic gene cluster and vemR in the heterologous host Streptomyces coelicolor also resulted in venemycin production. Although the gene cluster encodes two halogenases and a flavin reductase, constitutive expression of all three genes led to the accumulation only of a monohalogenated venemycin derivative, both in the native producer and the heterologous host. A competition experiment in which equimolar quantities of sodium chloride and sodium bromide were fed to the venemycin‐producing strains resulted in the preferential incorporation of bromine, thus suggesting that bromide is the preferred substrate for one or both halogenases. PMID:27605017

  11. Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera

    PubMed Central

    2010-01-01

    Background Grosmannia clavigera is a bark beetle-vectored fungal pathogen of pines that causes wood discoloration and may kill trees by disrupting nutrient and water transport. Trees respond to attacks from beetles and associated fungi by releasing terpenoid and phenolic defense compounds. It is unclear which genes are important for G. clavigera's ability to overcome antifungal pine terpenoids and phenolics. Results We constructed seven cDNA libraries from eight G. clavigera isolates grown under various culture conditions, and Sanger sequenced the 5' and 3' ends of 25,000 cDNA clones, resulting in 44,288 high quality ESTs. The assembled dataset of unique transcripts (unigenes) consists of 6,265 contigs and 2,459 singletons that mapped to 6,467 locations on the G. clavigera reference genome, representing ~70% of the predicted G. clavigera genes. Although only 54% of the unigenes matched characterized proteins at the NCBI database, this dataset extensively covers major metabolic pathways, cellular processes, and genes necessary for response to environmental stimuli and genetic information processing. Furthermore, we identified genes expressed in spores prior to germination, and genes involved in response to treatment with lodgepole pine phloem extract (LPPE). Conclusions We provide a comprehensively annotated EST dataset for G. clavigera that represents a rich resource for gene characterization in this and other ophiostomatoid fungi. Genes expressed in response to LPPE treatment are indicative of fungal oxidative stress response. We identified two clusters of potentially functionally related genes responsive to LPPE treatment. Furthermore, we report a simple method for identifying contig misassemblies in de novo assembled EST collections caused by gene overlap on the genome. PMID:20920358

  12. Use of eQTL Analysis for the Discovery of Target Genes Identified by GWAS

    DTIC Science & Technology

    2014-04-01

    of the scatterplot with %GC on the x-axis and log2(GeneCount) on the y-axis. A loess smoother line is shown indicating the general pattern of all the...Gene Count values for this particular subject. Similarly, Figure 55 to 79 shows the loess smoother line for each subject. Based on this plot, it...Counts) for each Subject by RunID 59 Figure 55: Distribution of Percent GC versus log2(Gene Count + 1) with a loess smoother for each subject by

  13. DISCOVERY OF A LOW-MASS COMPANION TO A METAL-RICH F STAR WITH THE MARVELS PILOT PROJECT

    SciTech Connect

    Fleming, Scott W.; Ge Jian; Mahadevan, Suvrath; Lee, Brian; Cuong Nguyen, Duy; Morehead, Robert C.; Wan Xiaoke; Zhao Bo; Liu Jian; Guo Pengcheng; Kane, Stephen R.; Eastman, Jason D.; Siverd, Robert J.; Scott Gaudi, B.; Niedzielski, Andrzej; Sivarani, Thirupathi; Stassun, Keivan G.; Gary, Bruce; Wolszczan, Alex; Barnes, Rory

    2010-08-01

    We report the discovery of a low-mass companion orbiting the metal-rich, main sequence F star TYC 2949-00557-1 during the Multi-object APO Radial Velocity Exoplanet Large-area Survey (MARVELS) pilot project. The host star has an effective temperature T{sub eff} = 6135 {+-} 40 K, logg = 4.4 {+-} 0.1, and [Fe/H] = 0.32 {+-} 0.01, indicating a mass of M = 1.25 {+-} 0.09 M{sub sun} and R = 1.15 {+-} 0.15 R{sub sun}. The companion has an orbital period of 5.69449 {+-} 0.00023 days and straddles the hydrogen burning limit with a minimum mass of 64 M{sub J} , and thus may be an example of the rare class of brown dwarfs orbiting at distances comparable to those of 'Hot Jupiters'. We present relative photometry that demonstrates that the host star is photometrically stable at the few millimagnitude level on time scales of hours to years, and rules out transits for a companion of radius {approx}>0.8 R{sub J} at the 95% confidence level. Tidal analysis of the system suggests that the star and companion are likely in a double synchronous state where both rotational and orbital synchronization have been achieved. This is the first low-mass companion detected with a multi-object, dispersed, fixed-delay interferometer.

  14. Discovery and validation of gene classifiers for endocrine-disrupting chemicals in zebrafish (Danio rerio)

    EPA Science Inventory

    Development and application of transcriptomics-based gene classifiers for ecotoxicological applications lag far behind those of human biomedical science. Many such classifiers discovered thus far lack vigorous statistical and experimental validations, with their stability and rel...

  15. G-SESAME: web tools for GO-term-based gene similarity analysis and knowledge discovery

    PubMed Central

    Du, Zhidian; Li, Lin; Chen, Chin-Fu; Yu, Philip S.; Wang, James Z.

    2009-01-01

    We have developed a set of online tools for measuring the semantic similarities of Gene Ontology (GO) terms and the functional similarities of gene products, and for further discovering biomedical knowledge from the GO database. The tools have been used for about 6.9 million times by 417 institutions from 43 countries since October 2006. The online tools are available at: http://bioinformatics.clemson.edu/G-SESAME. PMID:19491312

  16. Discovery of diversity in xylan biosynthetic genes by transcriptional profiling of a heteroxylan containing mucilaginous tissue

    PubMed Central

    Jensen, Jacob K.; Johnson, Nathan; Wilkerson, Curtis G.

    2013-01-01

    The exact biochemical steps of xylan backbone synthesis remain elusive. In Arabidopsis, three non-redundant genes from two glycosyltransferase (GT) families, IRX9 and IRX14 from GT43 and IRX10 from GT47, are candidates for forming the xylan backbone. In other plants, evidence exists that different tissues express these three genes at widely different levels, which suggests that diversity in the makeup of the xylan synthase complex exists. Recently we have profiled the transcripts present in the developing mucilaginous tissue of psyllium (Plantago ovata Forsk). This tissue was found to have high expression levels of an IRX10 homolog, but very low levels of the two GT43 family members. This contrasts with recent wheat endosperm tissue profiling that found a relatively high abundance of the GT43 family members. We have performed an in-depth analysis of all GTs genes expressed in four developmental stages of the psyllium mucilagenous layer and in a single stage of the psyllium stem using RNA-Seq. This analysis revealed several IRX10 homologs, an expansion in GT61 (homologs of At3g18170/At3g18180), and several GTs from other GT families that are highly abundant and specifically expressed in the mucilaginous tissue. Our current hypothesis is that the four IRX10 genes present in the mucilagenous tissues have evolved to function without the GT43 genes. These four genes represent some of the most divergent IRX10 genes identified to date. Conversely, those present in the psyllium stem are very similar to those in other eudicots. This suggests these genes are under selective pressure, likely due to the synthesis of the various xylan structures present in mucilage that has a different biochemical role than that present in secondary walls. The numerous GT61 family members also show a wide sequence diversity and may be responsible for the larger number of side chain structures present in the psyllium mucilage. PMID:23761806

  17. Discovery and characterization of two novel salt-tolerance genes in Puccinellia tenuiflora.

    PubMed

    Li, Ying; Takano, Tetsuo; Liu, Shenkui

    2014-09-18

    Puccinellia tenuiflora is a monocotyledonous halophyte that is able to survive in extreme saline soil environments at an alkaline pH range of 9-10. In this study, we transformed full-length cDNAs of P. tenuiflora into Saccharomyces cerevisiae by using the full-length cDNA over-expressing gene-hunting system to identify novel salt-tolerance genes. In all, 32 yeast clones overexpressing P. tenuiflora cDNA were obtained by screening under NaCl stress conditions; of these, 31 clones showed stronger tolerance to NaCl and were amplified using polymerase chain reaction (PCR) and sequenced. Four novel genes encoding proteins with unknown function were identified; these genes had no homology with genes from higher plants. Of the four isolated genes, two that encoded proteins with two transmembrane domains showed the strongest resistance to 1.3 M NaCl. RT-PCR and northern blot analysis of P. tenuiflora cultured cells confirmed the endogenous NaCl-induced expression of the two proteins. Both of the proteins conferred better tolerance in yeasts to high salt, alkaline and osmotic conditions, some heavy metals and H2O2 stress. Thus, we inferred that the two novel proteins might alleviate oxidative and other stresses in P. tenuiflora.

  18. Genome-wide discovery of Pax7 target genes during development.

    PubMed

    White, Robert B; Ziman, Melanie R

    2008-03-14

    Pax7 plays critical roles in development of brain, spinal cord, neural crest, and skeletal muscle. As a sequence-specific DNA-binding transcription factor, any direct functional role played by Pax7 during development is mediated through target gene selection. Thus, we have sought to identify genes targeted by Pax7 during embryonic development using an unbiased chromatin immunoprecipitation (ChIP) cloning assay to isolate cis-regulatory regions bound by Pax7 in vivo. Sequencing and genomic localization of a library of chromatin-DNA fragments bound by Pax7 has identified 34 candidate Pax7 target genes, with occupancy of a selection confirmed with independent chromatin enrichment tests (ChIP-PCR). To assess the capacity of Pax7 to regulate transcription from these loci, we have cloned alternate transcripts of Pax7 (differing significantly in their DNA binding domain) into expression vectors and transfected cultured cells with these constructs, then analyzed target gene expression levels using RT-PCR. We show that Pax7 directly occupies sites within genes encoding transcription factors Gbx1 and Eya4, the neurogenic cytokine receptor ciliary neurotrophic factor receptor, the neuronal potassium channel Kcnk2, and the signal transduction kinase Camk1d in vivo and regulates the transcriptional state of these genes in cultured cells. This analysis gives us greater insight into the direct functional role played by Pax7 during embryonic development.

  19. Use of model organism and disease databases to support matchmaking for human disease gene discovery.

    PubMed

    Mungall, Christopher J; Washington, Nicole L; Nguyen-Xuan, Jeremy; Condit, Christopher; Smedley, Damian; Köhler, Sebastian; Groza, Tudor; Shefchek, Kent; Hochheiser, Harry; Robinson, Peter N; Lewis, Suzanna E; Haendel, Melissa A

    2015-10-01

    The Matchmaker Exchange application programming interface (API) allows searching a patient's genotypic or phenotypic profiles across clinical sites, for the purposes of cohort discovery and variant disease causal validation. This API can be used not only to search for matching patients, but also to match against public disease and model organism data. This public disease data enable matching known diseases and variant-phenotype associations using phenotype semantic similarity algorithms developed by the Monarch Initiative. The model data can provide additional evidence to aid diagnosis, suggest relevant models for disease mechanism and treatment exploration, and identify collaborators across the translational divide. The Monarch Initiative provides an implementation of this API for searching multiple integrated sources of data that contextualize the knowledge about any given patient or patient family into the greater biomedical knowledge landscape. While this corpus of data can aid diagnosis, it is also the beginning of research to improve understanding of rare human diseases.

  20. Characterization of Greater Middle Eastern genetic variation for enhanced disease gene discovery

    PubMed Central

    Scott, Eric M.; Halees, Anason; Itan, Yuval; Spencer, Emily G.; He, Yupeng; Azab, Mostafa Abdellateef; Gabriel, Stacey B.; Belkadi, Aziz; Boisson, Bertrand; Abel, Laurent; Clark, Andrew G.; Alkuraya, Fowzan S.; Casanova, Jean-Laurent; Gleeson, Joseph G.

    2016-01-01

    The Greater Middle East (GME) has been a central hub of human migration and population admixture. The tradition of consanguinity, variably practiced in the Gulf region, North Africa, and Central Asia 1–3, has resulted in an elevated burden of recessive disease4. Here we generated a whole exome GME variome from 1,111 unrelated subjects. We detected substantial diversity from sub-geographies, continental and subregional admixture, several ancient founder populations with little evidence of bottlenecks. Measured consanguinity was an order-of-magnitude above that of other sampled populations, and included an increased burden of runs of homozygosity (ROH), but no evidence for reduced burden of deleterious variation due to classically theorized ‘genetic purging’. Applying this database to unsolved GME recessive conditions reduced the number of potential disease-causing variants by 4–7-fold. These results reveal the variegated GME genetic architecture and support future human genetic discoveries in Mendelian and population genetics. PMID:27428751

  1. Discovery of Antibiotics-derived Polymers for Gene Delivery using Combinatorial Synthesis and Cheminformatics Modeling

    PubMed Central

    Potta, Thrimoorthy; Zhen, Zhuo; Grandhi, Taraka Sai Pavan; Christensen, Matthew D.; Ramos, James; Breneman, Curt M.; Rege, Kaushal

    2014-01-01

    We describe the combinatorial synthesis and cheminformatics modeling of aminoglycoside antibiotics-derived polymers for transgene delivery and expression. Fifty-six polymers were synthesized by polymerizing aminoglycosides with diglycidyl ether cross-linkers. Parallel screening resulted in identification of several lead polymers that resulted in high transgene expression levels in cells. The role of polymer physicochemical properties in determining efficacy of transgene expression was investigated using Quantitative Structure-Activity Relationship (QSAR) cheminformatics models based on Support Vector Regression (SVR) and ‘building block’ polymer structures. The QSAR model exhibited high predictive ability, and investigation of descriptors in the model, using molecular visualization and correlation plots, indicated that physicochemical attributes related to both, aminoglycosides and diglycidyl ethers facilitated transgene expression. This work synergistically combines combinatorial synthesis and parallel screening with cheminformatics-based QSAR models for discovery and physicochemical elucidation of effective antibiotics-derived polymers for transgene delivery in medicine and biotechnology. PMID:24331709

  2. Natural and man-made V-gene repertoires for antibody discovery

    PubMed Central

    Finlay, William J. J.; Almagro, Juan C.

    2012-01-01

    Antibodies are the fastest-growing segment of the biologics market. The success of antibody-based drugs resides in their exquisite specificity, high potency, stability, solubility, safety, and relatively inexpensive manufacturing process in comparison with other biologics. We outline here the structural studies and fundamental principles that define how antibodies interact with diverse targets. We also describe the antibody repertoires and affinity maturation mechanisms of humans, mice, and chickens, plus the use of novel single-domain antibodies in camelids and sharks. These species all utilize diverse evolutionary solutions to generate specific and high affinity antibodies and illustrate the plasticity of natural antibody repertoires. In addition, we discuss the multiple variations of man-made antibody repertoires designed and validated in the last two decades, which have served as tools to explore how the size, diversity, and composition of a repertoire impact the antibody discovery process. PMID:23162556

  3. Discovery of antibiotics-derived polymers for gene delivery using combinatorial synthesis and cheminformatics modeling.

    PubMed

    Potta, Thrimoorthy; Zhen, Zhuo; Grandhi, Taraka Sai Pavan; Christensen, Matthew D; Ramos, James; Breneman, Curt M; Rege, Kaushal

    2014-02-01

    We describe the combinatorial synthesis and cheminformatics modeling of aminoglycoside antibiotics-derived polymers for transgene delivery and expression. Fifty-six polymers were synthesized by polymerizing aminoglycosides with diglycidyl ether cross-linkers. Parallel screening resulted in identification of several lead polymers that resulted in high transgene expression levels in cells. The role of polymer physicochemical properties in determining efficacy of transgene expression was investigated using Quantitative Structure-Activity Relationship (QSAR) cheminformatics models based on Support Vector Regression (SVR) and 'building block' polymer structures. The QSAR model exhibited high predictive ability, and investigation of descriptors in the model, using molecular visualization and correlation plots, indicated that physicochemical attributes related to both, aminoglycosides and diglycidyl ethers facilitated transgene expression. This work synergistically combines combinatorial synthesis and parallel screening with cheminformatics-based QSAR models for discovery and physicochemical elucidation of effective antibiotics-derived polymers for transgene delivery in medicine and biotechnology.

  4. DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes

    PubMed Central

    Piñero, Janet; Queralt-Rosinach, Núria; Bravo, Àlex; Deu-Pons, Jordi; Bauer-Mehren, Anna; Baron, Martin; Sanz, Ferran; Furlong, Laura I.

    2015-01-01

    DisGeNET is a comprehensive discovery platform designed to address a variety of questions concerning the genetic underpinning of human diseases. DisGeNET contains over 380 000 associations between >16 000 genes and 13 000 diseases, which makes it one of the largest repositories currently available of its kind. DisGeNET integrates expert-curated databases with text-mined data, covers information on Mendelian and complex diseases, and includes data from animal disease models. It features a score based on the supporting evidence to prioritize gene-disease associations. It is an open access resource available through a web interface, a Cytoscape plugin and as a Semantic Web resource. The web interface supports user-friendly data exploration and navigation. DisGeNET data can also be analysed via the DisGeNET Cytoscape plugin, and enriched with the annotations of other plugins of this popular network analysis software suite. Finally, the information contained in DisGeNET can be expanded and complemented using Semantic Web technologies and linked to a variety of resources already present in the Linked Data cloud. Hence, DisGeNET offers one of the most comprehensive collections of human gene-disease associations and a valuable set of tools for investigating the molecular mechanisms underlying diseases of genetic origin, designed to fulfill the needs of different user profiles, including bioinformaticians, biologists and health-care practitioners. Database URL: http://www.disgenet.org/ PMID:25877637

  5. DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes.

    PubMed

    Piñero, Janet; Queralt-Rosinach, Núria; Bravo, Àlex; Deu-Pons, Jordi; Bauer-Mehren, Anna; Baron, Martin; Sanz, Ferran; Furlong, Laura I

    2015-01-01

    DisGeNET is a comprehensive discovery platform designed to address a variety of questions concerning the genetic underpinning of human diseases. DisGeNET contains over 380,000 associations between >16,000 genes and 13,000 diseases, which makes it one of the largest repositories currently available of its kind. DisGeNET integrates expert-curated databases with text-mined data, covers information on Mendelian and complex diseases, and includes data from animal disease models. It features a score based on the supporting evidence to prioritize gene-disease associations. It is an open access resource available through a web interface, a Cytoscape plugin and as a Semantic Web resource. The web interface supports user-friendly data exploration and navigation. DisGeNET data can also be analysed via the DisGeNET Cytoscape plugin, and enriched with the annotations of other plugins of this popular network analysis software suite. Finally, the information contained in DisGeNET can be expanded and complemented using Semantic Web technologies and linked to a variety of resources already present in the Linked Data cloud. Hence, DisGeNET offers one of the most comprehensive collections of human gene-disease associations and a valuable set of tools for investigating the molecular mechanisms underlying diseases of genetic origin, designed to fulfill the needs of different user profiles, including bioinformaticians, biologists and health-care practitioners. Database URL: http://www.disgenet.org/

  6. De novo assembly of the common bean transcriptome using short reads for the discovery of drought-responsive genes.

    PubMed

    Wu, Jing; Wang, Lanfen; Li, Long; Wang, Shumin

    2014-01-01

    The common bean (Phaseolus vulgaris L.) is one of the most important food legumes, far ahead of other legumes. The average grain yield of the common bean worldwide is much lower than its potential yields, primarily due to drought in the field. However, the gene network that mediates plant responses to drought stress remains largely unknown in this species. The major goals of our study are to identify a large scale of genes involved in drought stress using RNA-seq. First, we assembled 270 million high-quality trimmed reads into a non-redundant set of 62,828 unigenes, representing approximately 49 Mb of unique transcriptome sequences. Of these unigenes, 26,501 (42.2%) common bean unigenes had significant similarity with unigenes/predicted proteins from other legumes or sequenced plants. All unigenes were functionally annotated within the GO, COG and KEGG pathways. The strategy for de novo assembly of transcriptome data generated here will be useful in other legume plant transcriptome studies. Second, we identified 10,482 SSRs and 4,099 SNPs in transcripts. The large number of genetic markers provides a resource for gene discovery and development of functional molecular markers. Finally, we found differential expression genes (DEGs) between terminal drought and optimal irrigation treatments and between the two different genotypes Long 22-0579 (drought tolerant) and Naihua (drought sensitive). DEGs were confirmed by quantitative real-time PCR assays, which indicated that these genes are functionally associated with the drought-stress response. These resources will be helpful for basic and applied research for genome analysis and crop drought resistance improvement in the common bean.

  7. Transcriptome Analysis and Discovery of Genes Involved in Immune Pathways from Hepatopancreas of Microbial Challenged Mitten Crab Eriocheir sinensis

    PubMed Central

    Li, Xihong; Cui, Zhaoxia; Liu, Yuan; Song, Chengwen; Shi, Guohui

    2013-01-01

    Background The Chinese mitten crab Eriocheir sinensis is an important economic crustacean and has been seriously attacked by various diseases, which requires more and more information for immune relevant genes on genome background. Recently, high-throughput RNA sequencing (RNA-seq) technology provides a powerful and efficient method for transcript analysis and immune gene discovery. Methods/Principal Findings A cDNA library from hepatopancreas of E. sinensis challenged by a mixture of three pathogen strains (Gram-positive bacteria Micrococcus luteus, Gram-negative bacteria Vibrio alginolyticus and fungi Pichia pastoris; 108 cfu·mL−1) was constructed and randomly sequenced using Illumina technique. Totally 39.76 million clean reads were assembled to 70,300 unigenes. After ruling out short-length and low-quality sequences, 52,074 non-redundant unigenes were compared to public databases for homology searching and 17,617 of them showed high similarity to sequences in NCBI non-redundant protein (Nr) database. For function classification and pathway assignment, 18,734 (36.00%) unigenes were categorized to three Gene Ontology (GO) categories, 12,243 (23.51%) were classified to 25 Clusters of Orthologous Groups (COG), and 8,983 (17.25%) were assigned to six Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Potentially, 24, 14, 47 and 132 unigenes were characterized to be involved in Toll, IMD, JAK-STAT and MAPK pathways, respectively. Conclusions/Significance This is the first systematical transcriptome analysis of components relating to innate immune pathways in E. sinensis. Functional genes and putative pathways identified here will contribute to better understand immune system and prevent various diseases in crab. PMID:23874555

  8. Discovery in silico and characterization in vitro of novel genes exclusively expressed in the mouse epididymis.

    PubMed

    Penttinen, Jenni; Pujianto, Dwi Ari; Sipila, Petra; Huhtaniemi, Ilpo; Poutanen, Matti

    2003-11-01

    Epididymal proteins interact with sperm during their passage through the epididymis and thus contribute to the maturation and fertilizing capacity of the spermatozoa. In the present study we have discovered five novel epididymis-specific genes through in silico analysis of expressed sequence tags (ESTs) at the UniGene library collection. The strategy used is a powerful way to discover novel epididymis-specific genes. The full-length cDNA sequences were determined, and computational tools were used to characterize the genomic structures and to predict putative functions for the encoded proteins. In vitro analyses revealed that all five genes characterized were highly expressed in the defined areas of the epididymis, and they were not expressed at significant levels in any other tissue. Three of the genes were named on the basis of their putative functions: Spint4 (serine protease inhibitor, Kunitz type 4), and Rnase9 and Rnase10 (ribonuclease, Rnase A family 9 and 10), while for the ESTs AV381130 and AV381126 no putative functions could be predicted. The expression of Spint4, Rnase9, and AV381130 was found to be under a direct or indirect regulation by androgens, while the expression of Rnase10 is regulated by a testicular factor(s) other than androgen. None of the genes were expressed in the immature epididymis, while mRNAs were detected from d 17 onward, at the time of maturation of epididymal epithelium. However, the expression of AV381130 was not detected until d 30 after birth, indicating a close connection between gene expression and puberty.

  9. Drosophila and Caenorhabditis elegans as Discovery Platforms for Genes Involved in Human Alcohol Use Disorder

    PubMed Central

    Grotewiel, Mike; Bettinger, Jill C.

    2015-01-01

    Background Despite the profound clinical significance and strong heritability of alcohol use disorder (AUD), we do not yet have a comprehensive understanding of the naturally occurring genetic variance within the human genome that drives its development. This lack of understanding is likely to be due in part to the large phenotypic and genetic heterogeneities that underlie human AUD. As a complement to genetic studies in humans, many laboratories are using the invertebrate model organisms (iMOs) Drosophila melanogaster (fruit fly) and Caenorhabditis elegans (nematode worm) to identify genetic mechanisms that influence the effects of alcohol (ethanol) on behavior. While these extremely powerful models have identified many genes that influence the behavioral responses to alcohol, in most cases it has remained unclear whether results from behavioral–genetic studies in iMOs are directly applicable to understanding the genetic basis of human AUD. Methods In this review, we critically evaluate the utility of the fly and worm models for identifying genes that influence AUD in humans. Results Based on results published through early 2015, studies in flies and worms have identified 91 and 50 genes, respectively, that influence 1 or more aspects of behavioral responses to alcohol. Collectively, these fly and worm genes correspond to 293 orthologous genes in humans. Intriguingly, 51 of these 293 human genes have been implicated in AUD by at least 1 study in human populations. Conclusions Our analyses strongly suggest that the Drosophila and C. elegans models have considerable utility for identifying orthologs of genes that influence human AUD. PMID:26173477

  10. Phylogenomic Analysis of Natural Products Biosynthetic Gene Clusters Allows Discovery of Arseno-Organic Metabolites in Model Streptomycetes

    PubMed Central

    Cruz-Morales, Pablo; Kopp, Johannes Florian; Martínez-Guerrero, Christian; Yáñez-Guerra, Luis Alfonso; Selem-Mojica, Nelly; Ramos-Aboites, Hilda; Feldmann, Jörg; Barona-Gómez, Francisco

    2016-01-01

    Natural products from microbes have provided humans with beneficial antibiotics for millennia. However, a decline in the pace of antibiotic discovery exerts pressure on human health as antibiotic resistance spreads, a challenge that may better faced by unveiling chemical diversity produced by microbes. Current microbial genome mining approaches have revitalized research into antibiotics, but the empirical nature of these methods limits the chemical space that is explored. Here, we address the problem of finding novel pathways by incorporating evolutionary principles into genome mining. We recapitulated the evolutionary history of twenty-three enzyme families previously uninvestigated in the context of natural product biosynthesis in Actinobacteria, the most proficient producers of natural products. Our genome evolutionary analyses where based on the assumption that expanded—repurposed enzyme families—from central metabolism, occur frequently and thus have the potential to catalyze new conversions in the context of natural products biosynthesis. Our analyses led to the discovery of biosynthetic gene clusters coding for hidden chemical diversity, as validated by comparing our predictions with those from state-of-the-art genome mining tools; as well as experimentally demonstrating the existence of a biosynthetic pathway for arseno-organic metabolites in Streptomyces coelicolor and Streptomyces lividans, Using a gene knockout and metabolite profile combined strategy. As our approach does not rely solely on sequence similarity searches of previously identified biosynthetic enzymes, these results establish the basis for the development of an evolutionary-driven genome mining tool termed EvoMining that complements current platforms. We anticipate that by doing so real ‘chemical dark matter’ will be unveiled. PMID:27289100

  11. Phylogenomic Analysis of Natural Products Biosynthetic Gene Clusters Allows Discovery of Arseno-Organic Metabolites in Model Streptomycetes.

    PubMed

    Cruz-Morales, Pablo; Kopp, Johannes Florian; Martínez-Guerrero, Christian; Yáñez-Guerra, Luis Alfonso; Selem-Mojica, Nelly; Ramos-Aboites, Hilda; Feldmann, Jörg; Barona-Gómez, Francisco

    2016-07-02

    Natural products from microbes have provided humans with beneficial antibiotics for millennia. However, a decline in the pace of antibiotic discovery exerts pressure on human health as antibiotic resistance spreads, a challenge that may better faced by unveiling chemical diversity produced by microbes. Current microbial genome mining approaches have revitalized research into antibiotics, but the empirical nature of these methods limits the chemical space that is explored.Here, we address the problem of finding novel pathways by incorporating evolutionary principles into genome mining. We recapitulated the evolutionary history of twenty-three enzyme families previously uninvestigated in the context of natural product biosynthesis in Actinobacteria, the most proficient producers of natural products. Our genome evolutionary analyses where based on the assumption that expanded-repurposed enzyme families-from central metabolism, occur frequently and thus have the potential to catalyze new conversions in the context of natural products biosynthesis. Our analyses led to the discovery of biosynthetic gene clusters coding for hidden chemical diversity, as validated by comparing our predictions with those from state-of-the-art genome mining tools; as well as experimentally demonstrating the existence of a biosynthetic pathway for arseno-organic metabolites in Streptomyces coelicolor and Streptomyces lividans, Using a gene knockout and metabolite profile combined strategy.As our approach does not rely solely on sequence similarity searches of previously identified biosynthetic enzymes, these results establish the basis for the development of an evolutionary-driven genome mining tool termed EvoMining that complements current platforms. We anticipate that by doing so real 'chemical dark matter' will be unveiled.

  12. Diversity of ribulose-1,5-bisphosphate carboxylase/oxygenase large-subunit genes in the MgCl2-dominated deep hypersaline anoxic basin discovery.

    PubMed

    van der Wielen, Paul W J J

    2006-06-01

    Partial sequences of the form I (cbbL) and form II (cbbM) of the ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO) large subunit genes were obtained from the brine and interface of the MgCl2-dominated deep hypersaline anoxic basin Discovery. CbbL and cbbM genes were found in both brine and interface of the Discovery Basin but were absent in the overlying seawater. The diversity of both genes in the brine and interface was low, which might caused by the extreme saline conditions in Discovery of approximately 5 M MgCl2. None of the retrieved sequences were closely related to sequences deposited in the GenBank database. A phylogenetic analysis demonstrated that the cbbL sequences were affiliated with a Thiobacillus sp. or with one of the RuBisCO genes from Hydrogenovibrio marinus. The cbbM sequences clustered with thiobacilli or formed a new group with no close relatives. The results implicate that bacteria with the potential for carbon dioxide fixation and chemoautotrophy are present in the Discovery Basin. This is the first report demonstrating that RuBisCO genes are present under hypersaline conditions of 5 M MgCl2.

  13. Essential Gene Discovery in the Basidiomycete Cryptococcus neoformans for Antifungal Drug Target Prioritization

    PubMed Central

    Ianiri, Giuseppe

    2015-01-01

    ABSTRACT Fungal diseases represent a major burden to health care globally. As with other pathogenic microbes, there is a limited number of agents suitable for use in treating fungal diseases, and resistance to these agents can develop rapidly. Cryptococcus neoformans is a basidiomycete fungus that causes cryptococcosis worldwide in both immunocompromised and healthy individuals. As a basidiomycete, it diverged from other common pathogenic or model ascomycete fungi more than 500 million years ago. Here, we report C. neoformans genes that are essential for viability as identified through forward and reverse genetic approaches, using an engineered diploid strain and genetic segregation after meiosis. The forward genetic approach generated random insertional mutants in the diploid strain, the induction of meiosis and sporulation, and selection for haploid cells with counterselection of the insertion event. More than 2,500 mutants were analyzed, and transfer DNA (T-DNA) insertions in several genes required for viability were identified. The genes include those encoding the thioredoxin reductase (Trr1), a ribosome assembly factor (Rsa4), an mRNA-capping component (Cet1), and others. For targeted gene replacement, the C. neoformans homologs of 35 genes required for viability in ascomycete fungi were disrupted, meiosis and sporulation were induced, and haploid progeny were evaluated for their ability to grow on selective media. Twenty-one (60%) were found to be required for viability in C. neoformans. These genes are involved in mitochondrial translation, ergosterol biosynthesis, and RNA-related functions. The heterozygous diploid mutants were evaluated for haploinsufficiency on a number of perturbing agents and drugs, revealing phenotypes due to the loss of one copy of an essential gene in C. neoformans. This study expands the knowledge of the essential genes in fungi using a basidiomycete as a model organism. Genes that have no mammalian homologs and are essential

  14. Genomic Approaches For the Discovery of Genes Mutated in Inherited Retinal Degeneration

    PubMed Central

    Siemiatkowska, Anna M.; Collin, Rob W.J.; den Hollander, Anneke I.; Cremers, Frans P.M.

    2014-01-01

    In view of their high degree of genetic heterogeneity, inherited retinal diseases (IRDs) pose a significant challenge for identifying novel genetic causes. Thus far, more than 200 genes have been found to be mutated in IRDs, which together contain causal variants in >80% of the cases. Accurate genetic diagnostics is particularly important for isolated cases, in which X-linked and de novo autosomal dominant variants are not uncommon. In addition, new gene- or mutation-specific therapies are emerging, underlining the importance of identifying causative mutations in each individual. Sanger sequencing of selected genes followed by cost-effective targeted next-generation sequencing (NGS) can identify defects in known IRD-associated genes in the majority of the cases. Exome NGS in combination with genetic linkage or homozygosity mapping studies can aid the identification of the remaining causal genes. As these are thought to be mutated in <1% of the cases, validation through functional modeling in, for example, zebrafish and/or replication through the genotyping of large patient cohorts is required. In the near future, whole genome NGS in combination with transcriptome NGS may reveal mutations that are currently hidden in the noncoding regions of the human genome. PMID:24939053

  15. Discovery of Molecular Mechanisms of Traditional Chinese Medicinal Formula Si-Wu-Tang Using Gene Expression Microarray and Connectivity Map

    PubMed Central

    Wen, Zhining; Wang, Zhijun; Wang, Steven; Ravula, Ranadheer; Yang, Lun; Xu, Jun; Wang, Charles; Zuo, Zhong; Chow, Moses S. S.; Shi, Leming; Huang, Ying

    2011-01-01

    To pursue a systematic approach to discovery of mechanisms of action of traditional Chinese medicine (TCM), we used microarrays, bioinformatics and the “Connectivity Map” (CMAP) to examine TCM-induced changes in gene expression. We demonstrated that this approach can be used to elucidate new molecular targets using a model TCM herbal formula Si-Wu-Tang (SWT) which is widely used for women's health. The human breast cancer MCF-7 cells treated with 0.1 µM estradiol or 2.56 mg/ml of SWT showed dramatic gene expression changes, while no significant change was detected for ferulic acid, a known bioactive compound of SWT. Pathway analysis using differentially expressed genes related to the treatment effect identified that expression of genes in the nuclear factor erythroid 2-related factor 2 (Nrf2) cytoprotective pathway was most significantly affected by SWT, but not by estradiol or ferulic acid. The Nrf2-regulated genes HMOX1, GCLC, GCLM, SLC7A11 and NQO1 were upreguated by SWT in a dose-dependent manner, which was validated by real-time RT-PCR. Consistently, treatment with SWT and its four herbal ingredients resulted in an increased antioxidant response element (ARE)-luciferase reporter activity in MCF-7 and HEK293 cells. Furthermore, the gene expression profile of differentially expressed genes related to SWT treatment was used to compare with those of 1,309 compounds in the CMAP database. The CMAP profiles of estradiol-treated MCF-7 cells showed an excellent match with SWT treatment, consistent with SWT's widely claimed use for women's diseases and indicating a phytoestrogenic effect. The CMAP profiles of chemopreventive agents withaferin A and resveratrol also showed high similarity to the profiles of SWT. This study identified SWT as an Nrf2 activator and phytoestrogen, suggesting its use as a nontoxic chemopreventive agent, and demonstrated the feasibility of combining microarray gene expression profiling with CMAP mining to discover mechanisms of actions

  16. Gene discovery and functional assessment of rare copy-number variants in neurodevelopmental disorders.

    PubMed

    Iyer, Janani; Girirajan, Santhosh

    2015-09-01

    Rare copy-number variants (CNVs) are a significant cause of neurodevelopmental disorders. The sequence architecture of the human genome predisposes certain individuals to deletions and duplications within specific genomic regions. While assessment of individuals with different breakpoints has identified causal genes for certain rare CNVs, deriving gene-phenotype correlations for rare CNVs with similar breakpoints has been challenging. We present a comprehensive review of the literature related to genetic architecture that is predisposed to recurrent rearrangements, and functional evaluation of deletions, duplications and candidate genes within rare CNV intervals using mouse, zebrafish and fruit fly models. It is clear that phenotypic assessment and complete genetic evaluation of large cohorts of individuals carrying specific CNVs and functional evaluation using multiple animal models are necessary to understand the molecular genetic basis of neurodevelopmental disorders.

  17. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling.

    PubMed

    Yeoh, Eng-Juh; Ross, Mary E; Shurtleff, Sheila A; Williams, W Kent; Patel, Divyen; Mahfouz, Rami; Behm, Fred G; Raimondi, Susana C; Relling, Mary V; Patel, Anami; Cheng, Cheng; Campana, Dario; Wilkins, Dawn; Zhou, Xiaodong; Li, Jinyan; Liu, Huiqing; Pui, Ching-Hon; Evans, William E; Naeve, Clayton; Wong, Limsoon; Downing, James R

    2002-03-01

    Treatment of pediatric acute lymphoblastic leukemia (ALL) is based on the concept of tailoring the intensity of therapy to a patient's risk of relapse. To determine whether gene expression profiling could enhance risk assignment, we used oligonucleotide microarrays to analyze the pattern of genes expressed in leukemic blasts from 360 pediatric ALL patients. Distinct expression profiles identified each of the prognostically important leukemia subtypes, including T-ALL, E2A-PBX1, BCR-ABL, TEL-AML1, MLL rearrangement, and hyperdiploid >50 chromosomes. In addition, another ALL subgroup was identified based on its unique expression profile. Examination of the genes comprising the expression signatures provided important insights into the biology of these leukemia subgroups. Further, within some genetic subgroups, expression profiles identified those patients that would eventually fail therapy. Thus, the single platform of expression profiling should enhance the accurate risk stratification of pediatric ALL patients.

  18. Prior knowledge driven Granger causality analysis on gene regulatory network discovery

    SciTech Connect

    Yao, Shun; Yoo, Shinjae; Yu, Dantong

    2015-08-28

    Our study focuses on discovering gene regulatory networks from time series gene expression data using the Granger causality (GC) model. However, the number of available time points (T) usually is much smaller than the number of target genes (n) in biological datasets. The widely applied pairwise GC model (PGC) and other regularization strategies can lead to a significant number of false identifications when n>>T. In this study, we proposed a new method, viz., CGC-2SPR (CGC using two-step prior Ridge regularization) to resolve the problem by incorporating prior biological knowledge about a target gene data set. In our simulation experiments, the propose new methodology CGC-2SPR showed significant performance improvement in terms of accuracy over other widely used GC modeling (PGC, Ridge and Lasso) and MI-based (MRNET and ARACNE) methods. In addition, we applied CGC-2SPR to a real biological dataset, i.e., the yeast metabolic cycle, and discovered more true positive edges with CGC-2SPR than with the other existing methods. In our research, we noticed a “ 1+1>2” effect when we combined prior knowledge and gene expression data to discover regulatory networks. Based on causality networks, we made a functional prediction that the Abm1 gene (its functions previously were unknown) might be related to the yeast’s responses to different levels of glucose. In conclusion, our research improves causality modeling by combining heterogeneous knowledge, which is well aligned with the future direction in system biology. Furthermore, we proposed a method of Monte Carlo significance estimation (MCSE) to calculate the edge significances which provide statistical meanings to the discovered causality networks. All of our data and source codes will be available under the link https://bitbucket.org/dtyu/granger-causality/wiki/Home.

  19. Prior knowledge driven Granger causality analysis on gene regulatory network discovery

    DOE PAGES

    Yao, Shun; Yoo, Shinjae; Yu, Dantong

    2015-08-28

    Our study focuses on discovering gene regulatory networks from time series gene expression data using the Granger causality (GC) model. However, the number of available time points (T) usually is much smaller than the number of target genes (n) in biological datasets. The widely applied pairwise GC model (PGC) and other regularization strategies can lead to a significant number of false identifications when n>>T. In this study, we proposed a new method, viz., CGC-2SPR (CGC using two-step prior Ridge regularization) to resolve the problem by incorporating prior biological knowledge about a target gene data set. In our simulation experiments, themore » propose new methodology CGC-2SPR showed significant performance improvement in terms of accuracy over other widely used GC modeling (PGC, Ridge and Lasso) and MI-based (MRNET and ARACNE) methods. In addition, we applied CGC-2SPR to a real biological dataset, i.e., the yeast metabolic cycle, and discovered more true positive edges with CGC-2SPR than with the other existing methods. In our research, we noticed a “ 1+1>2” effect when we combined prior knowledge and gene expression data to discover regulatory networks. Based on causality networks, we made a functional prediction that the Abm1 gene (its functions previously were unknown) might be related to the yeast’s responses to different levels of glucose. In conclusion, our research improves causality modeling by combining heterogeneous knowledge, which is well aligned with the future direction in system biology. Furthermore, we proposed a method of Monte Carlo significance estimation (MCSE) to calculate the edge significances which provide statistical meanings to the discovered causality networks. All of our data and source codes will be available under the link https://bitbucket.org/dtyu/granger-causality/wiki/Home.« less

  20. RNA-Seq Based De Novo Transcriptome Assembly and Gene Discovery of Cistanche deserticola Fleshy Stem

    PubMed Central

    Yao, Fuwen; Li, Cuiping; Tang, Qingli; Sun, Min; Sun, Gaoyuan; Hu, Songnian; Yu, Jun; Song, Shuhui

    2015-01-01

    Backgrounds Cistanche deserticola is a completely non-photosynthetic parasitic plant with great medicinal value and mainly distributed in desert of Northwest China. Its dried fleshy stem is a crucial tonic in traditional Chinese medicine with roles of mainly improving male sexual function and strengthening immunity, but few mechanistic studies have been conducted partly due to the lack of genomic and transcriptomic resources. Results In this study, we performed deep transcriptome sequencing in fleshy stem of C. deserticola, and about 80 million reads were generated using Illumina pair-end sequencing on HiSeq2000 platform. Using trinity assembler, we obtained 95,787 transcript sequences with transcript lengths ranging from 200bp to 15,698bp, having an average length of 950 bases and the N50 length of 1,519 bases. 63,957 transcripts were identified actively expressed with FPKM ≥ 0.5, in which 30,098 transcripts were annotated with gene descriptions or gene ontology terms by sequence similarity analyses against several public databases (Uniprot, NR and Nt at NCBI, and KEGG). Furthermore, we identified key enzyme genes involved in biosynthesis of lignin and phenylethanoid glycosides (PhGs) which are known to be the primary active ingredients. Four phenylalanine ammonia-lyase (PAL) genes, the first key enzyme in lignin and PhG biosynthesis, were identified based on sequences comparison and phylogenetic analysis. Two biosynthesis pathways of PhGs were also proposed for the first time. Conclusions In all, we completed a global analysis of the C. deserticola fleshy stem transcriptome using RNA-seq technology. A collection of enzyme genes related to biosynthesis of lignin and phenylethanoid glysides were identified from the assembled and annotated transcripts, and the gene family of PAL was also predicted. The sequence data from this study will provide a valuable resource for conducting future phenylethanoid glysides biosynthesis researches and functional genomic studies

  1. A 13-gene signature prognostic of HPV-negative OSCC: discovery and external validation

    PubMed Central

    Lohavanichbutr, Pawadee; Méndez, Eduardo; Holsinger, F. Christopher; Rue, Tessa C.; Zhang, Yuzheng; Houck, John; Upton, Melissa P.; Futran, Neal; Schwartz, Stephen M.; Wang, Pei; Chen, Chu

    2013-01-01

    Purpose To identify a prognostic gene signature for HPV-negative OSCC patients. Experimental Design Two gene expression datasets were used; a training dataset from the Fred Hutchinson Cancer Research Center (FHCRC) (n=97), and a validation dataset from the MD Anderson Cancer Center (MDACC) (n=71). We applied L1/L2-penalized Cox regression models to the FHCRC data on the 131–gene signature previously identified to be prognostic in OSCC patients to identify a prognostic model specific for high-risk HPV-negative OSCC patients. The models were tested with the MDACC dataset using a receiver operating characteristic analysis. Results A 13-gene model was identified as the best predictor of HPV-negative OSCC-specific survival in the training dataset. The risk score for each patient in the validation dataset was calculated from this model and dichotomized at the median. The estimated 2-year mortality (± SE) of patients with high risk scores was 47.1 (±9.24)% compared with 6.35 (± 4.42)% for patients with low risk scores. ROC analyses showed that the areas under the curve for the age, gender, and treatment modality-adjusted models with risk score (0.78, 95%CI: 0.74-0.86) and risk score plus tumor stage (0.79, 95%CI: 0.75-0.87) were substantially higher than for the model with tumor stage (0.54, 95%CI: 0.48-0.62). Conclusions We identified and validated a 13-gene signature that is considerably better than tumor stage in predicting survival of HPV-negative OSCC patients. Further evaluation of this gene signature as a prognostic marker in other populations of patients with HPV-negative OSCC is warranted. PMID:23319825

  2. Plant gravitropic signal transduction: A network analysis leads to gene discovery

    NASA Astrophysics Data System (ADS)

    Wyatt, Sarah

    Gravity plays a fundamental role in plant growth and development. Although a significant body of research has helped define the events of gravity perception, the role of the plant growth regulator auxin, and the mechanisms resulting in the gravity response, the events of signal transduction, those that link the biophysical action of perception to a biochemical signal that results in auxin redistribution, those that regulate the gravitropic effects on plant growth, remain, for the most part, a “black box.” Using a cold affect, dubbed the gravity persistent signal (GPS) response, we developed a mutant screen to specifically identify components of the signal transduction pathway. Cloning of the GPS genes have identified new proteins involved in gravitropic signaling. We have further exploited the GPS response using a multi-faceted approach including gene expression microarrays, proteomics analysis, and bioinformatics analysis and continued mutant analysis to identified additional genes, physiological and biochemical processes. Gene expression data provided the foundation of a regulatory network for gravitropic signaling. Based on these gene expression data and related data sets/information from the literature/repositories, we constructed a gravitropic signaling network for Arabidopsis inflorescence stems. To generate the network, both a dynamic Bayesian network approach and a time-lagged correlation coefficient approach were used. The dynamic Bayesian network added existing information of protein-protein interaction while the time-lagged correlation coefficient allowed incorporation of temporal regulation and thus could incorporate the time-course metric from the data set. Thus the methods complemented each other and provided us with a more comprehensive evaluation of connections. Each method generated a list of possible interactions associated with a statistical significance value. The two networks were then overlaid to generate a more rigorous, intersected

  3. High-throughput proteomics integrated with gene microarray for discovery of colorectal cancer potential biomarkers

    PubMed Central

    Zhong, Chenhan; Li, Dan; Zhai, Xiaohui; Hu, Wangxiong; Guo, Cheng; Yuan, Ying; Zheng, Shu

    2016-01-01

    Proteins, as executives of genes' instructions, are responsible for cellular phenotypes. Integrating proteomics with gene microarray, we conducted this study to identify potential protein biomarkers of colorectal cancer (CRC). Isobaric tags with related and absolute quantitation (iTRAQ) labeling mass spectrometry (MS) was applied to screen and identify differentially expressed proteins between paired CRC and adjacent normal mucosa. Meanwhile, Affymetrix U133plus2.0 microarrays were used to perform gene microarray analysis. Verification experiments included immunohistochemistry (IHC), western blot and enzyme-linked immunosorbent assay (ELISA) of selected proteins. Overall, 5469 differentially expressed proteins were detected with iTRAQ-MS from 24 matched CRC and adjacent normal tissues. And gene microarray identified 39859 differential genes from 52 patients. Of these, 3083 differential proteins had corresponding differentially expressed genes, with 245 proteins and their genes showed >1.5-fold change in expression level. Gene ontology enrichment analysis revealed that up-regulated proteins were more involved in cell adhesion and motion than down-regulated proteins. In addition, up-regulated proteins were more likely to be located in nucleus and vesicles. Further verification experiments with IHC confirmed differential expression levels of 5 proteins (S100 calcium-binding protein A9, annexin A3, nicotinamide phosphoribosyltransferase, carboxylesterase 2 and calcium activated chloride channel A1) between CRC and normal tissues. Besides, western blot showed a stepwise increase of annexin A3 abundance in normal colorectal mucosa, adenoma and CRC tissues. ELISA results revealed significantly higher serum levels of S100 calcium-binding protein A9 and annexin A3 in CRC patients than healthy controls, validating diagnostic value of these proteins. Cell experiments showed that inhibition of annexin A3 could suppress CRC cell proliferation and aggressiveness. S100 calcium

  4. Discovery and identification of candidate genes from the chitinase gene family for Verticillium dahliae resistance in cotton

    PubMed Central

    Xu, Jun; Xu, Xiaoyang; Tian, Liangliang; Wang, Guilin; Zhang, Xueying; Wang, Xinyu; Guo, Wangzhen

    2016-01-01

    Verticillium dahliae, a destructive and soil-borne fungal pathogen, causes massive losses in cotton yields. However, the resistance mechanism to V. dahilae in cotton is still poorly understood. Accumulating evidence indicates that chitinases are crucial hydrolytic enzymes, which attack fungal pathogens by catalyzing the fungal cell wall degradation. As a large gene family, to date, the chitinase genes (Chis) have not been systematically analyzed and effectively utilized in cotton. Here, we identified 47, 49, 92, and 116 Chis from four sequenced cotton species, diploid Gossypium raimondii (D5), G. arboreum (A2), tetraploid G. hirsutum acc. TM-1 (AD1), and G. barbadense acc. 3–79 (AD2), respectively. The orthologous genes were not one-to-one correspondence in the diploid and tetraploid cotton species, implying changes in the number of Chis in different cotton species during the evolution of Gossypium. Phylogenetic classification indicated that these Chis could be classified into six groups, with distinguishable structural characteristics. The expression patterns of Chis indicated their various expressions in different organs and tissues, and in the V. dahliae response. Silencing of Chi23, Chi32, or Chi47 in cotton significantly impaired the resistance to V. dahliae, suggesting these genes might act as positive regulators in disease resistance to V. dahliae. PMID:27354165

  5. Transcriptome Analysis and Discovery of Genes Relevant to Development in Bradysia odoriphaga at Three Developmental Stages

    PubMed Central

    Wang, Wenbo; Chen, Hao; Zhou, Xianhong; Zhuang, Qianying; Yu, Yi; Li, Rumei

    2016-01-01

    Bradysia odoriphaga (Diptera: Sciaridae) is the most important pest of Chinese chive (Allium tuberosum) in Asia; however, the molecular genetics are poorly understood. To explore the molecular biological mechanism of development, Illumina sequencing and de novo assembly were performed in the third-instar, fourth-instar, and pupal B. odoriphaga. The study resulted in 16.2 Gb of clean data and 47,578 unigenes (≥125bp) contained in 7,632,430contigs, 46.21% of which were annotated from non-redundant protein (NR), Gene Ontology (GO), Clusters of Orthologous Groups (COG), Eukaryotic Orthologous Groups (KOG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. It was found that 19.67% of unigenes matched the homologous species mainly, including Aedes aegypti, Culex quinquefasciatus, Ceratitis capitata, and Anopheles gambiae. According to differentially expressed gene (DEG) analysis, 143, 490, and 309 DEGs were annotated as involved in the developmental process in the GO database respectively, in the comparisons of third-instar and fourth-instar larvae, third-instar larvae and pupae, and fourth-instar larvae and pupae. Twenty-five genes were closely related to these processes, including developmental process, reproduction process, and reproductive organs development and programmed cell death (PCD). The information of unigenes assembled in B. odoriphaga through transcriptome and DEG analyses could provide a detailed genetic basis and regulated information for elaborating the developmental mechanism from the larval, pre-pupal to pupal stages of B. odoriphaga. PMID:26891450

  6. Human Transporter Database: Comprehensive Knowledge and Discovery Tools in the Human Transporter Genes

    PubMed Central

    Ye, Adam Y.; Liu, Qing-Rong; Li, Chuan-Yun; Zhao, Min; Qu, Hong

    2014-01-01

    Transporters are essential in homeostatic exchange of endogenous and exogenous substances at the systematic, organic, cellular, and subcellular levels. Gene mutations of transporters are often related to pharmacogenetics traits. Recent developments in high throughput technologies on genomics, transcriptomics and proteomics allow in depth studies of transporter genes in normal cellular processes and diverse disease conditions. The flood of high throughput data have resulted in urgent need for an updated knowledgebase with curated, organized, and annotated human transporters in an easily accessible way. Using a pipeline with the combination of automated keywords query, sequence similarity search and manual curation on transporters, we collected 1,555 human non-redundant transporter genes to develop the Human Transporter Database (HTD) (http://htd.cbi.pku.edu.cn). Based on the extensive annotations, global properties of the transporter genes were illustrated, such as expression patterns and polymorphisms in relationships with their ligands. We noted that the human transporters were enriched in many fundamental biological processes such as oxidative phosphorylation and cardiac muscle contraction, and significantly associated with Mendelian and complex diseases such as epilepsy and sudden infant death syndrome. Overall, HTD provides a well-organized interface to facilitate research communities to search detailed molecular and genetic information of transporters for development of personalized medicine. PMID:24558441

  7. Discovery of genes involved with learning and memory: an experimental synthesis of Hirschian and Benzerian perspectives.

    PubMed

    Tully, T

    1996-11-26

    The biological bases of learning and memory are being revealed today with a wide array of molecular approaches, most of which entail the analysis of dysfunction produced by gene disruptions. This perspective derives both from early "genetic dissections" of learning in mutant Drosophila by Seymour Benzer and colleagues and from earlier behavior-genetic analyses of learning and in Diptera by Jerry Hirsh and coworkers. Three quantitative-genetic insights derived from these latter studies serve as guiding principles for the former. First, interacting polygenes underlie complex traits. Consequently, learning/memory defects associated with single-gene mutants can be quantified accurately only in equilibrated, heterogeneous genetic backgrounds. Second, complex behavioral responses will be composed of genetically distinct functional components. Thus, genetic dissection of complex traits into specific biobehavioral properties is likely. Finally, disruptions of genes involved with learning/memory are likely to have pleiotropic effects. As a result, task-relevant sensorimotor responses required for normal learning must be assessed carefully to interpret performance in learning/memory experiments. In addition, more specific conclusions will be obtained from reverse-genetic experiments, in which gene disruptions are restricted in time and/or space.

  8. Transcriptome Analysis and Discovery of Genes Relevant to Development in Bradysia odoriphaga at Three Developmental Stages.

    PubMed

    Gao, Huanhuan; Zhai, Yifan; Wang, Wenbo; Chen, Hao; Zhou, Xianhong; Zhuang, Qianying; Yu, Yi; Li, Rumei

    2016-01-01

    Bradysia odoriphaga (Diptera: Sciaridae) is the most important pest of Chinese chive (Allium tuberosum) in Asia; however, the molecular genetics are poorly understood. To explore the molecular biological mechanism of development, Illumina sequencing and de novo assembly were performed in the third-instar, fourth-instar, and pupal B. odoriphaga. The study resulted in 16.2 Gb of clean data and 47,578 unigenes (≥125 bp) contained in 7,632,430 contigs, 46.21% of which were annotated from non-redundant protein (NR), Gene Ontology (GO), Clusters of Orthologous Groups (COG), Eukaryotic Orthologous Groups (KOG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. It was found that 19.67% of unigenes matched the homologous species mainly, including Aedes aegypti, Culex quinquefasciatus, Ceratitis capitata, and Anopheles gambiae. According to differentially expressed gene (DEG) analysis, 143, 490, and 309 DEGs were annotated as involved in the developmental process in the GO database respectively, in the comparisons of third-instar and fourth-instar larvae, third-instar larvae and pupae, and fourth-instar larvae and pupae. Twenty-five genes were closely related to these processes, including developmental process, reproduction process, and reproductive organs development and programmed cell death (PCD). The information of unigenes assembled in B. odoriphaga through transcriptome and DEG analyses could provide a detailed genetic basis and regulated information for elaborating the developmental mechanism from the larval, pre-pupal to pupal stages of B. odoriphaga.

  9. Discovery of Chemosensory Genes in the Oriental Fruit Fly, Bactrocera dorsalis

    PubMed Central

    Wu, Zhongzhen; Zhang, He; Wang, Zhengbing; Bin, Shuying; He, Hualiang; Lin, Jintian

    2015-01-01

    The oriental fruit fly, Bactrocera dorsalis, is a devastating fruit fly pest in tropical and sub-tropical countries. Like other insects, this fly uses its chemosensory system to efficiently interact with its environment. However, our understanding of the molecular components comprising B. dorsalis chemosensory system is limited. Using next generation sequencing technologies, we sequenced the transcriptome of four B. dorsalis developmental stages: egg, larva, pupa and adult chemosensory tissues. A total of 31 candidate odorant binding proteins (OBPs), 4 candidate chemosensory proteins (CSPs), 23 candidate odorant receptors (ORs), 11 candidate ionotropic receptors (IRs), 6 candidate gustatory receptors (GRs) and 3 candidate sensory neuron membrane proteins (SNMPs) were identified. The tissue distributions of the OBP and CSP transcripts were determined by RT-PCR and a subset of nine genes were further characterized. The predicted proteins from these genes shared high sequence similarity to Drosophila melanogaster pheromone binding protein related proteins (PBPRPs). Interestingly, one OBP (BdorOBP19c) was exclusively expressed in the sex pheromone glands of mature females. RT-PCR was also used to compare the expression of the candidate genes in the antennae of male and female B. dorsalis adults. These antennae-enriched OBPs, CSPs, ORs, IRs and SNMPs could play a role in the detection of pheromones and general odorants and thus could be useful target genes for the integrated pest management of B. dorsalis and other agricultural pests. PMID:26070069

  10. A Sorghum Mutant Resource as an Efficient Platform for Gene Discovery in Grasses[OPEN

    PubMed Central

    Burke, John; Chen, Junping; Wang, Bo; Hayes, Chad; Emendack, Yves

    2016-01-01

    Sorghum (Sorghum bicolor) is a versatile C4 crop and a model for research in family Poaceae. High-quality genome sequence is available for the elite inbred line BTx623, but functional validation of genes remains challenging due to the limited genomic and germplasm resources available for comprehensive analysis of induced mutations. In this study, we generated 6400 pedigreed M4 mutant pools from EMS-mutagenized BTx623 seeds through single-seed descent. Whole-genome sequencing of 256 phenotyped mutant lines revealed >1.8 million canonical EMS-induced mutations, affecting >95% of genes in the sorghum genome. The vast majority (97.5%) of the induced mutations were distinct from natural variations. To demonstrate the utility of the sequenced sorghum mutant resource, we performed reverse genetics to identify eight genes potentially affecting drought tolerance, three of which had allelic mutations and two of which exhibited exact cosegregation with the phenotype of interest. Our results establish that a large-scale resource of sequenced pedigreed mutants provides an efficient platform for functional validation of genes in sorghum, thereby accelerating sorghum breeding. Moreover, findings made in sorghum could be readily translated to other members of the Poaceae via integrated genomics approaches. PMID:27354556

  11. Large-Scale Discovery of Disease-Disease and Disease-Gene Associations

    PubMed Central

    Gligorijevic, Djordje; Stojanovic, Jelena; Djuric, Nemanja; Radosavljevic, Vladan; Grbovic, Mihajlo; Kulathinal, Rob J.; Obradovic, Zoran

    2016-01-01

    Data-driven phenotype analyses on Electronic Health Record (EHR) data have recently drawn benefits across many areas of clinical practice, uncovering new links in the medical sciences that can potentially affect the well-being of millions of patients. In this paper, EHR data is used to discover novel relationships between diseases by studying their comorbidities (co-occurrences in patients). A novel embedding model is designed to extract knowledge from disease comorbidities by learning from a large-scale EHR database comprising more than 35 million inpatient cases spanning nearly a decade, revealing significant improvements on disease phenotyping over current computational approaches. In addition, the use of the proposed methodology is extended to discover novel disease-gene associations by including valuable domain knowledge from genome-wide association studies. To evaluate our approach, its effectiveness is compared against a held-out set where, again, it revealed very compelling results. For selected diseases, we further identify candidate gene lists for which disease-gene associations were not studied previously. Thus, our approach provides biomedical researchers with new tools to filter genes of interest, thus, reducing costly lab studies. PMID:27578529

  12. A Sorghum Mutant Resource as an Efficient Platform for Gene Discovery in Grasses.

    PubMed

    Jiao, Yinping; Burke, John; Chopra, Ratan; Burow, Gloria; Chen, Junping; Wang, Bo; Hayes, Chad; Emendack, Yves; Ware, Doreen; Xin, Zhanguo

    2016-07-01

    Sorghum (Sorghum bicolor) is a versatile C4 crop and a model for research in family Poaceae. High-quality genome sequence is available for the elite inbred line BTx623, but functional validation of genes remains challenging due to the limited genomic and germplasm resources available for comprehensive analysis of induced mutations. In this study, we generated 6400 pedigreed M4 mutant pools from EMS-mutagenized BTx623 seeds through single-seed descent. Whole-genome sequencing of 256 phenotyped mutant lines revealed >1.8 million canonical EMS-induced mutations, affecting >95% of genes in the sorghum genome. The vast majority (97.5%) of the induced mutations were distinct from natural variations. To demonstrate the utility of the sequenced sorghum mutant resource, we performed reverse genetics to identify eight genes potentially affecting drought tolerance, three of which had allelic mutations and two of which exhibited exact cosegregation with the phenotype of interest. Our results establish that a large-scale resource of sequenced pedigreed mutants provides an efficient platform for functional validation of genes in sorghum, thereby accelerating sorghum breeding. Moreover, findings made in sorghum could be readily translated to other members of the Poaceae via integrated genomics approaches.

  13. Transcriptome Analysis of Catharanthus roseus for Gene Discovery and Expression Profiling

    PubMed Central

    Sharma, Raghvendra; Sinha, Alok K.; Jain, Mukesh

    2014-01-01

    The medicinal plant, Catharanthus roseus, accumulates wide range of terpenoid indole alkaloids, which are well documented therapeutic agents. In this study, deep transcriptome sequencing of C. roseus was carried out to identify the pathways and enzymes (genes) involved in biosynthesis of these compounds. About 343 million reads were generated from different tissues (leaf, flower and root) of C. roseus using Illumina platform. Optimization of de novo assembly involving a two-step process resulted in a total of 59,220 unique transcripts with an average length of 1284 bp. Comprehensive functional annotation and gene ontology (GO) analysis revealed the representation of many genes involved in different biological processes and molecular functions. In total, 65% of C. roseus transcripts showed homology with sequences available in various public repositories, while remaining 35% unigenes may be considered as C. roseus specific. In silico analysis revealed presence of 11,620 genic simple sequence repeats (excluding mono-nucleotide repeats) and 1820 transcription factor encoding genes in C. roseus transcriptome. Expression analysis showed roots and leaves to be actively participating in bisindole alkaloid production with clear indication that enzymes involved in pathway of vindoline and vinblastine biosynthesis are restricted to aerial tissues. Such large-scale transcriptome study provides a rich source for understanding plant-specialized metabolism, and is expected to promote research towards production of plant-derived pharmaceuticals. PMID:25072156

  14. Biomarker discovery and gene expression responses in Lycopersicon esculentum root exposed to lead.

    PubMed

    Hou, Jing; Bai, Lili; Xie, Yujia; Liu, Xinhui; Cui, Baoshan

    2015-12-15

    Gene expression analysis has shown particular promise for the identification of molecular biomarkers that can be used for further evaluation of potential toxicity of chemicals present in agricultural soil. In the study, we focused on the development of molecular markers to detect Pb toxicity in agricultural soil. Using the results obtained from microarray analysis, twelve Pb-responsive genes were selected and tested in different Pb concentrations to examine their concentration-response characteristics using real-time quantitative polymerase chain reaction (RT-qPCR). All the Pb treatments set in our study could generally induce the differential expression of the 12 genes, while the lowest observable adverse effect concentration (LOAEC) of Pb for seed germination, root elongation, biomass and structural modification derived from 1,297, 177, 177, and 1,297 mg Pb/kg soil, respectively, suggesting that the transcriptional approach was more sensitive than the traditional end points of death, growth, and morphology for the evaluation of Pb toxicity. The relative expression of glycoalkaloid metabolism 1 (P=-0.790), ethylene-responsive transcription factor ERF017 (P=-0.686) and CASP-like protein 4C2 (P=-0.652) demonstrates a dose-dependent response with Pb content in roots, implying that the three genes can be used as sensitive bioindicators of Pb stress in Lycopersicon esculentum.

  15. Gene expression and epigenetic discovery screen reveal methylation of SFRP2 in prostate cancer.

    PubMed

    Perry, Antoinette S; O'Hurley, Gillian; Raheem, Omer A; Brennan, Kevin; Wong, Simon; O'Grady, Anthony; Kennedy, Anne-Marie; Marignol, Laure; Murphy, Therese M; Sullivan, Linda; Barrett, Ciara; Loftus, Barbara; Thornhill, John; Hewitt, Stephen M; Lawler, Mark; Kay, Elaine; Lynch, Thomas; Hollywood, Donal

    2013-04-15

    Aberrant activation of Wnts is common in human cancers, including prostate. Hypermethylation associated transcriptional silencing of Wnt antagonist genes SFRPs (Secreted Frizzled-Related Proteins) is a frequent oncogenic event. The significance of this is not known in prostate cancer. The objectives of our study were to (i) profile Wnt signaling related gene expression and (ii) investigate methylation of Wnt antagonist genes in prostate cancer. Using TaqMan Low Density Arrays, we identified 15 Wnt signaling related genes with significantly altered expression in prostate cancer; the majority of which were upregulated in tumors. Notably, histologically benign tissue from men with prostate cancer appeared more similar to tumor (r = 0.76) than to benign prostatic hyperplasia (BPH; r = 0.57, p < 0.001). Overall, the expression profile was highly similar between tumors of high (≥ 7) and low (≤ 6) Gleason scores. Pharmacological demethylation of PC-3 cells with 5-Aza-CdR reactivated 39 genes (≥ 2-fold); 40% of which inhibit Wnt signaling. Methylation frequencies in prostate cancer were 10% (2/20) (SFRP1), 64.86% (48/74) (SFRP2), 0% (0/20) (SFRP4) and 60% (12/20) (SFRP5). SFRP2 methylation was detected at significantly lower frequencies in high-grade prostatic intraepithelial neoplasia (HGPIN; 30%, (6/20), p = 0.0096), tumor adjacent benign areas (8.82%, (7/69), p < 0.0001) and BPH (11.43% (4/35), p < 0.0001). The quantitative level of SFRP2 methylation (normalized index of methylation) was also significantly higher in tumors (116) than in the other samples (HGPIN = 7.45, HB = 0.47, and BPH = 0.12). We show that SFRP2 hypermethylation is a common event in prostate cancer. SFRP2 methylation in combination with other epigenetic markers may be a useful biomarker of prostate cancer.

  16. The human gene connectome as a map of short cuts for morbid allele discovery.

    PubMed

    Itan, Yuval; Zhang, Shen-Ying; Vogt, Guillaume; Abhyankar, Avinash; Herman, Melina; Nitschke, Patrick; Fried, Dror; Quintana-Murci, Lluis; Abel, Laurent; Casanova, Jean-Laurent

    2013-04-02

    High-throughput genomic data reveal thousands of gene variants per patient, and it is often difficult to determine which of these variants underlies disease in a given individual. However, at the population level, there may be some degree of phenotypic homogeneity, with alterations of specific physiological pathways underlying the pathogenesis of a particular disease. We describe here the human gene connectome (HGC) as a unique approach for human mendelian genetic research, facilitating the interpretation of abundant genetic data from patients with the same disease, and guiding subsequent experimental investigations. We first defined the set of the shortest plausible biological distances, routes, and degrees of separation between all pairs of human genes by applying a shortest distance algorithm to the full human gene network. We then designed a hypothesis-driven application of the HGC, in which we generated a Toll-like receptor 3-specific connectome useful for the genetic dissection of inborn errors of Toll-like receptor 3 immunity. In addition, we developed a functional genomic alignment approach from the HGC. In functional genomic alignment, the genes are clustered according to biological distance (rather than the traditional molecular evolutionary genetic distance), as estimated from the HGC. Finally, we compared the HGC with three state-of-the-art methods: String, FunCoup, and HumanNet. We demonstrated that the existing methods are more suitable for polygenic studies, whereas HGC approaches are more suitable for monogenic studies. The HGC and functional genomic alignment data and computer programs are freely available to noncommercial users from http://lab.rockefeller.edu/casanova/HGC and should facilitate the genome-wide selection of disease-causing candidate alleles for experimental validation.

  17. The human gene connectome as a map of short cuts for morbid allele discovery

    PubMed Central

    Itan, Yuval; Zhang, Shen-Ying; Vogt, Guillaume; Abhyankar, Avinash; Herman, Melina; Nitschke, Patrick; Fried, Dror; Quintana-Murci, Lluis; Abel, Laurent; Casanova, Jean-Laurent

    2013-01-01

    High-throughput genomic data reveal thousands of gene variants per patient, and it is often difficult to determine which of these variants underlies disease in a given individual. However, at the population level, there may be some degree of phenotypic homogeneity, with alterations of specific physiological pathways underlying the pathogenesis of a particular disease. We describe here the human gene connectome (HGC) as a unique approach for human Mendelian genetic research, facilitating the interpretation of abundant genetic data from patients with the same disease, and guiding subsequent experimental investigations. We first defined the set of the shortest plausible biological distances, routes, and degrees of separation between all pairs of human genes by applying a shortest distance algorithm to the full human gene network. We then designed a hypothesis-driven application of the HGC, in which we generated a Toll-like receptor 3-specific connectome useful for the genetic dissection of inborn errors of Toll-like receptor 3 immunity. In addition, we developed a functional genomic alignment approach from the HGC. In functional genomic alignment, the genes are clustered according to biological distance (rather than the traditional molecular evolutionary genetic distance), as estimated from the HGC. Finally, we compared the HGC with three state-of-the-art methods: String, FunCoup, and HumanNet. We demonstrated that the existing methods are more suitable for polygenic studies, whereas HGC approaches are more suitable for monogenic studies. The HGC and functional genomic alignment data and computer programs are freely available to noncommercial users from http://lab.rockefeller.edu/casanova/HGC and should facilitate the genome-wide selection of disease-causing candidate alleles for experimental validation. PMID:23509278

  18. Characterization of Greater Middle Eastern genetic variation for enhanced disease gene discovery.

    PubMed

    Scott, Eric M; Halees, Anason; Itan, Yuval; Spencer, Emily G; He, Yupeng; Azab, Mostafa Abdellateef; Gabriel, Stacey B; Belkadi, Aziz; Boisson, Bertrand; Abel, Laurent; Clark, Andrew G; Alkuraya, Fowzan S; Casanova, Jean-Laurent; Gleeson, Joseph G

    2016-09-01

    The Greater Middle East (GME) has been a central hub of human migration and population admixture. The tradition of consanguinity, variably practiced in the Persian Gulf region, North Africa, and Central Asia, has resulted in an elevated burden of recessive disease. Here we generated a whole-exome GME variome from 1,111 unrelated subjects. We detected substantial diversity and admixture in continental and subregional populations, corresponding to several ancient founder populations with little evidence of bottlenecks. Measured consanguinity rates were an order of magnitude above those in other sampled populations, and the GME population exhibited an increased burden of runs of homozygosity (ROHs) but showed no evidence for reduced burden of deleterious variation due to classically theorized 'genetic purging'. Applying this database to unsolved recessive conditions in the GME population reduced the number of potential disease-causing variants by four- to sevenfold. These results show variegated genetic architecture in GME populations and support future human genetic discoveries in Mendelian and population genetics.

  19. Leveraging a Sturge-Weber Gene Discovery: An Agenda for Future Research.

    PubMed

    Comi, Anne M; Sahin, Mustafa; Hammill, Adrienne; Kaplan, Emma H; Juhász, Csaba; North, Paula; Ball, Karen L; Levin, Alex V; Cohen, Bernard; Morris, Jill; Lo, Warren; Roach, E Steve

    2016-05-01

    Sturge-Weber syndrome (SWS) is a vascular neurocutaneous disorder that results from a somatic mosaic mutation in GNAQ, which is also responsible for isolated port-wine birthmarks. Infants with SWS are born with a cutaneous capillary malformation (port-wine birthmark) of the forehead or upper eyelid which can signal an increased risk of brain and/or eye involvement prior to the onset of specific symptoms. This symptom-free interval represents a time when a targeted intervention could help to minimize the neurological and ophthalmologic manifestations of the disorder. This paper summarizes a 2015 SWS workshop in Bethesda, Maryland that was sponsored by the National Institutes of Health. Meeting attendees included a diverse group of clinical and translational researchers with a goal of establishing research priorities for the next few years. The initial portion of the meeting included a thorough review of the recent genetic discovery and what is known of the pathogenesis of SWS. Breakout sessions related to neurology, dermatology, and ophthalmology aimed to establish SWS research priorities in each field. Key priorities for future development include the need for clinical consensus guidelines, further work to develop a clinical trial network, improvement of tissue banking for research purposes, and the need for multiple animal and cell culture models of SWS.

  20. InFusion: Advancing Discovery of Fusion Genes and Chimeric Transcripts from Deep RNA-Sequencing Data

    PubMed Central

    Okonechnikov, Konstantin; Imai-Matsushima, Aki; Seitz, Alexander; Meyer, Thomas F.; Garcia-Alcalde, Fernando

    2016-01-01

    Analysis of fusion transcripts has become increasingly important due to their link with cancer development. Since high-throughput sequencing approaches survey fusion events exhaustively, several computational methods for the detection of gene fusions from RNA-seq data have been developed. This kind of analysis, however, is complicated by native trans-splicing events, the splicing-induced complexity of the transcriptome and biases and artefacts introduced in experiments and data analysis. There are a number of tools available for the detection of fusions from RNA-seq data; however, certain differences in specificity and sensitivity between commonly used approaches have been found. The ability to detect gene fusions of different types, including isoform fusions and fusions involving non-coding regions, has not been thoroughly studied yet. Here, we propose a novel computational toolkit called InFusion for fusion gene detection from RNA-seq data. InFusion introduces several unique features, such as discovery of fusions involving intergenic regions, and detection of anti-sense transcription in chimeric RNAs based on strand-specificity. Our approach demonstrates superior detection accuracy on simulated data and several public RNA-seq datasets. This improved performance was also evident when evaluating data from RNA deep-sequencing of two well-established prostate cancer cell lines. InFusion identified 26 novel fusion events that were validated in vitro, including alternatively spliced gene fusion isoforms and chimeric transcripts that include intergenic regions. The toolkit is freely available to download from http:/bitbucket.org/kokonech/infusion. PMID:27907167

  1. Ellis Englesberg and the discovery of positive control in gene regulation.

    PubMed

    Hahn, Steven

    2014-10-01

    Based on his work with the Escherichia coli l-arabinose operon, Ellis Englesberg proposed in 1965 that the regulatory gene araC was an "activator gene" required for positive control of the ara operon. This challenged the widely held belief in a universal mechanism of negative regulation proposed earlier by Jacob and Monod. For years, Englesberg's model was met with deep skepticism. Despite much frustration with complex ad hoc explanations used to challenge his model, Englesberg persisted until the evidence for positive control in ara and other systems became overwhelming. Englesberg's pioneering work enriched the original operon model and had a lasting impact in opening new and exciting ways of thinking about transcriptional regulation.

  2. Case-Only Exome Sequencing and Complex Disease Susceptibility Gene Discovery: Study Design Considerations

    PubMed Central

    Wu, Lang; Schaid, Daniel J.; Sicotte, Hugues; Wieben, Eric D.; Li, Hu; Petersen, Gloria M.

    2015-01-01

    Whole exome sequencing (WES) provides an unprecedented opportunity to identify the potential etiologic role of rare functional variants in human complex diseases. Large-scale collaborations have generated germline WES data on patients with a number of diseases, especially cancer, but less often on healthy controls under the same sequencing procedures. These data can be a valuable resource for identifying new disease susceptibility loci, if study designs are appropriately applied. This review describes suggested strategies and technical considerations when focusing on case-only study designs that use WES data in complex disease scenarios. These include variant filtering based on frequency and functionality, gene prioritization, interrogation of different data types, and targeted sequencing validation. We propose that if case-only WES designs were applied in an appropriate manner, new susceptibility genes containing rare variants for human complex diseases can be detected. PMID:25371537

  3. Discovery of immune-related genes expressed in hemocytes of the tarantula spider Acanthoscurria gomesiana.

    PubMed

    Lorenzini, Daniel M; da Silva, Pedro I; Soares, Marcelo B; Arruda, Paulo; Setubal, João; Daffre, Sirlei

    2006-01-01

    The present study reports the identification of immune related transcripts from hemocytes of the spider Acanthoscurria gomesiana by high throughput sequencing of expressed sequence tags (ESTs). To generate ESTs from hemocytes, two cDNA libraries were prepared: one by directional cloning (primary) and the other by the normalization of the first (normalized). A total of 7584 clones were sequenced and the identical ESTs were clustered, resulting in 3723 assembled sequences (AS). At least 20% of these sequences are putative novel genes. The automatic functional annotation of AS based on Gene Ontology revealed several abundant transcripts related to the following functional classes: hemocyanin, lectin, and structural constituents of ribosome and cytoskeleton. From this annotation, 73 transcripts possibly involved in immune response were also identified, suggesting the existence of several molecular processes not previously described for spiders, such as: pathogen recognition, coagulation, complement activation, cell adhesion and intracellular signaling pathway for the activation of cellular defenses.

  4. Legal developments and practical implications of gene patenting on targeted drug discovery and development.

    PubMed

    Klein, R D

    2010-06-01

    The use of genetic information to design and guide therapies creates novel patent issues. Gene patents have been integral to the introduction of new biologics, but their role in diagnostic testing is controversial. Genotype-phenotype associations are at the heart of personalized medicine. The intellectual property rules by which these biological relationships are governed have profound implications for the growth of this field. Several cases currently before the courts may add reason and clarity to the law in this area.

  5. Use of eQTL Analysis for the Discovery of Target Genes Identified by GWAS

    DTIC Science & Technology

    2012-04-01

    prostate tissue-specific expression quantitative trait loci (eQTL) dataset; and 2) utilize this dataset to identify candidate genes for existing...set of 500 samples of normal prostate tissue sampled from men with PC. To date, we have pre-screened normal prostate tissue with the use of H&E...stained sections from 4000 men having a radical prostatectomy in order to identify those cases meeting our strict selection criteria for further

  6. Discovery of Novel Mammary Developmental and Cancer Genes Using ENU Mutagenesis

    DTIC Science & Technology

    2002-10-01

    same gene. This is being undertaken using both wild type backgrounds and sensitising mono- and bi -transgeneic backgrounds. To date 20 phenotypes have had...introduction of a transgenic oncogene into the background will enable ENU to provide a second hit in the carcinogenic process . We believe, based on the proven...that includes the use of this bi -transgeneic animal. We have established a collaboration with Dr Lewis Chodosh to make these animals available to us

  7. Genome-wide discovery of cis-elements in promoter sequences using gene expression.

    PubMed

    Troukhan, Maxim; Tatarinova, Tatiana; Bouck, John; Flavell, Richard B; Alexandrov, Nickolai N

    2009-04-01

    The availability of complete or nearly complete genome sequences, a large number of 5' expressed sequence tags, and significant public expression data allow for a more accurate identification of cis-elements regulating gene expression. We have implemented a global approach that takes advantage of available expression data, genomic sequences, and transcript information to predict cis-elements associated with specific expression patterns. The key components of our approach are: (1) precise identification of transcription start sites, (2) specific locations of cis-elements relative to the transcription start site, and (3) assessment of statistical significance for all sequence motifs. By applying our method to promoters of Arabidopsis thaliana and Mus musculus, we have identified motifs that affect gene expression under specific environmental conditions or in certain tissues. We also found that the presence of the TATA box is associated with increased variability of gene expression. Strong correlation between our results and experimentally determined motifs shows that the method is capable of predicting new functionally important cis-elements in promoter sequences.

  8. Discovery of Inhibitors of Aberrant Gene Transcription from Libraries of DNA Binding Molecules: Inhibition of LEF-1 Mediated Gene Transcription and Oncogenic Transformation

    PubMed Central

    Stover, James S.; Shi, Jin; Jin, Wei; Vogt, Peter K.; Boger, Dale L.

    2009-01-01

    The screening of a >9000 compound library of synthetic DNA binding molecules for selective binding to the consensus sequence of the transcription factor LEF-1 followed by assessment of the candidate compounds in a series of assays that characterized functional activity (disruption of DNA–LEF-1 binding) at the intended target and site (inhibition of intracellular LEF-1 mediated gene transcription) resulting in a desired phenotypic cellular change (inhibit LEF-1 driven cell transformation) provided two lead compounds: lefmycin-1 and lefmycin-2. The sequence of screens defining the approach assures that activity in the final functional assay may be directly related to the inhibition of gene transcription and DNA binding properties of the identified molecules. Central to the implementation of this generalized approach to the discovery of DNA binding small molecule inhibitors of gene transcription was: (1) the use of a technically non-demanding fluorescent intercalator displacement (FID) assay for initial assessment of the DNA binding affinity and selectivity of a library of compounds for any sequence of interest, and (2) the technology used to prepare a sufficiently large library of DNA binding compounds. PMID:19216569

  9. Discovery of inhibitors of aberrant gene transcription from Libraries of DNA binding molecules: inhibition of LEF-1-mediated gene transcription and oncogenic transformation.

    PubMed

    Stover, James S; Shi, Jin; Jin, Wei; Vogt, Peter K; Boger, Dale L

    2009-03-11

    The screening of a >9000 compound library of synthetic DNA binding molecules for selective binding to the consensus sequence of the transcription factor LEF-1 followed by assessment of the candidate compounds in a series of assays that characterized functional activity (disruption of DNA-LEF-1 binding) at the intended target and site (inhibition of intracellular LEF-1-mediated gene transcription) resulting in a desired phenotypic cellular change (inhibit LEF-1-driven cell transformation) provided two lead compounds: lefmycin-1 and lefmycin-2. The sequence of screens defining the approach assures that activity in the final functional assay may be directly related to the inhibition of gene transcription and DNA binding properties of the identified molecules. Central to the implementation of this generalized approach to the discovery of DNA binding small molecule inhibitors of gene transcription was (1) the use of a technically nondemanding fluorescent intercalator displacement (FID) assay for initial assessment of the DNA binding affinity and selectivity of a library of compounds for any sequence of interest, and (2) the technology used to prepare a sufficiently large library of DNA binding compounds.

  10. Large-scale gene discovery in the pea aphid Acyrthosiphon pisum (Hemiptera)

    PubMed Central

    Sabater-Muñoz, Beatriz; Legeai, Fabrice; Rispe, Claude; Bonhomme, Joël; Dearden, Peter; Dossat, Carole; Duclert, Aymeric; Gauthier, Jean-Pierre; Ducray, Danièle Giblot; Hunter, Wayne; Dang, Phat; Kambhampati, Srini; Martinez-Torres, David; Cortes, Teresa; Moya, Andrès; Nakabachi, Atsushi; Philippe, Cathy; Prunier-Leterme, Nathalie; Rahbé, Yvan; Simon, Jean-Christophe; Stern, David L; Wincker, Patrick; Tagu, Denis

    2006-01-01

    Aphids are the leading pests in agricultural crops. A large-scale sequencing of 40,904 ESTs from the pea aphid Acyrthosiphon pisum was carried out to define a catalog of 12,082 unique transcripts. A strong AT bias was found, indicating a compositional shift between Drosophila melanogaster and A. pisum. An in silico profiling analysis characterized 135 transcripts specific to pea-aphid tissues (relating to bacteriocytes and parthenogenetic embryos). This project is the first to address the genetics of the Hemiptera and of a hemimetabolous insect. PMID:16542494

  11. Functional Analysis and Discovery of Microbial Genes Transforming Metallic and Organic Pollutants: Database and Experimental Tools

    SciTech Connect

    Lawrence P. Wackett; Lynda B.M. Ellis

    2004-12-09

    Microbial functional genomics is faced with a burgeoning list of genes which are denoted as unknown or hypothetical for lack of any knowledge about their function. The majority of microbial genes encode enzymes. Enzymes are the catalysts of metabolism; catabolism, anabolism, stress responses, and many other cell functions. A major problem facing microbial functional genomics is proposed here to derive from the breadth of microbial metabolism, much of which remains undiscovered. The breadth of microbial metabolism has been surveyed by the PIs and represented according to reaction types on the University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD): http://umbbd.ahc.umn.edu/search/FuncGrps.html The database depicts metabolism of 49 chemical functional groups, representing most of current knowledge. Twice that number of chemical groups are proposed here to be metabolized by microbes. Thus, at least 50% of the unique biochemical reactions catalyzed by microbes remain undiscovered. This further suggests that many unknown and hypothetical genes encode functions yet undiscovered. This gap will be partly filled by the current proposal. The UM-BBD will be greatly expanded as a resource for microbial functional genomics. Computational methods will be developed to predict microbial metabolism which is not yet discovered. Moreover, a concentrated effort to discover new microbial metabolism will be conducted. The research will focus on metabolism of direct interest to DOE, dealing with the transformation of metals, metalloids, organometallics and toxic organics. This is precisely the type of metabolism which has been characterized most poorly to date. Moreover, these studies will directly impact functional genomic analysis of DOE-relevant genomes.

  12. Transcriptomics Analysis of Crassostrea hongkongensis for the Discovery of Reproduction-Related Genes

    PubMed Central

    Tong, Ying; Zhang, Yang; Huang, Jiaomei; Xiao, Shu; Zhang, Yuehuan; Li, Jun; Chen, Jinhui; Yu, Ziniu

    2015-01-01

    Background The reproductive mechanisms of mollusk species have been interesting targets in biological research because of the diverse reproductive strategies observed in this phylum. These species have also been studied for the development of fishery technologies in molluscan aquaculture. Although the molecular mechanisms underlying the reproductive process have been well studied in animal models, the relevant information from mollusks remains limited, particularly in species of great commercial interest. Crassostrea hongkongensis is the dominant oyster species that is distributed along the coast of the South China Sea and little genomic information on this species is available. Currently, high-throughput sequencing techniques have been widely used for investigating the basis of physiological processes and facilitating the establishment of adequate genetic selection programs. Results The C.hongkongensis transcriptome included a total of 1,595,855 reads, which were generated by 454 sequencing and were assembled into 41,472 contigs using de novo methods. Contigs were clustered into 33,920 isotigs and further grouped into 22,829 isogroups. Approximately 77.6% of the isogroups were successfully annotated by the Nr database. More than 1,910 genes were identified as being related to reproduction. Some key genes involved in germline development, sex determination and differentiation were identified for the first time in C.hongkongensis (nanos, piwi, ATRX, FoxL2, β-catenin, etc.). Gene expression analysis indicated that vasa, nanos, piwi, ATRX, FoxL2, β-catenin and SRD5A1 were highly or specifically expressed in C.hongkongensis gonads. Additionally, 94,056 single nucleotide polymorphisms (SNPs) and 1,699 simple sequence repeats (SSRs) were compiled. Conclusions Our study significantly increased C.hongkongensis genomic information based on transcriptomics analysis. The group of reproduction-related genes identified in the present study constitutes a new tool for research

  13. Gene Discovery and Molecular Marker Development, Based on High-Throughput Transcript Sequencing of Paspalum dilatatum Poir

    PubMed Central

    Giordano, Andrea; Cogan, Noel O. I.; Kaur, Sukhjiwan; Drayton, Michelle; Mouradov, Aidyn; Panter, Stephen; Schrauf, Gustavo E.; Mason, John G.; Spangenberg, German C.

    2014-01-01

    Background Paspalum dilatatum Poir. (common name dallisgrass) is a native grass species of South America, with special relevance to dairy and red meat production. P. dilatatum exhibits higher forage quality than other C4 forage grasses and is tolerant to frost and water stress. This species is predominantly cultivated in an apomictic monoculture, with an inherent high risk that biotic and abiotic stresses could potentially devastate productivity. Therefore, advanced breeding strategies that characterise and use available genetic diversity, or assess germplasm collections effectively are required to deliver advanced cultivars for production systems. However, there are limited genomic resources available for this forage grass species. Results Transcriptome sequencing using second-generation sequencing platforms has been employed using pooled RNA from different tissues (stems, roots, leaves and inflorescences) at the final reproductive stage of P. dilatatum cultivar Primo. A total of 324,695 sequence reads were obtained, corresponding to c. 102 Mbp. The sequences were assembled, generating 20,169 contigs of a combined length of 9,336,138 nucleotides. The contigs were BLAST analysed against the fully sequenced grass species of Oryza sativa subsp. japonica, Brachypodium distachyon, the closely related Sorghum bicolor and foxtail millet (Setaria italica) genomes as well as against the UniRef 90 protein database allowing a comprehensive gene ontology analysis to be performed. The contigs generated from the transcript sequencing were also analysed for the presence of simple sequence repeats (SSRs). A total of 2,339 SSR motifs were identified within 1,989 contigs and corresponding primer pairs were designed. Empirical validation of a cohort of 96 SSRs was performed, with 34% being polymorphic between sexual and apomictic biotypes. Conclusions The development of genetic and genomic resources for P. dilatatum will contribute to gene discovery and expression studies

  14. Cultivation of hard-to-culture subsurface mercury-resistant bacteria and discovery of new merA gene sequences.

    PubMed

    Rasmussen, L D; Zawadsky, C; Binnerup, S J; Oregaard, G; Sørensen, S J; Kroer, N

    2008-06-01

    Mercury-resistant bacteria may be important players in mercury biogeochemistry. To assess the potential for mercury reduction by two subsurface microbial communities, resistant subpopulations and their merA genes were characterized by a combined molecular and cultivation-dependent approach. The cultivation method simulated natural conditions by using polycarbonate membranes as a growth support and a nonsterile soil slurry as a culture medium. Resistant bacteria were pregrown to microcolony-forming units (mCFU) before being plated on standard medium. Compared to direct plating, culturability was increased up to 2,800 times and numbers of mCFU were similar to the total number of mercury-resistant bacteria in the soils. Denaturing gradient gel electrophoresis analysis of DNA extracted from membranes suggested stimulation of growth of hard-to-culture bacteria during the preincubation. A total of 25 different 16S rRNA gene sequences were observed, including Alpha-, Beta-, and Gammaproteobacteria; Actinobacteria; Firmicutes; and Bacteroidetes. The diversity of isolates obtained by direct plating included eight different 16S rRNA gene sequences (Alpha- and Betaproteobacteria and Actinobacteria). Partial sequencing of merA of selected isolates led to the discovery of new merA sequences. With phylum-specific merA primers, PCR products were obtained for Alpha- and Betaproteobacteria and Actinobacteria but not for Bacteroidetes and Firmicutes. The similarity to known sequences ranged between 89 and 95%. One of the sequences did not result in a match in the BLAST search. The results illustrate the power of integrating advanced cultivation methodology with molecular techniques for the characterization of the diversity of mercury-resistant populations and assessing the potential for mercury reduction in contaminated environments.

  15. High-Throughput, Motility-Based Sorter for Microswimmers and Gene Discovery Platform

    NASA Astrophysics Data System (ADS)

    Yuan, Jinzhou; Raizen, David; Bau, Haim

    2015-11-01

    Animal motility varies with genotype, disease progression, aging, and environmental conditions. In many studies, it is desirable to carry out high throughput motility-based sorting to isolate rare animals for, among other things, forward genetic screens to identify genetic pathways that regulate phenotypes of interest. Many commonly used screening processes are labor-intensive, lack sensitivity, and require extensive investigator training. Here, we describe a sensitive, high throughput, automated, motility-based method for sorting nematodes. Our method was implemented in a simple microfluidic device capable of sorting many thousands of animals per hour per module, and is amenable to parallelism. The device successfully enriched for known C. elegans motility mutants. Furthermore, using this device, we isolated low-abundance mutants capable of suppressing the somnogenic effects of the flp-13 gene, which regulates sleep-like quiescence in C. elegans. Subsequent genomic sequencing led to the identification of a flp-13-suppressor gene. This research was supported, in part, by NIH NIA Grant 5R03AG042690-02.

  16. A large-scale gene discovery for the red palm weevil Rhynchophorus ferrugineus (Coleoptera: Curculionidae).

    PubMed

    Wang, Lei; Zhang, Xiao-Wei; Pan, Lin-Lin; Liu, Wan-Fei; Wang, Da-Peng; Zhang, Guang-Yu; Yin, Yu-Xin; Yin, An; Jia, Shan-Gang; Yu, Xiao-Guang; Sun, Gao-Yuan; Hu, Song-Nian; Al-Mssallem, Ibrahim S; Yu, Jun

    2013-12-01

    The red palm weevil (RPW; Rhynchophorus ferrugineus) is a devastating pest of palms, prevalent in the Middle East as well as many other regions of the world. Here, we report a large-scale de novo complementary DNA (cDNA) sequencing effort that acquired ∼5 million reads and assembled them into 26 765 contigs from 12 libraries made from samples of different RPW developmental stages based on the Roche/454 GS FLX platform. We annotated these contigs based on the publically available known insect genes and the Tribolium castaneum genome assembly. We find that over 80% of coding sequences (CDS) from the RPW contigs have high-identity homologs to known proteins with complete CDS. Gene expression analysis shows that the pupa and larval stages have the highest and lowest expression levels, respectively. In addition, we also identified more than 60 000 single nucleotide polymorphisms and 1 200 simple sequence repeat markers. This study provides the first large-scale cDNA dataset for RPW, a much-needed resource for future molecular studies.

  17. Discovery of genes coding for carbohydrate-active enzyme by metagenomic analysis of lignocellulosic biomasses

    PubMed Central

    Montella, Salvatore; Ventorino, Valeria; Lombard, Vincent; Henrissat, Bernard; Pepe, Olimpia; Faraco, Vincenza

    2017-01-01

    In this study, a high-throughput sequencing approach was applied to discover novel biocatalysts for lignocellulose hydrolysis from three dedicated energy crops, Arundo donax, Eucalyptus camaldulensis and Populus nigra, after natural biodegradation. The microbiomes of the three lignocellulosic biomasses were dominated by bacterial species (approximately 90%) with the highest representation by the Streptomyces genus both in the total microbial community composition and in the microbial diversity related to GH families of predicted ORFs. Moreover, the functional clustering of the predicted ORFs showed a prevalence of poorly characterized genes, suggesting these lignocellulosic biomasses are potential sources of as yet unknown genes. 1.2%, 0.6% and 3.4% of the total ORFs detected in A. donax, E. camaldulensis and P. nigra, respectively, were putative Carbohydrate-Active Enzymes (CAZymes). Interestingly, the glycoside hydrolases abundance in P. nigra (1.8%) was higher than that detected in the other biomasses investigated in this study. Moreover, a high percentage of (hemi)cellulases with different activities and accessory enzymes (mannanases, polygalacturonases and feruloyl esterases) was detected, confirming that the three analyzed samples were a reservoir of diversified biocatalysts required for an effective lignocellulose saccharification. PMID:28198423

  18. Gene discovery in the Antarctic fur seal (Arctocephalus gazella) skin transcriptome.

    PubMed

    Hoffman, Joseph I

    2011-07-01

    Next-generation sequencing provides a powerful new approach for developing functional genomic tools for nonmodel species, helping to narrow the gap between studies of model organisms and those of natural populations. Consequently, massively parallel 454 sequencing was used to characterize a normalized cDNA library derived from skin biopsy samples of twelve Antarctic fur seal (Arctocephalus gazella) individuals. Over 412 Mb of sequence data were generated, comprising 1.4 million reads of average length 286 bp. De novo assembly using Newbler 2.3 yielded 156 contigs plus 22 869 isotigs, which in turn clustered into 18,576 isogroups. Almost half of the assembled transcript sequences showed significant similarity to the nr database, revealing a functionally diverse array of genes. Moreover, 97.9% of these mapped to the dog (Canis lupis familiaris) genome, with a strong positive relationship between the number of sequences locating to a given chromosome and the length of that chromosome in the dog indicating a broad genomic distribution. Average depth of coverage was also almost 20-fold, sufficient to detect several thousand putative microsatellite loci and single nucleotide polymorphisms. This study constitutes an important step towards developing genomic resources with which to address consequential questions in pinniped ecology and evolution. It also supports an earlier but smaller study showing that skin tissue can be a rich source of expressed genes, with important implications for studying the genomics not only of marine mammals, but also more generally of species that cannot be destructively sampled.

  19. Positive selection systems for discovery of novel polyester biosynthesis genes based on fatty acid detoxification.

    PubMed Central

    Kranz, R G; Gabbert, K K; Madigan, M T

    1997-01-01

    The photosynthetic bacterium Rhodobacter capsulatus can grow with short- to long-chain fatty acids as the sole carbon source (R. G. Kranz, K. K. Gabbert, T. A. Locke, and M. T. Madigan, Appl. Environ. Microbiol. 63:3003-3009, 1997). Concomitant with growth on fatty acids is the production to high levels of the polyester storage compounds called polyhydroxyalkanoates (PHAs). Here, we describe colony screening and selection systems to analyze the production of PHAs in R. capsulatus. A screen with Nile red dissolved in acetone distinguishes between PHA producers and nonproducers. Unlike the wild type, an R. capsulatus PhaC- strain with the gene encoding PHA synthase deleted is unable to grow on solid media containing high concentrations of certain fatty acids. It is proposed that this deficiency is due to the inability of the PhaC- strain to detoxify the surrounding medium by consumption of fatty acids and their incorporation into PHAs. This fatty acid toxicity phenotype is used in selection for the cloning and characterization of heterologous phaC genes. PMID:9251190

  20. Hexosamine template. A platform for modulating gene expression and for sugar-based drug discovery.

    PubMed

    Elmouelhi, Noha; Aich, Udayanath; Paruchuri, Venkata D P; Meledeo, M Adam; Campbell, Christopher T; Wang, Jean J; Srinivas, Raja; Khanna, Hargun S; Yarema, Kevin J

    2009-04-23

    This study investigates the breadth of cellular responses engendered by short chain fatty acid (SCFA)-hexosamine hybrid molecules, a class of compounds long used in "metabolic glycoengineering" that are now emerging as drug candidates. First, a "mix and match" strategy showed that different SCFA (n-butyrate and acetate) appended to the same core sugar altered biological activity, complementing previous results [Campbell et al. J. Med. Chem. 2008, 51, 8135-8147] where a single type of SCFA elicited distinct responses. Microarray profiling then compared transcriptional responses engendered by regioisomerically modified ManNAc, GlcNAc, and GalNAc analogues in MDA-MB-231 cells. These data, which were validated by qRT-PCR or Western analysis for ID1, TP53, HPSE, NQO1, EGR1, and VEGFA, showed a two-pronged response where a core set of genes was coordinately regulated by all analogues while each analogue simultaneously uniquely regulated a larger number of genes. Finally, AutoDock modeling supported a mechanism where the analogues directly interact with elements of the NF-kappaB pathway. Together, these results establish the SCFA-hexosamine template as a versatile platform for modulating biological activity and developing new therapeutics.

  1. RNA-seq analysis for plant carnivory gene discovery in Nepenthes × ventrata

    PubMed Central

    Wan Zakaria, Wan Nor Adibah; Loke, Kok-Keong; Goh, Hoe-Han; Mohd Noor, Normah

    2015-01-01

    Carnivorous plants have the ability to capture and digest insects for nutrients, which allows them to survive in land deprived of nitrogenous nutrients. Nepenthes spp. are one of the carnivorous plants, which uniquely produce pitcher from the tip of an elongated leaf. This study provides the first transcriptome resource from pitcher of a Nepenthes ventricosa × Nepenthes alata hybrid, Nepenthes × ventrata to understand carnivory mechanism in Nepenthes spp., as well as in other carnivorous species. Raw reads and the transcriptome assembly project have been deposited to SRA database with the accession numbers SRX1389337 (day 0 control), SRX1389392 (day 3 longevity), and SRX1389395 (day 3 chitin-treated). PMID:26981352

  2. The long (and winding) road to gene discovery for canine hip dysplasia.

    PubMed

    Zhu, Lan; Zhang, Zhiwu; Friedenberg, Steven; Jung, Seung-Woo; Phavaphutanon, Janjira; Vernier-Singer, Margaret; Corey, Elizabeth; Mateescu, Raluca; Dykes, Nathan; Sandler, Jody; Acland, Gregory; Lust, George; Todhunter, Rory

    2009-08-01

    Hip dysplasia is a common inherited trait of dogs that results in secondary osteoarthritis. In this article the methods used to uncover the mutations contributing to this condition are reviewed, beginning with hip phenotyping. Coarse, genome-wide, microsatellite-based screens of pedigrees of greyhounds and dysplastic Labrador retrievers were used to identify linked quantitative trait loci (QTL). Fine-mapping across two chromosomes (CFA11 and 29) was employed using single nucleotide polymorphism (SNP) genotyping. Power analyses and preferential selection of dogs for ongoing SNP-based genotyping is described with the aim of refining the QTL intervals to 1-2 megabases on these and several additional chromosomes prior to candidate gene screening. The review considers how a mutation or a genetic marker such as a SNP or haplotype of SNPs might be combined with pedigree and phenotype information to create a 'breeding value' that could improve the accuracy of predicting a dog's hip conformation.

  3. Discovery and characterization of a novel CCND1/MRCK gene fusion in mantle cell lymphoma.

    PubMed

    Masamha, Chioniso Patience; Albrecht, Todd R; Wagner, Eric J

    2016-03-29

    The t(11;14) translocation resulting in constitutive cyclin D1 expression is an early event in mantle cell lymphoma (MCL) transformation. Patients with a highly proliferative phenotype produce cyclin D1 transcripts with truncated 3'UTRs that evade miRNA regulation. Here, we report the recurrence of a novel gene fusion in MCL cell lines and MCL patient isolates that consists of the full protein coding region of cyclin D1 (CCND1) and a 3'UTR consisting of sequences from both the CCND1 3'UTR and myotonic dystrophy kinase-related Cdc42-binding kinase's (MRCK) intron one. The resulting CCND1/MRCK mRNA is resistant to CCND1-targeted miRNA regulation, and targeting the MRCK region of the chimeric 3'UTR with siRNA results in decreased CCND1 levels.

  4. Microfluidic droplet-based PCR instrumentation for high-throughput gene expression profiling and biomarker discovery

    PubMed Central

    Hayes, Christopher J.; Dalton, Tara M.

    2015-01-01

    PCR is a common and often indispensable technique used in medical and biological research labs for a variety of applications. Real-time quantitative PCR (RT-qPCR) has become a definitive technique for quantitating differences in gene expression levels between samples. Yet, in spite of this importance, reliable methods to quantitate nucleic acid amounts in a higher throughput remain elusive. In the following paper, a unique design to quantify gene expression levels at the nanoscale in a continuous flow system is presented. Fully automated, high-throughput, low volume amplification of deoxynucleotides (DNA) in a droplet based microfluidic system is described. Unlike some conventional qPCR instrumentation that use integrated fluidic circuits or plate arrays, the instrument performs qPCR in a continuous, micro-droplet flowing process with droplet generation, distinctive reagent mixing, thermal cycling and optical detection platforms all combined on one complete instrument. Detailed experimental profiling of reactions of less than 300 nl total volume is achieved using the platform demonstrating the dynamic range to be 4 order logs and consistent instrument sensitivity. Furthermore, reduced pipetting steps by as much as 90% and a unique degree of hands-free automation makes the analytical possibilities for this instrumentation far reaching. In conclusion, a discussion of the first demonstrations of this approach to perform novel, continuous high-throughput biological screens is presented. The results generated from the instrument, when compared with commercial instrumentation, demonstrate the instrument reliability and robustness to carry out further studies of clinical significance with added throughput and economic benefits. PMID:27077035

  5. Discovery of Gene Sources for Economic Traits in Hanwoo by Whole-genome Resequencing

    PubMed Central

    Shin, Younhee; Jung, Ho-jin; Jung, Myunghee; Yoo, Seungil; Subramaniyam, Sathiyamoorthy; Markkandan, Kesavan; Kang, Jun-Mo; Rai, Rajani; Park, Junhyung; Kim, Jong-Joo

    2016-01-01

    Hanwoo, a Korean native cattle (Bos taurus coreana), has great economic value due to high meat quality. Also, the breed has genetic variations that are associated with production traits such as health, disease resistance, reproduction, growth as well as carcass quality. In this study, next generation sequencing technologies and the availability of an appropriate reference genome were applied to discover a large amount of single nucleotide polymorphisms (SNPs) in ten Hanwoo bulls. Analysis of whole-genome resequencing generated a total of 26.5 Gb data, of which 594,716,859 and 592,990,750 reads covered 98.73% and 93.79% of the bovine reference genomes of UMD 3.1 and Btau 4.6.1, respectively. In total, 2,473,884 and 2,402,997 putative SNPs were discovered, of which 1,095,922 (44.3%) and 982,674 (40.9%) novel SNPs were discovered against UMD3.1 and Btau 4.6.1, respectively. Among the SNPs, the 46,301 (UMD 3.1) and 28,613 SNPs (Btau 4.6.1) that were identified as Hanwoo-specific SNPs were included in the functional genes that may be involved in the mechanisms of milk production, tenderness, juiciness, marbling of Hanwoo beef and yellow hair. Most of the Hanwoo-specific SNPs were identified in the promoter region, suggesting that the SNPs influence differential expression of the regulated genes relative to the relevant traits. In particular, the non-synonymous (ns) SNPs found in CORIN, which is a negative regulator of Agouti, might be a causal variant to determine yellow hair of Hanwoo. Our results will provide abundant genetic sources of variation to characterize Hanwoo genetics and for subsequent breeding. PMID:26954201

  6. Discovery of molecular associations among aging, stem cells, and cancer based on gene expression profiling.

    PubMed

    Wang, Xiaosheng

    2013-04-01

    The emergence of a huge volume of "omics" data enables a computational approach to the investigation of the biology of cancer. The cancer informatics approach is a useful supplement to the traditional experimental approach. I reviewed several reports that used a bioinformatics approach to analyze the associations among aging, stem cells, and cancer by microarray gene expression profiling. The high expression of aging- or human embryonic stem cell-related molecules in cancer suggests that certain important mechanisms are commonly underlying aging, stem cells, and cancer. These mechanisms are involved in cell cycle regulation, metabolic process, DNA damage response, apoptosis, p53 signaling pathway, immune/inflammatory response, and other processes, suggesting that cancer is a developmental and evolutional disease that is strongly related to aging. Moreover, these mechanisms demonstrate that the initiation, proliferation, and metastasis of cancer are associated with the deregulation of stem cells. These findings provide insights into the biology of cancer. Certainly, the findings that are obtained by the informatics approach should be justified by experimental validation. This review also noted that next-generation sequencing data provide enriched sources for cancer informatics study.

  7. Identification and Validation of HCC-specific Gene Transcriptional Signature for Tumor Antigen Discovery

    PubMed Central

    Petrizzo, Annacarmen; Caruso, Francesca Pia; Tagliamonte, Maria; Tornesello, Maria Lina; Ceccarelli, Michele; Costa, Valerio; Aprile, Marianna; Esposito, Roberta; Ciliberto, Gennaro; Buonaguro, Franco M.; Buonaguro, Luigi

    2016-01-01

    A novel two-step bioinformatics strategy was applied for identification of signatures with therapeutic implications in hepatitis-associated HCC. Transcriptional profiles from HBV- and HCV-associated HCC samples were compared with non-tumor liver controls. Resulting HCC modulated genes were subsequently compared with different non-tumor tissue samples. Two related signatures were identified, namely “HCC-associated” and “HCC-specific”. Expression data were validated by RNA-Seq analysis carried out on unrelated HCC samples and protein expression was confirmed according to The Human Protein Atlas" (http://proteinatlas.org/), a public repository of immunohistochemistry data. Among all, aldo-keto reductase family 1 member B10, and IGF2 mRNA-binding protein 3 were found strictly HCC-specific with no expression in 18/20 normal tissues. Target peptides for vaccine design were predicted for both proteins associated with the most prevalent HLA-class I and II alleles. The described novel strategy showed to be feasible for identification of HCC-specific proteins as highly potential target for HCC immunotherapy. PMID:27387388

  8. Discovery, evaluation and distribution of haplotypes of the wheat Ppd-D1 gene.

    PubMed

    Guo, Zhiai; Song, Yanxia; Zhou, Ronghua; Ren, Zhenglong; Jia, Jizeng

    2010-02-01

    Ppd-D1 is one of the most potent genes affecting the photoperiod response of wheat (Triticum aestivum). Only two alleles, insensitive Ppd-D1a and sensitive Ppd-D1b, were known previously, and these did not adequately explain the broad adaptation of wheat to photoperiod variation. In this study, five diagnostic molecular markers were employed to identify Ppd-D1 haplotypes in 492 wheat varieties from diverse geographic locations and 55 accessions of Aegilops tauschii, the D genome donor species of wheat. Six Ppd-D1 haplotypes, designated I-VI, were identified. Types II, V and VI were considered to be more ancient and types I, III and IV were considered to be derived from type II. The transcript abundances of the Ppd-D1 haplotypes showed continuous variation, being highest for haplotype I, lowest for haplotype III, and correlating negatively with varietal differences in heading time. These haplotypes also significantly affected other agronomic traits. The distribution frequency of Ppd-D1 haplotypes showed partial correlations with both latitudes and altitudes of wheat cultivation regions. The evolution, expression and distribution of Ppd-D1 haplotypes were consistent evidentially with each other. What was regarded as a pair of alleles in the past can now be considered a series of alleles leading to continuous variation.

  9. The long (and winding) road to gene discovery for canine hip dysplasia

    PubMed Central

    Zhu, Lan; Zhang, Zhiwu; Friedenberg, Steven; Jung, Seung-Woo; Phavaphutanon, Janjira; Vernier-Singer, Margaret; Corey, Elizabeth; Mateescu, Raluca; Dykes, Nathan; Sandler, Jody; Acland, Gregory; Lust, George; Todhunter, Rory

    2009-01-01

    Hip dysplasia is a common inherited trait of dogs that results in secondary osteoarthritis. In this article the methods used to uncover the mutations contributing to this condition are reviewed, beginning with hip phenotyping. Coarse, genome-wide, microsatellite-based screens of pedigrees of greyhounds and dysplastic Labrador retrievers were used to identify linked quantitative trait loci (QTL). Fine-mapping across two chromosomes (CFA11 and 29) was employed using single nucleotide polymorphism (SNP) genotyping. Power analyses and preferential selection of dogs for ongoing SNP-based genotyping is described with the aim of refining the QTL intervals to 1–2 megabases on these and several additional chromosomes prior to candidate gene screening. The review considers how a mutation or a genetic marker such as a SNP or haplotype of SNPs might be combined with pedigree and phenotype information to create a ‘breeding value’ that could improve the accuracy of predicting a dog’s hip conformation. PMID:19297220

  10. The Fragile X Mental Retardation Syndrome 20 Years After the FMR1 Gene Discovery: an Expanding Universe of Knowledge

    PubMed Central

    Rousseau, François; Labelle, Yves; Bussières, Johanne; Lindsay, Carmen

    2011-01-01

    The fragile X mental retardation (FXMR) syndrome is one of the most frequent causes of mental retardation. Affected individuals display a wide range of additional characteristic features including behavioural and physical phenotypes, and the extent to which individuals are affected is highly variable. For these reasons, elucidation of the pathophysiology of this disease has been an important challenge to the scientific community. 1991 marks the year of the discovery of both the FMR1 gene mutations involved in this disease, and of their dynamic nature. Although a mouse model for the disease has been available for 16 years and extensive research has been performed on the FMR1 protein (FMRP), we still understand little about how the disease develops, and no treatment has yet been shown to be effective. In this review, we summarise current knowledge on FXMR with an emphasis on the technical challenges of molecular diagnostics, on its prevalence and dynamics among populations, and on the potential of screening for FMR1 mutations. PMID:21912443

  11. De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes

    PubMed Central

    Zolotarov, Yevgen; Strömvik, Martina

    2015-01-01

    Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved. PMID:26114291

  12. The fragile x mental retardation syndrome 20 years after the FMR1 gene discovery: an expanding universe of knowledge.

    PubMed

    Rousseau, François; Labelle, Yves; Bussières, Johanne; Lindsay, Carmen

    2011-08-01

    The fragile X mental retardation (FXMR) syndrome is one of the most frequent causes of mental retardation. Affected individuals display a wide range of additional characteristic features including behavioural and physical phenotypes, and the extent to which individuals are affected is highly variable. For these reasons, elucidation of the pathophysiology of this disease has been an important challenge to the scientific community. 1991 marks the year of the discovery of both the FMR1 gene mutations involved in this disease, and of their dynamic nature. Although a mouse model for the disease has been available for 16 years and extensive research has been performed on the FMR1 protein (FMRP), we still understand little about how the disease develops, and no treatment has yet been shown to be effective. In this review, we summarise current knowledge on FXMR with an emphasis on the technical challenges of molecular diagnostics, on its prevalence and dynamics among populations, and on the potential of screening for FMR1 mutations.

  13. EST-SNP discovery and dense genetic mapping in lentil (Lens culinaris Medik.) enable candidate gene selection for boron tolerance.

    PubMed

    Kaur, Sukhjiwan; Cogan, Noel O I; Stephens, Amber; Noy, Dianne; Butsch, Mirella; Forster, John W; Materne, Michael

    2014-03-01

    Large-scale SNP discovery and dense genetic mapping in a lentil intraspecific cross permitted identification of a single chromosomal region controlling tolerance to boron toxicity, an important breeding objective. Lentil (Lens culinaris Medik.) is a highly nutritious food legume crop that is cultivated world-wide. Until recently, lentil has been considered a genomic 'orphan' crop, limiting the feasibility of marker-assisted selection strategies in breeding programs. The present study reports on the identification of single-nucleotide polymorphisms (SNPs) from transcriptome sequencing data, utilisation of expressed sequence tag (EST)-derived simple sequence repeat (SSR) and SNP markers for construction of a gene-based genetic linkage map, and identification of markers in close linkage to major QTLs for tolerance to boron (B) toxicity. A total of 2,956 high-quality SNP markers were identified from a lentil EST database. Sub-sets of 546 SSRs and 768 SNPs were further used for genetic mapping of an intraspecific mapping population (Cassab × ILL2024) that exhibits segregation for B tolerance. Comparative analysis of the lentil linkage map with the sequenced genomes of Medicago truncatula Gaertn., soybean (Glycine max [L.] Merr.) and Lotus japonicus L. indicated blocks of conserved macrosynteny, as well as a number of rearrangements. A single genomic region was found to be associated with variation for B tolerance in lentil, based on evaluation performed over 2 years. Comparison of flanking markers to genome sequences of model species (M. truncatula, soybean and Arabidopsis thaliana) identified candidate genes that are functionally associated with B tolerance, and could potentially be used for diagnostic marker development in lentil.

  14. Detection, distribution, and organohalogen compound discovery implications of the reduced flavin adenine dinucleotide-dependent halogenase gene in major filamentous actinomycete taxonomic groups.

    PubMed

    Gao, Peng; Huang, Ying

    2009-07-01

    Halogenases have been shown to play a significant role in biosynthesis and introducing the bioactivity of many halogenated secondary metabolites. In this study, 54 reduced flavin adenine dinucleotide (FADH(2))-dependent halogenase gene-positive strains were identified after the PCR screening of a large collection of 228 reference strains encompassing all major families and genera of filamentous actinomycetes. The wide distribution of this gene was observed to extend to some rare lineages with higher occurrences and large sequence diversity. Subsequent phylogenetic analyses revealed that strains containing highly homologous halogenases tended to produce halometabolites with similar structures, and halogenase genes are likely to propagate by horizontal gene transfer as well as vertical inheritance within actinomycetes. Higher percentages of halogenase gene-positive strains than those of halogenase gene-negative ones contained polyketide synthase genes and/or nonribosomal peptide synthetase genes or displayed antimicrobial activities in the tests applied, indicating their genetic and physiological potentials for producing secondary metabolites. The robustness of this halogenase gene screening strategy for the discovery of particular biosynthetic gene clusters in rare actinomycetes besides streptomycetes was further supported by genome-walking analysis. The described distribution and phylogenetic implications of the FADH(2)-dependent halogenase gene present a guide for strain selection in the search for novel organohalogen compounds from actinomycetes.

  15. The Discovery Method in Training.

    ERIC Educational Resources Information Center

    Belbin, R. M.

    In the form of a discussion between faceless people, this booklet concerns discovery learning and its advantages. Subjects covered in the discussions are: Introducing the Discovery Method; An Experiment with British Railways; The OECD Research Projects in U.S.A., Austria, and Sweden; How the Discovery Method Differs from Other Methods; Discovery…

  16. Analysis of Gene Expression Profiles in Leaf Tissues of Cultivated Peanuts and Development of EST-SSR Markers and Gene Discovery

    PubMed Central

    Guo, Baozhu; Chen, Xiaoping; Hong, Yanbin; Liang, Xuanqiang; Dang, Phat; Brenneman, Tim; Holbrook, Corley; Culbreath, Albert

    2009-01-01

    Peanut is vulnerable to a range of foliar diseases such as spotted wilt caused by Tomato spotted wilt virus (TSWV), early (Cercospora arachidicola) and late (Cercosporidium personatum) leaf spots, southern stem rot (Sclerotium rolfsii), and sclerotinia blight (Sclerotinia minor). In this study, we report the generation of 17,376 peanut expressed sequence tags (ESTs) from leaf tissues of a peanut cultivar (Tifrunner, resistant to TSWV and leaf spots) and a breeding line (GT-C20, susceptible to TSWV and leaf spots). After trimming vector and discarding low quality sequences, a total of 14,432 high-quality ESTs were selected for further analysis and deposition to GenBank. Sequence clustering resulted in 6,888 unique ESTs composed of 1,703 tentative consensus (TCs) sequences and 5185 singletons. A large number of ESTs (5717) representing genes of unknown functions were also identified. Among the unique sequences, there were 856 EST-SSRs identified. A total of 290 new EST-based SSR markers were developed and examined for amplification and polymorphism in cultivated peanut and wild species. Resequencing information of selected amplified alleles revealed that allelic diversity could be attributed mainly to differences in repeat type and length in the SSR regions. In addition, a few additional INDEL mutations and substitutions were observed in the regions flanking the microsatellite regions. In addition, some defense-related transcripts were also identified, such as putative oxalate oxidase (EU024476) and NBS-LRR domains. EST data in this study have provided a new source of information for gene discovery and development of SSR markers in cultivated peanut. A total of 16931 ESTs have been deposited to the NCBI GenBank database with accession numbers ES751523 to ES768453. PMID:19584933

  17. Genomic resources for gene discovery, functional genome annotation, and evolutionary studies of maize and its close relatives.

    PubMed

    Wang, Chao; Shi, Xue; Liu, Lin; Li, Haiyan; Ammiraju, Jetty S S; Kudrna, David A; Xiong, Wentao; Wang, Hao; Dai, Zhaozhao; Zheng, Yonglian; Lai, Jinsheng; Jin, Weiwei; Messing, Joachim; Bennetzen, Jeffrey L; Wing, Rod A; Luo, Meizhong

    2013-11-01

    Maize is one of the most important food crops and a key model for genetics and developmental biology. A genetically anchored and high-quality draft genome sequence of maize inbred B73 has been obtained to serve as a reference sequence. To facilitate evolutionary studies in maize and its close relatives, much like the Oryza Map Alignment Project (OMAP) (www.OMAP.org) bacterial artificial chromosome (BAC) resource did for the rice community, we constructed BAC libraries for maize inbred lines Zheng58, Chang7-2, and Mo17 and maize wild relatives Zea mays ssp. parviglumis and Tripsacum dactyloides. Furthermore, to extend functional genomic studies to maize and sorghum, we also constructed binary BAC (BIBAC) libraries for the maize inbred B73 and the sorghum landrace Nengsi-1. The BAC/BIBAC vectors facilitate transfer of large intact DNA inserts from BAC clones to the BIBAC vector and functional complementation of large DNA fragments. These seven Zea Map Alignment Project (ZMAP) BAC/BIBAC libraries have average insert sizes ranging from 92 to 148 kb, organellar DNA from 0.17 to 2.3%, empty vector rates between 0.35 and 5.56%, and genome equivalents of 4.7- to 8.4-fold. The usefulness of the Parviglumis and Tripsacum BAC libraries was demonstrated by mapping clones to the reference genome. Novel genes and alleles present in these ZMAP libraries can now be used for functional complementation studies and positional or homology-based cloning of genes for translational genomics.

  18. Guided Discoveries.

    ERIC Educational Resources Information Center

    Ehrlich, Amos

    1991-01-01

    Presented are four mathematical discoveries made by students on an arithmetical function using the Fibonacci sequence. Discussed is the nature of the role of the teacher in directing the students' discovery activities. (KR)

  19. Gene discovery in EST sequences from the wheat leaf rust fungus Puccinia triticina sexual spores, asexual spores and haustoria, compared to other rust and corn smut fungi

    PubMed Central

    2011-01-01

    Background Rust fungi are biotrophic basidiomycete plant pathogens that cause major diseases on plants and trees world-wide, affecting agriculture and forestry. Their biotrophic nature precludes many established molecular genetic manipulations and lines of research. The generation of genomic resources for these microbes is leading to novel insights into biology such as interactions with the hosts and guiding directions for breakthrough research in plant pathology. Results To support gene discovery and gene model verification in the genome of the wheat leaf rust fungus, Puccinia triticina (Pt), we have generated Expressed Sequence Tags (ESTs) by sampling several life cycle stages. We focused on several spore stages and isolated haustorial structures from infected wheat, generating 17,684 ESTs. We produced sequences from both the sexual (pycniospores, aeciospores and teliospores) and asexual (germinated urediniospores) stages of the life cycle. From pycniospores and aeciospores, produced by infecting the alternate host, meadow rue (Thalictrum speciosissimum), 4,869 and 1,292 reads were generated, respectively. We generated 3,703 ESTs from teliospores produced on the senescent primary wheat host. Finally, we generated 6,817 reads from haustoria isolated from infected wheat as well as 1,003 sequences from germinated urediniospores. Along with 25,558 previously generated ESTs, we compiled a database of 13,328 non-redundant sequences (4,506 singlets and 8,822 contigs). Fungal genes were predicted using the EST version of the self-training GeneMarkS algorithm. To refine the EST database, we compared EST sequences by BLASTN to a set of 454 pyrosequencing-generated contigs and Sanger BAC-end sequences derived both from the Pt genome, and to ESTs and genome reads from wheat. A collection of 6,308 fungal genes was identified and compared to sequences of the cereal rusts, Puccinia graminis f. sp. tritici (Pgt) and stripe rust, P. striiformis f. sp. tritici (Pst), and poplar

  20. A novel approach to the discovery of survival biomarkers in glioblastoma using a joint analysis of DNA methylation and gene expression.

    PubMed

    Smith, Ashley A; Huang, Yen-Tsung; Eliot, Melissa; Houseman, E Andres; Marsit, Carmen J; Wiencke, John K; Kelsey, Karl T

    2014-06-01

    Glioblastoma multiforme (GBM) is the most aggressive of all brain tumors, with a median survival of less than 1.5 years. Recently, epigenetic alterations were found to play key roles in both glioma genesis and clinical outcome, demonstrating the need to integrate genetic and epigenetic data in predictive models. To enhance current models through discovery of novel predictive biomarkers, we employed a genome-wide, agnostic strategy to specifically capture both methylation-directed changes in gene expression and alternative associations of DNA methylation with disease survival in glioma. Human GBM-associated DNA methylation, gene expression, IDH1 mutation status, and survival data were obtained from The Cancer Genome Atlas. DNA methylation loci and expression probes were paired by gene, and their subsequent association with survival was determined by applying an accelerated failure time model to previously published alternative and expression-based association equations. Significant associations were seen in 27 unique methylation/expression pairs with expression-based, alternative, and combinatorial associations observed (10, 13, and 4 pairs, respectively). The majority of the predictive DNA methylation loci were located within CpG islands, and all but three of the locus pairs were negatively correlated with survival. This finding suggests that for most loci, methylation/expression pairs are inversely related, consistent with methylation-associated gene regulatory action. Our results indicate that changes in DNA methylation are associated with altered survival outcome through both coordinated changes in gene expression and alternative mechanisms. Furthermore, our approach offers an alternative method of biomarker discovery using a priori gene pairing and precise targeting to identify novel sites for locus-specific therapeutic intervention.

  1. A novel approach to the discovery of survival biomarkers in glioblastoma using a joint analysis of DNA methylation and gene expression

    PubMed Central

    Smith, Ashley A; Huang, Yen-Tsung; Eliot, Melissa; Houseman, E Andres; Marsit, Carmen J; Wiencke, John K; Kelsey, Karl T

    2014-01-01

    Glioblastoma multiforme (GBM) is the most aggressive of all brain tumors, with a median survival of less than 1.5 years. Recently, epigenetic alterations were found to play key roles in both glioma genesis and clinical outcome, demonstrating the need to integrate genetic and epigenetic data in predictive models. To enhance current models through discovery of novel predictive biomarkers, we employed a genome-wide, agnostic strategy to specifically capture both methylation-directed changes in gene expression and alternative associations of DNA methylation with disease survival in glioma. Human GBM-associated DNA methylation, gene expression, IDH1 mutation status, and survival data were obtained from The Cancer Genome Atlas. DNA methylation loci and expression probes were paired by gene, and their subsequent association with survival was determined by applying an accelerated failure time model to previously published alternative and expression-based association equations. Significant associations were seen in 27 unique methylation/expression pairs with expression-based, alternative, and combinatorial associations observed (10, 13, and 4 pairs, respectively). The majority of the predictive DNA methylation loci were located within CpG islands, and all but three of the locus pairs were negatively correlated with survival. This finding suggests that for most loci, methylation/expression pairs are inversely related, consistent with methylation-associated gene regulatory action. Our results indicate that changes in DNA methylation are associated with altered survival outcome through both coordinated changes in gene expression and alternative mechanisms. Furthermore, our approach offers an alternative method of biomarker discovery using a priori gene pairing and precise targeting to identify novel sites for locus-specific therapeutic intervention. PMID:24670968

  2. Rule of five in 2015 and beyond: Target and ligand structural limitations, ligand chemistry structure and drug discovery project decisions.

    PubMed

    Lipinski, Christopher A

    2016-06-01

    The rule of five (Ro5), based on physicochemical profiles of phase II drugs, is consistent with structural limitations in protein targets and the drug target ligands. Three of four parameters in Ro5 are fundamental to the structure of both target and drug binding sites. The chemical structure of the drug ligand depends on the ligand chemistry and design philosophy. Two extremes of chemical structure and design philosophy exist; ligands constructed in the medicinal chemistry synthesis laboratory without input from natural selection and natural product (NP) metabolites biosynthesized based on evolutionary selection. Exceptions to Ro5 are found mostly among NPs. Chemistry chameleon-like behavior of some NPs due to intra-molecular hydrogen bonding as exemplified by cyclosporine A is a strong contributor to NP Ro5 outliers. The fragment derived, drug Navitoclax is an example of the extensive expertise, resources, time and key decisions required for the rare discovery of a non-NP Ro5 outlier.

  3. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework

    PubMed Central

    Chen, Yi-An; Tripathi, Lokesh P.; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org PMID:26989145

  4. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework.

    PubMed

    Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org.

  5. The Genetics of Obsessive-Compulsive Disorder and Tourette Syndrome: An Epidemiological and Pathway-Based Approach for Gene Discovery

    ERIC Educational Resources Information Center

    Grados, Marco A.

    2010-01-01

    Objective: To provide a contemporary perspective on genetic discovery methods applied to obsessive-compulsive disorder (OCD) and Tourette syndrome (TS). Method: A review of research trends in genetics research in OCD and TS is conducted, with emphasis on novel approaches. Results: Genome-wide association studies (GWAS) are now in progress in OCD…

  6. Gene discovery for Mendelian conditions via social networking: de novo variants in KDM1A cause developmental delay and distinctive facial features

    PubMed Central

    Chong, Jessica X.; Yu, Joon-Ho; Lorentzen, Peter; Park, Karen M.; Jamal, Seema M.; Tabor, Holly K.; Rauch, Anita; Saenz, Margarita Sifuentes; Boltshauser, Eugen; Patterson, Karynne E.; Nickerson, Deborah A.; Bamshad, Michael J.

    2015-01-01

    Purpose The pace of Mendelian gene discovery is slowed by the “n-of-1 problem” – the difficulty of establishing causality of a putatively pathogenic variant in a single person or family. Identification of an unrelated person with an overlapping phenotype and suspected pathogenic variant in the same gene can overcome this barrier but is often impeded by lack of a convenient or widely-available way to share data on candidate variants / genes among families, clinicians and researchers. Methods Social networking among families, clinicians and researchers was used to identify three children with variants of unknown significance in KDM1A and similar phenotypes. Results De novo variants in KDM1A underlie a new syndrome characterized by developmental delay and distinctive facial features. Conclusion Social networking is a potentially powerful strategy to discover genes for rare Mendelian conditions, particularly those with non-specific phenotypic features. To facilitate the efforts of families to share phenotypic and genomic information with each other, clinicians, and researchers, we developed the Repository for Mendelian Genomics Family Portal (RMD-FP). Design and development of a web-based tool, MyGene2, that enables families, clinicians and researchers to search for gene matches based on analysis of phenotype and exome data deposited into the RMD-FP is underway. PMID:26656649

  7. Design of the DISCovery project: tailored work-oriented interventions to improve employee health, well-being, and performance-related outcomes in hospital care

    PubMed Central

    2013-01-01

    Background It is well-known that health care workers in today’s general hospitals have to deal with high levels of job demands, which could have negative effects on their health, well-being, and job performance. A way to reduce job-related stress reactions and to optimize positive work-related outcomes is to raise the level of specific job resources and opportunities to recover from work. However, the question remains how to translate the optimization of the balance between job demands, job resources, and recovery opportunities into effective workplace interventions. The aim of the DISCovery project is to develop and implement tailored work-oriented interventions to improve health, well-being, and performance of health care personnel. Methods/Design A quasi-experimental field study with a non-equivalent control group pretest-posttest design will be conducted in a top general hospital. Four existing organizational departments will provide both an intervention and a comparison group. Two types of research methods are used: (1) a longitudinal web-based survey study, and (2) a longitudinal daily diary study. After base-line measures of both methods, existing and yet to be developed interventions will be implemented within the experimental groups. Follow-up measurements will be taken one and two years after the base-line measures to analyze short-term and long-term effects of the interventions. Additionally, a process evaluation and a cost-effectiveness analysis will be carried out. Discussion The DISCovery project fulfills a strong need for theory-driven and scientifically well-performed research on job stress and performance interventions. It will provide insight into (1) how a balance between job demands, job resources, and recovery from work can be optimized, (2) the short-term and long-term effects of tailored work-oriented effects, and (3) indicators for successful or unsuccessful implementation of interventions. PMID:23421647

  8. Discovery of a 29-gene panel in peripheral blood mononuclear cells for the detection of colorectal cancer and adenomas using high throughput real-time PCR.

    PubMed

    Ciarloni, Laura; Hosseinian, Sahar; Monnier-Benoit, Sylvain; Imaizumi, Natsuko; Dorta, Gian; Ruegg, Curzio

    2015-01-01

    Colorectal cancer (CRC) is the second leading cause of cancer-related death in developed countries. Early detection of CRC leads to decreased CRC mortality. A blood-based CRC screening test is highly desirable due to limited invasiveness and high acceptance rate among patients compared to currently used fecal occult blood testing and colonoscopy. Here we describe the discovery and validation of a 29-gene panel in peripheral blood mononuclear cells (PBMC) for the detection of CRC and adenomatous polyps (AP). Blood samples were prospectively collected from a multicenter, case-control clinical study. First, we profiled 93 samples with 667 candidate and 3 reference genes by high throughput real-time PCR (OpenArray system). After analysis, 160 genes were retained and tested again on 51 additional samples. Low expressed and unstable genes were discarded resulting in a final dataset of 144 samples profiled with 140 genes. To define which genes, alone or in combinations had the highest potential to discriminate AP and/or CRC from controls, data were analyzed by a combination of univariate and multivariate methods. A list of 29 potentially discriminant genes was compiled and evaluated for its predictive accuracy by penalized logistic regression and bootstrap. This method discriminated AP >1cm and CRC from controls with a sensitivity of 59% and 75%, respectively, with 91% specificity. The behavior of the 29-gene panel was validated with a LightCycler 480 real-time PCR platform, commonly adopted by clinical laboratories. In this work we identified a 29-gene panel expressed in PBMC that can be used for developing a novel minimally-invasive test for accurate detection of AP and CRC using a standard real-time PCR platform.

  9. How the Serotonin Story is Being Rewritten By New Gene-Based Discoveries Principally Related to SLC6A4, the Serotonin Transporter Gene, Which Functions To Influence All Cellular Serotonin Systems

    PubMed Central

    Murphy, Dennis L.; Fox, Meredith A.; Timpano, Kiara R.; Moya, Pablo; Ren-Patterson, Renee; Andrews, Anne M.; Holmes, Andrew; Lesch, Klaus-Peter; Wendland, Jens R.

    2009-01-01

    Discovered and crystallized over sixty years ago, serotonin's important functions in the brain and body were identified over the ensuing years by neurochemical, physiological and pharmacological investigations. This 2008 M. Rapport Memorial Serotonin Review focuses on some of the most recent discoveries in serotonin that are based on genetic methodologies. These include examples of the consequences that result from direct serotonergic gene manipulation (gene deletion or overexpression) in mice and other species; an evaluation of some phenotypes related to functional human serotonergic gene variants, particularly in SLC6A4, the serotonin transporter gene; and finally, a consideration of the pharmacogenomics of serotonergic drugs with respect to both their therapeutic actions and side effects. The serotonin transporter (SERT) has been the most comprehensively studied of the serotonin system molecular components, and will be the primary focus of this review. We provide in-depth examples of gene-based discoveries primarily related to SLC6A4 that have clarified serotonin's many important homeostatic functions in humans, non-human primates, mice and other species. PMID:18824000

  10. Discovery and identification of candidate sex-related genes based on transcriptome sequencing of Russian sturgeon (Acipenser gueldenstaedtii) gonads.

    PubMed

    Chen, Yadong; Xia, Yongtao; Shao, Changwei; Han, Lei; Chen, Xuejie; Yu, Mengjun; Sha, Zhenxia

    2016-07-01

    As the Russian sturgeon (Acipenser gueldenstaedtii) is an important food and is the main source of caviar, it is necessary to discover the genes associated with its sex differentiation. However, the complicated life and maturity cycles of the Russian sturgeon restrict the accurate identification of sex in early development. To generate a first look at specific sex-related genes, we sequenced the transcriptome of gonads in different development stages (1, 2, and 5 yr old stages) with next-generation RNA sequencing. We generated >60 million raw reads, and the filtered reads were assembled into 263,341 contigs, which produced 38,505 unigenes. Genes involved in signal transduction mechanisms were the most abundant, suggesting that development of sturgeon gonads is under control of signal transduction mechanisms. Differentially expressed gene analysis suggests that more genes for protein synthesis, cytochrome c oxidase subunits, and ribosomal proteins were expressed in female gonads than in male. Meanwhile, male gonads expressed more transposable element transposase, reverse transcriptase, and transposase-related genes than female. In total, 342, 782, and 7,845 genes were detected in intersex, male, and female transcriptomes, respectively. The female gonad expressed more genes than the male gonad, and more genes were involved in female gonadal development. Genes (sox9, foxl2) are differentially expressed in different sexes and may be important sex-related genes in Russian sturgeon. Sox9 genes are responsible for the development of male gonads and foxl2 for female gonads.

  11. In-depth cDNA library sequencing provides quantitative gene expression profiling in cancer biomarker discovery.

    PubMed

    Yang, Wanling; Ying, Dingge; Lau, Yu-Lung

    2009-06-01

    Quantitative gene expression analysis plays an important role in identifying differentially expressed genes in various pathological states, gene expression regulation and co-regulation, shedding light on gene functions. Although microarray is widely used as a powerful tool in this regard, it is suboptimal quantitatively and unable to detect unknown gene variants. Here we demonstrated effective detection of differential expression and co-regulation of certain genes by expressed sequence tag analysis using a selected subset of cDNA libraries. We discussed the issues of sequencing depth and library preparation, and propose that increased sequencing depth and improved preparation procedures may allow detection of many expression features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to increase sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique advantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis.

  12. Natural product proteomining, a quantitative proteomics platform, allows rapid discovery of biosynthetic gene clusters for different classes of natural products.

    PubMed

    Gubbens, Jacob; Zhu, Hua; Girard, Geneviève; Song, Lijiang; Florea, Bogdan I; Aston, Philip; Ichinose, Koji; Filippov, Dmitri V; Choi, Young H; Overkleeft, Herman S; Challis, Gregory L; van Wezel, Gilles P

    2014-06-19

    Information on gene clusters for natural product biosynthesis is accumulating rapidly because of the current boom of available genome sequencing data. However, linking a natural product to a specific gene cluster remains challenging. Here, we present a widely applicable strategy for the identification of gene clusters for specific natural products, which we name natural product proteomining. The method is based on using fluctuating growth conditions that ensure differential biosynthesis of the bioactivity of interest. Subsequent combination of metabolomics and quantitative proteomics establishes correlations between abundance of natural products and concomitant changes in the protein pool, which allows identification of the relevant biosynthetic gene cluster. We used this approach to elucidate gene clusters for different natural products in Bacillus and Streptomyces, including a novel juglomycin-type antibiotic. Natural product proteomining does not require prior knowledge of the gene cluster or secondary metabolite and therefore represents a general strategy for identification of all types of gene clusters.

  13. Toxins and drug discovery.

    PubMed

    Harvey, Alan L

    2014-12-15

    Components from venoms have stimulated many drug discovery projects, with some notable successes. These are briefly reviewed, from captopril to ziconotide. However, there have been many more disappointments on the road from toxin discovery to approval of a new medicine. Drug discovery and development is an inherently risky business, and the main causes of failure during development programmes are outlined in order to highlight steps that might be taken to increase the chances of success with toxin-based drug discovery. These include having a clear focus on unmet therapeutic needs, concentrating on targets that are well-validated in terms of their relevance to the disease in question, making use of phenotypic screening rather than molecular-based assays, and working with development partners with the resources required for the long and expensive development process.

  14. Erratum: The Extragalactic Distance Scale Key Project. III. The Discovery of Cepheids and a New Distance to M101 Using the Hubble Space Telescope

    NASA Astrophysics Data System (ADS)

    Kelson, Daniel D; Illingworth, Garth D.; Freedman, Wendy F.; Graham, John A.; Hill, Robert; Madore, Barry F.; Saha, Abhijit; Stetson, Peter B.; Kennicutt, Robert C., Jr.; Mould, Jeremy R.; Hughes, Shaun M.; Ferrarese, Laura; Phelps, Randy; Turner, Anne; Cook, Kem H.; Ford, Holland; Hoessel, John G.; Huchra, John

    1997-03-01

    In the paper ``The Extragalactic Distance Scale Key Project. III. The Discovery of Cepheids and a New Distance to M101 Using the Hubble Space Telescope'' by Daniel D. Kelson, Garth D. Illingworth, Wendy F. Freedman, John A. Graham, Robert Hill, Barry F. Madore, Abhijit Saha, Peter B. Stetson, Robert C. Kennicutt, Jr., Jeremy R. Mould, Shaun M. Hughes, Laura Ferrarese, Randy Phelps, Anne Turner, Kem H. Cook, Holland Ford, John G. Hoessel, and John Huchra (ApJ, 463, 26 [1996]), two of the tables are in error. The magnitudes in Tables B1 and B2, in Appendix B, are ordered incorrectly. As a result, the Julian dates are not associated with their correct Cepheid magnitudes. We have now corrected these data, and updated versions of the tables are available on the World Wide Web. The tables are available in ASCII format at our Key Project site (http://www.ipac.caltech.edu/H0kp/) and will appear in volume 7 of the AAS CDROM. PostScript and paper copies are also available from the first author (http://www.ucolick.org/~kelson/H0/home.html or kelson@ucolick.org).

  15. Discovery and functional identification of fecundity-related genes in the brown planthopper by large-scale RNA interference.

    PubMed

    Qiu, J; He, Y; Zhang, J; Kang, K; Li, T; Zhang, W

    2016-12-01

    Recently, transcriptome and proteome data have increasingly been used to identify potential novel genes related to insect phenotypes. However, there are few studies reporting the large-scale functional identification of such genes in insects. To identify novel genes related to fecundity in the brown planthopper (BPH), Nilaparvata lugens, 115 genes were selected from the transcriptomic and proteomic data previously obtained from high- and low-fecundity populations in our laboratory. The results of RNA interference (RNAi) feeding experiments showed that 91.21% of the genes were involved in the regulation of vitellogenin (Vg) expression and may influence BPH fecundity. After RNAi injection experiments, 12 annotated genes were confirmed as fecundity-related genes and three novel genes were identified in the BPH. Finally, C-terminal binding protein (CtBP) was shown to play an important role in BPH fecundity. Knockdown of CtBP not only led to lower survival, underdeveloped ovaries and fewer eggs laid but also resulted in a reduction in Vg protein expression. The novel gene resources gained from this study will be useful for constructing a Vg regulation network and may provide potential target genes for RNAi-based pest control.

  16. The HUBBLE SPACE TELESCOPE Extragalactic Distance Scale Key Project. XXIII. The Discovery of Cepheids in NGC 3319

    NASA Astrophysics Data System (ADS)

    Sakai, Shoko; Ferrarese, Laura; Kennicutt, Robert C.; Graham, John A.; Silbermann, N. A.; Mould, Jeremy R.; Freedman, Wendy L.; Bresolin, Fabio; Ford, Holland C.; Gibson, Brad K.; Han, Mingsheng; Harding, Paul; Hoessel, John G.; Huchra, John P.; Hughes, Shaun M.; Illingworth, Garth D.; Kelson, Daniel; Macri, Lucas; Madore, Barry F.; Phelps, Randy L.; Saha, Abhijit; Sebo, Kim M.; Stetson, Peter B.; Turner, Anne

    1999-10-01

    The distance to NGC 3319 has been determined from Cepheid variable stars as part of the Hubble Space Telescope Key Project on the Extragalactic Distance Scale. Thirteen and four epochs of observations, using filters F555W (V) and F814W (I), respectively, were made with the Wide Field Planetary Camera 2. Thirty-three Cepheid variables between periods of 8 and 47 days were discovered. Adopting a Large Magellanic Cloud distance modulus of 18.50 +/- 0.10 mag and extinction of E(V-I)=0.13 mag, a true reddening-corrected distance modulus (based on an analysis employing the ALLFRAME software package) of 30.78 +/- 0.14 (random) +/- 0.10 (systematic) mag and the extinction of E(V-I)=0.06 mag were determined for NGC 3319. This galaxy is the last galaxy observed for the HST H0 Key Project. Based on observations with the NASA/ESA Hubble Space Telescope, obtained at the Space Telescope Science Institute, operated by AURA, Inc., under NASA contract NAS 5-26555.

  17. The Hubble Space Telescope Extragalactic Distance Scale Key Project. 1: The discovery of Cepheids and a new distance to M81

    NASA Technical Reports Server (NTRS)

    Freedman, Wendy L.; Hughes, Shaun M.; Madore, Barry F.; Mould, Jeremy R.; Lee, Myung Gyoon; Stetson, Peter; Kennicutt, Robert C.; Turner, Anne; Ferrarese, Laura; Ford, Holland

    1994-01-01

    We report on the discovery of 30 new Cepheids in the nearby galaxy M81 based on observations using the Hubble Space Telescope (HST). The periods of these Cepheids lie in the range of 10-55 days, based on 18 independent epochs using the HST wide-band F555W filter. The HST F555W and F785LP data have been transformed to the Cousins standard V and I magnitude system using a ground-based calibration. Apparent period-luminosity relations at V and I were constructed, from which apparent distance moduli were measured with respect to assumed values of mu(sub 0) = 18.50 mag and E(B - V) = 0.10 mag for the Large Magellanic Cloud. The difference in the apparent V and I moduli yields a measure of the difference in the total mean extinction between the M81 and the LMC Cepheid samples. A low total mean extinction to the M81 sample of E(B - V) = 0.03 +/- 0.05 mag is obtained. The true distance modulus to M81 is determined to be 27.80 +/- 0.20 mag, corresponding to a distance of 3.63 +/- 0.34 Mpc. These data illustrate that with an optimal (power-law) sampling strategy, the HST provides a powerful tool for the discovery of extragalactic Cepheids and their application to the distance scale. M81 is the first calibrating galaxy in the target sample of the HST Key Project on the Extragalactic Distance Scale, the ultimate aim of which is to provide a value of the Hubble constant to 10% accuracy.

  18. Scientific Discovery for All

    ERIC Educational Resources Information Center

    Zaikowski, Lori; Lichtman, Paul; Quarless, Duncan

    2007-01-01

    The scientific discovery process comes alive for 70 minority students each year at Uniondale High School in New York where students have won top awards for "in-house" projects. Uniondale High School is in a middle-income school district where over 95% of students are from minority groups. Founded in 2000, the Uniondale High School Research Program…

  19. Field of genes: the politics of science and identity in the Estonian Genome Project.

    PubMed

    Fletcher, Amy L

    2004-04-01

    This case study of the Estonian Genome Project (EGP) analyses the Estonian policy decision to construct a national human gene bank. Drawing upon qualitative data from newspaper articles and public policy documents, it focuses on how proponents use discourse to link the EGP to the broader political goal of securing Estonia's position within the Western/European scientific and cultural space. This dominant narrative is then situated within the analytical notion of the "brand state", which raises potentially negative political consequences for this type of market-driven genomic research. Considered against the increasing number of countries engaging in gene bank and/or gene database projects, this analysis of Estonia elucidates issues that cross national boundaries, while also illuminating factors specific to this small, post-Soviet state as it enters the global biocybernetic economy.

  20. IMG-ABC: new features for bacterial secondary metabolism analysis and targeted biosynthetic gene cluster discovery in thousands of microbial genomes

    PubMed Central

    Hadjithomas, Michalis; Chen, I-Min A.; Chu, Ken; Huang, Jinghua; Ratner, Anna; Palaniappan, Krishna; Andersen, Evan; Markowitz, Victor; Kyrpides, Nikos C.; Ivanova, Natalia N.

    2017-01-01

    Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc/). Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic gene clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery. PMID:27903896

  1. Immune gene discovery by expressed sequence tag (EST) analysis of hemocytes in the ridgetail white prawn Exopalaemon carinicauda

    PubMed Central

    Duan, Yafei; Liu, Ping; Li, Jitao; Li, Jian; Chen, Ping

    2013-01-01

    The ridgetail white prawn Exopalaemon carinicauda is one of the most important commercial species in eastern China. However, little information of immune genes in E. carinicauda has been reported. To identify distinctive genes associated with immunity, an expressed sequence tag (EST) library was constructed from hemocytes of E. carinicauda. A total of 3411 clones were sequenced, yielding 2853 ESTs and the average sequence length is 436 bp. The cluster and assembly analysis yielded 1053 unique sequences including 329 contigs and 724 singletons. Blast analysis identified 593 (56.3%) of the unique sequences as orthologs of genes from other organisms (E-value < 1e-5). Based on the COG and Gene Ontology (GO), 593 unique sequences were classified. Through comparison with previous studies, 153 genes assembled from 367 ESTs have been identified as possibly involved in defense or immune functions. These genes are categorized into seven categories according to their putative functions in shrimp immune system: antimicrobial peptides, prophenoloxidase activating system, antioxidant defense systems, chaperone proteins, clottable proteins, pattern recognition receptors and other immune-related genes. According to EST abundance, the major immune-related genes were thioredoxin (141, 4.94% of all ESTs) and calmodulin (14, 0.49% of all ESTs). The EST sequences of E. carinicauda hemocytes provide important information of the immune system and lay the groundwork for development of molecular markers related to disease resistance in prawn species. PMID:23092732

  2. Expressed sequence tags in cultivated peanut (Arachis hypogaea): discovery of genes in seed development and response to Ralstonia solanacearum challenge.

    PubMed

    Huang, Jiaquan; Yan, Liying; Lei, Yong; Jiang, Huifang; Ren, Xiaoping; Liao, Boshou

    2012-11-01

    Although an important oil crop, peanut has only 162,030 expressed sequence tags (ESTs) publicly available, 86,943 of which are from cultivated plants. More ESTs from cultivated peanuts are needed for isolation of stress-resistant, tissue-specific and developmentally important genes. Here, we generated 63,234 ESTs from our 5 constructed peanut cDNA libraries of Ralstonia solanacearum challenged roots, R. solanacearum challenged leaves, and unchallenged cultured peanut roots, leaves and developing seeds. Among these ESTs, there were 14,547 unique sequences with 7,961 tentative consensus sequences and 6,586 singletons. Putative functions for 47.8 % of the sequences were identified, including transcription factors, tissue-specific genes, genes involved in fatty acid biosynthesis and oil formation regulation, and resistance gene analogue genes. Additionally, differentially expressed genes, including those involved in ethylene and jasmonic acid signal transduction pathways, from both peanut leaves and roots, were identified in R. solanacearum challenged samples. This large expression dataset from different peanut tissues will be a valuable source for marker development and gene expression analysis. It will also be helpful for finding candidate genes for fatty acid synthesis and oil formation regulation as well as for studying mechanisms of interactions between the peanut host and R. solanacearum pathogen.

  3. Coupled Transcriptome and Proteome Analysis of Human Lymphotropic Tumor Viruses: Insights on the Detection and Discovery of Viral Genes

    SciTech Connect

    Dresang, Lindsay R.; Teuton, Jeremy R.; Feng, Huichen; Jacobs, Jon M.; Camp, David G.; Purvine, Samuel O.; Gritsenko, Marina A.; Li, Zhihua; Smith, Richard D.; Sugden, Bill; Moore, Patrick S.; Chang, Yuan

    2011-12-20

    Kaposi's sarcoma-associated herpesvirus (KSHV) and Epstein-Barr virus (EBV) are related human tumor viruses that cause primary effusion lymphomas (PEL) and Burkitt's lymphomas (BL), respectively. Viral genes expressed in naturally-infected cancer cells contribute to disease pathogenesis; knowing which viral genes are expressed is critical in understanding how these viruses cause cancer. To evaluate the expression of viral genes, we used high-resolution separation and mass spectrometry coupled with custom tiling arrays to align the viral proteomes and transcriptomes of three PEL and two BL cell lines under latent and lytic culture conditions. Results The majority of viral genes were efficiently detected at the transcript and/or protein level on manipulating the viral life cycle. Overall the correlation of expressed viral proteins and transcripts was highly complementary in both validating and providing orthogonal data with latent/lytic viral gene expression. Our approach also identified novel viral genes in both KSHV and EBV, and extends viral genome annotation. Several previously uncharacterized genes were validated at both transcript and protein levels. Conclusions This systems biology approach coupling proteome and transcriptome measurements provides a comprehensive view of viral gene expression that could not have been attained using each methodology independently. Detection of viral proteins in combination with viral transcripts is a potentially powerful method for establishing virus-disease relationships.

  4. Combining SNP discovery from next-generation sequencing data with bulked segregant analysis (BSA) to fine-map genes in polyploid wheat

    PubMed Central

    2012-01-01

    Background Next generation sequencing (NGS) technologies are providing new ways to accelerate fine-mapping and gene isolation in many species. To date, the majority of these efforts have focused on diploid organisms with readily available whole genome sequence information. In this study, as a proof of concept, we tested the use of NGS for SNP discovery in tetraploid wheat lines differing for the previously cloned grain protein content (GPC) gene GPC-B1. Bulked segregant analysis (BSA) was used to define a subset of putative SNPs within the candidate gene region, which were then used to fine-map GPC-B1. Results We used Illumina paired end technology to sequence mRNA (RNAseq) from near isogenic lines differing across a ~30-cM interval including the GPC-B1 locus. After discriminating for SNPs between the two homoeologous wheat genomes and additional quality filtering, we identified inter-varietal SNPs in wheat unigenes between the parental lines. The relative frequency of these SNPs was examined by RNAseq in two bulked samples made up of homozygous recombinant lines differing for their GPC phenotype. SNPs that were enriched at least 3-fold in the corresponding pool (6.5% of all SNPs) were further evaluated. Marker assays were designed for a subset of the enriched SNPs and mapped using DNA from individuals of each bulk. Thirty nine new SNP markers, corresponding to 67% of the validated SNPs, mapped across a 12.2-cM interval including GPC-B1. This translated to 1 SNP marker per 0.31 cM defining the GPC-B1 gene to within 13-18 genes in syntenic cereal genomes and to a 0.4 cM interval in wheat. Conclusions This study exemplifies the use of RNAseq for SNP discovery in polyploid species and supports the use of BSA as an effective way to target SNPs to specific genetic intervals to fine-map genes in unsequenced genomes. PMID:22280551

  5. The discovery of the two types of small subunit ribosomal RNA gene in Eimeria mitis contests the existence of E. mivati as an independent species.

    PubMed

    Vrba, Vladimir; Poplstein, Martin; Pakandl, Michal

    2011-12-29

    Although the validity of the coccidian species, Eimeria mivati, has been questioned by many researchers for a long time there has not been any molecular analysis that would help resolve this issue. Here we report on the discovery of the two types of small ribosomal subunit (18S) gene within the Eimeria mitis genome that correspond to the known 18S sequences of E. mitis and E. mivati, and this is in conflict with the existence of E. mivati as an independent species. We have carried out five single oocyst isolations to obtain five single-oocyst-derived strains of E. mitis and these were analyzed by the sequencing of 18S and mitochondrial cytochrome c oxidase subunit I genes. The two types of 18S gene were found to be present in each strain in roughly equal ratios. This indicates that if the strains carrying only one or the other 18S type exist, they will likely cross-breed and still represent a single species. However, the more probable explanation is that all strains of E. mitis contain two types of 18S gene and that the occasional detection of only one or the other type by sequencing might be caused by insufficient sampling. This is also the first report of the two types of 18S gene in Eimeria, which has already been described in some other apicomplexan species, most notably Plasmodium. We also found that these two types of ribosomal RNA differ significantly in their secondary structure. The biological significance of the two 18S gene variants in E. mitis is not known, however, we hypothesize that these variants might be used in different stages of the parasite's life-cycle as it is in other apicomplexan species investigated so far.

  6. Normalized lmQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in Cancers

    PubMed Central

    Zhang, Jie; Huang, Kun

    2014-01-01

    In this paper, we present a new approach for mining weighted networks to identify densely connected modules such as quasi-cliques. Quasi-cliques are densely connected subnetworks in a network. Detecting quasi-cliques is an important topic in data mining, with applications such as social network study and biomedicine. Our approach has two major improvements upon previous work. The first is the use of local maximum edges to initialize the search in order to avoid excessive overlaps among the modules, thereby greatly reducing the computing time. The second is the inclusion of a weight normalization procedure to enable discovery of “subtle” modules with more balanced sizes. We carried out careful tests on multiple parameters and settings using two large cancer datasets. This approach allowed us to identify a large number of gene modules enriched in both biological functions and chromosomal bands in cancer data, suggesting potential roles of copy number variations (CNVs) involved in the cancer development. We then tested the genes in selected modules with enriched chromosomal bands using The Cancer Genome Atlas data, and the results strongly support our hypothesis that the coexpression in these modules are associated with CNVs. While gene coexpression network analyses have been widely adopted in disease studies, most of them focus on the functional relationships of coexpressed genes. The relationship between coexpression gene modules and CNVs are much less investigated despite the potential advantage that we can infer from such relationship without genotyping data. Our new approach thus provides a means to carry out deep mining of the gene coexpression network to obtain both functional and genetic information from the expression data. PMID:27486298

  7. Discovery of genes related to witches broom disease in Paulownia tomentosa × Paulownia fortunei by a De Novo assembled transcriptome.

    PubMed

    Liu, Rongning; Dong, Yanpeng; Fan, Guoqiang; Zhao, Zhenli; Deng, Minjie; Cao, Xibing; Niu, Suyan

    2013-01-01

    In spite of its economic importance, very little molecular genetics and genomic research has been targeted at the family Paulownia spp. The little genetic information on this plant is a big obstacle to studying the mechanisms of its ability to resist Paulownia Witches' Broom (PaWB) disease. Analysis of the Paulownia transcriptome and its expression profile data are essential to extending the genetic resources on this species, thus will greatly improves our studies on Paulownia. In the current study, we performed the de novo assembly of a transcriptome on P. tomentosa × P. fortunei using the short-read sequencing technology (Illumina). 203,664 unigenes with a mean length of 1,328 bp was obtained. Of these unigenes, 32,976 (30% of all unigenes) containing complete structures were chosen. Eukaryotic clusters of orthologous groups, gene orthology, and the Kyoto Encyclopedia of Genes and Genomes annotations were performed of these unigenes. Genes related to PaWB disease resistance were analyzed in detail. To our knowledge, this is the first study to elucidate the genetic makeup of Paulownia. This transcriptome provides a quick way to understanding Paulownia, increases the number of gene sequences available for further functional genomics studies and provides clues to the identification of potential PaWB disease resistance genes. This study has provided a comprehensive insight into gene expression profiles at different states, which facilitates the study of each gene's roles in the developmental process and in PaWB disease resistance.

  8. Pigmentation in sand pear (Pyrus pyrifolia) fruit: biochemical characterization, gene discovery and expression analysis with exocarp pigmentation mutant.

    PubMed

    Wang, Yue-zhi; Zhang, Shujun; Dai, Mei-song; Shi, Ze-bin

    2014-05-01

    Exocarp color of sand pear is an important trait for the fruit production and has caused our concern for a long time. Our previous study explored the different expression genes between the two genotypes contrasting for exocarp color, which indicated the different suberin, cutin, wax and lignin biosynthesis between the russet- and green-exocarp. In this study, we carried out microscopic observation and Fourier transform infrared spectroscopy analysis to detect the differences of tissue structure and biochemical composition between the russet- and green-exocarp of sand pear. The green exocarp was covered with epidermis and cuticle which was replaced by a cork layer on the surface of russet exocarp, and the chemicals of the russet exocarp were characterized by lignin, cellulose and hemicellulose. We explored differential gene expression between the russet exocarp of 'Niitaka' and its green exocarp mutant cv. 'Suisho' using Illumina RNA-sequencing. A total of 559 unigenes showed different expression between the two types of exocarp, and 123 of them were common to the previous study. The quantitative real time-PCR analysis supports the RNA-seq-derived gene with different expression between the two types of exocarp and revealed the preferential expression of these genes in exocarp than in mesocarp and fruit core. Gene ontology enrichment analysis revealed divorced expression of lipid metabolic process genes, transport genes, stress responsive genes and other biological process genes in the two types of exocarp. Expression changes in lignin metabolism-related genes were consistent with the different pigmentation of russet and green exocarp. Increased transcripts of putative genes involved the suberin, cutin and wax biosynthesis in 'Suisho' exocarp could facilitate deposition of the chemicals and take a role in the mutant trait responsible for the green exocarp. In addition, the divorced expression of ATP-binding cassette transporters involved in the trans

  9. Biomarker discovery to improve prediction of breast cancer survival: using gene expression profiling, meta-analysis, and tissue validation

    PubMed Central

    Meng, Liwei; Xu, Yingchun; Xu, Chaoyang; Zhang, Wei

    2016-01-01

    Purpose Breast cancer is the leading cause of cancer death worldwide in women. The molecular mechanism for human breast cancer is unknown. Gene microarray has been widely used in breast cancer research to identify clinically relevant molecular subtypes as well as to predict prognosis survival. So far, the valuable multigene signatures in clinical practice are unclear, and the biological importance of individual genes is difficult to detect, as the described signatures virtually do not overlap. Early prognosis of this disease, breast invasive ductal carcinoma (IDC) and breast ductal carcinoma in situ (DCIS), is vital in breast surgery. Methods Thus, this study reports gene expression profiling in large breast cancer cohorts from Gene Expression Omnibus, including GSE29044 (N=138) and GSE10780 (N=185) test series and four independent validation series GSE21653 (N=266), GSE20685 (N=327), GSE26971 (N=276), and GSE12776 (N=204). Significantly differentially expressed genes in human breast IDC and breast DCIS were detected by transcriptome microarray analysis. Results We created a set of three genes (MAMDC2, TSHZ2, and CLDN11) that were significantly correlated with disease-free survival of breast cancer patients using a univariate Cox regression model (significance level P<0.01) in a meta-analysis. Based on the risk score of the three genes, the test series patients could be separated into low-risk and high-risk groups with significantly different survival times. This signature was validated in the other three cohorts. The prognostic value of this three-gene signature was confirmed in the internal validation series and another four independent breast cancer data sets. The prognostic impact of one of the three genes, CLDN11, was confirmed by immunohistochemistry. CLDN11 was significantly overexpressed in human breast IDC as compared with normal breast tissues and breast DCIS. Conclusion Using novel gene expression profiling together with a meta-analysis validation

  10. De novo assembly and discovery of genes that are involved in drought tolerance in Tibetan Sophora moorcroftiana.

    PubMed

    Li, Huie; Yao, Weijie; Fu, Yaru; Li, Shaoke; Guo, Qiqiang

    2015-01-01

    Sophora moorcroftiana, a Leguminosae shrub species that is restricted to the arid and semi-arid regions of the Qinghai-Tibet Plateau, is an ecologically important foundation species and exhibits substantial drought tolerance in the Plateau. There are no functional genomics resources in public databases for understanding the molecular mechanism underlying the drought tolerance of S. moorcroftiana. Therefore, we performed a large-scale transcriptome sequencing of this species under drought stress using the Illumina sequencing technology. A total of 62,348,602 clean reads were obtained. The assembly of the clean reads resulted in 146,943 transcripts, including 66,026 unigenes. In the assembled sequences, 1534 transcription factors were identified and classified into 23 different common families, and 9040 SSR loci, from di- to hexa-nucleotides, whose repeat number is greater than five, were presented. In addition, we performed a gene expression profiling analysis upon dehydration treatment. The results indicated significant differences in the gene expression profiles among the control, mild stress and severe stress. In total, 4687, 5648 and 5735 genes were identified from the comparison of mild versus control, severe versus control and severe versus mild stress, respectively. Based on the differentially expressed genes, a Gene Ontology annotation analysis indicated many dehydration-relevant categories, including 'response to water 'stimulus' and 'response to water deprivation'. Meanwhile, the Kyoto Encyclopedia of Genes and Genomes pathway analysis uncovered some important pathways, such as 'metabolic pathways' and 'plant hormone signal transduction'. In addition, the expression patterns of 25 putative genes that are involved in drought tolerance resulting from quantitative real-time PCR were consistent with their transcript abundance changes as identified by RNA-seq. The globally sequenced genes covered a considerable proportion of the S. moorcroftiana transcriptome

  11. Discovery of Genes Related to Insecticide Resistance in Bactrocera dorsalis by Functional Genomic Analysis of a De Novo Assembled Transcriptome

    PubMed Central

    Hsu, Ju-Chun; Wu, Wen-Jer; Feng, Hai-Tung; Haymer, David S.; Chen, Chien-Yu

    2012-01-01

    Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS). The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs). A total of 29,067 isotigs have putative homologues in the non-redundant (nr) protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also analyzed to

  12. De Novo Assembly, Gene Annotation, and Marker Discovery in Stored-Product Pest Liposcelis entomophila (Enderlein) Using Transcriptome Sequences

    PubMed Central

    Wei, Dan-Dan; Chen, Er-Hu; Ding, Tian-Bo; Chen, Shi-Chun; Dou, Wei; Wang, Jin-Jun

    2013-01-01

    Background As a major stored-product pest insect, Liposcelis entomophila has developed high levels of resistance to various insecticides in grain storage systems. However, the molecular mechanisms underlying resistance and environmental stress have not been characterized. To date, there is a lack of genomic information for this species. Therefore, studies aimed at profiling the L. entomophila transcriptome would provide a better understanding of the biological functions at the molecular levels. Methodology/Principal Findings We applied Illumina sequencing technology to sequence the transcriptome of L. entomophila. A total of 54,406,328 clean reads were obtained and that de novo assembled into 54,220 unigenes, with an average length of 571 bp. Through a similarity search, 33,404 (61.61%) unigenes were matched to known proteins in the NCBI non-redundant (Nr) protein database. These unigenes were further functionally annotated with gene ontology (GO), cluster of orthologous groups of proteins (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. A large number of genes potentially involved in insecticide resistance were manually curated, including 68 putative cytochrome P450 genes, 37 putative glutathione S-transferase (GST) genes, 19 putative carboxyl/cholinesterase (CCE) genes, and other 126 transcripts to contain target site sequences or encoding detoxification genes representing eight types of resistance enzymes. Furthermore, to gain insight into the molecular basis of the L. entomophila toward thermal stresses, 25 heat shock protein (Hsp) genes were identified. In addition, 1,100 SSRs and 57,757 SNPs were detected and 231 pairs of SSR primes were designed for investigating the genetic diversity in future. Conclusions/Significance We developed a comprehensive transcriptomic database for L. entomophila. These sequences and putative molecular markers would further promote our understanding of the molecular mechanisms underlying insecticide resistance

  13. Constrained randomization and multivariate effect projections improve information extraction and biomarker pattern discovery in metabolomics studies involving dependent samples.

    PubMed

    Jonsson, Pär; Wuolikainen, Anna; Thysell, Elin; Chorell, Elin; Stattin, Pär; Wikström, Pernilla; Antti, Henrik

    Analytical drift is a major source of bias in mass spectrometry based metabolomics confounding interpretation and biomarker detection. So far, standard protocols for sample and data analysis have not been able to fully resolve this. We present a combined approach for minimizing the influence of analytical drift on multivariate comparisons of matched or dependent samples in mass spectrometry based metabolomics studies. The approach is building on a randomization procedure for sample run order, constrained to independent randomizations between and within dependent sample pairs (e.g. pre/post intervention). This is followed by a novel multivariate statistical analysis strategy allowing paired or dependent analyses of individual effects named OPLS-effect projections (OPLS-EP). We show, using simulated data that OPLS-EP gives improved interpretation over existing methods and that constrained randomization of sample run order in combination with an appropriate dependent statistical test increase the accuracy and sensitivity and decrease the false omission rate in biomarker detection. We verify these findings and prove the strength of the suggested approach in a clinical data set consisting of LC/MS data of blood plasma samples from patients before and after radical prostatectomy. Here OPLS-EP compared to traditional (independent) OPLS-discriminant analysis (OPLS-DA) on constrained randomized data gives a less complex model (3 versus 5 components) as well a higher predictive ability (Q2 = 0.80 versus Q2 = 0.55). We explain this by showing that paired statistical analysis detects 37 unique significant metabolites that were masked for the independent test due to bias, including analytical drift and inter-individual variation.

  14. Sleeping Beauty Transposon Mutagenesis as a Tool for Gene Discovery in the NOD Mouse Model of Type 1 Diabetes

    PubMed Central

    Elso, Colleen M.; Chu, Edward P. F.; Alsayb, May A.; Mackin, Leanne; Ivory, Sean T.; Ashton, Michelle P.; Bröer, Stefan; Silveira, Pablo A.; Brodnicki, Thomas C.

    2015-01-01

    A number of different strategies have been used to identify genes for which genetic variation contributes to type 1 diabetes (T1D) pathogenesis. Genetic studies in humans have identified >40 loci that affect the risk for developing T1D, but the underlying causative alleles are often difficult to pinpoint or have subtle biological effects. A complementary strategy to identifying “natural” alleles in the human population is to engineer “artificial” alleles within inbred mouse strains and determine their effect on T1D incidence. We describe the use of the Sleeping Beauty (SB) transposon mutagenesis system in the nonobese diabetic (NOD) mouse strain, which harbors a genetic background predisposed to developing T1D. Mutagenesis in this system is random, but a green fluorescent protein (GFP)-polyA gene trap within the SB transposon enables early detection of mice harboring transposon-disrupted genes. The SB transposon also acts as a molecular tag to, without additional breeding, efficiently identify mutated genes and prioritize mutant mice for further characterization. We show here that the SB transposon is functional in NOD mice and can produce a null allele in a novel candidate gene that increases diabetes incidence. We propose that SB transposon mutagenesis could be used as a complementary strategy to traditional methods to help identify genes that, when disrupted, affect T1D pathogenesis. PMID:26438296

  15. Discovery of Genes Related to Witches Broom Disease in Paulownia tomentosa × Paulownia fortunei by a De Novo Assembled Transcriptome

    PubMed Central

    Liu, Rongning; Dong, Yanpeng; Fan, Guoqiang; Zhao, Zhenli; Deng, Minjie; Cao, Xibing; Niu, Suyan

    2013-01-01

    In spite of its economic importance, very little molecular genetics and genomic research has been targeted at the family Paulownia spp. The little genetic information on this plant is a big obstacle to studying the mechanisms of its ability to resist Paulownia Witches’ Broom (PaWB) disease. Analysis of the Paulownia transcriptome and its expression profile data are essential to extending the genetic resources on this species, thus will greatly improves our studies on Paulownia. In the current study, we performed the de novo assembly of a transcriptome on P. tomentosa × P. fortunei using the short-read sequencing technology (Illumina). 203,664 unigenes with a mean length of 1,328 bp was obtained. Of these unigenes, 32,976 (30% of all unigenes) containing complete structures were chosen. Eukaryotic clusters of orthologous groups, gene orthology, and the Kyoto Encyclopedia of Genes and Genomes annotations were performed of these unigenes. Genes related to PaWB disease resistance were analyzed in detail. To our knowledge, this is the first study to elucidate the genetic makeup of Paulownia. This transcriptome provides a quick way to understanding Paulownia, increases the number of gene sequences available for further functional genomics studies and provides clues to the identification of potential PaWB disease resistance genes. This study has provided a comprehensive insight into gene expression profiles at different states, which facilitates the study of each gene’s roles in the developmental process and in PaWB disease resistance. PMID:24278262

  16. Discovery of gene networks regulating cytokine-induced dysfunction and apoptosis in insulin-producing INS-1 cells.

    PubMed

    Kutlu, Burak; Cardozo, Alessandra K; Darville, Martine I; Kruhøffer, Mogens; Magnusson, Nils; Ørntoft, Torben; Eizirik, Décio L

    2003-11-01

    Locally released cytokines contribute to beta-cell dysfunction and apoptosis in type 1 diabetes. In vitro exposure of insulin-producing INS-1E cells to the cytokines interleukin (IL)-1beta + interferon (IFN)-gamma leads to a significant increase in apoptosis. To characterize the genetic networks implicated in beta-cell dysfunction and apoptosis and its dependence on nitric oxide (NO) production, we performed a time-course microarray analysis of cytokine-induced genes in insulin-producing INS-1E cells. INS-1E cells were exposed in duplicate to IL-1beta + IFN-gamma for six different time points (1, 2, 4, 8, 12, and 24 h) with or without the inducible NO synthase (iNOS) blocker N(G)-monomethyl-L-arginine (NMA). The microarray analysis identified 698 genes as cytokine modified (>or=2.5-fold change compared with control) in at least one time point. Based on their temporal pattern of variation, the cytokine-regulated genes were classified into 15 clusters by the k-means method. These genes were further classified into 14 different groups according to their putative function. Changes in the expression of genes related to metabolism, signal transduction, and transcription factors at all time points studied indicate beta-cell attempts to adapt to the effects of continuous cytokine exposure. Notably, several apoptosis-related genes were modified at early time points (2-4 h) preceding iNOS expression. On the other hand, 46% of the genes modified by cytokines after 8-24 h were NO dependent, indicating the important role of this radical for the late effects of cytokines. The present results increase by more than twofold the number of known cytokine-modified genes in insulin-producing cells and yield comprehensive information on the role of NO for these modifications in gene expression. These data provide novel and detailed insights into the gene networks activated in beta-cells facing a prolonged immune assault.

  17. Genes and Environment in Multiple Sclerosis project: A platform to investigate multiple sclerosis risk.

    PubMed

    Xia, Zongqi; White, Charles C; Owen, Emily K; Von Korff, Alina; Clarkson, Sarah R; McCabe, Cristin A; Cimpean, Maria; Winn, Phoebe A; Hoesing, Ashley; Steele, Sonya U; Cortese, Irene C M; Chitnis, Tanuja; Weiner, Howard L; Reich, Daniel S; Chibnik, Lori B; De Jager, Philip L

    2016-02-01

    The Genes and Environment in Multiple Sclerosis project establishes a platform to investigate the events leading to multiple sclerosis (MS) in at-risk individuals. It has recruited 2,632 first-degree relatives from across the USA. Using an integrated genetic and environmental risk score, we identified subjects with twice the MS risk when compared to the average family member, and we report an initial incidence rate in these subjects that is 30 times greater than that of sporadic MS. We discuss the feasibility of large-scale studies of asymptomatic at-risk subjects that leverage modern tools of subject recruitment to execute collaborative projects.

  18. Natural product discovery: past, present, and future.

    PubMed

    Katz, Leonard; Baltz, Richard H

    2016-03-01

    Microorganisms have provided abundant sources of natural products which have been developed as commercial products for human medicine, animal health, and plant crop protection. In the early years of natural product discovery from microorganisms (The Golden Age), new antibiotics were found with relative ease from low-throughput fermentation and whole cell screening methods. Later, molecular genetic and medicinal chemistry approaches were applied to modify and improve the activities of important chemical scaffolds, and more sophisticated screening methods were directed at target disease states. In the 1990s, the pharmaceutical industry moved to high-throughput screening of synthetic chemical libraries against many potential therapeutic targets, including new targets identified from the human genome sequencing project, largely to the exclusion of natural products, and discovery rates dropped dramatically. Nonetheless, natural products continued to provide key scaffolds for drug development. In the current millennium, it was discovered from genome sequencing that microbes with large genomes have the capacity to produce about ten times as many secondary metabolites as was previously recognized. Indeed, the most gifted actinomycetes have the capacity to produce around 30-50 secondary metabolites. With the precipitous drop in cost for genome sequencing, it is now feasible to sequence thousands of actinomycete genomes to identify the "biosynthetic dark matter" as sources for the discovery of new and novel secondary metabolites. Advances in bioinformatics, mass spectrometry, proteomics, transcriptomics, metabolomics and gene expression are driving the new field of microbial genome mining for applications in natural product discovery and development.

  19. High-Throughput Sequence Analysis of Turbot (Scophthalmus maximus) Transcriptome Using 454-Pyrosequencing for the Discovery of Antiviral Immune Genes

    PubMed Central

    Pereiro, Patricia; Balseiro, Pablo; Romero, Alejandro; Dios, Sonia; Forn-Cuni, Gabriel; Fuste, Berta; Planas, Josep V.; Beltran, Sergi; Novoa, Beatriz; Figueras, Antonio

    2012-01-01

    Background Turbot (Scophthalmus maximus L.) is an important aquacultural resource both in Europe and Asia. However, there is little information on gene sequences available in public databases. Currently, one of the main problems affecting the culture of this flatfish is mortality due to several pathogens, especially viral diseases which are not treatable. In order to identify new genes involved in immune defense, we conducted 454-pyrosequencing of the turbot transcriptome after different immune stimulations. Methodology/Principal Findings Turbot were injected with viral stimuli to increase the expression level of immune-related genes. High-throughput deep sequencing using 454-pyrosequencing technology yielded 915,256 high-quality reads. These sequences were assembled into 55,404 contigs that were subjected to annotation steps. Intriguingly, 55.16% of the deduced protein was not significantly similar to any sequences in the databases used for the annotation and only 0.85% of the BLASTx top-hits matched S. maximus protein sequences. This relatively low level of annotation is possibly due to the limited information for this specie and other flatfish in the database. These results suggest the identification of a large number of new genes in turbot and in fish in general. A more detailed analysis showed the presence of putative members of several innate and specific immune pathways. Conclusions/Significance To our knowledge, this study is the first transcriptome analysis using 454-pyrosequencing for turbot. Previously, there were only 12,471 EST and less of 1,500 nucleotide sequences for S. maximus in NCBI database. Our results provide a rich source of data (55,404 contigs and 181,845 singletons) for discovering and identifying new genes, which will serve as a basis for microarray construction, gene expression characterization and for identification of genetic markers to be used in several applications. Immune stimulation in turbot was very effective, obtaining an

  20. Anti-cancer Parasporin Toxins are Associated with Different Environments: Discovery of Two Novel Parasporin 5-like Genes.

    PubMed

    Ammons, David R; Short, John D; Bailey, Jeffery; Hinojosa, Gabriela; Tavarez, Lourdes; Salazar, Martha; Rampersad, Joanne N

    2016-02-01

    Cry toxins are primarily a family of insecticidal toxins produced by the bacterium Bacillus thuringiensis (Bt). However, some Cry toxins, called parasporins (PSs), are non-insecticidal and have been shown to differentially kill human cancer cells. Based on amino acid homology, there are currently six different classes of parasporins (PS1-6). It is not known what role parasporins play in nature, nor if certain PSs are associated with Bt found in particular environments. Herein, we present ten parasporin-containing isolates of Bt from the Caribbean island of Trinidad. Genes coding for PS1 and PS6 were found in isolates associated mainly with artificial aquatic environments (e.g., barrels with rain water), while Bt possessing two novel PS5-like genes (ps5-1 and ps5-2), were isolated from manure collected directly from the rectum of cattle. The amino acid sequences inferred from the two PS5-like genes were 51 % homologous to each other, while being only 41 or 45 % similar to PS5Aa1/Cry64Aa, the only reported member of the parasporin five class. The low level of amino acid homology between the two PS5-like genes and PS5Aa1 indicate that the two PS5-like genes may represent a new class of parasporins, or greatly expand the level of diversity within the current parasporin 5 class.

  1. A Population of Deletion Mutants and an Integrated Mapping and Exome-seq Pipeline for Gene Discovery in Maize

    PubMed Central

    Jia, Shangang; Li, Aixia; Morton, Kyla; Avoles-Kianian, Penny; Kianian, Shahryar F.; Zhang, Chi; Holding, David

    2016-01-01

    To better understand maize endosperm filling and maturation, we used γ-irradiation of the B73 maize reference line to generate mutants with opaque endosperm and reduced kernel fill phenotypes, and created a population of 1788 lines including 39 Mo17 × F2s showing stable, segregating, and viable kernel phenotypes. For molecular characterization of the mutants, we developed a novel functional genomics platform that combined bulked segregant RNA and exome sequencing (BSREx-seq) to map causative mutations and identify candidate genes within mapping intervals. To exemplify the utility of the mutants and provide proof-of-concept for the bioinformatics platform, we present detailed characterization of line 937, an opaque mutant harboring a 6203 bp in-frame deletion covering six exons within the Opaque-1 gene. In addition, we describe mutant line 146 which contains a 4.8 kb intragene deletion within the Sugary-1 gene and line 916 in which an 8.6 kb deletion knocks out a Cyclin A2 gene. The publically available algorithm developed in this work improves the identification of causative deletions and its corresponding gaps within mapping peaks. This study demonstrates the utility of γ-irradiation for forward genetics in large nondense genomes such as maize since deletions often affect single genes. Furthermore, we show how this classical mutagenesis method becomes applicable for functional genomics when combined with state-of-the-art genomics tools. PMID:27261000

  2. Comparative transcriptome analysis of testes and ovaries for the discovery of novel genes from Amur sturgeon (Acipenser schrenckii).

    PubMed

    Jin, S B; Zhang, Y; Dong, X L; Xi, Q K; Song, D; Fu, H T; Sun, D J

    2015-12-29

    Sturgeons (Acipenser schrenckii) are of high evolutionary, economic, and conservation value, and caviar isone of the most valuable animal food products in the world. The Illumina HiSeq2000 sequencing platform was used to construct testicular and ovarian transcriptomes to identify genes involved in reproduction and sex determination in A. schrenckii. A total of 122,381 and 114,527 unigenes were obtained in the testicular and ovarian transcriptomes, respectively, with average lengths of 748 and 697 bp. A total of 46,179 genes were matched to the non-redundant nr database. GO (31,266), KEGG (39,712), and COG analyses (20,126) were performed to identify potential genes and their functions. Twenty-six gene families involved in reproduction and sex determination were identified from the A. schrenckii testicular and ovarian transcriptomes based on functional annotation of non-redundant transcripts and comparisons with the published literature. Furthermore, 1309 unigenes showed significant differences between the testes and ovaries, including 782 genes that were up-regulated in the testes and 527 that were up-regulated in the ovaries. Eleven genes were involved in reproduction and sex determination mechanisms. Furthermore, 19,065 simple sequence repeats (SSRs) were identified in the expressed sequence tagged dataset, and 190,863 and 193,258 single nucleotide polymorphisms (SNPs) were obtained from the testicular and ovarian transcriptomic databases, respectively. This study provides new sequence information about A. schrenckii, which will provide a basis for the further study of reproduction and sex determination mechanisms in Acipenser species. The potential SSR and SNP markers isolated from the transcriptome may shed light on the evolution and molecular ecology of Acipenser species.

  3. Novel enabling technologies of gene isolation and plant transformation for improved crop protection

    SciTech Connect

    Torok, Tamas

    2013-02-04

    Meeting the needs of agricultural producers requires the continued development of improved transgenic crop protection products. The completed project focused on developing novel enabling technologies of gene discovery and plant transformation to facilitate the generation of such products.

  4. Discovery of gene-gene interactions across multiple independent datasets of Late Onset Alzheimer Disease from the Alzheimer Disease Genetics Consortium

    PubMed Central

    Hohman, Timothy J.; Bush, William S.; Jiang, Lan; Brown-Gentry, Kristin D.; Torstenson, Eric S.; Dudek, Scott M.; Mukherjee, Shubhabrata; Naj, Adam; Kunkle, Brian W.; Ritchie, Marylyn D.; Martin, Eden R.; Schellenberg, Gerard D.; Mayeux, Richard; Farrer, Lindsay A.; Pericak-Vance, Margaret A.; Haines, Jonathan L.; Thornton-Wells, Tricia A.

    2015-01-01

    Late-onset Alzheimer disease (LOAD) has a complex genetic etiology, involving locus heterogeneity, polygenic inheritance and gene-gene interactions; however, the investigation of interactions in recent GWAS has been limited. We used a biological knowledge-driven approach to evaluate gene-gene interactions for consistency across thirteen datasets from the Alzheimer Disease Genetics Consortium. Fifteen SNP-SNP pairs within three gene-gene combinations were identified: SIRT1 x ABCB1, PSAP x PEBP4, and GRIN2B x ADRA1A. Additionally, we extend a previously identified interaction from an endophenotype analysis between RYR3 x CACNA1C. Finally, post hoc gene expression analyses of the implicated SNPs further implicate SIRT1 and ABCB1, and implicate CDH23 which was most recently identified as an AD risk locus in an epigenetic analysis of AD. The observed interactions in this manuscript highlight ways in which genotypic variation related to disease may depend on the genetic context in which it occurs. Further, our results highlight the utility of evaluating genetic interactions to explain additional variance in AD risk and identify novel molecular mechanisms of AD pathogenesis. PMID:26827652

  5. Discovery of gene-gene interactions across multiple independent data sets of late onset Alzheimer disease from the Alzheimer Disease Genetics Consortium.

    PubMed

    Hohman, Timothy J; Bush, William S; Jiang, Lan; Brown-Gentry, Kristin D; Torstenson, Eric S; Dudek, Scott M; Mukherjee, Shubhabrata; Naj, Adam; Kunkle, Brian W; Ritchie, Marylyn D; Martin, Eden R; Schellenberg, Gerard D; Mayeux, Richard; Farrer, Lindsay A; Pericak-Vance, Margaret A; Haines, Jonathan L; Thornton-Wells, Tricia A

    2016-02-01

    Late-onset Alzheimer disease (AD) has a complex genetic etiology, involving locus heterogeneity, polygenic inheritance, and gene-gene interactions; however, the investigation of interactions in recent genome-wide association studies has been limited. We used a biological knowledge-driven approach to evaluate gene-gene interactions for consistency across 13 data sets from the Alzheimer Disease Genetics Consortium. Fifteen single nucleotide polymorphism (SNP)-SNP pairs within 3 gene-gene combinations were identified: SIRT1 × ABCB1, PSAP × PEBP4, and GRIN2B × ADRA1A. In addition, we extend a previously identified interaction from an endophenotype analysis between RYR3 × CACNA1C. Finally, post hoc gene expression analyses of the implicated SNPs further implicate SIRT1 and ABCB1, and implicate CDH23 which was most recently identified as an AD risk locus in an epigenetic analysis of AD. The observed interactions in this article highlight ways in which genotypic variation related to disease may depend on the genetic context in which it occurs. Further, our results highlight the utility of evaluating genetic interactions to explain additional variance in AD risk and identify novel molecular mechanisms of AD pathogenesis.

  6. Transcriptome Analysis of the White Body of the Squid Euprymna tasmanica with Emphasis on Immune and Hematopoietic Gene Discovery

    PubMed Central

    Salazar, Karla A.; Joffe, Nina R.; Dinguirard, Nathalie; Houde, Peter; Castillo, Maria G.

    2015-01-01

    In the mutualistic relationship between the squid Euprymna tasmanica and the bioluminescent bacterium Vibrio fischeri, several host factors, including immune-related proteins, are known to interact and respond specifically and exclusively to the presence of the symbiont. In squid and octopus, the white body is considered to be an immune organ mainly due to the fact that blood cells, or hemocytes, are known to be present in high numbers and in different developmental stages. Hence, the white body has been described as the site of hematopoiesis in cephalopods. However, to our knowledge, there are no studies showing any molecular evidence of such functions. In this study, we performed a transcriptomic analysis of white body tissue of the Southern dumpling squid, E. tasmanica. Our primary goal was to gain insights into the functions of this tissue and to test for the presence of gene transcripts associated with hematopoietic and immune processes. Several hematopoiesis genes including CPSF1, GATA 2, TFIID, and FGFR2 were found to be expressed in the white body. In addition, transcripts associated with immune-related signal transduction pathways, such as the toll-like receptor/NF-κβ, and MAPK pathways were also found, as well as other immune genes previously identified in E. tasmanica’s sister species, E. scolopes. This study is the first to analyze an immune organ within cephalopods, and to provide gene expression data supporting the white body as a hematopoietic tissue. PMID:25775132

  7. The discovery of the virulence gene ToxA in the wheat and barley pathogen Bipolaris sorokiniana.

    PubMed

    McDonald, Megan C; Ahren, Dag; Simpfendorfer, Steven; Milgate, Andrew; Solomon, Peter S

    2017-01-17

    Bipolaris sorokiniana is the causal agent of multiple diseases on wheat and barley and is the primary constraint to cereal production throughout South Asia. Despite its significance, the molecular basis of disease is poorly understood. To address this, the genomes of three Australian isolates of B. sorokiniana were sequenced and screened for known pathogenicity genes. Sequence analysis revealed that the isolate BRIP10943 harboured the ToxA gene, which has been associated previously with disease in the wheat pathogens Parastagonospora nodorum and Pyrenophora tritici-repentis. Analysis of the regions flanking ToxA within B. sorokiniana revealed that it was embedded within a 12-kb genomic element nearly identical to the corresponding regions in P. nodorum and P. tritici-repentis. A screen of 35 Australian B. sorokiniana isolates confirmed that ToxA was present in 12 isolates. Sequencing of the ToxA genes within these isolates revealed two haplotypes, which differed by a single non-synonymous nucleotide substitution. Pathogenicity assays showed that a B. sorokiniana isolate harbouring ToxA was more virulent on wheat lines that contained the sensitivity gene when compared with a non-ToxA isolate. This work demonstrates that proteins that confer host-specific virulence can be horizontally acquired across multiple species. This acquisition can dramatically increase the virulence of pathogenic strains on susceptible cultivars, which, in an agricultural setting, can have devastating economic and social impacts.

  8. Discovery of single nucleotide polymorphisms in candidate genes associated with fertility and production traits in Holstein cattle

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Identification of single nucleotide polymorphisms (SNPs) for specific genes involved in reproduction might improve reliability of genomic estimates for these low- heritability traits. Semen from 550 Holstein bulls of high (>= 1.7; n=288) or low (<= -2; n = 262) daughter pregnancy rate (DPR) was geno...

  9. Transcriptome analysis of the white body of the squid Euprymna tasmanica with emphasis on immune and hematopoietic gene discovery.

    PubMed

    Salazar, Karla A; Joffe, Nina R; Dinguirard, Nathalie; Houde, Peter; Castillo, Maria G

    2015-01-01

    In the mutualistic relationship between the squid Euprymna tasmanica and the bioluminescent bacterium Vibrio fischeri, several host factors, including immune-related proteins, are known to interact and respond specifically and exclusively to the presence of the symbiont. In squid and octopus, the white body is considered to be an immune organ mainly due to the fact that blood cells, or hemocytes, are known to be present in high numbers and in different developmental stages. Hence, the white body has been described as the site of hematopoiesis in cephalopods. However, to our knowledge, there are no studies showing any molecular evidence of such functions. In this study, we performed a transcriptomic analysis of white body tissue of the Southern dumpling squid, E. tasmanica. Our primary goal was to gain insights into the functions of this tissue and to test for the presence of gene transcripts associated with hematopoietic and immune processes. Several hematopoiesis genes including CPSF1, GATA 2, TFIID, and FGFR2 were found to be expressed in the white body. In addition, transcripts associated with immune-related signal transduction pathways, such as the toll-like receptor/NF-κβ, and MAPK pathways were also found, as well as other immune genes previously identified in E. tasmanica's sister species, E. scolopes. This study is the first to analyze an immune organ within cephalopods, and to provide gene expression data supporting the white body as a hematopoietic tissue.

  10. A population of deletion mutants and an integrated mapping and Exome-seq pipeline for gene discovery in maize

    Technology Transfer Automated Retrieval System (TEKTRAN)

    To better understand maize endosperm filling and maturation, we developed a novel functional genomics platform that combined Bulked Segregant RNA and Exome sequencing (BSREx-seq) to map causative mutations and identify candidate genes within mapping intervals. Using gamma-irradiation of B73 maize to...

  11. Discovery of single nucleotide polymorphisms in candidate genes associated with fertility and production traits in Holstein cattle

    PubMed Central

    2013-01-01

    Background Identification of single nucleotide polymorphisms (SNPs) for specific genes involved in reproduction might improve reliability of genomic estimates for these low-heritability traits. Semen from 550 Holstein bulls of high (≥ 1.7; n = 288) or low (≤ −2; n = 262) daughter pregnancy rate (DPR) was genotyped for 434 candidate SNPs using the Sequenom MassARRAY® system. Three types of SNPs were evaluated: SNPs previously reported to be associated with reproductive traits or physically close to genetic markers for reproduction, SNPs in genes that are well known to be involved in reproductive processes, and SNPs in genes that are differentially expressed between physiological conditions in a variety of tissues associated in reproductive function. Eleven reproduction and production traits were analyzed. Results A total of 40 SNPs were associated (P < 0.05) with DPR. Among these were genes involved in the endocrine system, cell signaling, immune function and inhibition of apoptosis. A total of 10 genes were regulated by estradiol. In addition, 22 SNPs were associated with heifer conception rate, 33 with cow conception rate, 36 with productive life, 34 with net merit, 23 with milk yield, 19 with fat yield, 13 with fat percent, 19 with protein yield, 22 with protein percent, and 13 with somatic cell score. The allele substitution effect for SNPs associated with heifer conception rate, cow conception rate, productive life and net merit were in the same direction as for DPR. Allele substitution effects for several SNPs associated with production traits were in the opposite direction as DPR. Nonetheless, there were 29 SNPs associated with DPR that were not negatively associated with production traits. Conclusion SNPs in a total of 40 genes associated with DPR were identified as well as SNPs for other traits. It might be feasible to include these SNPs into genomic tests of reproduction and other traits. The genes associated with DPR are likely to be

  12. Transcriptome profiling of the testis reveals genes involved in spermatogenesis and marker discovery in the oriental fruit fly, Bactrocera dorsalis.

    PubMed

    Wei, D; Li, H-M; Yang, W-J; Wei, D-D; Dou, W; Huang, Y; Wang, J-J

    2015-02-01

    The testis is a highly specialized tissue that plays a vital role in ensuring fertility by producing spermatozoa, which are transferred to the female during mating. Spermatogenesis is a complex process, resulting in the production of mature sperm, and involves significant structural and biochemical changes in the seminiferous epithelium of the adult testis. The identification of genes involved in spermatogenesis of Bactrocera dorsalis (Hendel) is critical for a better understanding of its reproductive development. In this study, we constructed a cDNA library of testes from male B. dorsalis adults at different ages, and performed de novo transcriptome sequencing to produce a comprehensive transcript data set, using Illumina sequencing technology. The analysis yielded 52 016 732 clean reads, including a total of 4.65 Gb of nucleotides. These reads were assembled into 47 677 contigs (average 443 bp) and then clustered into 30 516 unigenes (average 756 bp). Based on BLAST hits with known proteins in different databases, 20 921 unigenes were annotated with a cut-off E-value of 10(-5). The transcriptome sequences were further annotated using the Clusters of Orthologous Groups, Gene Orthology and the Kyoto Encyclopedia of Genes and Genomes databases. Functional genes involved in spermatogenesis were analysed, including cell cycle proteins, metalloproteins, actin, and ubiquitin and antihyperthermia proteins. Several testis-specific genes were also identified. The transcripts database will help us to understand the molecular mechanisms underlying spermatogenesis in B. dorsalis. Furthermore, 2913 simple sequence repeats and 151 431 single nucleotide polymorphisms were identified, which will be useful for investigating the genetic diversity of B. dorsalis in the future.

  13. Discovery of new risk loci for IgA nephropathy implicates genes involved in immunity against intestinal pathogens.

    PubMed

    Kiryluk, Krzysztof; Li, Yifu; Scolari, Francesco; Sanna-Cherchi, Simone; Choi, Murim; Verbitsky, Miguel; Fasel, David; Lata, Sneh; Prakash, Sindhuri; Shapiro, Samantha; Fischman, Clara; Snyder, Holly J; Appel, Gerald; Izzi, Claudia; Viola, Battista Fabio; Dallera, Nadia; Del Vecchio, Lucia; Barlassina, Cristina; Salvi, Erika; Bertinetto, Francesca Eleonora; Amoroso, Antonio; Savoldi, Silvana; Rocchietti, Marcella; Amore, Alessandro; Peruzzi, Licia; Coppo, Rosanna; Salvadori, Maurizio; Ravani, Pietro; Magistroni, Riccardo; Ghiggeri, Gian Marco; Caridi, Gianluca; Bodria, Monica; Lugani, Francesca; Allegri, Landino; Delsante, Marco; Maiorana, Mariarosa; Magnano, Andrea; Frasca, Giovanni; Boer, Emanuela; Boscutti, Giuliano; Ponticelli, Claudio; Mignani, Renzo; Marcantoni, Carmelita; Di Landro, Domenico; Santoro, Domenico; Pani, Antonello; Polci, Rosaria; Feriozzi, Sandro; Chicca, Silvana; Galliani, Marco; Gigante, Maddalena; Gesualdo, Loreto; Zamboli, Pasquale; Battaglia, Giovanni Giorgio; Garozzo, Maurizio; Maixnerová, Dita; Tesar, Vladimir; Eitner, Frank; Rauen, Thomas; Floege, Jürgen; Kovacs, Tibor; Nagy, Judit; Mucha, Krzysztof; Pączek, Leszek; Zaniew, Marcin; Mizerska-Wasiak, Małgorzata; Roszkowska-Blaim, Maria; Pawlaczyk, Krzysztof; Gale, Daniel; Barratt, Jonathan; Thibaudin, Lise; Berthoux, Francois; Canaud, Guillaume; Boland, Anne; Metzger, Marie; Panzer, Ulf; Suzuki, Hitoshi; Goto, Shin; Narita, Ichiei; Caliskan, Yasar; Xie, Jingyuan; Hou, Ping; Chen, Nan; Zhang, Hong; Wyatt, Robert J; Novak, Jan; Julian, Bruce A; Feehally, John; Stengel, Benedicte; Cusi, Daniele; Lifton, Richard P; Gharavi, Ali G

    2014-11-01

    We performed a genome-wide association study (GWAS) of IgA nephropathy (IgAN), the most common form of glomerulonephritis, with discovery and follow-up in 20,612 individuals of European and East Asian ancestry. We identified six new genome-wide significant associations, four in ITGAM-ITGAX, VAV3 and CARD9 and two new independent signals at HLA-DQB1 and DEFA. We replicated the nine previously reported signals, including known SNPs in the HLA-DQB1 and DEFA loci. The cumulative burden of risk alleles is strongly associated with age at disease onset. Most loci are either directly associated with risk of inflammatory bowel disease (IBD) or maintenance of the intestinal epithelial barrier and response to mucosal pathogens. The geospatial distribution of risk alleles is highly suggestive of multi-locus adaptation, and genetic risk correlates strongly with variation in local pathogens, particularly helminth diversity, suggesting a possible role for host-intestinal pathogen interactions in shaping the genetic landscape of IgAN.

  14. Discovery of new risk loci for IgA nephropathy implicates genes involved in immunity against intestinal pathogens

    PubMed Central

    Kiryluk, Krzysztof; Li, Yifu; Scolari, Francesco; Sanna-Cherchi, Simone; Choi, Murim; Verbitsky, Miguel; Fasel, David; Lata, Sneh; Prakash, Sindhuri; Shapiro, Samantha; Fischman, Clara; Snyder, Holly J.; Appel, Gerald; Izzi, Claudia; Viola, Battista Fabio; Dallera, Nadia; Vecchio, Lucia Del; Barlassina, Cristina; Salvi, Erika; Bertinetto, Francesca Eleonora; Amoroso, Antonio; Savoldi, Silvana; Rocchietti, Marcella; Amore, Alessandro; Peruzzi, Licia; Coppo, Rosanna; Salvadori, Maurizio; Ravani, Pietro; Magistroni, Riccardo; Ghiggeri, Gian Marco; Caridi, Gianluca; Bodria, Monica; Lugani, Francesca; Allegri, Landino; Delsante, Marco; Maiorana, Mariarosa; Magnano, Andrea; Frasca, Giovanni; Boer, Emanuela; Boscutti, Giuliano; Ponticelli, Claudio; Mignani, Renzo; Marcantoni, Carmelita; Di Landro, Domenico; Santoro, Domenico; Pani, Antonello; Polci, Rosaria; Feriozzi, Sandro; Chicca, Silvana; Galliani, Marco; Gigante, Maddalena; Gesualdo, Loreto; Zamboli, Pasquale; Maixnerová, Dita; Tesar, Vladimir; Eitner, Frank; Rauen, Thomas; Floege, Jürgen; Kovacs, Tibor; Nagy, Judit; Mucha, Krzysztof; Pączek, Leszek; Zaniew, Marcin; Mizerska-Wasiak, Małgorzata; Roszkowska-Blaim, Maria; Pawlaczyk, Krzysztof; Gale, Daniel; Barratt, Jonathan; Thibaudin, Lise; Berthoux, Francois; Canaud, Guillaume; Boland, Anne; Metzger, Marie; Panzer, Ulf; Suzuki, Hitoshi; Goto, Shin; Narita, Ichiei; Caliskan, Yasar; Xie, Jingyuan; Hou, Ping; Chen, Nan; Zhang, Hong; Wyatt, Robert J.; Novak, Jan; Julian, Bruce A.; Feehally, John; Stengel, Benedicte; Cusi, Daniele; Lifton, Richard P.; Gharavi, Ali G.

    2014-01-01

    We performed a genome-wide association study (GWAS) of IgA nephropathy (IgAN), the most common form of glomerulonephritis, with discovery and follow-up in 20,612 individuals of European and East Asian ancestry. We identified six novel genome-wide significant associations, four in ITGAM-ITGAX, VAV3 and CARD9 and two new independent signals at HLA-DQB1 and DEFA. We replicated the nine previously reported signals, including known SNPs in the HLA-DQB1 and DEFA loci. The cumulative burden of risk alleles is strongly associated with age at disease onset. Most loci are either directly associated with risk of inflammatory bowel disease (IBD) or maintenance of the intestinal epithelial barrier and response to mucosal pathogens. The geo-spatial distribution of risk alleles is highly suggestive of multi-locus adaptation and the genetic risk correlates strongly with variation in local pathogens, particularly helminth diversity, suggesting a possible role for host-intestinal pathogen interactions in shaping the genetic landscape of IgAN. PMID:25305756

  15. De Novo Transcriptome Analysis of an Aerial Microalga Trentepohlia jolithus: Pathway Description and Gene Discovery for Carbon Fixation and Carotenoid Biosynthesis

    PubMed Central

    Li, Qianqian; Liu, Jianguo; Zhang, Litao; Liu, Qian

    2014-01-01

    Background Algae in the order Trentepohliales have a broad geographic distribution and are generally characterized by the presence of abundant β-carotene. The many monographs published to date have mainly focused on their morphology, taxonomy, phylogeny, distribution and reproduction; molecular studies of this order are still rare. High-throughput RNA sequencing (RNA-Seq) technology provides a powerful and efficient method for transcript analysis and gene discovery in Trentepohlia jolithus. Methods/Principal Findings Illumina HiSeq 2000 sequencing generated 55,007,830 Illumina PE raw reads, which were assembled into 41,328 assembled unigenes. Based on NR annotation, 53.28% of the unigenes (22,018) could be assigned to gene ontology classes with 54 subcategories and 161,451 functional terms. A total of 26,217 (63.44%) assembled unigenes were mapped to 128 KEGG pathways. Furthermore, a set of 5,798 SSRs in 5,206 unigenes and 131,478 putative SNPs were identified. Moreover, the fact that all of the C4 photosynthesis genes exist in T. jolithus suggests a complex carbon acquisition and fixation system. Similarities and differences between T. jolithus and other algae in carotenoid biosynthesis are also described in depth. Conclusions/Significance This is the first broad transcriptome survey for T. jolithus, increasing the amount of molecular data available for the class Ulvophyceae. As well as providing resources for functional genomics studies, the functional genes and putative pathways identified here will contribute to a better understanding of carbon fixation and fatty acid and carotenoid biosynthesis in T. jolithus. PMID:25254555

  16. An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility Loci for osteoporosis-related traits.

    PubMed

    Hsu, Yi-Hsiang; Zillikens, M Carola; Wilson, Scott G; Farber, Charles R; Demissie, Serkalem; Soranzo, Nicole; Bianchi, Estelle N; Grundberg, Elin; Liang, Liming; Richards, J Brent; Estrada, Karol; Zhou, Yanhua; van Nas, Atila; Moffatt, Miriam F; Zhai, Guangju; Hofman, Albert; van Meurs, Joyce B; Pols, Huibert A P; Price, Roger I; Nilsson, Olle; Pastinen, Tomi; Cupples, L Adrienne; Lusis, Aldons J; Schadt, Eric E; Ferrari, Serge; Uitterlinden, André G; Rivadeneira, Fernando; Spector, Timothy D; Karasik, David; Kiel, Douglas P

    2010-06-10

    Osteoporosis is a complex disorder and commonly leads to fractures in elderly persons. Genome-wide association studies (GWAS) have become an unbiased approach to identify variations in the genome that potentially affect health. However, the genetic variants identified so far only explain a small proportion of the heritability for complex traits. Due to the modest genetic effect size and inadequate power, true association signals may not be revealed based on a stringent genome-wide significance threshold. Here, we take advantage of SNP and transcript arrays and integrate GWAS and expression signature profiling relevant to the skeletal system in cellular and animal models to prioritize the discovery of novel candidate genes for osteoporosis-related traits, including bone mineral density (BMD) at the lumbar spine (LS) and femoral neck (FN), as well as geometric indices of the hip (femoral neck-shaft angle, NSA; femoral neck length, NL; and narrow-neck width, NW). A two-stage meta-analysis of GWAS from 7,633 Caucasian women and 3,657 men, revealed three novel loci associated with osteoporosis-related traits, including chromosome 1p13.2 (RAP1A, p = 3.6x10(-8)), 2q11.2 (TBC1D8), and 18q11.2 (OSBPL1A), and confirmed a previously reported region near TNFRSF11B/OPG gene. We also prioritized 16 suggestive genome-wide significant candidate genes based on their potential involvement in skeletal metabolism. Among them, 3 candidate genes were associated with BMD in women. Notably, 2 out of these 3 genes (GPR177, p = 2.6x10(-13); SOX6, p = 6.4x10(-10)) associated with BMD in women have been successfully replicated in a large-scale meta-analysis of BMD, but none of the non-prioritized candidates (associated with BMD) did. Our results support the concept of our prioritization strategy. In the absence of direct biological support for identified genes, we highlighted the efficiency of subsequent functional characterization using publicly available expression profiling relevant to the

  17. Genetic Predictors of Adverse Radiotherapy Effects: The Gene-PARE project

    SciTech Connect

    Ho, Alice Y.; Atencio, David P.; Peters, Sheila; Stock, Richard G.; Cesaretti, Jamie A.; Green, Sheryl; Formenti, Silvia C.; Haffty, Bruce; Drumea, Karen; Leitzin, Larisa M.D.; Kuten, Abraham; Azria, David; Ozsahin, Mahmut; Overgaard, Jens; Andreassen, Christian N.; Trop, Cynthia S.; Park, Janelle; Rosenstein, Barry S. |||. E-mail: barry.rosenstein@mssm.edu

    2006-07-01

    Purpose: The development of adverse effects resulting from the radiotherapy of cancer limits the use of this treatment modality. The validation of a test capable of predicting which patients would be most likely to develop adverse responses to radiation treatment, based on the possession of specific genetic variants, would therefore be of value. The purpose of the Genetic Predictors of Adverse Radiotherapy Effects (Gene-PARE) project is to help achieve this goal. Methods and Materials: A continuously expanding biorepository has been created consisting of frozen lymphocytes and DNA isolated from patients treated with radiotherapy. In conjunction with this biorepository, a database is maintained with detailed clinical information pertaining to diagnosis, treatment, and outcome. The DNA samples are screened using denaturing high performance liquid chromatography (DHPLC) and the Surveyor nuclease assay for variants in ATM, TGFB1, XRCC1, XRCC3, SOD2, and hHR21. It is anticipated that additional genes that control the biologic response to radiation will be screened in future work. Results: Evidence has been obtained that possession of variants in genes, the products of which play a role in radiation response, is predictive for the development of adverse effects after radiotherapy. Conclusions: It is anticipated that the Gene-PARE project will yield information that will allow radiation oncologists to use genetic data to optimize treatment on an individual basis.

  18. Gene expression profiling of coelomic cells and discovery of immune-related genes in the earthworm, Eisenia andrei, using expressed sequence tags.

    PubMed

    Tak, Eun Sik; Cho, Sung-Jin; Park, Soon Cheol

    2015-01-01

    The coelomic cells of the earthworm consist of leukocytes, chlorogocytes, and coelomocytes, which play an important role in innate immunity reactions. To gain insight into the expression profiles of coelomic cells of the earthworm, Eisenia andrei, we analyzed 1151 expressed sequence tags (ESTs) derived from the cDNA library of the coelomic cells. Among the 1151 ESTs analyzed, 493 ESTs (42.8%) showed a significant similarity to known genes and represented 164 unique genes, of which 93 ESTs were singletons and 71 ESTs manifested as two or more ESTs. From the 164 unique genes sequenced, we found 24 immune-related and cell defense genes. Furthermore, real-time PCR analysis showed that levels of lysenin-related proteins mRNA in coelomic cells of E. andrei were upregulated after the injection of Bacillus subtilis bacteria. This EST data-set would provide a valuable resource for future researches of earthworm immune system.

  19. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools.

    PubMed

    Quast, Christian; Pruesse, Elmar; Yilmaz, Pelin; Gerken, Jan; Schweer, Timmy; Yarza, Pablo; Peplies, Jörg; Glöckner, Frank Oliver

    2013-01-01

    SILVA (from Latin silva, forest, http://www.arb-silva.de) is a comprehensive web resource for up to date, quality-controlled databases of aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services. The referred database release 111 (July 2012) contains 3 194 778 small subunit and 288 717 large subunit rRNA gene sequences. Since the initial description of the project, substantial new features have been introduced, including advanced quality control procedures, an improved rRNA gene aligner, online tools for probe and primer evaluation and optimized browsing, searching and downloading on the website. Furthermore, the extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches.

  20. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools

    PubMed Central

    Quast, Christian; Pruesse, Elmar; Yilmaz, Pelin; Gerken, Jan; Schweer, Timmy; Yarza, Pablo; Peplies, Jörg; Glöckner, Frank Oliver

    2013-01-01

    SILVA (from Latin silva, forest, http://www.arb-silva.de) is a comprehensive web resource for up to date, quality-controlled databases of aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services. The referred database release 111 (July 2012) contains 3 194 778 small subunit and 288 717 large subunit rRNA gene sequences. Since the initial description of the project, substantial new features have been introduced, including advanced quality control procedures, an improved rRNA gene aligner, online tools for probe and primer evaluation and optimized browsing, searching and downloading on the website. Furthermore, the extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches. PMID:23193283

  1. The Complete Genome Sequence of Plodia Interpunctella Granulovirus: Evidence for Horizontal Gene Transfer and Discovery of an Unusual Inhibitor-of-Apoptosis Gene

    PubMed Central

    Harrison, Robert L.; Rowley, Daniel L.; Funk, C. Joel

    2016-01-01

    The Indianmeal moth, Plodia interpunctella (Lepidoptera: Pyralidae), is a common pest of stored goods with a worldwide distribution. The complete genome sequence for a larval pathogen of this moth, the baculovirus Plodia interpunctella granulovirus (PiGV), was determined by next-generation sequencing. The PiGV genome was found to be 112, 536 bp in length with a 44.2% G+C nucleotide distribution. A total of 123 open reading frames (ORFs) and seven homologous regions (hrs) were identified and annotated. Phylogenetic inference using concatenated alignments of 36 baculovirus core genes placed PiGV in the “b” clade of viruses from genus Betabaculovirus with a branch length suggesting that PiGV represents a distinct betabaculovirus species. In addition to the baculovirus core genes and orthologues of other genes found in other betabaculovirus genomes, the PiGV genome sequence contained orthologues of the bidensovirus NS3 gene, as well as ORFs that occur in alphabaculoviruses but not betabaculoviruses. While PiGV contained an orthologue of inhibitor of apoptosis-5 (iap-5), an orthologue of inhibitor of apoptosis-3 (iap-3) was not present. Instead, the PiGV sequence contained an ORF (PiGV ORF81) encoding an IAP homologue with sequence similarity to insect cellular IAPs, but not to viral IAPs. Phylogenetic analysis of baculovirus and insect IAP amino acid sequences suggested that the baculovirus IAP-3 genes and the PiGV ORF81 IAP homologue represent different lineages arising from more than one acquisition event. The presence of genes from other sources in the PiGV genome highlights the extent to which baculovirus gene content is shaped by horizontal gene transfer. PMID:27472489

  2. Gene Discovery and Expression Profiling in the Toxin-Producing Marine Diatom, Pseudo-nitzschia Multiseries (Hasle) Hasle

    DTIC Science & Technology

    2004-09-01

    measuring mRNAs. Comparing the amount of specific mRNAs between two samples provides a mechanism to screen for genes that are turned on or off under defined ...were based on a comparison of different criteria and software packages that revealed that Seqman (DNAStar) yielded the most consistent results using...branches of the eukaryotic tree and to further define the root of the tree. Multiple copies of actin, beta-tubulin, and EF- 1 alpha were identified in P

  3. Functional gene-based discovery of phenazines from the actinobacteria associated with marine sponges in the South China Sea.

    PubMed

    Karuppiah, Valliappan; Li, Yingxin; Sun, Wei; Feng, Guofang; Li, Zhiyong

    2015-07-01

    Phenazines represent a large group of nitrogen-containing heterocyclic compounds produced by the diverse group of bacteria including actinobacteria. In this study, a total of 197 actinobacterial strains were isolated from seven different marine sponge species in the South China Sea using five different culture media. Eighty-seven morphologically different actinobacterial strains were selected and grouped into 13 genera, including Actinoalloteichus, Kocuria, Micrococcus, Micromonospora, Mycobacterium, Nocardiopsis, Prauserella, Rhodococcus, Saccharopolyspora, Salinispora, Serinicoccus, and Streptomyces by the phylogenetic analysis of 16S rRNA gene. Based on the screening of phzE genes, ten strains, including five Streptomyces, two Nocardiopsis, one Salinispora, one Micrococcus, and one Serinicoccus were found to be potential for phenazine production. The level of phzE gene expression was highly expressed in Nocardiopsis sp. 13-33-15, 13-12-13, and Serinicoccus sp. 13-12-4 on the fifth day of fermentation. Finally, 1,6-dihydroxy phenazine (1) from Nocardiopsis sp. 13-33-15 and 13-12-13, and 1,6-dimethoxy phenazine (2) from Nocardiopsis sp. 13-33-15 were isolated and identified successfully based on ESI-MS and NMR analysis. The compounds 1 and 2 showed antibacterial activity against Bacillus mycoides SJ14, Staphylococcus aureus SJ51, Escherichia coli SJ42, and Micrococcus luteus SJ47. This study suggests that the integrated approach of gene screening and chemical analysis is an effective strategy to find the target compounds and lays the basis for the production of phenazine from the sponge-associated actinobacteria.

  4. Gene Discovery and Advances in Finger Millet [Eleusine coracana (L.) Gaertn.] Genomics—An Important Nutri-Cereal of Future

    PubMed Central

    Sood, Salej; Kumar, Anil; Babu, B. Kalyana; Gaur, Vikram S.; Pandey, Dinesh; Kant, Lakshmi; Pattnayak, Arunava

    2016-01-01

    The rapid strides in molecular marker technologies followed by genomics, and next generation sequencing advancements in three major crops (rice, maize and wheat) of the world have given opportunities for their use in the orphan, but highly valuable future crops, including finger millet [Eleusine coracana (L.) Gaertn.]. Finger millet has many special agronomic and nutritional characteristics, which make it an indispensable crop in arid, semi-arid, hilly and tribal areas of India and Africa. The crop has proven its adaptability in harsh conditions and has shown resilience to climate change. The adaptability traits of finger millet have shown the advantage over major cereal grains under stress conditions, revealing it as a storehouse of important genomic resources for crop improvement. Although new technologies for genomic studies are now available, progress in identifying and tapping these important alleles or genes is lacking. RAPDs were the default choice for genetic diversity studies in the crop until the last decade, but the subsequent development of SSRs and comparative genomics paved the way for the marker assisted selection in finger millet. Resistance gene homologs from NBS-LRR region of finger millet for blast and sequence variants for nutritional traits from other cereals have been developed and used invariably. Population structure analysis studies exhibit 2–4 sub-populations in the finger millet gene pool with separate grouping of Indian and exotic genotypes. Recently, the omics technologies have been efficiently applied to understand the nutritional variation, drought tolerance and gene mining. Progress has also occurred with respect to transgenics development. This review presents the current biotechnological advancements along with research gaps and future perspective of genomic research in finger millet. PMID:27881984

  5. Discovery of consensus gene signature and intermodular connectivity defining self-renewal of human embryonic stem cells

    PubMed Central

    Kim, Jeffrey J.; Khalid, Omar; Namazi, AmirHosien; Tu, Thanh G.; Elie, Omid; Lee, Connie; Kim, Yong

    2014-01-01

    Molecular markers defining self-renewing pluripotent embryonic stem cells (ESCs) have been identified by relative comparisons between undifferentiated and differentiated cells. Most of analysis has been done under a specific differentiation condition that may present significantly different molecular changes over others. Therefore, it is currently unclear if there are true consensus markers defining undifferentiated hESCs. To identify a set of key genes consistently altered during differentiation of hESCs regardless of differentiation conditions we have performed microarray analysis on undifferentiated hESCs (H1 and H9) and differentiated EB’s and validated our results using publicly available expression array data sets. We constructed consensus modules by Weighted Gene Correlation Analysis (WGCNA) and discovered novel markers that are consistently present in undifferentiated hESCs under various differentiation conditions. We have validated top markers (downregulated: LCK, KLKB1 and SLC7A3; upregulated: RhoJ, Zeb2 and Adam12) upon differentiation. Functional validation analysis of LCK in self-renewal of hESCs by using LCK inhibitor or gene silencing with siLCK resulted in a loss of undifferentiation characteristics- morphological change, reduced alkaline phosphatase activity and pluripotency gene expression, demonstrating a potential functional role of LCK in self-renewal of hESCs. We have designated hESC markers to interactive networks in the genome, identifying possible interacting partners and showing how new markers relate to each other. Furthermore, comparison of these data sets with available datasets from iPSCs revealed that the level of these newly identified markers were correlated to the establishment of iPSCs, which may imply a potential role of these markers in gaining of cellular potency. PMID:24519983

  6. Discovery of consensus gene signature and intermodular connectivity defining self-renewal of human embryonic stem cells.

    PubMed

    Kim, Jeffrey J; Khalid, Omar; Namazi, AmirHosien; Tu, Thanh G; Elie, Omid; Lee, Connie; Kim, Yong

    2014-06-01

    Molecular markers defining self-renewing pluripotent embryonic stem cells (ESCs) have been identified by relative comparisons between undifferentiated and differentiated cells. Most of analysis has been done under a specific differentiation condition that may present significantly different molecular changes over others. Therefore, it is currently unclear if there are true consensus markers defining undifferentiated human ESCs (hESCs). To identify a set of key genes consistently altered during differentiation of hESCs regardless of differentiation conditions, we have performed microarray analysis on undifferentiated hESCs (H1 and H9) and differentiated EBs and validated our results using publicly available expression array datasets. We constructed consensus modules by Weighted Gene Coexpression Network Analysis and discovered novel markers that are consistently present in undifferentiated hESCs under various differentiation conditions. We have validated top markers (downregulated: LCK, KLKB1, and SLC7A3; upregulated: RhoJ, Zeb2, and Adam12) upon differentiation. Functional validation analysis of LCK in self-renewal of hESCs using LCK inhibitor or gene silencing with siLCK resulted in a loss of undifferentiation characteristics-morphological change, reduced alkaline phosphatase activity, and pluripotency gene expression, demonstrating a potential functional role of LCK in self-renewal of hESCs. We have designated hESC markers to interactive networks in the genome, identifying possible interacting partners and showing how new markers relate to each other. Furthermore, comparison of these datasets with available datasets from induced pluripotent stem cells (iPSCs) revealed that the level of these newly identified markers was correlated to the establishment of iPSCs, which may imply a potential role of these markers in gaining of cellular potency.

  7. Transcriptome Analysis of the Portunus trituberculatus: De Novo Assembly, Growth-Related Gene Identification and Marker Discovery

    PubMed Central

    Lv, Jianjian; Liu, Ping; Gao, Baoquan; Wang, Yu; Wang, Zheng; Chen, Ping; Li, Jian

    2014-01-01

    Background The swimming crab, Portunus trituberculatus, is an important farmed species in China, has been attracting extensive studies, which require more and more genome background knowledge. To date, the sequencing of its whole genome is unavailable and transcriptomic information is also scarce for this species. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive transcript dataset for major tissues of Portunus trituberculatus by the Illumina paired-end sequencing technology. Results Total RNA was isolated from eyestalk, gill, heart, hepatopancreas and muscle. Equal quantities of RNA from each tissue were pooled to construct a cDNA library. Using the Illumina paired-end sequencing technology, we generated a total of 120,137 transcripts with an average length of 1037 bp. Further assembly analysis showed that all contigs contributed to 87,100 unigenes, of these, 16,029 unigenes (18.40% of the total) can be matched in the GenBank non-redundant database. Potential genes and their functions were predicted by GO, KEGG pathway mapping and COG analysis. Based on our sequence analysis and published literature, many putative genes with fundamental roles in growth and muscle development, including actin, myosin, tropomyosin, troponin and other potentially important candidate genes were identified for the first time in this specie. Furthermore, 22,673 SSRs and 66,191 high-confidence SNPs were identified in this EST dataset. Conclusion The transcriptome provides an invaluable new data for a functional genomics resource and future biological research in Portunus trituberculatus. The data will also instruct future functional studies to manipulate or select for genes influencing growth that should find practical applications in aquaculture breeding programs. The molecular markers identified in this study will provide a material basis for future genetic linkage and quantitative trait loci analyses, and will be essential for accelerating

  8. Annotation of genes involved in glycerolipid biosynthesis in Chlamydomonas reinhardtii: discovery of the betaine lipid synthase BTA1Cr.

    PubMed

    Riekhof, Wayne R; Sears, Barbara B; Benning, Christoph

    2005-02-01

    Lipid metabolism in flowering plants has been intensely studied, and knowledge regarding the identities of genes encoding components of the major fatty acid and membrane lipid biosynthetic pathways is very extensive. We now present an in silico analysis of fatty acid and glycerolipid metabolism in an algal model, enabled by the recent availability of expressed sequence tag and genomic sequences of Chlamydomonas reinhardtii. Genes encoding proteins involved in membrane biogenesis were predicted on the basis of similarity to proteins with confirmed functions and were organized so as to reconstruct the major pathways of glycerolipid synthesis in Chlamydomonas. This analysis accounts for the majority of genes predicted to encode enzymes involved in anabolic reactions of membrane lipid biosynthesis and compares and contrasts these pathways in Chlamydomonas and flowering plants. As an important result of the bioinformatics analysis, we identified and isolated the C. reinhardtii BTA1 (BTA1Cr) gene and analyzed the bifunctional protein that it encodes; we predicted this protein to be sufficient for the synthesis of the betaine lipid diacylglyceryl-N,N,N-trimethylhomoserine (DGTS), a major membrane component in Chlamydomonas. Heterologous expression of BTA1Cr led to DGTS accumulation in Escherichia coli, which normally lacks this lipid, and allowed in vitro analysis of the enzymatic properties of BTA1Cr. In contrast, in the bacterium Rhodobacter sphaeroides, two separate proteins, BtaARs and BtaBRs, are required for the biosynthesis of DGTS. Site-directed mutagenesis of the active sites of the two domains of BTA1Cr allowed us to study their activities separately, demonstrating directly their functional homology to the bacterial orthologs BtaARs and BtaBRs.

  9. Discovery of genes implicated in whirling disease infection and resistance in rainbow trout using genome-wide expression profiling

    PubMed Central

    Baerwald, Melinda R; Welsh, Amy B; Hedrick, Ronald P; May, Bernie

    2008-01-01

    Background Whirling disease, caused by the pathogen Myxobolus cerebralis, afflicts several salmonid species. Rainbow trout are particularly susceptible and may suffer high mortality rates. The disease is persistent and spreading in hatcheries and natural waters of several countries, including the U.S.A., and the economic losses attributed to whirling disease are substantial. In this study, genome-wide expression profiling using cDNA microarrays was conducted for resistant Hofer and susceptible Trout Lodge rainbow trout strains following pathogen exposure with the primary objective of identifying specific genes implicated in whirling disease resistance. Results Several genes were significantly up-regulated in skin following pathogen exposure for both the resistant and susceptible rainbow trout strains. For both strains, response to infection appears to be linked with the interferon system. Expression profiles for three genes identified with microarrays were confirmed with qRT-PCR. Ubiquitin-like protein 1 was up-regulated over 100 fold and interferon regulating factor 1 was up-regulated over 15 fold following pathogen exposure for both strains. Expression of metallothionein B, which has known roles in inflammation and immune response, was up-regulated over 5 fold in the resistant Hofer strain but was unchanged in the susceptible Trout Lodge strain following pathogen exposure. Conclusion The present study has provided an initial view into the genetic basis underlying immune response and resistance of rainbow trout to the whirling disease parasite. The identified genes have allowed us to gain insight into the molecular mechanisms implicated in salmonid immune response and resistance to whirling disease infection. PMID:18218127

  10. De Novo Transcriptomic Analysis of an Oleaginous Microalga: Pathway Description and Gene Discovery for Production of Next-Generation Biofuels

    PubMed Central

    Wan, LingLin; Han, Juan; Sang, Min; Li, AiFen; Wu, Hong; Yin, ShunJi; Zhang, ChengWu

    2012-01-01

    Background Eustigmatos cf. polyphem is a yellow-green unicellular soil microalga belonging to the eustimatophyte with high biomass and considerable production of triacylglycerols (TAGs) for biofuels, which is thus referred to as an oleaginous microalga. The paucity of microalgae genome sequences, however, limits development of gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for a non-model microalgae species, E. cf. polyphem, and identify pathways and genes of importance related to biofuel production. Results We performed the de novo assembly of E. cf. polyphem transcriptome using Illumina paired-end sequencing technology. In a single run, we produced 29,199,432 sequencing reads corresponding to 2.33 Gb total nucleotides. These reads were assembled into 75,632 unigenes with a mean size of 503 bp and an N50 of 663 bp, ranging from 100 bp to >3,000 bp. Assembled unigenes were subjected to BLAST similarity searches and annotated with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology identifiers. These analyses identified the majority of carbohydrate, fatty acids, TAG and carotenoids biosynthesis and catabolism pathways in E. cf. polyphem. Conclusions Our data provides the construction of metabolic pathways involved in the biosynthesis and catabolism of carbohydrate, fatty acids, TAG and carotenoids in E. cf. polyphem and provides a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock. PMID:22536352

  11. Discovery of bacterial polyhydroxyalkanoate synthase (PhaC)-encoding genes from seasonal Baltic Sea ice and cold estuarine waters.

    PubMed

    Pärnänen, Katariina; Karkman, Antti; Virta, Marko; Eronen-Rasimus, Eeva; Kaartokallio, Hermanni

    2015-01-01

    Polyhydroxyalkanoates (PHAs) are macromolecules produced by bacteria as means for storing carbon and energy in intracellular granules. PHAs have physical properties similar to those of plastics and have become of interest to industry as materials for environmentally friendly bioplastic production. There is an ongoing search for new PHA-producing bacterial strains and PHA-synthesizing enzymes tolerating extreme conditions to find ways of producing PHAs at cold temperatures and high solute concentrations. Moreover, the study of PHA producers in the sea-ice biome can aid in understanding the microbial ecology of carbon cycling in ice-associated ecosystems. In this study, PHA producers and PHA synthase genes were examined under the extreme environmental conditions of sea ice and cold seawater to find evidence of PHA production in an environment requiring adaptation to high salinity and cold temperatures. Sea ice and cold estuarine water samples were collected from the northern Baltic Sea and evidence of PHA production was gathered, using microscopy with Nile Blue A staining of PHA-granules and PCR assays detecting PHA-synthesis genes. The PHA granules and PHA synthases were found at all sampling locations, in both sea ice and water, and throughout the sampling period spanning over 10 years. Our study shows, for the first time, that PHA synthesis occurs in Baltic Sea cold-adapted bacteria in their natural environment, which makes the Baltic Sea and its cold environments an interesting choice in the quest for PHA-synthesizing bacteria and synthesis genes.

  12. Tissue-specific laser microdissection of the Brassica napus funiculus improves gene discovery and spatial identification of biological processes

    PubMed Central

    Chan, Ainsley C.; Khan, Deirdre; Girard, Ian J.; Becker, Michael G.; Millar, Jenna L.; Sytnik, David; Belmonte, Mark F.

    2016-01-01

    The three primary tissue systems of the funiculus each undergo unique developmental programs to support the growth and development of the filial seed. To understand the underlying transcriptional mechanisms that orchestrate development of the funiculus at the globular embryonic stage of seed development, we used laser microdissection coupled with RNA-sequencing to produce a high-resolution dataset of the mRNAs present in the epidermis, cortex, and vasculature of the Brassica napus (canola) funiculus. We identified 7761 additional genes in these tissues compared with the whole funiculus organ alone using this technology. Differential expression and enrichment analyses were used to identify several biological processes associated with each tissue system. Our data show that cell wall modification and lipid metabolism are prominent in the epidermis, cell growth and modification occur in the cortex, and vascular tissue proliferation and differentiation occur in the central vascular strand. We provide further evidence that each of the three tissue systems of the globular stage funiculus are involved in specific biological processes that all co-ordinate to support seed development. The identification of genes and gene regulators responsible for tissue-specific developmental processes of the canola funiculus now serves as a valuable resource for seed improvement research. PMID:27194740

  13. Discovery of genetic susceptibility factors for human birth defects: an opportunity for a National Agenda.

    PubMed

    Olshan, Andrew F; Hobbs, Charlotte A; Shaw, Gary M

    2011-08-01

    A recent workshop highlighted the current challenges and new opportunities for studying the role of genetic factors in the etiology of human birth defects. The workshop provided a series of recommendations pertaining to the use of animal models, key elements of population-based designs, the need for national collaborative projects, biorepositories, and consortia, investigation of new types of structural genetic variants, examination of gene-exposure interactions, and a strategy for gene variant discovery. A key reason to hold the recent workshop and contribute this concise communication to the literature is to draw attention to and initiate action toward advancing discoveries about the genetic etiologies of birth defects.

  14. Candidate Gene Discovery Procedure after Follow-Up Confirmatory Analyses of Candidate Regions of Interests for Alzheimer’s Disease in the NIMH Sibling Dataset

    PubMed Central

    Baye, Tesfaye M.; Perry, Rodney T.; Wiener, Howard W.; Chen, Zuomin; Harrell, Lindy E.; Go, Rodney C. P.

    2008-01-01

    The objective of this research was to develop a procedure to identify candidate genes under linkage peaks confirmed in a follow-up of candidate regions of interests (CRIs) identified in our original genome scan in the NIMH Alzheimer’s diseases (AD) Initiative families (Blacker et al. [1]). There were six CRIs identified that met the threshold of multipoint lod score (MLS) of ≥ 2.0 from the original scan. The most significant peak (MLS = 7.7) was at 19q13, which was attributed to APOE. The remaining CRIs with ‘suggestive’ evidence for linkage were identified at 9q22, 6q27, 14q22, 11q25, and 3p26. We have followed up and narrowed the 9q22 CRI signal using simple tandem repeat (STR) markers (Perry et al. [2]). In this confirmatory project, we have followed up the 6q27, 14q22, 11q25, and 3p26 CRIs with a total of 24 additional flanking STRs, reducing the mean interval marker distance (MID) in each CRI, and substantially increase in the information content (IC). The linkage signals at 6q27, 14q22 and 11q25 remain ‘suggestive’, indicating that these CRIs are promising and worthy of detailed fine mapping and assessment of candidate genes associated with AD. We have developed a bioinformatics approach for identifying candidate genes in these confirmed regions based on the Gene Ontology terms that are annotated and enriched among the systematic meta-analyzed genes, confirmed by at least three case-control samples, and cataloged in the “AlzGene database” as potential Alzheimer disease susceptibility genes (http://www.alzgene.org). PMID:18688078

  15. Gene Discovery for Enzymes Involved in Limonene Modification or Utilization by the Mountain Pine Beetle-Associated Pathogen Grosmannia clavigera

    PubMed Central

    Wang, Ye; Lim, Lynette; Madilao, Lina; Lah, Ljerka; Bohlmann, Joerg

    2014-01-01

    To successfully colonize and eventually kill pine trees, Grosmannia clavigera (Gs cryptic species), the main fungal pathogen associated with the mountain pine beetle (Dendroctonus ponderosae), has developed multiple mechanisms to overcome host tree chemical defenses, of which terpenoids are a major component. In addition to a monoterpene efflux system mediated by a recently discovered ABC transporter, Gs has genes that are highly induced by monoterpenes and that encode enzymes that modify or utilize monoterpenes [especially (+)-limonene]. We showed that pine-inhabiting Ophiostomale fungi are tolerant to monoterpenes, but only a few, including Gs, are known to utilize monoterpenes as a carbon source. Gas chromatography-mass spectrometry (GC-MS) revealed that Gs can modify (+)-limonene through various oxygenation pathways, producing carvone, p-mentha-2,8-dienol, perillyl alcohol, and isopiperitenol. It can also degrade (+)-limonene through the C-1-oxygenated pathway, producing limonene-1,2-diol as the most abundant intermediate. Transcriptome sequencing (RNA-seq) data indicated that Gs may utilize limonene 1,2-diol through beta-oxidation and then valine and tricarboxylic acid (TCA) metabolic pathways. The data also suggested that at least two gene clusters, located in genome contigs 108 and 161, were highly induced by monoterpenes and may be involved in monoterpene degradation processes. Further, gene knockouts indicated that limonene degradation required two distinct Baeyer-Villiger monooxygenases (BVMOs), an epoxide hydrolase and an enoyl coenzyme A (enoyl-CoA) hydratase. Our work provides information on enzyme-mediated limonene utilization or modification and a more comprehensive understanding of the interaction between an economically important fungal pathogen and its host's defense chemicals. PMID:24837377

  16. Transcriptome Analysis of the Oriental River Prawn, Macrobrachium nipponense Using 454 Pyrosequencing for Discovery of Genes and Markers

    PubMed Central

    Ma, Keyi; Qiu, Gaofeng; Feng, Jianbin; Li, Jiale

    2012-01-01

    Background The oriental river prawn, Macrobrachium nipponense, is an economically and nutritionally important species of the Palaemonidae family of decapod crustaceans. To date, the sequencing of its whole genome is unavailable as a non-model organism. Transcriptomic information is also scarce for this species. In this study, we performed de novo transcriptome sequencing to produce the first comprehensive expressed sequence tag (EST) dataset for M. nipponense using high-throughput sequencing technologies. Methodology and Principal Findings Total RNA was isolated from eyestalk, gill, heart, ovary, testis, hepatopancreas, muscle, and embryos at the cleavage, gastrula, nauplius and zoea stages. Equal quantities of RNA from each tissue and stage were pooled to construct a cDNA library. Using 454 pyrosequencing technology, we generated a total of 984,204 high quality reads (338.59Mb) with an average length of 344 bp. Clustering and assembly of these reads produced a non-redundant set of 81,411 unique sequences, comprising 42,551 contigs and 38,860 singletons. All of the unique sequences were involved in the molecular function (30,425), cellular component (44,112) and biological process (67,679) categories by GO analysis. Potential genes and their functions were predicted by KEGG pathway mapping and COG analysis. Based on our sequence analysis and published literature, many putative genes involved in sex determination, including DMRT1, FTZ-F1, FOXL2, FEM1 and other potentially important candidate genes, were identified for the first time in this prawn. Furthermore, 6,689 SSRs and 18,107 high-confidence SNPs were identified in this EST dataset. Conclusions The transcriptome provides an invaluable new data for a functional genomics resource and future biological research in M. nipponense. The molecular markers identified in this study will provide a material basis for future genetic linkage and quantitative trait loci analyses, and will be essential for accelerating

  17. De novo transcriptome assembly of Ipomoea nil using Illumina sequencing for gene discovery and SSR marker identification.

    PubMed

    Wei, Changhe; Tao, Xiang; Li, Ming; He, Bin; Yan, Lang; Tan, Xuemei; Zhang, Yizheng

    2015-10-01

    Ipomoea nil is widely used as an ornamental plant due to its abundance of flower color, but the limited transcriptome and genomic data hinder research on it. Using illumina platform, transcriptome profiling of I. nil was performed through high-throughput sequencing, which was proven to be a rapid and cost-effective means to characterize gene content. Our goal is to use the resulting information to facilitate the relevant research on flowering and flower color formation in I. nil. In total, 268 million unique illumina RNA-Seq reads were produced and used in the transcriptome assembly. These reads were assembled into 220,117 contigs, of which 137,307 contigs were annotated using the GO and KEGG database. Based on the result of functional annotations, a total of 89,781 contigs were assigned 455,335 GO term annotations. Meanwhile, 17,418 contigs were identified with pathway annotation and they were functionally assigned to 144 KEGG pathways. Our transcriptome revealed at least 55 contigs as probably flowering-related genes in I. nil, and we also identified 25 contigs that encode key enzymes in the phenylpropanoid biosynthesis pathway. Based on the analysis relating to gene expression profiles, in the phenylpropanoid biosynthesis pathway of I. nil, the repression of lignin biosynthesis might lead to the redirection of the metabolic flux into anthocyanin biosynthesis. This may be the most likely reason that I. nil has high anthocyanins content, especially in its flowers. Additionally, 15,537 simple sequence repeats (SSRs) were detected using the MISA software, and these SSRs will undoubtedly benefit future breeding work. Moreover, the information uncovered in this study will also serve as a valuable resource for understanding the flowering and flower color formation mechanisms in I. nil.

  18. Insights into shell deposition in the Antarctic bivalve Laternula elliptica: gene discovery in the mantle transcriptome using 454 pyrosequencing

    PubMed Central

    2010-01-01

    Background The Antarctic clam, Laternula elliptica, is an infaunal stenothermal bivalve mollusc with a circumpolar distribution. It plays a significant role in bentho-pelagic coupling and hence has been proposed as a sentinel species for climate change monitoring. Previous studies have shown that this mollusc displays a high level of plasticity with regard to shell deposition and damage repair against a background of genetic homogeneity. The Southern Ocean has amongst the lowest present-day CaCO3 saturation rate of any ocean region, and is predicted to be among the first to become undersaturated under current ocean acidification scenarios. Hence, this species presents as an ideal candidate for studies into the processes of calcium regulation and shell deposition in our changing ocean environments. Results 454 sequencing of L. elliptica mantle tissue generated 18,290 contigs with an average size of 535 bp (ranging between 142 bp-5.591 kb). BLAST sequence similarity searching assigned putative function to 17% of the data set, with a significant proportion of these transcripts being involved in binding and potentially of a secretory nature, as defined by GO molecular function and biological process classifications. These results indicated that the mantle is a transcriptionally active tissue which is actively proliferating. All transcripts were screened against an in-house database of genes shown to be involved in extracellular matrix formation and calcium homeostasis in metazoans. Putative identifications were made for a number of classical shell deposition genes, such as tyrosinase, carbonic anhydrase and metalloprotease 1, along with novel members of the family 2 G-Protein Coupled Receptors (GPCRs). A membrane transport protein (SEC61) was also characterised and this demonstrated the utility of the clam sequence data as a resource for examining cold adapted amino acid substitutions. The sequence data contained 46,235 microsatellites and 13,084 Single Nucleotide

  19. Macro optical projection tomography for large scale 3D imaging of plant structures and gene activity.

    PubMed

    Lee, Karen J I; Calder, Grant M; Hindle, Christopher R; Newman, Jacob L; Robinson, Simon N; Avondo, Jerome J H Y; Coen, Enrico S

    2016-12-26

    Optical projection tomography (OPT) is a well-established method for visualising gene activity in plants and animals. However, a limitation of conventional OPT is that the specimen upper size limit precludes its application to larger structures. To address this problem we constructed a macro version called Macro OPT (M-OPT). We apply M-OPT to 3D live imaging of gene activity in growing whole plants and to visualise structural morphology in large optically cleared plant and insect specimens up to 60 mm tall and 45 mm deep. We also show how M-OPT can be used to image gene expression domains in 3D within fixed tissue and to visualise gene activity in 3D in clones of growing young whole Arabidopsis plants. A further application of M-OPT is to visualise plant-insect interactions. Thus M-OPT provides an effective 3D imaging platform that allows the study of gene activity, internal plant structures and plant-insect interactions at a macroscopic scale.

  20. Lignification in sugarcane: biochemical characterization, gene discovery, and expression analysis in two genotypes contrasting for lignin content.

    PubMed

    Bottcher, Alexandra; Cesarino, Igor; Santos, Adriana Brombini dos; Vicentini, Renato; Mayer, Juliana Lischka Sampaio; Vanholme, Ruben; Morreel, Kris; Goeminne, Geert; Moura, Jullyana Cristina Magalhães Silva; Nobile, Paula Macedo; Carmello-Guerreiro, Sandra Maria; Anjos, Ivan Antonio dos; Creste, Silvana; Boerjan, Wout; Landell, Marcos Guimarães de Andrade; Mazzafera, Paulo

    2013-12-01

    Sugarcane (Saccharum spp.) is currently one of the most efficient crops in the production of first-generation biofuels. However, the bagasse represents an additional abundant lignocellulosic resource that has the potential to increase the ethanol production per plant. To achieve a more efficient conversion of bagasse into ethanol, a better understanding of the main factors affecting biomass recalcitrance is needed. Because several studies have shown a negative effect of lignin on saccharification yield, the characterization of lignin biosynthesis, structure, and deposition in sugarcane is an important goal. Here, we present, to our knowledge, the first systematic study of lignin deposition during sugarcane stem development, using histological, biochemical, and transcriptional data derived from two sugarcane genotypes with contrasting lignin contents. Lignin amount and composition were determined in rind (outer) and pith (inner) tissues throughout stem development. In addition, the phenolic metabolome was analyzed by ultra-high-performance liquid chromatography-mass spectrometry, which allowed the identification of 35 compounds related to the phenylpropanoid pathway and monolignol biosynthesis. Furthermore, the Sugarcane EST Database was extensively surveyed to identify lignin biosynthetic gene homologs, and the expression of all identified genes during stem development was determined by quantitative reverse transcription-polymerase chain reaction. Our data provide, to our knowledge, the first in-depth characterization of lignin biosynthesis in sugarcane and form the baseline for the rational metabolic engineering of sugarcane feedstock for bioenergy purposes.

  1. Low-coverage, whole-genome sequencing of Artocarpus camansi (Moraceae) for phylogenetic marker development and gene discovery1

    PubMed Central

    Gardner, Elliot M.; Johnson, Matthew G.; Ragone, Diane; Wickett, Norman J.; Zerega, Nyree J. C.

    2016-01-01

    Premise of the study: We used moderately low-coverage (17×) whole-genome sequencing of Artocarpus camansi (Moraceae) to develop genomic resources for Artocarpus and Moraceae. Methods and Results: A de novo assembly of Illumina short reads (251,378,536 pairs, 2 × 100 bp) accounted for 93% of the predicted genome size. Predicted coding regions were used in a three-way orthology search with published genomes of Morus notabilis and Cannabis sativa. Phylogenetic markers for Moraceae were developed from 333 inferred single-copy exons. Ninety-eight putative MADS-box genes were identified. Analysis of all predicted coding regions resulted in preliminary annotation of 49,089 genes. An analysis of synonymous substitutions for pairs of orthologs (Ks analysis) in M. notabilis and A. camansi strongly suggested a lineage-specific whole-genome duplication in Artocarpus. Conclusions: This study substantially increases the genomic resources available for Artocarpus and Moraceae and demonstrates the value of low-coverage de novo assemblies for nonmodel organisms with moderately large genomes. PMID:27437173

  2. Array-Based Gene Discovery with Three Unrelated Subjects Shows SCARB2/LIMP-2 Deficiency Causes Myoclonus Epilepsy and Glomerulosclerosis

    PubMed Central

    Berkovic, Samuel F.; Dibbens, Leanne M.; Oshlack, Alicia; Silver, Jeremy D.; Katerelos, Marina; Vears, Danya F.; Lüllmann-Rauch, Renate; Blanz, Judith; Zhang, Ke Wei; Stankovich, Jim; Kalnins, Renate M.; Dowling, John P.; Andermann, Eva; Andermann, Frederick; Faldini, Enrico; D'Hooge, Rudi; Vadlamudi, Lata; Macdonell, Richard A.; Hodgson, Bree L.; Bayly, Marta A.; Savige, Judy; Mulley, John C.; Smyth, Gordon K.; Power, David A.; Saftig, Paul; Bahlo, Melanie

    2008-01-01

    Action myoclonus-renal failure syndrome (AMRF) is an autosomal-recessive disorder with the remarkable combination of focal glomerulosclerosis, frequently with glomerular collapse, and progressive myoclonus epilepsy associated with storage material in the brain. Here, we employed a novel combination of molecular strategies to find the responsible gene and show its effects in an animal model. Utilizing only three unrelated affected individuals and their relatives, we used homozygosity mapping with single-nucleotide polymorphism chips to localize AMRF. We then used microarray-expression analysis to prioritize candidates prior to sequencing. The disorder was mapped to 4q13-21, and microarray-expression analysis identified SCARB2/Limp2, which encodes a lysosomal-membrane protein, as the likely candidate. Mutations in SCARB2/Limp2 were found in all three families used for mapping and subsequently confirmed in two other unrelated AMRF families. The mutations were associated with lack of SCARB2 protein. Reanalysis of an existing Limp2 knockout mouse showed intracellular inclusions in cerebral and cerebellar cortex, and the kidneys showed subtle glomerular changes. This study highlights that recessive genes can be identified with a very small number of subjects. The ancestral lysosomal-membrane protein SCARB2/LIMP-2 is responsible for AMRF. The heterogeneous pathology in the kidney and brain suggests that SCARB2/Limp2 has pleiotropic effects that may be relevant to understanding the pathogenesis of other forms of glomerulosclerosis or collapse and myoclonic epilepsies. PMID:18308289

  3. High-throughput discovery of mutations in tef semi-dwarfing genes by next-generation sequencing analysis.

    PubMed

    Zhu, Qihui; Smith, Shavannor M; Ayele, Mulu; Yang, Lixing; Jogi, Ansuya; Chaluvadi, Srinivasa R; Bennetzen, Jeffrey L

    2012-11-01

    Tef (Eragrostis tef) is a major cereal crop in Ethiopia. Lodging is the primary constraint to increasing productivity in this allotetraploid species, accounting for losses of ∼15-45% in yield each year. As a first step toward identifying semi-dwarf varieties that might have improved lodging resistance, an ∼6× fosmid library was constructed and used to identify both homeologues of the dw3 semi-dwarfing gene of Sorghum bicolor. An EMS mutagenized population, consisting of ∼21,210 tef plants, was planted and leaf materials were collected into 23 superpools. Two dwarfing candidate genes, homeologues of dw3 of sorghum and rht1 of wheat, were sequenced directly from each superpool with 454 technology, and 120 candidate mutations were identified. Out of 10 candidates tested, six independent mutations were validated by Sanger sequencing, including two predicted detrimental mutations in both dw3 homeologues with a potential to improve lodging resistance in tef through further breeding. This study demonstrates that high-throughput sequencing can identify potentially valuable mutations in under-studied plant species like tef and has provided mutant lines that can now be combined and tested in breeding programs for improved lodging resistance.

  4. The Hexosamine Template – A Platform for Modulating Gene Expression and for Sugar-based Drug Discovery

    PubMed Central

    Elmouelhi, Noha; Aich, Udayanath; Paruchuri, Venkata D.P.; Meledeo, M. Adam; Campbell, Christopher T.; Wang, Jean J.; Srinivas, Raja; Khanna, Hargun S.; Yarema, Kevin J.

    2009-01-01

    This study investigates the breadth of cellular responses engendered by short chain fatty acid (SCFA)-hexosamine hybrid molecules, a class of compounds long used in ‘metabolic glycoengineering’ that are now emerging as drug candidates. First, a ‘mix-and-match’ strategy showed that different SCFA (n-butyrate and acetate) appended to the same core sugar altered biological activity, complementing previous results [Campbell et al., (2008) J. Med. Chem. 51, 8135–8147] where a single type of SCFA elicited distinct responses. Microarray profiling then compared transcriptional responses engendered by regioisomerically-modified ManNAc, GlcNAc, and GalNAc analogs in MDA-MB-231 cells. These data – which were validated by qRT-PCR or Western analysis for ID1, TP53, HPSE, NQO1, EGR1 and VEGFA – showed a two-pronged response where a core set of genes was coordinately regulated by all analogs while each analog simultaneously uniquely regulated a larger number of genes. Finally, AutoDock modeling supported a mechanism where the analogs directly interact with elements of the NF-κB pathway. Together, these results establish the SCFA-hexosamine template as a versatile platform for modulating biological activity and developing new therapeutics. PMID:19326913

  5. Lignification in Sugarcane: Biochemical Characterization, Gene Discovery, and Expression Analysis in Two Genotypes Contrasting for Lignin Content1[W

    PubMed Central

    Bottcher, Alexandra; Cesarino, Igor; Brombini dos Santos, Adriana; Vicentini, Renato; Mayer, Juliana Lischka Sampaio; Vanholme, Ruben; Morreel, Kris; Goeminne, Geert; Moura, Jullyana Cristina Magalhães Silva; Nobile, Paula Macedo; Carmello-Guerreiro, Sandra Maria; Antonio dos Anjos, Ivan; Creste, Silvana; Boerjan, Wout; Landell, Marcos Guimarães de Andrade; Mazzafera, Paulo

    2013-01-01

    Sugarcane (Saccharum spp.) is currently one of the most efficient crops in the production of first-generation biofuels. However, the bagasse represents an additional abundant lignocellulosic resource that has the potential to increase the ethanol production per plant. To achieve a more efficient conversion of bagasse into ethanol, a better understanding of the main factors affecting biomass recalcitrance is needed. Because several studies have shown a negative effect of lignin on saccharification yield, the characterization of lignin biosynthesis, structure, and deposition in sugarcane is an important goal. Here, we present, to our knowledge, the first systematic study of lignin deposition during sugarcane stem development, using histological, biochemical, and transcriptional data derived from two sugarcane genotypes with contrasting lignin contents. Lignin amount and composition were determined in rind (outer) and pith (inner) tissues throughout stem development. In addition, the phenolic metabolome was analyzed by ultra-high-performance liquid chromatography-mass spectrometry, which allowed the identification of 35 compounds related to the phenylpropanoid pathway and monolignol biosynthesis. Furthermore, the Sugarcane EST Database was extensively surveyed to identify lignin biosynthetic gene homologs, and the expression of all identified genes during stem development was determined by quantitative reverse transcription-polymerase chain reaction. Our data provide, to our knowledge, the first in-depth characterization of lignin biosynthesis in sugarcane and form the baseline for the rational metabolic engineering of sugarcane feedstock for bioenergy purposes. PMID:24144790

  6. Chronicles in drug discovery.

    PubMed

    Davies, Shelley L; Moral, Maria Angels; Bozzo, Jordi

    2007-03-01

    Chronicles in Drug Discovery features special interest reports on advances in drug discovery. This month we highlight agents that target and deplete immunosuppressive regulatory T cells, which are produced by tumor cells to hinder innate immunity against, or chemotherapies targeting, tumor-associated antigens. Antiviral treatments for respiratory syncytial virus, a severe and prevalent infection in children, are limited due to their side effect profiles and cost. New strategies currently under clinical development include monoclonal antibodies, siRNAs, vaccines and oral small molecule inhibitors. Recent therapeutic lines for Huntington's disease include gene therapies that target the mutated human huntingtin gene or deliver neuroprotective growth factors and cellular transplantation in apoptotic regions of the brain. Finally, we highlight the antiinflammatory and antinociceptive properties of new compounds targeting the somatostatin receptor subtype sst4, which warrant further study for their potential application as clinical analgesics.

  7. Diversity of human tRNA genes from the 1000-genomes project.

    PubMed

    Parisien, Marc; Wang, Xiaoyun; Pan, Tao

    2013-12-01

    The sequence diversity of individual human genomes has been extensively analyzed for variations and phenotypic implications for mRNA, miRNA, and long non-coding RNA genes. TRNA (tRNA) also exhibits large sequence diversity in the human genome, but tRNA gene sequence variation and potential functional implications in individual human genomes have not been investigated. Here we capitalize on the sequencing data from the 1000-genomes project to examine the diversity of tRNA genes in the human population. Previous analysis of the reference human genome indicated an unexpected large number of diverse tRNA genes beyond the necessity of translation, suggesting that some tRNA transcripts may perform non-canonical functions. We found 24 new tRNA sequences in>1% and 76 new tRNA sequences in>0.2% of all individuals, indicating that tRNA genes are also subject to evolutionary changes in the human population. Unexpectedly, two abundant new tRNA genes contain base-pair mismatches in the anticodon stem. We experimentally determined that these two new tRNAs have altered structures in vitro; however, one new tRNA is not aminoacylated but extremely stable in HeLa cells, suggesting that this new tRNA can be used for non-canonical function. Our results show that at the scale of human population, tRNA genes are more diverse than conventionally understood, and some new tRNAs may perform non-canonical, extra-translational functions that may be linked to human health and disease.

  8. Discovery of seven novel Mammalian and avian coronaviruses in the genus deltacoronavirus supports bat coronaviruses as the gene source of alphacoronavirus and betacoronavirus and avian coronaviruses as the gene source of gammacoronavirus and deltacoronavirus.

    PubMed

    Woo, Patrick C Y; Lau, Susanna K P; Lam, Carol S F; Lau, Candy C Y; Tsang, Alan K L; Lau, John H N; Bai, Ru; Teng, Jade L L; Tsang, Chris C C; Wang, Ming; Zheng, Bo-Jian; Chan, Kwok-Hung; Yuen, Kwok-Yung

    2012-04-01

    Recently, we reported the discovery of three novel coronaviruses, bulbul coronavirus HKU11, thrush coronavirus HKU12, and munia coronavirus HKU13, which were identified as representatives of a novel genus, Deltacoronavirus, in the subfamily Coronavirinae. In this territory-wide molecular epidemiology study involving 3,137 mammals and 3,298 birds, we discovered seven additional novel deltacoronaviruses in pigs and birds, which we named porcine coronavirus HKU15, white-eye coronavirus HKU16, sparrow coronavirus HKU17, magpie robin coronavirus HKU18, night heron coronavirus HKU19, wigeon coronavirus HKU20, and common moorhen coronavirus HKU21. Complete genome sequencing and comparative genome analysis showed that the avian and mammalian deltacoronaviruses have similar genome characteristics and structures. They all have relatively small genomes (25.421 to 26.674 kb), the smallest among all coronaviruses. They all have a single papain-like protease domain in the nsp3 gene; an accessory gene, NS6 open reading frame (ORF), located between the M and N genes; and a variable number of accessory genes (up to four) downstream of the N gene. Moreover, they all have the same putative transcription regulatory sequence of ACACCA. Molecular clock analysis showed that the most recent common ancestor of all coronaviruses was estimated at approximately 8100 BC, and those of Alphacoronavirus, Betacoronavirus, Gammacoronavirus, and Deltacoronavirus were at approximately 2400 BC, 3300 BC, 2800 BC, and 3000 BC, respectively. From our studies, it appears that bats and birds, the warm blooded flying vertebrates, are ideal hosts for the coronavirus gene source, bats for Alphacoronavirus and Betacoronavirus and birds for Gammacoronavirus and Deltacoronavirus, to fuel coronavirus evolution and dissemination.

  9. Space Discovery.

    ERIC Educational Resources Information Center

    Blackman, Joan

    1998-01-01

    Describes one teacher's experience taking Space Discovery courses that were sponsored by the United States Space Foundation (USSF). These courses examine the history of space science, theory of orbits and rocketry, the effects of living in outer space on humans, and space weather. (DDR)

  10. [New discoveries about the fragile X syndrome complicate genetic counseling. More symptoms than earlier known caused by the disease gene].

    PubMed

    Kristoffersson, Ulf; Wahlström, Jan; Lynöe, Niels

    The Swedish Medical Society's Delegation for Medical Ethics held in October 2004 a workshop on the new ethical implications on genetic counselling in families where a premutation or mutation in the FMR1 gene was found. New research has revealed that premutation carrier women have an increased risk of premature ovarian failure, and, thus, their fertile sisters may be mutation carriers with an increased risk of having a child with the fragile X syndrome. Premutation carrier males have after the age of 50 an increased risk of developing ataxia and cognitive dysfunctions. Accordingly, their daughters have a high risk of having a child with the fragile X syndrome. The ethical aspects of these issues were discussed at the workshop with suggestions on the way forward.

  11. Cynomolgus monkey testicular cDNAs for discovery of novel human genes in the human genome sequence

    PubMed Central

    Osada, Naoki; Hida, Munetomo; Kusuda, Jun; Tanuma, Reiko; Hirata, Makoto; Suto, Yumiko; Hirai, Momoki; Terao, Keiji; Sugano, Sumio; Hashimoto, Katsuyuki

    2002-01-01

    Background In order to contribute to the establishment of a complete map of transcribed regions of the human genome, we constructed a testicular cDNA library for the cynomolgus monkey, and attempted to find novel transcripts for identification of their human homologues. Result The full-insert sequences of 512 cDNA clones were determined. Ultimately we found 302 non-redundant cDNAs carrying open reading frames of 300 bp-length or longer. Among them, 89 cDNAs were found not to be annotated previously in the Ensembl human database. After searching against the Ensembl mouse database, we also found 69 putative coding sequences have no homologous cDNAs in the annotated human and mouse genome sequences in Ensembl. We subsequently designed a DNA microarray including 396 non-redundant cDNAs (with and without open reading frames) to examine the expression of the full-sequenced genes. With the testicular probe and a mixture of probes of 10 other tissues, 316 of 332 effective spots showed intense hybridized signals and 75 cDNAs were shown to be expressed very highly in the cynomolgus monkey testis, but not ubiquitously. Conclusions In this report, we determined 302 full-insert sequences of cynomolgus monkey cDNAs with enough length of open reading frames to discover novel transcripts as human homologues. Among 302 cDNA sequences, human homologues of 89 cDNAs have not been predicted in the annotated human genome sequence in the Ensembl. Additionally, we identified 75 dominantly expressed genes in testis among the full-sequenced clones by using a DNA microarray. Our cDNA clones and analytical results will be valuable resources for future functional genomic studies. PMID:12498619

  12. Discovery of miRNAs and Their Corresponding miRNA Genes in Atlantic Cod (Gadus morhua): Use of Stable miRNAs as Reference Genes Reveals Subgroups of miRNAs That Are Highly Expressed in Particular Organs

    PubMed Central

    Andreassen, Rune; Rangnes, Fredrik; Sivertsen, Maria; Chiang, Michelle; Tran, Michelle; Worren, Merete Molton

    2016-01-01

    Background Atlantic cod (Gadus morhua) is among the economically most important species in the northern Atlantic Ocean and a model species for studying development of the immune system in vertebrates. MicroRNAs (miRNAs) are an abundant class of small RNA molecules that regulate fundamental biological processes at the post-transcriptional level. Detailed knowledge about a species miRNA repertoire is necessary to study how the miRNA transcriptome modulate gene expression. We have therefore discovered and characterized mature miRNAs and their corresponding miRNA genes in Atlantic cod. We have also performed a validation study to identify suitable reference genes for RT-qPCR analysis of miRNA expression in Atlantic cod. Finally, we utilized the newly characterized miRNA repertoire and the dedicated RT-qPCR method to reveal miRNAs that are highly expressed in certain organs. Results The discovery analysis revealed 490 mature miRNAs (401 unique sequences) along with precursor sequences and genomic location of the miRNA genes. Twenty six of these were novel miRNA genes. Validation studies ranked gmo-miR-17-1—5p or the two-gene combination gmo-miR25-3p and gmo-miR210-5p as most suitable qPCR reference genes. Analysis by RT-qPCR revealed 45 miRNAs with significantly higher expression in tissues from one or a few organs. Comparisons to other vertebrates indicate that some of these miRNAs may regulate processes like growth, lipid metabolism, immune response to microbial infections and scar damage repair. Three teleost-specific and three novel Atlantic cod miRNAs were among the differentially expressed miRNAs. Conclusions The number of known mature miRNAs was considerably increased by our identification of miRNAs and miRNA genes in Atlantic cod. This will benefit further functional studies of miRNA expression using deep sequencing methods. The validation study showed that stable miRNAs are suitable reference genes for RT-qPCR analysis of miRNA expression. Applying RT-qPCR we

  13. A late 17α-hydroxylase deficiency diagnosis that leads to the discovery of a new CYP17 gene mutation.

    PubMed

    Guenego, Agathe; Morel, Yves; Ionesco, Oana; Mallet, Delphine; Priou-Guesdon, Melanie

    2015-02-01

    17α-Hydroxylase deficiency is a rare form of congenital adrenal hyperplasia. It leads to a reduced production of cortisol and sex steroids and thus an increase in adrenocorticotrophic hormone and gonadotrophins levels. High adrenocorticotrophic hormone levels result in an accumulation of 17-deoxysteroids, such as deoxycorticosterone and corticosterone. Deoxycorticosterone and corticosterone have an important mineralocorticoid activity. We report the case of a 66-year-old woman who presented with hypertension and symptomatic hypokalaemia. Primary hyperaldosteronism was suspected and a right adrenal mass was removed. After surgery, the patient was referred to the endocrinology department for persistant hypokalaemia. Actually, she presented some signs of hypogonadism (impuberism, primary amenorrhea, infertility). Cortisol and 17OH-progesterone serum levels were low. Deoxycorticosterone and corticosterone were markedly elevated. The hypothesis of 17α-hydroxylase deficiency was considered and confirmed by genetic exploration. A non-sense mutation c.938G>A (p.Trp313X) in exon 5 of the CYP17 gene was found that had never been reported so far to our knowledge. Moreover, the patient's karyotype found a mosaic Turner syndrome. This case is particularly interesting because of the delay of diagnosis. The 17α-hydroxylase deficiency diagnosis is to be considered when hypertension is associated with hypokalaemia and hypogonadism, even in adult patients.

  14. Cys-loop ligand-gated ion channel gene discovery in the Locusta migratoria manilensis through the neuron transcriptome.

    PubMed

    Wang, Xin; Meng, Xiangkun; Liu, Chuanjun; Gao, Hongli; Zhang, Yixi; Liu, Zewen

    2015-05-01

    As an ideal model, Locusta migratoria manilensis (Meyen) has been widely used in the study of endocrinological and neurobiological processes. Here we created a large transcriptome of the locust neurons, which enriched ion channels whose potential for functional genetic experiments is currently limited. With high-throughput Illumina sequencing technology, we obtained more than 50 million raw reads, which were assembled into 61,056 unique sequences with average size of 737bp. Among the unigenes, a total 24,884 sequences had significant similarities with proteins in the five public databases (NR, SwissProt, GO, COG and KEGG) with a cut-off E-value of 10(-5) using BLASTx. Moreover, the number of potential genes of the cys-loop ligand-gated ion channels (LGICs) was manually curated, including 39 putative nicotinic acetylcholine receptors (nAChRs), 6 putative γ-aminobutyric acid (GABA) gated anion channels, 21 putative glutamate-gated chloride channels (GluCls) and 1 histamine-gated chloride channels (HisCls). In addition, the full-length of 11 nAChRs subunits (9 alpha and 2 beta) were obtained by RACE technique that would be helpful to further studies on nAChR neurochemistry and pharmacological aspects. To our knowledge, this is the first study to characterize the locust neuron transcriptome, which will provide a useful resource especially for future studies on the neuro-function and behavior of the locust.

  15. Exploring divergent antibiotic resistance genes in ancient metagenomes and discovery of a novel beta-lactamase family.

    PubMed

    Rascovan, Nicolás; Telke, Amar; Raoult, Didier; Rolain, Jean Marc; Desnues, Christelle

    2016-08-12

    Antibiotic resistance in pathogenic bacteria is a major problem for human health. We analyzed metagenomic datasets from ancient and remote samples from diverse environmental sources and observed the presence of all the eleven antibiotic resistance genes (ARG) groups evaluated. Since ancient samples are not subjected to modern effects of antibiotic misuse, they represent a clean model to explore the natural diversity of ARG in the environment. Most sequences showed high divergence compared with known ARG, representing a much larger universe than the currently known and characterized ARGs. We explored whether proteins within the "divergent resistome" may correspond to functional ARG by characterizing a beta-lactamase hit with very low similarity to any known sequence (<45% to best BLAST hit in NCBI). By starting from purely in-silico data, we revived a new family of class B beta-lactamases from ancient medieval samples, which exhibited a very high penicillinase activity. In this work, we explored ancient resistomes and added novel support to previous works showing that the universe of ARG is naturally vast and diverse in microbial communities. Our results bring a new perspective to the exploration of environmental ARG and indicate that this gigantic reservoir represents a natural endless source of emerging resistances.

  16. Generalized and specific neurocognitive deficits in psychotic disorders: utility for evaluating pharmacological treatment effects and as intermediate phenotypes for gene discovery.

    PubMed

    Reilly, James L; Sweeney, John A

    2014-05-01

    A growing body of research suggests that schizophrenia and bipolar disorder share overlapping clinical, neurobiological, and genetic features, raising important questions about the boundaries and distinctiveness of these 2 major psychiatric disorders. A generalized cognitive impairment has long been understood to be a core feature of schizophrenia. More recently, it has become apparent that cognitive impairment also occurs in bipolar disorder, particularly in those patients with a history of psychotic symptoms. Whether a generalized deficit exists across a spectrum of psychotic disorders is less clearly established. Additionally, in the context of a broad impairment, it remains a significant challenge to identify deficits in specific cognitive processes that may have distinct neurochemical or regional brain substrates and linkages to particular risk-associated genetic factors. In this article, we review the findings from neuropsychological studies across a spectrum that includes schizophrenia, schizoaffective and bipolar disorders, and conclude the available evidence strongly supports that a generalized deficit is present across psychotic disorders that differs in severity more so than form. We then consider the implications of generalized and specific deficits in psychosis for 2 areas of research--the evaluation of pharmacological treatments targeting cognitive deficits, and the investigation of cognitive intermediate phenotypes in family genetic studies. Examples from the literature that touch on the relevance of the generalized deficit in these contexts are provided, as well as consideration for the continued need to identify specific impairments that are separable from the generalized deficit in order to advance drug and gene discovery.

  17. Discovery of precursor and mature microRNAs and their putative gene targets using high-throughput sequencing in pineapple (Ananas comosus var. comosus).

    PubMed

    Yusuf, Noor Hydayaty Md; Ong, Wen Dee; Redwan, Raimi Mohamed; Latip, Mariam Abd; Kumar, S Vijay

    2015-10-15

    MicroRNAs (miRNAs) are a class of small, endogenous non-coding RNAs that negatively regulate gene expression, resulting in the silencing of target mRNA transcripts through mRNA cleavage or translational inhibition. MiRNAs play significant roles in various biological and physiological processes in plants. However, the miRNA-mediated gene regulatory network in pineapple, the model tropical non-climacteric fruit, remains largely unexplored. Here, we report a complete list of pineapple mature miRNAs obtained from high-throughput small RNA sequencing and precursor miRNAs (pre-miRNAs) obtained from ESTs. Two small RNA libraries were constructed from pineapple fruits and leaves, respectively, using Illumina's Solexa technology. Sequence similarity analysis using miRBase revealed 579,179 reads homologous to 153 miRNAs from 41 miRNA families. In addition, a pineapple fruit transcriptome library consisting of approximately 30,000 EST contigs constructed using Solexa sequencing was used for the discovery of pre-miRNAs. In all, four pre-miRNAs were identified (MIR156, MIR399, MIR444 and MIR2673). Furthermore, the same pineapple transcriptome was used to dissect the function of the miRNAs in pineapple by predicting their putative targets in conjunction with their regulatory networks. In total, 23 metabolic pathways were found to be regulated by miRNAs in pineapple. The use of high-throughput sequencing in pineapples to unveil the presence of miRNAs and their regulatory pathways provides insight into the repertoire of miRNA regulation used exclusively in this non-climacteric model plant.

  18. Developmental gene discovery in a hemimetabolous insect: de novo assembly and annotation of a transcriptome for the cricket Gryllus bimaculatus.

    PubMed

    Zeng, Victor; Ewen-Campen, Ben; Horch, Hadley W; Roth, Siegfried; Mito, Taro; Extavour, Cassandra G

    2013-01-01

    Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects), representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket), a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts) and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr) identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in Gryllus.

  19. Developmental Gene Discovery in a Hemimetabolous Insect: De Novo Assembly and Annotation of a Transcriptome for the Cricket Gryllus bimaculatus

    PubMed Central

    Zeng, Victor; Ewen-Campen, Ben; Horch, Hadley W.; Roth, Siegfried; Mito, Taro; Extavour, Cassandra G.

    2013-01-01

    Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects), representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket), a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts) and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr) identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in Gryllus. PMID

  20. Discovery, evaluation and distribution of haplotypes and new alleles of the Photoperiod-A1 gene in wheat.

    PubMed

    Muterko, Alexandr; Kalendar, Ruslan; Cockram, James; Balashova, Irina

    2015-05-01

    Photoperiod response in wheat is determined to a large extent by the homoeologous series of Photoperiod 1 (Ppd1) genes. In this study, Ppd-A1 genomic sequences from the 5' UTR and promoter region were analysed in 104 accessions of six tetraploid wheat species (Triticum dicoccoides, T. dicoccum, T. turgidum, T. polonicum, T. carthlicum, T. durum) and 102 accessions of six hexaploid wheat species (T. aestivum, T. compactum, T. sphaerococcum, T. spelta, T. macha, T. vavilovii). This data was supplemented with in silico analysis of publicly available sequences from 46 to 193 accessions of diploid and tetraploid wheat, respectively. Analysis of a region of the Ppd-A1 promoter identified thirteen haplotypes, which were divided in two haplogroups. Distribution of the Ppd-A1 haplogroups and haplotypes in wheat species, and their geographical distributions were analysed. Polymerase chain reaction combined with a heteroduplex mobility assay was subsequently used to efficiently discriminate between Ppd-A1 alleles, allowing identification of the Ppd-A1b haplotypes and haplogroups. The causes of anomalous migration of Ppd-A1 heteroduplexes in gels were found to be the localization of mismatches relative to the center of fragment, the cumulative effect of neighbouring polymorphic sites, and the location of mismatches within A/T-tracts. Analysis of the Ppd-A1 5' UTR in hexaploid wheat revealed a novel mutation within the "photoperiod critical" region in a subset of T. compactum accessions. This putative photoperiod insensitive allele (designated Ppd-A1a.4) includes a 684 bp deletion which spans region in common with deletions previously identified in other photoperiod insensitive Ppd1 alleles.

  1. Discovery of a small molecule activator of the human ether-a-go-go-related gene (HERG) cardiac K+ channel.

    PubMed

    Kang, Jiesheng; Chen, Xiao-Liang; Wang, Hongge; Ji, Junzhi; Cheng, Hsien; Incardona, Josephine; Reynolds, William; Viviani, Fabrice; Tabart, Michel; Rampe, David

    2005-03-01

    Many drugs inhibit the human ether-a-go-go-related gene (HERG) cardiac K+ channel. This leads to action potential prolongation on the cellular level, a prolongation of the QT interval on the electrocardiogram, and sometimes cardiac arrhythmia. To date, no activators of this channel have been reported. Here, we describe the in vitro electrophysiological effects of (3R,4R)-4-[3-(6-methoxyquinolin-4-yl)-3-oxo-propyl]-1-[3-(2,3,5-trifluoro-phenyl)-prop-2-ynyl]-piperidine-3-carboxylic acid (RPR260243), a novel activator of HERG. Using patch-clamp electrophysiology, we found that RPR260243 dramatically slowed current deactivation when applied to cells stably expressing HERG. The effects of RPR260243 on HERG channel deactivation were temperature- and voltage-dependent and occurred over the concentration range of 1 to 30 microM. RPR260243-modified HERG currents were inhibited by dofetilide (IC50 = 58 nM). RPR260243 had little effect on HERG current amplitude and no significant effects on steady-state activation parameters or on channel inactivation processes. RPR260243 displayed no activator-like effects on other voltage-dependent ion channels, including the closely related erg3 K+ channel. RPR260243 enhanced the delayed rectifier current in guinea pig myocytes but, when administered alone, had little effect on action potential parameters in these cells. However, RPR260243 completely reversed the action potential-prolonging effects of dofetilide in this preparation. Using the Langendorff heart method, we found that 5 microM RPR260243 increased T-wave amplitude, prolonged the PR interval, and shortened the QT interval. We believe RPR260243 represents the first known HERG channel activator and that the drug works primarily by inhibiting channel closure, leading to a persistent HERG channel current upon repolarization. Compounds like RPR260243 will be useful for studying the physiological role of HERG and may one day find use in treating cardiac disease.

  2. Hunting for genes for hypertension: the Millennium Genome Project for Hypertension.

    PubMed

    Tabara, Yasuharu; Kohara, Katsuhiko; Miki, Tetsuro

    2012-06-01

    The Millennium Genome Project for Hypertension was started in 2000 to identify genetic variants conferring susceptibility to hypertension, with the aim of furthering the understanding of the pathogenesis of this condition and realizing genome-based personalized medical care. Two different approaches were launched, genome-wide association analysis using single-nucleotide polymorphisms (SNPs) and microsatellite markers, and systematic candidate gene analysis, under the hypothesis that common variants have an important role in the etiology of common diseases. These multilateral approaches identified ATP2B1 as a gene responsible for hypertension in not only Japanese but also Caucasians. The high blood pressure susceptibility conferred by certain alleles of ATP2B1 has been widely replicated in various populations. Ex vivo mRNA expression analysis in umbilical artery smooth muscle cells indicated that reduced expression of this gene associated with the risk allele may be an underlying mechanism relating the ATP2B1 variant to hypertension. However, the effect size of a SNP was too small to clarify the entire picture of the genetic basis of hypertension. Further, dense genome analysis with accurate phenotype data may be required.

  3. Variation analysis and gene annotation of eight MHC haplotypes: the MHC Haplotype Project.

    PubMed

    Horton, Roger; Gibson, Richard; Coggill, Penny; Miretti, Marcos; Allcock, Richard J; Almeida, Jeff; Forbes, Simon; Gilbert, James G R; Halls, Karen; Harrow, Jennifer L; Hart, Elizabeth; Howe, Kevin; Jackson, David K; Palmer, Sophie; Roberts, Anne N; Sims, Sarah; Stewart, C Andrew; Traherne, James A; Trevanion, Steve; Wilming, Laurens; Rogers, Jane; de Jong, Pieter J; Elliott, John F; Sawcer, Stephen; Todd, John A; Trowsdale, John; Beck, Stephan

    2008-01-01

    The human major histocompatibility complex (MHC) is contained within about 4 Mb on the short arm of chromosome 6 and is recognised as the most variable region in the human genome. The primary aim of the MHC Haplotype Project was to provide a comprehensively annotated reference sequence of a single, human leukocyte antigen-homozygous MHC haplotype and to use it as a basis against which variations could be assessed from seven other similarly homozygous cell lines, representative of the most common MHC haplotypes in the European population. Comparison of the haplotype sequences, including four haplotypes not previously analysed, resulted in the identification of >44,000 variations, both substitutions and indels (insertions and deletions), which have been submitted to the dbSNP database. The gene annotation uncovered haplotype-specific differences and confirmed the presence of more than 300 loci, including over 160 protein-coding genes. Combined analysis of the variation and annotation datasets revealed 122 gene loci with coding substitutions of which 97 were non-synonymous. The haplotype (A3-B7-DR15; PGF cell line) designated as the new MHC reference sequence, has been incorporated into the human genome assembly (NCBI35 and subsequent builds), and constitutes the largest single-haplotype sequence of the human genome to date. The extensive variation and annotation data derived from the analysis of seven further haplotypes have been made publicly available and provide a framework and resource for future association studies of all MHC-associated diseases and transplant medicine.

  4. Discovery Guide.

    ERIC Educational Resources Information Center

    Edwards, Claudia

    This guide describes a project (Teamwork Approach to Better Schools) developed to promote the establishment of a formal teacher support network in a variety of schools within a local support district. The model is a guide to newcomers to the project, helping eliminate startup problems and providing a sound base of experiences. The program began…

  5. A rapid alternative to X-ray crystallography for chiral determination: case studies of vibrational circular dichroism (VCD) to advance drug discovery projects.

    PubMed

    Wesolowski, Steven S; Pivonka, Don E

    2013-07-15

    The absolute stereochemistry of chiral drugs is usually established via X-ray crystallography. However, vibrational circular dichroism (VCD) spectroscopy coupled with quantum mechanics simulations offers a rapid alternative to crystallography and is readily applied to both crystalline and non-crystalline samples. VCD is an effective complement to X-ray analysis of drug candidates, and it can be used as a high-throughput means of assessing absolute stereochemistry at all phases of the discovery process (hundreds of assignments per year). The practical implementation (or fee-for-service outsourcing) of VCD and selected case studies are illustrated with an emphasis on providing utility and impact to pharmaceutical discovery programs.

  6. Characterisation of the wheat (triticum aestivum L.) transcriptome by de novo assembly for the discovery of phosphate starvation-responsive genes: gene expression in Pi-stressed wheat

    PubMed Central

    2013-01-01

    Background Phosphorus (P) is an essential macronutrient for plant growth and development. To modulate their P homeostasis, plants must balance P uptake, mobilisation, and partitioning to various organs. Despite the worldwide importance of wheat as a cultivated food crop, molecular mechanisms associated with phosphate (Pi) starvation in wheat remain unclear. To elucidate these mechanisms, we used RNA-Seq methods to generate transcriptome profiles of the wheat variety ‘Chinese Spring’ responding to 10 days of Pi starvation. Results We carried out de novo assembly on 73.8 million high-quality reads generated from RNA-Seq libraries. We then constructed a transcript dataset containing 29,617 non-redundant wheat transcripts, comprising 15,047 contigs and 14,570 non-redundant full-length cDNAs from the TriFLDB database. When compared with barley full-length cDNAs, 10,656 of the 15,047 contigs were unalignable, suggesting that many might be distinct from barley transcripts. The average expression level of the contigs was lower than that of the known cDNAs, implying that these contigs included transcripts that were rarely represented in the full-length cDNA library. Within the non-redundant transcript set, we identified 892–2,833 responsive transcripts in roots and shoots, corresponding on average to 23.4% of the contigs not covered by cDNAs in TriFLDB under Pi starvation. The relative expression level of the wheat IPS1 (Induced by Phosphate Starvation 1) homologue, TaIPS1, was 341-fold higher in roots and 13-fold higher in shoots; this finding was further confirmed by qRT-PCR analysis. A comparative analysis of the wheat- and rice-responsive transcripts for orthologous genes under Pi-starvation revealed commonly upregulated transcripts, most of which appeared to be involved in a general response to Pi starvation, namely, an IPS1-mediated signalling cascade and its downstream functions such as Pi remobilisation, Pi uptake, and changes in Pi metabolism. Conclusions Our

  7. Computational methods in drug discovery

    PubMed Central

    Leelananda, Sumudu P

    2016-01-01

    The process for drug discovery and development is challenging, time consuming and expensive. Computer-aided drug discovery (CADD) tools can act as a virtual shortcut, assisting in the expedition of this long process and potentially reducing the cost of research and development. Today CADD has become an effective and indispensable tool in therapeutic development. The human genome project has made available a substantial amount of sequence data that can be used in various drug discovery projects. Additionally, increasing knowledge of biological structures, as well as increasing computer power have made it possible to use computational methods effectively in various phases of the drug discovery and development pipeline. The importance of in silico tools is greater than ever before and has advanced pharmaceutical research. Here we present an overview of computational methods used in different facets of drug discovery and highlight some of the recent successes. In this review, both structure-based and ligand-based drug discovery methods are discussed. Advances in virtual high-throughput screening, protein structure prediction methods, protein–ligand docking, pharmacophore modeling and QSAR techniques are reviewed. PMID:28144341

  8. Computational methods in drug discovery.

    PubMed

    Leelananda, Sumudu P; Lindert, Steffen

    2016-01-01

    The process for drug discovery and development is challenging, time consuming and expensive. Computer-aided drug discovery (CADD) tools can act as a virtual shortcut, assisting in the expedition of this long process and potentially reducing the cost of research and development. Today CADD has become an effective and indispensable tool in therapeutic development. The human genome project has made available a substantial amount of sequence data that can be used in various drug discovery projects. Additionally, increasing knowledge of biological structures, as well as increasing computer power have made it possible to use computational methods effectively in various phases of the drug discovery and development pipeline. The importance of in silico tools is greater than ever before and has advanced pharmaceutical research. Here we present an overview of computational methods used in different facets of drug discovery and highlight some of the recent successes. In this review, both structure-based and ligand-based drug discovery methods are discussed. Advances in virtual high-throughput screening, protein structure prediction methods, protein-ligand docking, pharmacophore modeling and QSAR techniques are reviewed.

  9. The Extragalactic Distance Scale Key Project III. Teh discovery of Cephids and a New Distance to M101 Using the Hubble Space Telescope

    NASA Technical Reports Server (NTRS)

    Kelson, Daniel D.; Madore, Barry

    1994-01-01

    We report on the discovery of 29 cephid variables in the galaxy M101 after using the original Wide Field Camera (WFC 1) and the new Wide Field and Planetary Camera (WFPC 2) on the Hubble Space Telescope (HST), to observe a field in M101 at 14 independent epochs in F555W.

  10. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    PubMed

    Hill, Theresa A; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome

  11. Effective Dimension Reduction Using Sequential Projection Pursuit On Gene Expression Data for Cancer Classification

    SciTech Connect

    Webb-Robertson, Bobbie-Jo M.; Havre, Susan L.

    2004-06-23

    Motiviation: Classification is a powerful tool for uncovering interesting phenomena, for example classes of cancer, in microarray data. Due to the small number of observations (n) in comparison to the number of variables (p), genes, classification on microarray data is challenging. Thus, multivariate dimension reduction techniques are commonly used as a precursor to classification of microarray data; typically this is principal component analysis (PCA) or singular value decomposition (SVD). Since PCA and SVD are concerned with explaining the variance-covariance structure of the data, they may not be the best choice when the between-cluster variance is smaller than the within-cluster variance. Recently an attractive alternative to PCA, sequential projection pursuit (SPP), has been introduced which is designed to elicit clustering tendencies in the data. Thus, in some cases SPP may be more appropriate when performing clustering or classification analysis. Results: We compare the performance of SPP to PCA on two cancer gene expression datasets related to leukemia and colon cancer. Using PCA and SPP to reduce the dimensionality of the data to m<

  12. Lowell Observatory's Discovery Channel Telescope

    NASA Astrophysics Data System (ADS)

    Hall, Jeffrey C.

    2017-01-01

    Lowell Observatory broke ground on its 4.3-meter Discovery Channel Telescope (DCT) in July 2005 and celebrated first light for the telescope in July 2012. In this overview to this special session, I will discuss the origin and development of the project, the telescope's general specifications and performance, its current operating status, and the initial instrument suite.

  13. DISCOVERY IN THE URBAN SPRAWL.

    ERIC Educational Resources Information Center

    HYMOVITZ, LEON

    FOR A CULTURAL ENRICHMENT PROJECT ("DISCOVERY") IN A DISADVANTAGED PHILADELPIA HIGH SCHOOL, ATTENDANCE AT MUSIC, ART, AND THEATER EVENTS EARNED POINTS TOWARD A CERTIFICATE. THE STUDENTS ELECTED THE EVENTS FROM A PREPARED LIST OF ACTIVITIES, WHICH OFTEN WERE MADE PART OF THE ACADEMIC PROGRAM AND THE SCHOOL ASSEMBLIES. AS WELL AS OFFERING…

  14. Harry Stottlemier's Discovery [Revised Edition].

    ERIC Educational Resources Information Center

    Lipman, Matthew

    "Harry Stottlemeier's Discovery" is the student book for the project in philosophical thinking described in SO 008 123-126. It offers a model of dialogue -- both of children with one another and of children with adults. The story is set among a classroom of children who begin to understand the basics of logical reasoning when Harry, who isn't…

  15. Fear conditioning leads to alteration in specific genes expression in cortical and thalamic neurons that project to the lateral amygdala.

    PubMed

    Katz, Ira K; Lamprecht, Raphael

    2015-02-01

    RNA transcription is needed for memory formation. However, the ability to identify genes whose expression is altered by learning is greatly impaired because of methodological difficulties in profiling gene expression in specific neurons involved in memory formation. Here, we report a novel approach to monitor the expression of genes after learning in neurons in specific brain pathways needed for memory formation. In this study, we aimed to monitor gene expression after fear learning. We retrogradely labeled discrete thalamic neurons that project to the lateral amygdala (LA) of rats. The labeled neurons were dissected, using laser microdissection microscopy, after fear conditioning learning or unpaired training. The RNAs from the dissected neurons were subjected to microarray analysis. The levels of selected RNAs detected by the microarray analysis to be altered by fear conditioning were also assessed by nanostring analysis. We observed that the expression of genes involved in the regulation of translation, maturation and degradation of proteins was increased 6 h after fear conditioning compared to unpaired or naïve trained rats. These genes were not expressed 24 h after training or in cortical neurons that project to the LA. The expression of genes involved in transcription regulation and neuronal development was altered after fear conditioning learning in the cortical-LA pathway. The present study provides key information on the identity of genes expressed in discrete thalamic and cortical neurons that project to the LA after fear conditioning. Such an approach could also serve to identify gene products as targets for the development of a new generation of therapeutic agents that could be aimed to functionally identified brain circuits to treat memory-related disorders.

  16. Construction and Evaluation of Normalized cDNA Libraries Enriched with Full-Length Sequences for Rapid Discovery of New Genes from Sisal (Agave sisalana Perr.) Different Developmental Stages

    PubMed Central

    Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

    2012-01-01

    To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944

  17. Construction and evaluation of normalized cDNA libraries enriched with full-length sequences for rapid discovery of new genes from Sisal (Agave sisalana Perr.) different developmental stages.

    PubMed

    Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

    2012-10-12

    To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing.

  18. Wildlife Discovery.

    ERIC Educational Resources Information Center

    Silverman, Beth; And Others

    This pocket folder of instructional materials is designed to introduce youths aged 9 to 12 to the nature and needs of wildlife and to give children the opportunity to search for wildlife and their signs. The document includes a member's guide, a leader's guide, field record forms, and wildlife project materials. The illustrated 4-H member's guide…

  19. TOXICOGENOMICS DRUG DISCOVERY AND THE PATHOLOGIST

    EPA Science Inventory

    Toxicogenomics, drug discovery, and pathologist.

    The field of toxicogenomics, which currently focuses on the application of large-scale differential gene expression (DGE) data to toxicology, is starting to influence drug discovery and development in the pharmaceutical indu...

  20. The Discovery Channel Telescope

    NASA Astrophysics Data System (ADS)

    Millis, R. L.; Dunham, E. W.; Sebring, T. A.; Smith, B. W.; de Kock, M.; Wiecha, O.

    2004-11-01

    The Discovery Channel Telescope (DCT) is a 4.2-m telescope to be built at a new site near Happy Jack, Arizona. The DCT features a large prime focus mosaic CCD camera with a 2-degree-diameter field of view especially designed for surveys of KBOs, Centaurs, NEAs and other moving or time-variable targets. The telescope can be switched quickly to a Ritchey-Chretien configuration for optical/IR spectroscopy or near-IR imaging. This flexibility allows timely follow-up physical studies of high priority objects discovered in survey mode. The ULE (ultra-low-expansion) meniscus primary and secondary mirror blanks for the telescope are currently in fabrication by Corning Glass. Goodrich Aerospace, Vertex RSI, M3 Engineering and Technology Corp., and e2v Technologies have recently completed in-depth conceptual design studies of the optics, mount, enclosure, and mosaic focal plane, respectively. The results of these studies were subjected to a formal design review in July, 2004. Site testing at the 7760-ft altitude Happy Jack site began in 2001. Differential image motion observations from 117 nights since January 1, 2003 gave median seeing of 0.84 arcsec FWHM, and the average of the first quartile was 0.62 arcsec. The National Environmental Policy Act (NEPA) process for securing long-term access to this site on the Coconino National Forest is nearing completion and ground breaking is expected in the spring of 2005. The Discovery Channel Telescope is a project of the Lowell Observatory with major financial support from Discovery Communications, Inc. (DCI). DCI plans ongoing television programming featuring the construction of the telescope and the research ultimately undertaken with the DCT. An additional partner can be accommodated in the project. Interested parties should contact the lead author.

  1. The Status, Quality, and Expansion of the NIH Full-Length cDNA Project: The Mammalian Gene Collection (MGC)

    PubMed Central

    2004-01-01

    The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5′-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline. PMID:15489334

  2. Open PHACTS: semantic interoperability for drug discovery.

    PubMed

    Williams, Antony J; Harland, Lee; Groth, Paul; Pettifer, Stephen; Chichester, Christine; Willighagen, Egon L; Evelo, Chris T; Blomberg, Niklas; Ecker, Gerhard; Goble, Carole; Mons, Barend

    2012-11-01

    Open PHACTS is a public-private partnership between academia, publishers, small and medium sized enterprises and pharmaceutical companies. The goal of the project is to deliver and sustain an 'open pharmacological space' using and enhancing state-of-the-art semantic web standards and technologies. It is focused on practical and robust applications to solve specific questions in drug discovery research. OPS is intended to facilitate improvements in drug discovery in academia and industry and to support open innovation and in-house non-public drug discovery research. This paper lays out the challenges and how the Open PHACTS project is hoping to address these challenges technically and socially.

  3. Physical activity and sex modulate obesity risk linked to 3111T/C gene variant of the CLOCK gene in an elderly population: the SUN Project.

    PubMed

    Galbete, Cecilia; Contreras, Rafael; Martínez, J Alfredo; Martínez-González, Miguel Ángel; Guillén-Grima, Francisco; Marti, Amelia

    2012-12-01

    Genetic factors may interact with physical activity levels to modify obesity risk. Our aim was to explore the influence of rs1801260 single-nucleotide polymorphism (SNP) (3111T/C) of CLOCK gene on obesity risk, and to examine its potential interaction with lifestyle factors in an elderly population within the SUN ("Seguimiento Universidad de Navarra") Project. Subjects (n = 903, aged 69 ± 6 yrs) were recruited from the SUN Project. DNA was obtained from saliva, whereas lifestyle and dietary data were collected by validated self-report questionnaires. Genotype was assessed by reverse transcriptase-polymerase chain reaction (RT-PCR) plus allele discrimination. A significant interaction was observed between the 3111T/C SNP of CLOCK gene and sex for overweight/obesity risk (p for sex × CLOCK interaction <.001). Our results showed that women carrying the C allele of CLOCK gene had a marginally significant lower risk of overweight/obesity compared with noncarrier-TT-subjects (odds ratio [OR]: .61, 95% confidence interval [CI]: .36-1.04; p = .069). Moreover, this association of the C allele with a decreased overweight/obesity risk might be enhanced in those women with a high physical activity level. Women practicing more than 16.8 metabolic equivalent tasks (hours per week) had a significantly lower overweight/obesity risk (OR: .36, 95% CI: .17-.79; p = .011). Furthermore, a significant interaction between the 3111T/C gene variant and physical activity (PA) for overweight/obesity risk was observed but only in women (p for PA × CLOCK interaction <.050). In conclusion, it appears that physical activity levels may act by modifying the association of the 3111T/C SNP (rs1801260) of the CLOCK gene with overweight/obesity risk in elderly women in the SUN Project.

  4. Modern drug discovery technologies: opportunities and challenges in lead discovery.

    PubMed

    Guido, Rafael V C; Oliva, Glaucius; Andricopulo, Adriano D

    2011-12-01

    The identification of promising hits and the generation of high quality leads are crucial steps in the early stages of drug discovery projects. The definition and assessment of both chemical and biological space have revitalized the screening process model and emphasized the importance of exploring the intrinsic complementary nature of classical and modern methods in drug research. In this context, the widespread use of combinatorial chemistry and sophisticated screening methods for the discovery of lead compounds has created a large demand for small organic molecules that act on specific drug targets. Modern drug discovery involves the employment of a wide variety of technologies and expertise in multidisciplinary research teams. The synergistic effects between experimental and computational approaches on the selection and optimization of bioactive compounds emphasize the importance of the integration of advanced technologies in drug discovery programs. These technologies (VS, HTS, SBDD, LBDD, QSAR, and so on) are complementary in the sense that they have mutual goals, thereby the combination of both empirical and in silico efforts is feasible at many different levels of lead optimization and new chemical entity (NCE) discovery. This paper provides a brief perspective on the evolution and use of key drug design technologies, highlighting opportunities and challenges.

  5. Discovery of rare variants for complex phenotypes.

    PubMed

    Kosmicki, Jack A; Churchhouse, Claire L; Rivas, Manuel A; Neale, Benjamin M

    2016-06-01

    With the rise of sequencing technologies, it is now feasible to assess the role rare variants play in the genetic contribution to complex trait variation. While some of the earlier targeted sequencing studies successfully identified rare variants of large effect, unbiased gene discovery using exome sequencing has experienced limited success for complex traits. Nevertheless, rare variant association studies have demonstrated that rare variants do contribute to phenotypic variability, but sample sizes will likely have to be even larger than those of common variant association studies to be powered for the detection of genes and loci. Large-scale sequencing efforts of tens of thousands of individuals, such as the UK10K Project and aggregation efforts such as the Exome Aggregation Consortium, have made great strides in advancing our knowledge of the landscape of rare variation, but there remain many considerations when studying rare variation in the context of complex traits. We discuss these considerations in this review, presenting a broad range of topics at a high level as an introduction to rare variant analysis in complex traits including the issues of power, study design, sample ascertainment, de novo variation, and statistical testing approaches. Ultimately, as sequencing costs continue to decline, larger sequencing studies will yield clearer insights into the biological consequence of rare mutations and may reveal which genes play a role in the etiology of complex traits.

  6. Role of common and rare variants in SCN10A: results from the Brugada syndrome QRS locus gene discovery collaborative study

    PubMed Central

    Behr, Elijah R.; Savio-Galimberti, Eleonora; Barc, Julien; Holst, Anders G.; Petropoulou, Evmorfia; Prins, Bram P.; Jabbari, Javad; Torchio, Margherita; Berthet, Myriam; Mizusawa, Yuka; Yang, Tao; Nannenberg, Eline A.; Dagradi, Federica; Weeke, Peter; Bastiaenan, Rachel; Ackerman, Michael J.; Haunso, Stig; Leenhardt, Antoine; Kääb, Stefan; Probst, Vincent; Redon, Richard; Sharma, Sanjay; Wilde, Arthur; Tfelt-Hansen, Jacob; Schwartz, Peter; Roden, Dan M.; Bezzina, Connie R.; Olesen, Morten; Darbar, Dawood; Guicheney, Pascale; Crotti, Lia; Jamshidi, Yalda

    2015-01-01

    Aims Brugada syndrome (BrS) remains genetically heterogeneous and is associated with slowed cardiac conduction. We aimed to identify genetic variation in BrS cases at loci associated with QRS duration. Methods and results A multi-centre study sequenced seven candidate genes (SCN10A, HAND1, PLN, CASQ2, TKT, TBX3, and TBX5) in 156 Caucasian SCN5A mutation-negative BrS patients (80% male; mean age 48) with symptoms (64%) and/or a family history of sudden death (47%) or BrS (18%). Forty-nine variants were identified: 18 were rare (MAF <1%) and non-synonymous; and 11/18 (61.1%), mostly in SCN10A, were predicted as pathogenic using multiple bioinformatics tools. Allele frequencies were compared with the Exome Sequencing and UK10K Projects. SKAT methods tested rare variation in SCN10A finding no statistically significant difference between cases and controls. Co-segregation analysis was possible for four of seven probands carrying a novel pathogenic variant. Only one pedigree (I671V/G1299A in SCN10A) showed co-segregation. The SCN10A SNP V1073 was, however, associated strongly with BrS [66.9 vs. 40.1% (UK10K) OR (95% CI) = 3.02 (2.35–3.87), P = 8.07 × 10–19]. Voltage-clamp experiments for NaV1.8 were performed for SCN10A common variants V1073, A1073, and rare variants of interest: A200V and I671V. V1073, A200V and I671V, demonstrated significant reductions in peak INa compared with ancestral allele A1073 (rs6795970). Conclusion Rare variants in the screened QRS-associated genes (including SCN10A) are not responsible for a significant proportion of SCN5A mutation negative BrS. The common SNP SCN10A V1073 was strongly associated with BrS and demonstrated loss of NaV1.8 function, as did rare variants in isolated patients. PMID:25691538

  7. Multi-frequency survey of background radiations of the Universe. The "Cosmological Gene" project. First results

    NASA Astrophysics Data System (ADS)

    Parijskij, Yu. N.; Mingaliev, M. G.; Nizhel'Skii, N. A.; Bursov, N. N.; Berlin, A. B.; Grechkin, A. A.; Zharov, V. I.; Zhekanis, G. V.; Majorova, E. K.; Semenova, T. A.; Stolyarov, V. A.; Tsybulev, P. G.; Kratov, D. V.; Udovitskii, R. Yu.; Khaikin, V. B.

    2011-10-01

    The results of the first stage of the "Cosmological Gene" project of the Russian Academy of Sciences are reported. These results consist in the accumulation of multi-frequency data in 31 frequency channels in the wavelength interval 1-55 cm with maximum achievable statistical sensitivity limited by the noise of background radio sources at all wavelengths exceeding 1.38 cm. The survey region is determined by constraints 00 h < RA < 24 h and 40°30' < DEC < 42°30'. The scientific goals of the project are refined in view of recent proposals to use cosmological background radiation data for the development of a unified physical theory. Experimental data obtained with the RATAN-600 radio telescope are used to refine the contribution of the main "screens" located between the observer and the formation epoch of cosmic background radiation ( z = 1100). Experimental data for synchrotron radiation and free-free noise on scales that are of interest for the anisotropy of cosmic microwave background are reported as well as the contribution of these noise components in millimeter-wave experiments to be performed in the nearest years. The role of dipole radio emission of fullerene-type dust nanostructures is shown to be small. The most precise estimates of the role of background radio sources with inverted spectra are given and these sources are shown to create no serious interference in experiments. The average spectral indices of the weakest sources of the NVSS and FIRST catalogs are estimated. The "saturation" data for all wavelengths allowed a constraint to be imposed on the Sunyaev-Zeldovich noise (the SZ noise) at all wavelengths, and made it possible to obtain independent estimates of the average sky temperature from sources, substantially weaker than those listed in the NVSS catalog. These estimates are inconsistent with the existence of powerful extragalactic synchrotron background associated with radio sources. Appreciable "quadrupole" anisotropy in is detected in the

  8. Discovery and Precise Characterization by the MEarth Project of LP 661-13, an Eclipsing Binary Consisting of Two Fully Convective Low-mass Stars

    NASA Astrophysics Data System (ADS)

    Dittmann, Jason A.; Irwin, Jonathan M.; Charbonneau, David; Berta-Thompson, Zachory K.; Newton, Elisabeth R.; Latham, David W.; Latham, Christian A.; Esquerdo, Gilbert; Berlind, Perry; Calkins, Michael L.

    2017-02-01

    We report the detection of stellar eclipses in the LP 661-13 system. We present the discovery and characterization of this system, including high-resolution spectroscopic radial velocities and a photometric solution spanning two observing seasons. LP 661-13 is a low-mass binary system with an orbital period of {4.7043512}-0.0000010+0.0000013 days at a distance of 24.9 ± 1.3 parsecs. LP 661-13A is a 0.30795 ± 0.00084 M ⊙ star, while LP 661-13B is a 0.19400 ± 0.00034 M ⊙ star. The radius of each component is 0.3226 ± 0.0033 R ⊙ and 0.2174 ± 0.0023 R ⊙, respectively. We detect out-of-eclipse modulations at a period slightly shorter than the orbital period, implying that at least one of the components is not rotating synchronously. We find that each component is slightly inflated compared to stellar models, and that this cannot be reconciled through age or metallicity effects. As a nearby eclipsing binary system, where both components are near or below the full-convection limit, LP 661-13 will be a valuable test of models for the structure of cool dwarf stars.

  9. De novo Transcriptome Assembly of Common Wild Rice (Oryza rufipogon Griff.) and Discovery of Drought-Response Genes in Root Tissue Based on Transcriptomic Data

    PubMed Central

    Zhang, Jing-wen; Wang, Yan-yan; Li, Wei-min; Peng, Yu-fa; Yuan, Qian-hua; Pei, Xin-wu

    2015-01-01

    Background The perennial O. rufipogon (common wild rice), which is considered to be the ancestor of Asian cultivated rice species, contains many useful genetic resources, including drought resistance genes. However, few studies have identified the drought resistance and tissue-specific genes in common wild rice. Results In this study, transcriptome sequencing libraries were constructed, including drought-treated roots (DR) and control leaves (CL) and roots (CR). Using Illumina sequencing technology, we generated 16.75 million bases of high-quality sequence data for common wild rice and conducted de novo assembly and annotation of genes without prior genome information. These reads were assembled into 119,332 unigenes with an average length of 715 bp. A total of 88,813 distinct sequences (74.42% of unigenes) significantly matched known genes in the NCBI NT database. Differentially expressed gene (DEG) analysis showed that 3617 genes were up-regulated and 4171 genes were down-regulated in the CR library compared with the CL library. Among the DEGs, 535 genes were expressed in roots but not in shoots. A similar comparison between the DR and CR libraries showed that 1393 genes were up-regulated and 315 genes were down-regulated in the DR library compared with the CR library. Finally, 37 genes that were specifically expressed in roots were screened after comparing the DEGs identified in the above-described analyses. Conclusion This study provides a transcriptome sequence resource for common wild rice plants and establishes a digital gene expression profile of wild rice plants under drought conditions using the assembled transcriptome data as a reference. Several tissue-specific and drought-stress-related candidate genes were identified, representing a fully characterized transcriptome and providing a valuable resource for genetic and genomic studies in plants. PMID:26134138

  10. Elements of discovery.

    PubMed

    Toledo-Pereyra, Luis H

    2008-01-01

    I understand discovery as the essence of thinking man, or to paraphrase the notable French philosopher René Descartes, "I think, therefore I discover." In this study, I introduce discovery as the foundation of modern science. Discovery consists of six stages or elements, including: concept, belief, ability, support, proof, and protection. Each element is discussed within the context of the whole discovery enterprise. Fundamental tenets for understanding discovery are given throughout the paper, and a few examples illustrate the significance of some of the most important elements. I invite clinicians, researchers, and/or clinical researchers to integrate themselves into the active process of discovery. Remember--I think, therefore I discover.

  11. THE K2-ESPRINT PROJECT. I. DISCOVERY OF THE DISINTEGRATING ROCKY PLANET K2-22b WITH A COMETARY HEAD AND LEADING TAIL

    SciTech Connect

    Sanchis-Ojeda, R.; Rappaport, S.; Dai, F.; Pallè, E.; Delrez, L.; DeVore, J.; Gandolfi, D.; Fukui, A.; Ribas, I.; Stassun, K. G.; Albrecht, S.; Gaidos, E.; Gillon, M.; Hirano, T.; and others

    2015-10-20

    We present the discovery of a transiting exoplanet candidate in the K2 Field-1 with an orbital period of 9.1457 hr: K2-22b. The highly variable transit depths, ranging from ∼0% to 1.3%, are suggestive of a planet that is disintegrating via the emission of dusty effluents. We characterize the host star as an M-dwarf with T{sub eff} ≃ 3800 K. We have obtained ground-based transit measurements with several 1-m class telescopes and with the GTC. These observations (1) improve the transit ephemeris; (2) confirm the variable nature of the transit depths; (3) indicate variations in the transit shapes; and (4) demonstrate clearly that at least on one occasion the transit depths were significantly wavelength dependent. The latter three effects tend to indicate extinction of starlight by dust rather than by any combination of solid bodies. The K2 observations yield a folded light curve with lower time resolution but with substantially better statistical precision compared with the ground-based observations. We detect a significant “bump” just after the transit egress, and a less significant bump just prior to transit ingress. We interpret these bumps in the context of a planet that is not only likely streaming a dust tail behind it, but also has a more prominent leading dust trail that precedes it. This effect is modeled in terms of dust grains that can escape to beyond the planet's Hill sphere and effectively undergo “Roche lobe overflow,” even though the planet's surface is likely underfilling its Roche lobe by a factor of 2.

  12. Evolution of hedgehog and hedgehog-related genes, their origin from Hog proteins in ancestral eukaryotes and discovery of a novel Hint motif

    PubMed Central

    Bürglin, Thomas R

    2008-01-01

    Background The Hedgehog (Hh) signaling pathway plays important roles in human and animal development as well as in carcinogenesis. Hh molecules have been found in both protostomes and deuterostomes, but curiously the nematode Caenorhabditis elegans lacks a bona-fide Hh. Instead a series of Hh-related proteins are found, which share the Hint/Hog domain with Hh, but have distinct N-termini. Results We performed extensive genome searches such as the cnidarian Nematostella vectensis and several nematodes to gain further insights into Hh evolution. We found six genes in N. vectensis with a relationship to Hh: two Hh genes, one gene with a Hh N-terminal domain fused to a Willebrand factor type A domain (VWA), and three genes containing Hint/Hog domains with distinct novel N-termini. In the nematode Brugia malayi we find the same types of hh-related genes as in C. elegans. In the more distantly related Enoplea nematodes Xiphinema and Trichinella spiralis we find a bona-fide Hh. In addition, T. spiralis also has a quahog gene like C. elegans, and there are several additional hh-related genes, some of which have secreted N-terminal domains of only 15 to 25 residues. Examination of other Hh pathway components revealed that T. spiralis - like C. elegans - lacks some of these components. Extending our search to all eukaryotes, we recovered genes containing a Hog domain similar to Hh from many different groups of protists. In addition, we identified a novel Hint gene family present in many eukaryote groups that encodes a VWA domain fused to a distinct Hint domain we call Vint. Further members of a poorly characterized Hint family were also retrieved from bacteria. Conclusion In Cnidaria and nematodes the evolution of hh genes occurred in parallel to the evolution of other genes that contain a Hog domain but have different N-termini. The fact that Hog genes comprising a secreted N-terminus and a Hog domain are found in many protists indicates that this gene family must have

  13. THE BIOCATALYTIC DESULFURIZATION PROJECT

    SciTech Connect

    Steven E. Bonde; David Nunn

    2003-01-01

    During the first quarter of the Biological Desulfurization project several activities were pursued. A project kickoff meeting was held at the Diversa facility in San Diego, CA. Activities that were in process before the meeting and begun afterwards by Diversa Corporation and Petro Star Inc. include: Technology transfer in the form of information generated by Enchira to Diversa, the purchase and installation of equipment by Diversa, development of synthetic methods and preparation of organo-sulfur substrates for use in determining enzyme activities, production of extract via Petro Star's CED process, detailed analysis of Petro Star Inc. diesel and CED extract, and several activities in molecular biology. Diversa Corporation, in the area of molecular biology, engaged in several activities in support of the task list of the contract. These included: construction of a genomic library; development and utilization of a sequence-based gene discovery effort; a parallel discovery approach based on functional expression of enzymes with the ability to oxidize organosulfur compounds. Biodesulfurization genes have already been identified and are being sequenced and subcloned for expression in heterologous biological hosts. Diversa has evaluated and adapted assays developed by Enchira used to assess the activities of DBT and DBTO{sub 2} monooxygenases. Finally, Diversa personnel have developed two novel selection/screen strategies for the improvement of biocatalyst strains by directed evolution.

  14. Crossed and uncrossed projections to the cat sacrocaudal spinal cord: III. Axons expressing calcitonin gene-related peptide immunoreactivity.

    PubMed

    Ritz, L A; Murray, C R; Foli, K

    2001-10-01

    We have investigated the projection patterns of peptidergic small-diameter primary afferent fibers to the cat sacrocaudal spinal cord, a region associated with midline structures of the lower urogenital system and of the tail. Calcitonin gene-related peptide (CGRP)-immunoreactive (CGRP-IR) primary afferent fibers were observed within the superficial laminae, rostrally as the typical inverted U-shaped band that capped the separate dorsal horns (S1 to rostral S2) and caudally as a broad band that spanned the entire mediolateral extent of the fused dorsal horns (caudal S2 and caudal). Within the dorsal gray commissure, labeling was seen as a periodic vertical, midline band. CGRP-IR labeling was prevalent in an extensive mediolateral distribution at the base of the dorsal horn, originating from both lateral and medial collateral bundles that extend from the superficial dorsal horn. Some bundles, in part traveling within the dorsal commissure, conspicuously crossed the midline. In addition to the robust projection to the superficial dorsal horn, there was a more extensive distribution of CGRP-IR fibers within the deeper portions of the cat sacrocaudal dorsal horn than has been reported for other regions of the cat spinal cord. Presumably, these deep projections convey visceral information to projection or segmental neurons at the neck of the dorsal horn and in the region of the central canal. This deep distribution overlaps the reported projections of the pelvic and pudendal nerves. In addition, the contralateral projections of CGRP-IR fibers may form an anatomical substrate of the bilateral receptive fields for selective dorsal horn neurons. The density and variety of CGRP-IR projection patterns is a reflection of the functional attributes of the innervated structures.

  15. VARIABLES IN "DISCOVERY LEARNING."

    ERIC Educational Resources Information Center

    GLASER, ROBERT

    A PRESENTATION WAS MADE OF THE ANALYSIS OF BEHAVIOR THAT IS REQUIRED AS A FIRST STEP IN THE PROCESS OF DEVELOPING PROCEDURES AND MATERIALS FOR "DISCOVERY LEARNING." TEACHING BY THE DISCOVERY METHOD IS DESCRIBED AS REQUIRING THAT A MINIMUM OF STRUCTURED INSTRUCTIONAL SEQUENCE BE IMPOSED TO ALLOW THE CHILD TO (1) LEARN BY DISCOVERY AND (2)…

  16. Evaluating HapMap SNP data transferability in a large-scale genotyping project involving 175 cancer-associated genes.

    PubMed

    Ribas, Gloria; González-Neira, Anna; Salas, Antonio; Milne, Roger L; Vega, Ana; Carracedo, Begoña; González, Emilio; Barroso, Eva; Fernández, Lara P; Yankilevich, Patricio; Robledo, Mercedes; Carracedo, Angel; Benítez, Javier

    2006-02-01

    One of the many potential uses of the HapMap project is its application to the investigation of complex disease aetiology among a wide range of populations. This study aims to assess the transferability of HapMap SNP data to the Spanish population in the context of cancer research. We have carried out a genotyping study in Spanish subjects involving 175 candidate cancer genes using an indirect gene-based approach and compared results with those for HapMap CEU subjects. Allele frequencies were very consistent between the two samples, with a high positive correlation (R) of 0.91 (P<1x10(-6)). Linkage disequilibrium patterns and block structures across each gene were also very similar, with disequilibrium coefficient (r (2)) highly correlated (R=0.95, P<1x10(-6)). We found that of the 21 genes that contained at least one block larger than 60 kb, nine (ATM, ATR, BRCA1, ERCC6, FANCC, RAD17, RAD50, RAD54B and XRCC4) belonged to the GO category "DNA repair". Haplotype frequencies per gene were also highly correlated (mean R=0.93), as was haplotype diversity (R=0.91, P<1x10(-6)). "Yin yang" haplotypes were observed for 43% of the genes analysed and 18% of those were identical to the ancestral haplotype (identified in Chimpazee). Finally, the portability of tagSNPs identified in the HapMap CEU data using pairwise r (2) thresholds of 0.8 and 0.5 was assessed by applying these to the Spanish and current HapMap data for 66 genes. In general, the HapMap tagSNPs performed very well. Our results show generally high concordance with HapMap data in allele frequencies and haplotype distributions and confirm the applicability of HapMap SNP data to the study of complex diseases among the Spanish population.

  17. Discovery of Western European R1b1a2 Y chromosome variants in 1000 genomes project data: an online community approach.

    PubMed

    Rocca, Richard A; Magoon, Gregory; Reynolds, David F; Krahn, Thomas; Tilroe, Vincent O; Op den Velde Boots, Peter M; Grierson, Andrew J

    2012-01-01

    The authors have used an online community approach, and tools that were readily available via the Internet, to discover genealogically and therefore phylogenetically relevant Y-chromosome polymorphisms within core haplogroup R1b1a2-L11/S127 (rs9786076). Presented here is the analysis of 135 unrelated L11 derived samples from the 1000 Genomes Project. We were able to discover new variants and build a much more complex phylogenetic relationship for L11 sub-clades. Many of the variants were further validated using PCR amplification and Sanger sequencing. The identification of these new variants will help further the understanding of population history including patrilineal migrations in Western and Central Europe where R1b1a2 is the most frequent haplogroup. The fine-grained phylogenetic tree we present here will also help to refine historical genetic dating studies. Our findings demonstrate the power of citizen science for analysis of whole genome sequence data.

  18. Code-Assisted Discovery of TAL Effector Targets in Bacterial Leaf Streak of Rice Reveals Contrast with Bacterial Blight and a Novel Susceptibility Gene

    PubMed Central

    Cernadas, Raul A.; Doyle, Erin L.; Niño-Liu, David O.; Wilkins, Katherine E.; Bancroft, Timothy; Wang, Li; Schmidt, Clarice L.; Caldo, Rico; Yang, Bing; White, Frank F.; Nettleton, Dan; Wise, Roger P.; Bogdanove, Adam J.

    2014-01-01

    Bacterial leaf streak of rice, caused by Xanthomonas oryzae pv. oryzicola (Xoc) is an increasingly important yield constraint in this staple crop. A mesophyll colonizer, Xoc differs from X. oryzae pv. oryzae (Xoo), which invades xylem to cause bacterial blight of rice. Both produce multiple distinct TAL effectors, type III-delivered proteins that transactivate effector-specific host genes. A TAL effector finds its target(s) via a partially degenerate code whereby the modular effector amino acid sequence identifies nucleotide sequences to which the protein binds. Virulence contributions of some Xoo TAL effectors have been shown, and their relevant targets, susceptibility (S) genes, identified, but the role of TAL effectors in leaf streak is uncharacterized. We used host transcript profiling to compare leaf streak to blight and to probe functions of Xoc TAL effectors. We found that Xoc and Xoo induce almost completely different host transcriptional changes. Roughly one in three genes upregulated by the pathogens is preceded by a candidate TAL effector binding element. Experimental analysis of the 44 such genes predicted to be Xoc TAL effector targets verified nearly half, and identified most others as false predictions. None of the Xoc targets is a known bacterial blight S gene. Mutational analysis revealed that Tal2g, which activates two genes, contributes to lesion expansion and bacterial exudation. Use of designer TAL effectors discriminated a sulfate transporter gene as the S gene. Across all targets, basal expression tended to be higher than genome-average, and induction moderate. Finally, machine learning applied to real vs. falsely predicted targets yielded a classifier that recalled 92% of the real targets with 88% precision, providing a tool for better target prediction in the future. Our study expands the number of known TAL effector targets, identifies a new class of S gene, and improves our ability to predict functional targeting. PMID:24586171

  19. Code-assisted discovery of TAL effector targets in bacterial leaf streak of rice reveals contrast with bacterial blight and a novel susceptibility gene.

    PubMed

    Cernadas, Raul A; Doyle, Erin L; Niño-Liu, David O; Wilkins, Katherine E; Bancroft, Timothy; Wang, Li; Schmidt, Clarice L; Caldo, Rico; Yang, Bing; White, Frank F; Nettleton, Dan; Wise, Roger P; Bogdanove, Adam J

    2014-02-01

    Bacterial leaf streak of rice, caused by Xanthomonas oryzae pv. oryzicola (Xoc) is an increasingly important yield constraint in this staple crop. A mesophyll colonizer, Xoc differs from X. oryzae pv. oryzae (Xoo), which invades xylem to cause bacterial blight of rice. Both produce multiple distinct TAL effectors, type III-delivered proteins that transactivate effector-specific host genes. A TAL effector finds its target(s) via a partially degenerate code whereby the modular effector amino acid sequence identifies nucleotide sequences to which the protein binds. Virulence contributions of some Xoo TAL effectors have been shown, and their relevant targets, susceptibility (S) genes, identified, but the role of TAL effectors in leaf streak is uncharacterized. We used host transcript profiling to compare leaf streak to blight and to probe functions of Xoc TAL effectors. We found that Xoc and Xoo induce almost completely different host transcriptional changes. Roughly one in three genes upregulated by the pathogens is preceded by a candidate TAL effector binding element. Experimental analysis of the 44 such genes predicted to be Xoc TAL effector targets verified nearly half, and identified most others as false predictions. None of the Xoc targets is a known bacterial blight S gene. Mutational analysis revealed that Tal2g, which activates two genes, contributes to lesion expansion and bacterial exudation. Use of designer TAL effectors discriminated a sulfate transporter gene as the S gene. Across all targets, basal expression tended to be higher than genome-average, and induction moderate. Finally, machine learning applied to real vs. falsely predicted targets yielded a classifier that recalled 92% of the real targets with 88% precision, providing a tool for better target prediction in the future. Our study expands the number of known TAL effector targets, identifies a new class of S gene, and improves our ability to predict functional targeting.

  20. The Bacterial Phosphotransferase System: New frontiers 50 years after its discovery

    PubMed Central

    Saier, Milton H.

    2015-01-01

    Summary In 1964, Kundig, Ghosh and Roseman reported the discovery of the phosphoenolpyruvate:sugar phosphotransferase system (PTS), which they subsequently proposed might catalyze sugar transport as well as sugar phosphorylation. What we have learned in the past 50 years, since its discovery, is that in addition to these primary functions, the PTS serves as a complex protein kinase system that regulates a wide variety of transport, metabolic and mutagenic processes as well as the expression of numerous genes. Recent operon- and genome-sequencing projects have revealed novel PTS protein-encoding genes, many of which have yet to be functionally defined. The current picture of the PTS is that of a complex system with ramifications in all aspects of cellular physiology. Moreover, its mosaic evolutionary history is unusual and intriguing. The PTS can be considered to serve many prokaryotes in capacities of communication and coordination as do the nervous systems of animals. PMID:26159069

  1. [My accidental discovery].

    PubMed

    Nakamura, Tatsuya

    2008-10-01

    We wonder what we should do in medical care besides daily routine work as a laboratory technician. I made a discovery in my routine laboratory work, which gave me a theme for my research. This led to me successfully completing a number of scientific research projects, and these experiences have enabled me to be able to give advice on appropriate treatments for infectious diseases in medical care. It was March 1999 when I identified Escherichia coli (E. coli) in an intra-abdominal abscess resistant to antibacterial agents. The E. coli was producing an enzyme, extended-spectrum-beta-lactamase (ESBL), that breaks down cefem-group antibiotics often used in Japan. Therefore, it was resistant to those antimicrobial agents. Detailed analysis was performed by us and researchers of the National Institute of Infectious Diseases, which revealed that the E. coli had a SHV12 genotype of ESBL. It was the first case report of this type of ESBL-producing E. coli infection in Japan. After this experience, I became interested in searching for the mechanism of resistance to antibiotics with various kinds of approaches, such as a method involving genomic analysis by the polymerase-chain reaction (PCR), therapeutic management of drug-resistant bacterial infection, and so on, through which I learned a series of investigative approaches. Since I had plenty of data and experiences generated from routine work, I could perform novel studies and obtained many interesting findings. I am feeding back these findings to routine work in order to improve my performance. From my experience, we should look for the seeds for research from routine work as much as possible, and knowledge and experience generated by resolving problems teaches us how to perform in a clinical setting. This may lead to the further development of our research, which, in turn, promotes the accumulation of knowledge and experience. This feed-forward cycle enables laboratory technicians to improve their quality of work. This I

  2. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    PubMed Central

    Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A

    2009-01-01

    Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an

  3. OPEN DATA FOR DISCOVERY SCIENCE.

    PubMed

    Payne, Philip R O; Huang, Kun; Shah, Nigam H; Tenenbaum, Jessica

    2016-01-01

    The modern healthcare and life sciences ecosystem is moving towards an increasingly open and data-centric approach to discovery science. This evolving paradigm is predicated on a complex set of information needs related to our collective ability to share, discover, reuse, integrate, and analyze open biological, clinical, and population level data resources of varying composition, granularity, and syntactic or semantic consistency. Such an evolution is further impacted by a concomitant growth in the size of data sets that can and should be employed for both hypothesis discovery and testing. When such open data can be accessed and employed for discovery purposes, a broad spectrum of high impact end-points is made possible. These span the spectrum from identification of de novo biomarker complexes that can inform precision medicine, to the repositioning or repurposing of extant agents for new and cost-effective therapies, to the assessment of population level influences on disease and wellness. Of note, these types of uses of open data can be either primary, wherein open data is the substantive basis for inquiry, or secondary, wherein open data is used to augment or enrich project-specific or proprietary data that is not open in and of itself. This workshop is concerned with the key challenges, opportunities, and methodological best practices whereby open data can be used to drive the advancement of discovery science in all of the aforementioned capacities.

  4. Transcriptome Analysis and Discovery of Genes Involved in Immune Pathways from Coelomocytes of Sea Cucumber (Apostichopus japonicus) after Vibrio splendidus Challenge.

    PubMed

    Gao, Qiong; Liao, Meijie; Wang, Yingeng; Li, Bin; Zhang, Zheng; Rong, Xiaojun; Chen, Guiping; Wang, Lan

    2015-07-17

    Vibrio splendidus is identified as one of the major pathogenic factors for the skin ulceration syndrome in sea cucumber (Apostichopus japonicus), which has vastly limited the development of the sea cucumber culture industry. In order to screen the immune genes involving Vibrio splendidus challenge in sea cucumber and explore the molecular mechanism of this process, the related transcriptome and gene expression profiling of resistant and susceptible biotypes of sea cucumber with Vibrio splendidus challenge were collected for analysis. A total of 319,455,942 trimmed reads were obtained, which were assembled into 186,658 contigs. After that, 89,891 representative contigs (without isoform) were clustered. The analysis of the gene expression profiling identified 358 differentially expression genes (DEGs) in the bacterial-resistant group, and 102 DEGs in the bacterial-susceptible group, compared with that in control group. According to the reported references and annotation information from BLAST, GO and KEGG, 30 putative bacterial-resistant genes and 19 putative bacterial-susceptible genes were identified from DEGs. The qRT-PCR results were consistent with the RNA-Seq results. Furthermore, many DGEs were involved in immune signaling related pathways, such as Endocytosis, Lysosome, MAPK, Chemokine and the ERBB signaling pathway.

  5. Transcriptome Analysis and Discovery of Genes Involved in Immune Pathways from Coelomocytes of Sea Cucumber (Apostichopus japonicus) after Vibrio splendidus Challenge

    PubMed Central

    Gao, Qiong; Liao, Meijie; Wang, Yingeng; Li, Bin; Zhang, Zheng; Rong, Xiaojun; Chen, Guiping; Wang, Lan

    2015-01-01

    Vibrio splendidus is identified as one of the major pathogenic factors for the skin ulceration syndrome in sea cucumber (Apostichopus japonicus), which has vastly limited the development of the sea cucumber culture industry. In order to screen the immune genes involving Vibrio splendidus challenge in sea cucumber and explore the molecular mechanism of this process, the related transcriptome and gene expression profiling of resistant and susceptible biotypes of sea cucumber with Vibrio splendidus challenge were collected for analysis. A total of 319,455,942 trimmed reads were obtained, which were assembled into 186,658 contigs. After that, 89,891 representative contigs (without isoform) were clustered. The analysis of the gene expression profiling identified 358 differentially expression genes (DEGs) in the bacterial-resistant group, and 102 DEGs in the bacterial-susceptible group, compared with that in control group. According to the reported references and annotation information from BLAST, GO and KEGG, 30 putative bacterial-resistant genes and 19 putative bacterial-susceptible genes were identified from DEGs. The qRT-PCR results were consistent with the RNA-Seq results. Furthermore, many DGEs were involved in immune signaling related pathways, such as Endocytosis, Lysosome, MAPK, Chemokine and the ERBB signaling pathway. PMID:26193268

  6. Environmental Regulation of Plant Gene Expression: An Rt-qPCR Laboratory Project for an Upper-Level Undergraduate Biochemistry or Molecular Biology Course

    ERIC Educational Resources Information Center

    Eickelberg, Garrett J.; Fisher, Alison J.

    2013-01-01

    We present a novel laboratory project employing "real-time" RT-qPCR to measure the effect of environment on the expression of the "FLOWERING LOCUS C" gene, a key regulator of floral timing in "Arabidopsis thaliana" plants. The project requires four 3-hr laboratory sessions and is aimed at upper-level undergraduate…

  7. Discoveries of nicotinamide riboside as a nutrient and conserved NRK genes establish a Preiss-Handler independent route to NAD+ in fungi and humans.

    PubMed

    Bieganowski, Pawel; Brenner, Charles

    2004-05-14

    NAD+ is essential for life in all organisms, both as a coenzyme for oxidoreductases and as a source of ADPribosyl groups used in various reactions, including those that retard aging in experimental systems. Nicotinic acid and nicotinamide were defined as the vitamin precursors of NAD+ in Elvehjem's classic discoveries of the 1930s. The accepted view of eukaryotic NAD+ biosynthesis, that all anabolism flows through nicotinic acid mononucleotide, was challenged experimentally and revealed that nicotinamide riboside is an unanticipated NAD+ precursor in yeast. Nicotinamide riboside kinases from yeast and humans essential for this pathway were identified and found to be highly specific for phosphorylation of nicotinamide riboside and the cancer drug tiazofurin. Nicotinamide riboside was discovered as a nutrient in milk, suggesting that nicotinamide riboside is a useful compound for elevation of NAD+ levels in humans.

  8. Analysis of expressed sequence tags from Actinidia: applications of a cross species EST database for gene discovery in the areas of flavor, health, color and ripening

    PubMed Central

    Crowhurst, Ross N; Gleave, Andrew P; MacRae, Elspeth A; Ampomah-Dwamena, Charles; Atkinson, Ross G; Beuning, Lesley L; Bulley, Sean M; Chagne, David; Marsh, Ken B; Matich, Adam J; Montefiori, Mirco; Newcomb, Richard D; Schaffer, Robert J; Usadel, Björn; Allan, Andrew C; Boldingh, Helen L; Bowen, Judith H; Davy, Marcus W; Eckloff, Rheinhart; Ferguson, A Ross; Fraser, Lena G; Gera, Emma; Hellens, Roger P; Janssen, Bart J; Klages, Karin; Lo, Kim R; MacDiarmid, Robin M; Nain, Bhawana; McNeilage, Mark A; Rassam, Maysoon; Richardson, Annette C; Rikkerink, Erik HA; Ross, Gavin S; Schröder, Roswitha; Snowden, Kimberley C; Souleyre, Edwige JF; Templeton, Matt D; Walton, Eric F; Wang, Daisy; Wang, Mindy Y; Wang, Yanming Y; Wood, Marion; Wu, Rongmei; Yauk, Yar-Khing; Laing, William A

    2008-01-01

    Background Kiwifruit (Actinidia spp.) are a relatively new, but economically important crop grown in many different parts of the world. Commercial success is driven by the development of new cultivars with novel consumer traits including flavor, appearance, healthful components and convenience. To increase our understanding of the genetic diversity and gene-based control of these key traits in Actinidia, we have produced a collection of 132,577 expressed sequence tags (ESTs). Results The ESTs were derived mainly from four Actinidia species (A. chinensis, A. deliciosa, A. arguta and A. eriantha) and fell into 41,858 non redundant clusters (18,070 tentative consensus sequences and 23,788 EST singletons). Analysis of flavor and fragrance-related gene families (acyltransferases and carboxylesterases) and pathways (terpenoid biosynthesis) is presented in comparison with a chemical analysis of the compounds present in Actinidia including esters, acids, alcohols and terpenes. ESTs are identified for most genes in color pathways controlling chlorophyll degradation and carotenoid biosynthesis. In the health area, data are presented on the ESTs involved in ascorbic acid and quinic acid biosynthesis showing not only that genes for many of the steps in these pathways are represented in the database, but that genes encoding some critical steps are absent. In the convenience area, genes related to different stages of fruit softening are identified. Conclusion This large EST resource will allow researchers to undertake the tremendous challenge of understanding the molecular basis of genetic diversity in the Actinidia genus as well as provide an EST resource for comparative fruit genomics. The various bioinformatics analyses we have undertaken demonstrates the extent of coverage of ESTs for genes encoding different biochemical pathways in Actinidia. PMID:18655731

  9. Net present value approaches for drug discovery.

    PubMed

    Svennebring, Andreas M; Wikberg, Jarl Es

    2013-12-01

    Three dedicated approaches to the calculation of the risk-adjusted net present value (rNPV) in drug discovery projects under different assumptions are suggested. The probability of finding a candidate drug suitable for clinical development and the time to the initiation of the clinical development is assumed to be flexible in contrast to the previously used models. The rNPV of the post-discovery cash flows is calculated as the probability weighted average of the rNPV at each potential time of initiation of clinical development. Practical considerations how to set probability rates, in particular during the initiation and termination of a project is discussed.

  10. Final report on LDRD project : elucidating performance of proton-exchange-membrane fuel cells via computational modeling with experimental discovery and validation.

    SciTech Connect

    Wang, Chao Yang (Pennsylvania State University, University Park, PA); Pasaogullari, Ugur (Pennsylvania State University, University Park, PA); Noble, David R.; Siegel, Nathan P.; Hickner, Michael A.; Chen, Ken Shuang

    2006-11-01

    In this report, we document the accomplishments in our Laboratory Directed Research and Development project in which we employed a technical approach of combining experiments with computational modeling and analyses to elucidate the performance of hydrogen-fed proton exchange membrane fuel cells (PEMFCs). In the first part of this report, we document our focused efforts on understanding water transport in and removal from a hydrogen-fed PEMFC. Using a transparent cell, we directly visualized the evolution and growth of liquid-water droplets at the gas diffusion layer (GDL)/gas flow channel (GFC) interface. We further carried out a detailed experimental study to observe, via direct visualization, the formation, growth, and instability of water droplets at the GDL/GFC interface using a specially-designed apparatus, which simulates the cathode operation of a PEMFC. We developed a simplified model, based on our experimental observation and data, for predicting the onset of water-droplet instability at the GDL/GFC interface. Using a state-of-the-art neutron imaging instrument available at NIST (National Institute of Standard and Technology), we probed liquid-water distribution inside an operating PEMFC under a variety of operating conditions and investigated effects of evaporation due to local heating by waste heat on water removal. Moreover, we developed computational models for analyzing the effects of micro-porous layer on net water transport across the membrane and GDL anisotropy on the temperature and water distributions in the cathode of a PEMFC. We further developed a two-phase model based on the multiphase mixture formulation for predicting the liquid saturation, pressure drop, and flow maldistribution across the PEMFC cathode channels. In the second part of this report, we document our efforts on modeling the electrochemical performance of PEMFCs. We developed a constitutive model for predicting proton conductivity in polymer electrolyte membranes and compared

  11. Environmental regulation of plant gene expression: an RT-qPCR laboratory project for an upper-level undergraduate biochemistry or molecular biology course.

    PubMed

    Eickelberg, Garrett J; Fisher, Alison J

    2013-01-01

    We present a novel laboratory project employing "real-time" RT-qPCR to measure the effect of environment on the expression of the FLOWERING LOCUS C gene, a key regulator of floral timing in Arabidopsis thaliana plants. The project requires four 3-hr laboratory sessions and is aimed at upper-level undergraduate students in biochemistry or molecular biology courses. The project provides students with hands-on experience with RT-qPCR, the current "gold standard" for gene expression analysis, including detailed data analysis using the common 2-ΔΔCT method. Moreover, it provides a convenient starting point for many inquiry-driven projects addressing diverse questions concerning ecological biochemistry, naturally occurring genetic variation, developmental biology, and the regulation of gene expression in nature.

  12. Then and now: use of 16S rDNA gene sequencing for bacterial identification and discovery of novel bacteria in clinical microbiology laboratories.

    PubMed

    Woo, P C Y; Lau, S K P; Teng, J L L; Tse, H; Yuen, K-Y

    2008-10-01

    In the last decade, as a result of the widespread use of PCR and DNA sequencing, 16S rDNA sequencing has played a pivotal role in the accurate identification of bacterial isolates and the discovery of novel bacteria in clinical microbiology laboratories. For bacterial identification, 16S rDNA sequencing is particularly important in the case of bacteria with unusual phenotypic profiles, rare bacteria, slow-growing bacteria, uncultivable bacteria and culture-negative infections. Not only has it provided insights into aetiologies of infectious disease, but it also helps clinicians in choosing antibiotics and in determining the duration of treatment and infection control procedures. With the use of 16S rDNA sequencing, 215 novel bacterial species, 29 of which belong to novel genera, have been discovered from human specimens in the past 7 years of the 21st century (2001-2007). One hundred of the 215 novel species, 15 belonging to novel genera, have been found in four or more subjects. The largest number of novel species discovered were of the genera Mycobacterium (n = 12) and Nocardia (n = 6). The oral cavity/dental-related specimens (n = 19) and the gastrointestinal tract (n = 26) were the most important sites for discovery and/or reservoirs of novel species. Among the 100 novel species, Streptococcus sinensis, Laribacter hongkongensis, Clostridium hathewayi and Borrelia spielmanii have been most thoroughly characterized, with the reservoirs and routes of transmission documented, and S. sinensis, L. hongkongensis and C. hathewayi have been found globally. One of the greatest hurdles in putting 16S rDNA sequencing into routine use in clinical microbiology laboratories is automation of the technology. The only step that can be automated at the moment is input of the 16S rDNA sequence of the bacterial isolate for identification into one of the software packages that will generate the result of the identity of the isolate on the basis of its sequence database. However

  13. Code-assisted discovery of TAL effector targets in bacterial leaf streak of rice reveals contrast with bacterial blight and a novel susceptibility gene

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Transcription activator-like (TAL) effectors found in Xanthomonas spp. promote bacterial growth and plant susceptibility by binding specific DNA sequences or, effector-binding elements (EBEs), and inducing host gene expression. In this study, we have found substantially different transcriptional pro...

  14. Discovery, Annotation, and Functional Analysis of Long Noncoding RNAs Controlling Cell-Cycle Gene Expression and Proliferation in Breast Cancer Cells.

    PubMed

    Sun, Miao; Gadad, Shrikanth S; Kim, Dae-Seok; Kraus, W Lee

    2015-08-20

    We describe a computational approach that integrates GRO-seq and RNA-seq data to annotate long noncoding RNAs (lncRNAs), with increased sensitivity for low-abundance lncRNAs. We used this approach to characterize the lncRNA transcriptome in MCF-7 human breast cancer cells, including >700 previously unannotated lncRNAs. We then used information about the (1) transcription of lncRNA genes from GRO-seq, (2) steady-state levels of lncRNA transcripts in cell lines and patient samples from RNA-seq, and (3) histone modifications and factor binding at lncRNA gene promoters from ChIP-seq to explore lncRNA gene structure and regulation, as well as lncRNA transcript stability, regulation, and function. Functional analysis of selected lncRNAs with altered expression in breast cancers revealed roles in cell proliferation, regulation of an E2F-dependent cell-cycle gene expression program, and estrogen-dependent mitogenic growth. Collectively, our studies demonstrate the use of an integrated genomic and molecular approach to identify and characterize growth-regulating lncRNAs in cancers.

  15. Discovery of candidate disease genes in ENU-induced mouse mutants by large-scale sequencing, including a splice-site mutation in nucleoredoxin

    Technology Transfer Automated Retrieval System (TEKTRAN)

    An accurate and precisely annotated genome assembly is a fundamental requirement for functional genomic analysis. Here, the complete DNA sequence and gene annotation of mouse Chromosome 11 was used to test the efficacy of large-scale sequencing for mutation identification. We re-sequenced the 14,000...

  16. Targeted discovery of single-nucleotide polymorphisms in an unmarked wheat chromosomal region containing the Hessian fly resistance gene H33

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The highly effective Hessian fly-resistance gene, H33, was introgressed from durum wheat into common wheat and genetically mapped to chromosome 3AS, in previous research. However, H33 located to a region that is well-known to be devoid of molecular markers, with the closest flanking simple sequence ...

  17. Transcriptome sequencing and annotation of the microalgae Dunaliella tertiolecta: Pathway description and gene discovery for production of next-generation biofuels

    PubMed Central

    2011-01-01

    Background Biodiesel or ethanol derived from lipids or starch produced by microalgae may overcome many of the sustainability challenges previously ascribed to petroleum-based fuels and first generation plant-based biofuels. The paucity of microalgae genome sequences, however, limits gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for the non-model microalgae species, Dunaliella tertiolecta, and identify pathways and genes of importance related to biofuel production. Results Next generation DNA pyrosequencing technology applied to D. tertiolecta transcripts produced 1,363,336 high quality reads with an average length of 400 bases. Following quality and size trimming, ~ 45% of the high quality reads were assembled into 33,307 isotigs with a 31-fold coverage and 376,482 singletons. Assembled sequences and singletons were subjected to BLAST similarity searches and annotated with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology (KO) identifiers. These analyses identified the majority of lipid and starch biosynthesis and catabolism pathways in D. tertiolecta. Conclusions The construction of metabolic pathways involved in the biosynthesis and catabolism of fatty acids, triacylglycrols, and starch in D. tertiolecta as well as the assembled transcriptome provide a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock. PMID:21401935

  18. Using comparative genomics to drive new discoveries in microbiology.

    PubMed

    Haft, Daniel H

    2015-02-01

    Bioinformatics looks to many microbiologists like a service industry. In this view, annotation starts with what is known from experiments in the lab, makes reasonable inferences of which genes match other genes in function, builds databases to make all that we know accessible, but creates nothing truly new. Experiments lead, then biocuration and computational biology follow. But the astounding success of genome sequencing is changing the annotation paradigm. Every genome sequenced is an intercepted coded message from the microbial world, and as all cryptographers know, it is easier to decode a thousand messages than a single message. Some biology is best discovered not by phenomenology, but by decoding genome content, forming hypotheses, and doing the first few rounds of validation computationally. Through such reasoning, a role and function may be assigned to a protein with no sequence similarity to any protein yet studied. Experimentation can follow after the discovery to cement and to extend the findings. Unfortunately, this approach remains so unfamiliar to most bench scientists that lab work and comparative genomics typically segregate to different teams working on unconnected projects. This review will discuss several themes in comparative genomics as a discovery method, including highly derived data, use of patterns of design to reason by analogy, and in silico testing of computationally generated hypotheses.

  19. Discovery of miR-mRNA interactions via simultaneous Bayesian inference of gene networks and clusters using sequence-based predictions and expression data.

    PubMed

    Godsey, Brian

    2013-07-10

    MicroRNAs (miRs) are known to interfere with mRNA expression, and much work has been put into predicting and inferring miR-mRNA interactions. Both sequence-based interaction predictions as well as interaction inference based on expression data have been proven somewhat successful; furthermore, models that combine the two methods have had even more success. In this paper, I further refine and enrich the methods of miRmRNA interaction discovery by integrating a Bayesian clustering algorithm into a model of prediction-enhanced miR-mRNA target inference, creating an algorithm called PEACOAT, which is written in the R language. I show that PEACOAT improves the inference of miR-mRNA target interactions using both simulated data and a data set of microarrays from samples of multiple myeloma patients. In simulated networks of 25 miRs and mRNAs, our methods using clustering can improve inference in roughly two-thirds of cases, and in the multiple myeloma data set, KEGG pathway enrichment was found to be more significant with clustering than without. Our findings are consistent with previous work in clustering of non-miR genetic networks and indicate that there could be a significant advantage to clustering of miR and mRNA expression data as a part of interaction inference.

  20. STS-92 Discovery Launch

    NASA Technical Reports Server (NTRS)

    2000-01-01

    Viewed from across the waters of Banana Creek, clouds of smoke and steam are illuminated by the flames from Space Shuttle Discovery'''s perfect on-time launch at 7:17 p.m. EDT. Discovery carries a crew of seven on a construction flight to the International Space Station. Discovery also carries a payload that includes the Integrated Truss Structure Z-1, first of 10 trusses that will form the backbone of the Space Station, and the third Pressurized Mating Adapter that will provide a Shuttle docking port for solar array installation on the sixth Station flight and Lab installation on the seventh Station flight. Discovery'''s landing is expected Oct. 22 at 2:10 p.m. EDT.

  1. Computational drug discovery

    PubMed Central

    Ou-Yang, Si-sheng; Lu, Jun-yan; Kong, Xiang-qian; Liang, Zhong-jie; Luo, Cheng; Jiang, Hualiang

    2012-01-01

    Computational drug discovery is an effective strategy for accelerating and economizing drug discovery and development process. Because of the dramatic increase in the availability of biological macromolecule and small molecule information, the applicability of computational drug discovery has been extended and broadly applied to nearly every stage in the drug discovery and development workflow, including target identification and validation, lead discovery and optimization and preclinical tests. Over the past decades, computational drug discovery methods such as molecular docking, pharmacophore modeling and mapping, de novo design, molecular similarity calculation and sequence-based virtual screening have been greatly improved. In this review, we present an overview of these important computational methods, platforms and successful applications in this field. PMID:22922346

  2. Genomic instability, driver genes and cell selection: Projections from cancer to stem cells.

    PubMed

    Ben-David, Uri

    2015-04-01

    Cancer cells and stem cells share many traits, including a tendency towards genomic instability. Human cancers exhibit tumor-specific genomic aberrations, which often affect their malignancy and drug response. During their culture propagation, human pluripotent stem cells (hPSCs) also acquire characteristic genomic aberrations, which may have significant impact on their molecular and cellular phenotypes. These aberrations vary in size from single nucleotide alterations to copy number alterations to whole chromosome gains. A prominent challenge in both cancer and stem cell research is to identify "driver aberrations" that confer a selection advantage, and "driver genes" that underlie the recurrence of these aberrations. Following principles that are already well-established in cancer research, candidate driver genes have also been suggested in hPSCs. Experimental validation of the functional role of such candidates can uncover whether these are bona fide driver genes. The identification of driver genes may bring us closer to a mechanistic understanding of the genomic instability of stem cells. Guided by terminologies and methodologies commonly applied in cancer research, such understanding may have important ramifications for both stem cell and cancer biology. This article is part of a Special Issue entitled: Stress as a fundamental theme in cell plasticity.

  3. Mapping Our Genes--The Genome Projects: How Big, How Fast?

    ERIC Educational Resources Information Center

    Congress of the U.S., Washington, DC. Office of Technology Assessment.

    Scientific and technical journals in biology and medicine in recent years have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part…

  4. 454 pyrosequencing project identifying expressed genes from the horn fly, Haematobia irritans

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We used an EST approach to initiate a study of the genome of the horn fly, Haematobia irritans and have used 454 pyrosequencing techniques to sequence 73,512, 100,603, 71,550, and 85,769 expressed genes from the egg, first instar larvae, adult male, and adult female lifestages of the horn fly. cD...

  5. Radiation Detection Material Discovery Initiative at PNNL

    NASA Astrophysics Data System (ADS)

    Milbrath, Brian

    2006-05-01

    Today's security threats are being met with 30-year old radiation technology. Discovery of new radiation detection materials is currently a slow and Edisonian process. With heightened concerns over nuclear proliferation, terrorism and unconventional warfare, an alternative strategy for identification and development of potential radiation detection materials must be adopted. Through the Radiation Detection Materials Discovery Initiative, PNNL focuses on the science-based discovery of next generation materials for radiation detection by addressing three ``grand challenges'': fundamental understanding of radiation detection, identification of new materials, and accelerating the discovery process. The new initiative has eight projects addressing these challenges, which will be described, including early work, paths forward and the opportunities for collaboration.

  6. Discovery of toxin-encoding genes from the false viper Macropisthodon rudis, a rear-fanged snake, by transcriptome analysis of venom gland.

    PubMed

    Zhang, Zhixiao; Zhang, Xi; Hu, Tingsong; Zhou, Weiguo; Cui, Qinghua; Tian, Jing; Zheng, Ying; Fan, Quanshui

    2015-11-01

    Although rear-fanged snakes are often considered as non-threatening to humans, some species are lethal or medically hazardous. The toxin components and bioactivities of front-fanged snakes have been extensively studied; however, only limited research has explored the venoms of rear-fanged snakes. The false viper, Macropisthodon rudis, is widespread in southern China, but little is known about the toxins that this snake produces. Here, we analyzed the transcriptome of the venom gland of M. rudis using high-throughput sequencing with an illumina HiSeq 2000. The raw data were assembled and annotated using public databases. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and gene ontology (GO) were analyzed. Using sequence comparisons, snake venom metalloproteinases (SVMPs) and a phosphodiesterase (PDE) were discovered in the venom gland of M. rudis.

  7. Whole-Genome Resequencing of Holstein Bulls for Indel Discovery and Identification of Genes Associated with Milk Composition Traits in Dairy Cattle

    PubMed Central

    Jiang, Jianping; Gao, Yahui; Hou, Yali; Li, Wenhui; Zhang, Shengli; Zhang, Qin

    2016-01-01

    The use of whole-genome resequencing to obtain more information on genetic variation could produce a range of benefits for the dairy cattle industry, especially with regard to increasing milk production and improving milk composition. In this study, we sequenced the genomes of eight Holstein bulls from four half- or full-sib families, with high and low estimated breeding values (EBVs) of milk protein percentage and fat percentage at an average effective depth of 10×, using Illumina sequencing. Over 0.9 million nonredundant short insertions and deletions (indels) [1–49 base pairs (bp)] were obtained. Among them, 3,625 indels that were polymorphic between the high and low groups of bulls were revealed and subjected to further analysis. The vast majority (76.67%) of these indels were novel. Follow-up validation assays confirmed that most (70%) of the randomly selected indels represented true variations. The indels that were polymorphic between the two groups were annotated based on the cattle genome sequence assembly (UMD3.1.69); as a result, nearly 1,137 of them were found to be located within 767 annotated genes, only 5 (0.138%) of which were located in exons. Then, by integrated analysis of the 767 genes with known quantitative trait loci (QTL); significant single-nucleotide polymorphisms (SNPs) previously identified by genome-wide association studies (GWASs) to be associated with bovine milk protein and fat traits; and the well-known pathways involved in protein, fat synthesis, and metabolism, we identified a total of 11 promising candidate genes potentially affecting milk composition traits. These were FCGR2B, CENPE, RETSAT, ACSBG2, NFKB2, TBC1D1, NLK, MAP3K1, SLC30A2, ANGPT1 and UGDH. Our findings provide a basis for further study and reveal key genes for milk composition traits in dairy cattle. PMID:28030618

  8. Insights into Hepatopancreatic Functions for Nutrition Metabolism and Ovarian Development in the Crab Portunus trituberculatus: Gene Discovery in the Comparative Transcriptome of Different Hepatopancreas Stages

    PubMed Central

    Liu, Zhijun; Zheng, Huajun; Cheng, Yongxu

    2014-01-01

    The crustacean hepatopancreas has different functions including absorption, storage of nutrients and vitellogenesis during growth, and ovarian development. However, genetic information on the biological functions of the crustacean hepatopancreas during such processes is limited. The swimming crab, Portunus trituberculatus, is a commercially important species for both aquaculture and fisheries in the Asia-Pacific region. This study compared the transcriptome in the hepatopancreas of female P. trituberculatus during the growth and ovarian maturation stages by 454 high-throughput pyrosequencing and bioinformatics. The goal was to discover genes in the hepatopancreas involved in food digestion, nutrition metabolism and ovarian development, and to identify patterns of gene expression during growth and ovarian maturation. Our transcriptome produced 303,450 reads with an average length of 351 bp, and the high quality reads were assembled into 21,635 contigs and 31,844 singlets. Based on BLASTP searches of the deduced protein sequences, there were 7,762 contigs and 4,098 singlets with functional annotation. Further analysis revealed 33,427 unigenes with ORFs, including 17,388 contigs and 16,039 singlets in the hepatopancreas, while only 7,954 unigenes (5,691 contigs and 2,263 singlets) with the predicted protein sequences were annotated with biological functions. The deduced protein sequences were assigned to 3,734 GO terms, 25 COG categories and 294 specific pathways. Furthermore, there were 14, 534, and 22 identified unigenes involved in food digestion, nutrition metabolism and ovarian development, respectively. 212 differentially expressed genes (DEGs) were found between the growth and endogenous stage of the hepatopancreas, while there were 382 DEGs between the endogenous and exogenous stage hepatopancreas. Our results not only enhance the understanding of crustacean hepatopancreatic functions during growth and ovarian development, but also represent a basis for

  9. Gene-centric view on the human proteome project: the example of the Russian roadmap for chromosome 18.

    PubMed

    Archakov, Alexander; Aseev, Alexander; Bykov, Victor; Grigoriev, Anatoly; Govorun, Vadim; Ivanov, Vadim; Khlunov, Alexander; Lisitsa, Andrey; Mazurenko, Sergey; Makarov, Alexander A; Ponomarenko, Elena; Sagdeev, Renad; Skryabin, Konstantin

    2011-05-01

    During the 2010 Human Proteome Organization Congress in Sydney, a gene-centric approach emerged as a feasible and tractable scaffold for assemblage of the Human Proteome Project. Bringing the gene-centric principle into practice, a roadmap for the 18th chromosome was drafted, postulating the limited sensitivity of analytical methods, as a serious bottleneck in proteomics. In the context of the sensitivity problem, we refer to the "copy number of protein molecules" as a measurable assessment of protein abundance. The roadmap is focused on the development of technology to attain the low- and ultralow -"copied" portion of the proteome. Roadmap merges the genomic, transcriptomic and proteomic levels to identify the majority of 285 proteins from 18th chromosome - master proteins. Master protein is the primary translation of the coding sequence and resembling at least one of the known isoforms, coded by the gene. The executive phase of the roadmap includes the expansion of the study of the master proteins with alternate splicing, single amino acid polymorphisms (SAPs) and post-translational modifications. In implementing the roadmap, Russian scientists are expecting to establish proteomic technologies for integrating MS and atomic force microscopy (AFM). These technologies are anticipated to unlock the value of new biomarkers at a detection limit of 10(-18) M, i.e. 1 protein copy per 1 μL of plasma. The roadmap plan is posted at www.proteome.ru/en/roadmap/ and a forum for discussion of the document is supported.

  10. Discovery of a novel neuroprotective compound, AS1219164, by high-throughput chemical screening of a newly identified apoptotic gene marker.

    PubMed

    Yamazaki, Takao; Muramoto, Masakazu; Okitsu, Osamu; Morikawa, Noriyuki; Kita, Yasuhiro

    2011-11-01

    We have reported that tacrolimus (FK506), an immunosuppressive drug, and diclofenac, a non-steroidal anti-inflammatory drug, possess different modes of neuroprotective action. FK506 suppresses only thapsigargin-induced apoptosis in neuroblastoma SH-SY5Y cells while diclofenac reverses tunicamycin-induced as well as thapsigargin-induced apoptosis. The aim of this study is to discover novel compounds that exert neuroprotective properties by using the transcriptional response of a newly identified gene, which was regulated by both FK506 and diclofenac, as a surrogate screening marker in high-throughput chemical screening and characterize the compounds in comparison with FK506 and diclofenac. Using a microarray with 4504 human cDNAs and quantitative RT-PCR, two genes as apoptotic markers, transmembrane protein 100 (TMEM100) and limb-bud and heart (LBH), were identified because the thapsigargin-induced elevations in their mRNA levels were reversed by both FK506 and diclofenac. A luciferase reporter assay with a TMEM100 promoter region was applied to high-throughput chemical screening. AS1219164, {3-[(E)-2-{5-[(E)-2-pyridin-4-ylvinyl]pyridin-3-yl} vinyl]aniline}, suppressed thapsigargin-induced transactivation of the TMEM100 gene and reversed thapsigargin-induced increases in TMEM100 and LBH mRNA levels in SH-SY5Y cells, similar to the effects of FK506 and diclofenac. Furthermore, AS1219164 protected against SH-SY5Y cell death induced by four apoptotic agents including thapsigargin, similar to diclofenac, but was more potent than diclofenac, while FK506 only showed protective effects against thapsigargin-induced cell death. In conclusion, a novel neuroprotecitve compound, AS1219164, was discovered by high-throughput chemical screening using a reporter assay with the TMEM100 gene promoter regulated by both FK506 and diclofenac. Reporter assay using the promoter region of a gene under pharmacological and physiological transcriptional regulation would be well suit for use

  11. Pattern Discovery in Time-Ordered Data

    SciTech Connect

    CONRAD, GREGORY N.; BRITANIK, JOHN M.; DELAND, SHARON M.; JENKIN, CHRISTINA L.

    2002-02-01

    This report describes the results of a Laboratory-Directed Research and Development project on techniques for pattern discovery in discrete event time series data. In this project, we explored two different aspects of the pattern matching/discovery problem. The first aspect studied was the use of Dynamic Time Warping for pattern matching in continuous data. In essence, DTW is a technique for aligning time series along the time axis to optimize the similarity measure. The second aspect studied was techniques for discovering patterns in discrete event data. We developed a pattern discovery tool based on adaptations of the A-priori and GSP (Generalized Sequential Pattern mining) algorithms. We then used the tool on three different application areas--unattended monitoring system data from a storage magazine, computer network intrusion detection, and analysis of robot training data.

  12. Stochastic Analysis of Gene Regulatory Networks using Finite State Projections and Singular Perturbation

    DTIC Science & Technology

    2006-01-01

    Khammash Department of Mechanical Engineering University of California Santa Barbara, CA 93106-5070 Abstract—Considerable recent experimental evidence sug...relevant statistics of the modeled processes. However, the complexity of gene regulatory networks poses serious computational difficulties and makes any...resulting slow manifold FSP algorithm on a simple example arising in the cellular heat shock response mechanism . I. INTRODUCTION Through evolution living

  13. Projecting 2D gene expression data into 3D and