Science.gov

Sample records for gene discovery project

  1. Peanut EST Project and Gene Discovery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The overall goal of this project is to develop functional tools and resources for the peanut research community to better understand the disease resistance and seed protein synthesis and diversity in order to advance towards improving host resistance to diseases and mitigating food safety issues suc...

  2. Progress in maize gene discovery: a project update.

    PubMed

    Lunde, China F; Morrow, Darren J; Roy, Laura M; Walbot, Virginia

    2003-03-01

    The Maize Gene Discovery Project (MGDP) is a 5-year NSF-funded plant genome initiative that began in 1998. The MGDP collaboration involves researchers at six universities from diverse disciplines with the common goal of discovering new maize genes and developing tools for the phenotypic characterization of maize mutants. The project utilizes several approaches: EST sequencing, cDNA microarray production, and the discovery of gene function and genomic sequence through the use of a recombinant Mu1 transposon ( RescueMu). Current achievements of the MGDP (NSF 98-72657) include the sequencing of over 120,000 maize ESTs from diverse cDNA libraries, and over 70,000 RescueMu flanking sequences, as well as the cataloguing of mutant seed and cob phenotypes of 23,000 maize ears, 6,200 families of maize seedlings, and 4,000 families of adult maize plants carrying MuDR/Mu and RescueMu insertion alleles. A consolidation of over 24,000 unique sequences from 19 libraries has been made into the first two of the planned set of four "Unigene" microarray slides. In addition, slides for four EST libraries have been produced. These microarray slides, EST clones, library plates of immortalized RescueMu bacterial cultures, and seed are all available online (http://www.zmdb.iastate.edu). The ZmDB website posts periodic assemblies of all maize EST and genomic sequences available from GenBank. ZmDB is also a portal for sequence analysis software designed to aid in gene discovery: MuSeqBox, GeneSeqer, and SplicePredictor. In addition, ZmDB contains links to other plant and genetics websites. PMID:12590340

  3. Alternative Gene Form Discovery and Candidate Gene Selection from Gene Indexing Projects

    PubMed Central

    Burke, John; Wang, Hui; Hide, Winston; Davison, Daniel B.

    1998-01-01

    Several efforts are under way to partition single-read expressed sequence tag (EST), as well as full-length transcript data, into large-scale gene indices, where transcripts are in common index classes if and only if they share a common progenitor gene. Accurate gene indexing facilitates gene expression studies, as well as inexpensive and early gene sequence discovery through assembly of ESTs that are derived from genes that have not been sequenced by classical methods. We extend, correct, and enhance the information obtained from index groups by splitting index classes into subclasses based on sequence dissimilarity (diversity). Two applications of this are highlighted in this report. First it is shown that our method can ameliorate the damage that artifacts, such as chimerism, inflict on index integrity. Additionally, we demonstrate how the organization imposed by an effective subpartition can greatly increase the sensitivity of gene expression studies by accounting for the existence and tissue- or pathology-specific regulation of novel gene isoforms and polymorphisms. We apply our subpartitioning treatment to the UniGene gene indexing project to measure a marked increase in information quality and abundance (in terms of assembly length and insertion/deletion error) after treatment and demonstrate cases where new levels of information concerning differential expression of alternate gene forms, such as regulated alternative splicing, are discovered. [Tables 2 and 3 can be viewed in their entirety as Online Supplements at http://www.genome.org.] PMID:9521931

  4. Pine Gene Discovery Project - Final Report - 08/31/1997 - 02/28/2001

    SciTech Connect

    Whetten, R. W.; Sederoff, R. R.; Kinlaw, C.; Retzel, E.

    2001-04-30

    Integration of pines into the large scope of plant biology research depends on study of pines in parallel with study of annual plants, and on availability of research materials from pine to plant biologists interested in comparing pine with annual plant systems. The objectives of the Pine Gene Discovery Project were to obtain 10,000 partial DNA sequences of genes expressed in loblolly pine, to determine which of those pine genes were similar to known genes from other organisms, and to make the DNA sequences and isolated pine genes available to plant researchers to stimulate integration of pines into the wider scope of plant biology research. Those objectives have been completed, and the results are available to the public. Requests for pine genes have been received from a number of laboratories that would otherwise not have included pine in their research, indicating that progress is being made toward the goal of integrating pine research into the larger molecular biology research community.

  5. FORGE Canada Consortium: outcomes of a 2-year national rare-disease gene-discovery project.

    PubMed

    Beaulieu, Chandree L; Majewski, Jacek; Schwartzentruber, Jeremy; Samuels, Mark E; Fernandez, Bridget A; Bernier, Francois P; Brudno, Michael; Knoppers, Bartha; Marcadier, Janet; Dyment, David; Adam, Shelin; Bulman, Dennis E; Jones, Steve J M; Avard, Denise; Nguyen, Minh Thu; Rousseau, Francois; Marshall, Christian; Wintle, Richard F; Shen, Yaoqing; Scherer, Stephen W; Friedman, Jan M; Michaud, Jacques L; Boycott, Kym M

    2014-06-01

    Inherited monogenic disease has an enormous impact on the well-being of children and their families. Over half of the children living with one of these conditions are without a molecular diagnosis because of the rarity of the disease, the marked clinical heterogeneity, and the reality that there are thousands of rare diseases for which causative mutations have yet to be identified. It is in this context that in 2010 a Canadian consortium was formed to rapidly identify mutations causing a wide spectrum of pediatric-onset rare diseases by using whole-exome sequencing. The FORGE (Finding of Rare Disease Genes) Canada Consortium brought together clinicians and scientists from 21 genetics centers and three science and technology innovation centers from across Canada. From nation-wide requests for proposals, 264 disorders were selected for study from the 371 submitted; disease-causing variants (including in 67 genes not previously associated with human disease; 41 of these have been genetically or functionally validated, and 26 are currently under study) were identified for 146 disorders over a 2-year period. Here, we present our experience with four strategies employed for gene discovery and discuss FORGE's impact in a number of realms, from clinical diagnostics to the broadening of the phenotypic spectrum of many diseases to the biological insight gained into both disease states and normal human development. Lastly, on the basis of this experience, we discuss the way forward for rare-disease genetic discovery both in Canada and internationally. PMID:24906018

  6. FORGE Canada Consortium: Outcomes of a 2-Year National Rare-Disease Gene-Discovery Project

    PubMed Central

    Beaulieu, Chandree L.; Majewski, Jacek; Schwartzentruber, Jeremy; Samuels, Mark E.; Fernandez, Bridget A.; Bernier, Francois P.; Brudno, Michael; Knoppers, Bartha; Marcadier, Janet; Dyment, David; Adam, Shelin; Bulman, Dennis E.; Jones, Steve J.M.; Avard, Denise; Nguyen, Minh Thu; Rousseau, Francois; Marshall, Christian; Wintle, Richard F.; Shen, Yaoqing; Scherer, Stephen W.; Friedman, Jan M.; Michaud, Jacques L.; Boycott, Kym M.

    2014-01-01

    Inherited monogenic disease has an enormous impact on the well-being of children and their families. Over half of the children living with one of these conditions are without a molecular diagnosis because of the rarity of the disease, the marked clinical heterogeneity, and the reality that there are thousands of rare diseases for which causative mutations have yet to be identified. It is in this context that in 2010 a Canadian consortium was formed to rapidly identify mutations causing a wide spectrum of pediatric-onset rare diseases by using whole-exome sequencing. The FORGE (Finding of Rare Disease Genes) Canada Consortium brought together clinicians and scientists from 21 genetics centers and three science and technology innovation centers from across Canada. From nation-wide requests for proposals, 264 disorders were selected for study from the 371 submitted; disease-causing variants (including in 67 genes not previously associated with human disease; 41 of these have been genetically or functionally validated, and 26 are currently under study) were identified for 146 disorders over a 2-year period. Here, we present our experience with four strategies employed for gene discovery and discuss FORGE’s impact in a number of realms, from clinical diagnostics to the broadening of the phenotypic spectrum of many diseases to the biological insight gained into both disease states and normal human development. Lastly, on the basis of this experience, we discuss the way forward for rare-disease genetic discovery both in Canada and internationally. PMID:24906018

  7. Independent Gene Discovery and Testing

    ERIC Educational Resources Information Center

    Palsule, Vrushalee; Coric, Dijana; Delancy, Russell; Dunham, Heather; Melancon, Caleb; Thompson, Dennis; Toms, Jamie; White, Ashley; Shultz, Jeffry

    2010-01-01

    A clear understanding of basic gene structure is critical when teaching molecular genetics, the central dogma and the biological sciences. We sought to create a gene-based teaching project to improve students' understanding of gene structure and to integrate this into a research project that can be implemented by instructors at the secondary level…

  8. Metagenomics and novel gene discovery

    PubMed Central

    Culligan, Eamonn P; Sleator, Roy D; Marchesi, Julian R; Hill, Colin

    2014-01-01

    Metagenomics provides a means of assessing the total genetic pool of all the microbes in a particular environment, in a culture-independent manner. It has revealed unprecedented diversity in microbial community composition, which is further reflected in the encoded functional diversity of the genomes, a large proportion of which consists of novel genes. Herein, we review both sequence-based and functional metagenomic methods to uncover novel genes and outline some of the associated problems of each type of approach, as well as potential solutions. Furthermore, we discuss the potential for metagenomic biotherapeutic discovery, with a particular focus on the human gut microbiome and finally, we outline how the discovery of novel genes may be used to create bioengineered probiotics. PMID:24317337

  9. Human brain evolution: From gene discovery to phenotype discovery

    PubMed Central

    Preuss, Todd M.

    2012-01-01

    The rise of comparative genomics and related technologies has added important new dimensions to the study of human evolution. Our knowledge of the genes that underwent expression changes or were targets of positive selection in human evolution is rapidly increasing, as is our knowledge of gene duplications, translocations, and deletions. It is now clear that the genetic differences between humans and chimpanzees are far more extensive than previously thought; their genomes are not 98% or 99% identical. Despite the rapid growth in our understanding of the evolution of the human genome, our understanding of the relationship between genetic changes and phenotypic changes is tenuous. This is true even for the most intensively studied gene, FOXP2, which underwent positive selection in the human terminal lineage and is thought to have played an important role in the evolution of human speech and language. In part, the difficulty of connecting genes to phenotypes reflects our generally poor knowledge of human phenotypic specializations, as well as the difficulty of interpreting the consequences of genetic changes in species that are not amenable to invasive research. On the positive side, investigations of FOXP2, along with genomewide surveys of gene-expression changes and selection-driven sequence changes, offer the opportunity for “phenotype discovery,” providing clues to human phenotypic specializations that were previously unsuspected. What is more, at least some of the specializations that have been proposed are amenable to testing with noninvasive experimental techniques appropriate for the study of humans and apes. PMID:22723367

  10. Rice mutant resources for gene discovery.

    PubMed

    Hirochika, Hirohiko; Guiderdoni, Emmanuel; An, Gynheung; Hsing, Yue-Ie; Eun, Moo Young; Han, Chang-Deok; Upadhyaya, Narayana; Ramachandran, Srinivasan; Zhang, Qifa; Pereira, Andy; Sundaresan, Venkatesan; Leung, Hei

    2004-02-01

    With the completion of genomic sequencing of rice, rice has been firmly established as a model organism for both basic and applied research. The next challenge is to uncover the functions of genes predicted by sequence analysis. Considering the amount of effort and the diversity of disciplines required for functional analyses, extensive international collaboration is needed for this next goal. The aims of this review are to summarize the current status of rice mutant resources, key tools for functional analysis of genes, and our perspectives on how to accelerate rice gene discovery through collaboration. PMID:15284490

  11. Gene discovery using the maize genome database ZmDB.

    PubMed

    Gai, X; Lal, S; Xing, L; Brendel, V; Walbot, V

    2000-01-01

    Zea mays DataBase (ZmDB) is a repository and analysis tool for sequence, expression and phenotype data of the major crop plant maize. The data accessible in ZmDB are mostly generated in a large collaborative project of maize gene discovery, sequencing and phenotypic analysis using a transposon tagging strategy and expressed sequence tag (EST) sequencing. ESTs constitute most of the current content. Database search tools, convenient links to external databases, and novel sequence analysis programs for spliced alignment are provided and together serve as an efficient protocol for gene discovery by sequence inspection. ZmDB can be accessed at http://zmdb. iastate.edu. ZmDB also provides web-based ordering of materials generated in the project, including EST and genomic DNA clones, seeds of mutant plants and microarrays of amplified EST and genomic DNA sequences. PMID:10592191

  12. Cancer gene discovery: exploiting insertional mutagenesis

    PubMed Central

    Ranzani, Marco; Annunziato, Stefano; Adams, David J.; Montini, Eugenio

    2013-01-01

    Insertional mutagenesis has been utilized as a functional forward genetics screen for the identification of novel genes involved in the pathogenesis of human cancers. Different insertional mutagens have been successfully used to reveal new cancer genes. For example, retroviruses (RVs) are integrating viruses with the capacity to induce the deregulation of genes in the neighborhood of the insertion site. RVs have been employed for more than 30 years to identify cancer genes in the hematopoietic system and mammary gland. Similarly, another tool that has revolutionized cancer gene discovery is the cut-and-paste transposons. These DNA elements have been engineered to contain strong promoters and stop cassettes that may function to perturb gene expression upon integration proximal to genes. In addition, complex mouse models characterized by tissue-restricted activity of transposons have been developed to identify oncogenes and tumor suppressor genes that control the development of a wide range of solid tumor types, extending beyond those tissues accessible using RV-based approaches. Most recently, lentiviral vectors (LVs) have appeared on the scene for use in cancer gene screens. LVs are replication defective integrating vectors that have the advantage of being able to infect non-dividing cells, in a wide range of cell types and tissues. In this review, we describe the various insertional mutagens focusing on their advantages/limitations and we discuss the new and promising tools that will improve the insertional mutagenesis screens of the future. PMID:23928056

  13. Gene expression module discovery using gibbs sampling.

    PubMed

    Wu, Chang-Jiun; Fu, Yutao; Murali, T M; Kasif, Simon

    2004-01-01

    Recent advances in high throughput profiling of gene expression have catalyzed an explosive growth in functional genomics aimed at the elucidation of genes that are differentially expressed in various tissue or cell types across a range of experimental conditions. These studies can lead to the identification of diagnostic genes, classification of genes into functional categories, association of genes with regulatory pathways, and clustering of genes into modules that are potentially co-regulated by a group of transcription factors. Traditional clustering methods such as hierarchical clustering or principal component analysis are difficult to deploy effectively for several of these tasks since genes rarely exhibit similar expression pattern across a wide range of conditions. Bi-clustering of gene expression data is a promising methodology for identification of gene groups that show a coherent expression profile across a subset of conditions. This methodology can be a first step towards the discovery of co-regulated and co-expressed genes or modules. Although bi-clustering (also called block clustering) was introduced in statistics in 1974 few robust and efficient solutions exist for extracting gene expression modules in microarray data. In this paper, we propose a simple but promising new approach for bi-clustering based on a Gibbs sampling paradigm. Our algorithm is implemented in the program GEMS (Gene Expression Module Sampler). GEMS has been tested on synthetic data generated to evaluate the effect of noise on the performance of the algorithm as well as on published leukemia datasets. In our preliminary studies comparing GEMS with other bi-clustering software we show that GEMS is a reliable, flexible and computationally efficient approach for bi-clustering gene expression data. PMID:15712126

  14. Pathway-driven discovery of epilepsy genes.

    PubMed

    Noebels, Jeffrey

    2015-03-01

    Epilepsy genes deliver critical insights into the molecular control of brain synchronization and are revolutionizing our understanding and treatment of the disease. The epilepsy-associated genome is rapidly expanding, and two powerful complementary approaches, isolation of de novo exome variants in patients and targeted mutagenesis in model systems, account for the steep increase. In sheer number, the tally of genes linked to seizures will likely match that of cancer and exceed it in biological diversity. The proteins act within most intracellular compartments and span the molecular determinants of firing and wiring in the developing brain. Every facet of neurotransmission, from dendritic spine to exocytotic machinery, is in play, and defects of synaptic inhibition are over-represented. The contributions of somatic mutations and noncoding microRNAs are also being explored. The functional spectrum of established epilepsy genes and the arrival of rapid, precise technologies for genome editing now provide a robust scaffold to prioritize hypothesis-driven discovery and further populate this genetic proto-map. Although each gene identified offers translational potential to stratify patient care, the complexity of individual variation and covert actions of genetic modifiers may confound single-gene solutions for the clinical disorder. In vivo genetic deconstruction of epileptic networks, ex vivo validation of variant profiles in patient-derived induced pluripotent stem cells, in silico variant modeling and modifier gene discovery, now in their earliest stages, will help clarify individual patterns. Because seizures stand at the crossroads of all neuronal synchronization disorders in the developing and aging brain, the neurobiological analysis of epilepsy-associated genes provides an extraordinary gateway to new insights into higher cortical function. PMID:25710836

  15. Pathway-driven discovery of epilepsy genes

    PubMed Central

    Noebels, Jeffrey

    2016-01-01

    Epilepsy genes deliver critical insights into the molecular control of brain synchronization and are revolutionizing our understanding and treatment of the disease. The epilepsy-associated genome is rapidly expanding, and two powerful complementary approaches, isolation of de novo exome variants in patients and targeted mutagenesis in model systems, account for the steep increase. In sheer number, the tally of genes linked to seizures will likely match that of cancer and exceed it in biological diversity. The proteins act within most intracellular compartments and span the molecular determinants of firing and wiring in the developing brain. Every facet of neurotransmission, from dendritic spine to exocytotic machinery, is in play, and defects of synaptic inhibition are over-represented. The contributions of somatic mutations and noncoding microRNAs are also being explored. The functional spectrum of established epilepsy genes and the arrival of rapid, precise technologies for genome editing now provide a robust scaffold to prioritize hypothesis-driven discovery and further populate this genetic proto-map. Although each gene identified offers translational potential to stratify patient care, the complexity of individual variation and covert actions of genetic modifiers may confound single-gene solutions for the clinical disorder. In vivo genetic deconstruction of epileptic networks, ex vivo validation of variant profiles in patient-derived induced pluripotent stem cells, in silico variant modeling and modifier gene discovery, now in their earliest stages, will help clarify individual patterns. Because seizures stand at the crossroads of all neuronal synchronization disorders in the developing and aging brain, the neurobiological analysis of epilepsy-associated genes provides an extraordinary gateway to new insights into higher cortical function. PMID:25710836

  16. The Helioviewer Project: Discovery For Everyone Everywhere

    NASA Astrophysics Data System (ADS)

    Ireland, Jack; Hughitt, K.; Müller, D.; Dimitoglou, G.; Schmiedel, P.; Fleck, B.

    2009-05-01

    The Helioviewer Project: discovery for everyone everywhere There is an ever increasing amount of solar and heliospheric data gathered from multiple sources such as space-based facilities and ground based observatories. There are also multiple feature and event catalogs arising from human and computer based detection methods. The Helioviewer Project is developing a suite of technologies to allow users around the world to visualize, browse and access these heterogeneous datasets in an intuitive and highly customizable fashion. Helioviewer technologies are based on the JPEG2000 file format, an extremely flexible format that allows for the efficient transfer of data (and meta-data, such as FITS keywords) between client and server. Rather then having to download an entire image and then examine the small portion- for example, an active region - that you are interested in, the JPEG2000 file format lets you preferentially download only those portions you are interested in. This dramatically reduces the amount of data transferred, making possible responsive and flexible scientific discovery applications that can browse populous archives of large images, such as those from the Solar Dynamics Observatory. In addition, the Helioviewer Project is designed to be flexible and extensible to data sources as they become available. Helioviewer.org (www.helioviewer.org) works seamlessly with the Virtual Solar Observatory (VSO) whilst an application programming interface (API) is being developed for interaction with the Solar Dynamics Observatory Heliophysics Event Knowledgebase. After a short introduction to the underlying technology, a live demonstration of the web application www.helioviewer.org will be given. We will also comment on other client applications (Jhelioviewer, a Java-based browse tool), and the application of Helioviewer technology to existing and future solar and heliospheric data and feature/event repositories. This project is funded by NASA VxO and LWS awards and an ESA science award.

  17. Inflammatory bowel disease gene discovery. CRADA final report

    SciTech Connect

    1997-09-09

    The ultimate goal of this project is to identify the human gene(s) responsible for the disorder known as IBD. The work was planned in two phases. The desired products resulting from Phase 1 were BAC clone(s) containing the genetic marker(s) identified by gene/Networks, Inc. as potentially linked to IBD, plasmid subclones of those BAC(s), and new genetic markers developed from these plasmid subclones. The newly developed markers would be genotyped by gene/Networks, Inc. to ascertain evidence for linkage or non-linkage of IBD to this region. If non-linkage was indicated, the project would move to investigation of other candidate chromosomal regions. Where linkage was indicated, the project would move to Phase 2, in which a physical map of the candidate region(s) would be developed. The products of this phase would be contig(s) of BAC clones in the region exhibiting linkage to IBD, as well as plasmic subclones of the BACs and further genetic marker development. There would also be continued genotyping with new polymorphic markers during this phase. It was anticipated that clones identified and developed during these two phases would provide the physical resources for eventual disease gene discovery.

  18. Peroxidase gene discovery from the horseradish transcriptome

    PubMed Central

    2014-01-01

    Background Horseradish peroxidases (HRPs) from Armoracia rusticana have long been utilized as reporters in various diagnostic assays and histochemical stainings. Regardless of their increasing importance in the field of life sciences and suggested uses in medical applications, chemical synthesis and other industrial applications, the HRP isoenzymes, their substrate specificities and enzymatic properties are poorly characterized. Due to lacking sequence information of natural isoenzymes and the low levels of HRP expression in heterologous hosts, commercially available HRP is still extracted as a mixture of isoenzymes from the roots of A. rusticana. Results In this study, a normalized, size-selected A. rusticana transcriptome library was sequenced using 454 Titanium technology. The resulting reads were assembled into 14871 isotigs with an average length of 1133 bp. Sequence databases, ORF finding and ORF characterization were utilized to identify peroxidase genes from the 14871 isotigs generated by de novo assembly. The sequences were manually reviewed and verified with Sanger sequencing of PCR amplified genomic fragments, resulting in the discovery of 28 secretory peroxidases, 23 of them previously unknown. A total of 22 isoenzymes including allelic variants were successfully expressed in Pichia pastoris and showed peroxidase activity with at least one of the substrates tested, thus enabling their development into commercial pure isoenzymes. Conclusions This study demonstrates that transcriptome sequencing combined with sequence motif search is a powerful concept for the discovery and quick supply of new enzymes and isoenzymes from any plant or other eukaryotic organisms. Identification and manual verification of the sequences of 28 HRP isoenzymes do not only contribute a set of peroxidases for industrial, biological and biomedical applications, but also provide valuable information on the reliability of the approach in identifying and characterizing a large group of isoenzymes. PMID:24666710

  19. Using transcriptomics to guide lead optimization in drug discovery projects: Lessons learned from the QSTAR project.

    PubMed

    Verbist, Bie; Klambauer, Günter; Vervoort, Liesbet; Talloen, Willem; Shkedy, Ziv; Thas, Olivier; Bender, Andreas; Göhlmann, Hinrich W H; Hochreiter, Sepp

    2015-05-01

    The pharmaceutical industry is faced with steadily declining R&D efficiency which results in fewer drugs reaching the market despite increased investment. A major cause for this low efficiency is the failure of drug candidates in late-stage development owing to safety issues or previously undiscovered side-effects. We analyzed to what extent gene expression data can help to de-risk drug development in early phases by detecting the biological effects of compounds across disease areas, targets and scaffolds. For eight drug discovery projects within a global pharmaceutical company, gene expression data were informative and able to support go/no-go decisions. Our studies show that gene expression profiling can detect adverse effects of compounds, and is a valuable tool in early-stage drug discovery decision making. PMID:25582842

  20. Genome-enabled Discovery of Carbon Sequestration Genes

    SciTech Connect

    Tuskan, Gerald A; Tschaplinski, Timothy J; Kalluri, Udaya C; Yin, Tongming; Yang, Xiaohan; Zhang, Xinye; Engle, Nancy L; Ranjan, Priya; Basu, Manojit M; Gunter, Lee E; Jawdy, Sara; Martin, Madhavi Z; Campbell, Alina S; DiFazio, Stephen P; Davis, John M; Hinchee, Maud; Pinnacchio, Christa; Meilan, R; Busov, V.; Strauss, S

    2009-01-01

    The fate of carbon below ground is likely to be a major factor determining the success of carbon sequestration strategies involving plants. Despite their importance, molecular processes controlling belowground C allocation and partitioning are poorly understood. This project is leveraging the Populus trichocarpa genome sequence to discover genes important to C sequestration in plants and soils. The focus is on the identification of genes that provide key control points for the flow and chemical transformations of carbon in roots, concentrating on genes that control the synthesis of chemical forms of carbon that result in slower turnover rates of soil organic matter (i.e., increased recalcitrance). We propose to enhance carbon allocation and partitioning to roots by 1) modifying the auxin signaling pathway, and the invertase family, which controls sucrose metabolism, and by 2) increasing root proliferation through transgenesis with genes known to control fine root proliferation (e.g., ANT), 3) increasing the production of recalcitrant C metabolites by identifying genes controlling secondary C metabolism by a major mQTL-based gene discovery effort, and 4) increasing aboveground productivity by enhancing drought tolerance to achieve maximum C sequestration. This broad, integrated approach is aimed at ultimately enhancing root biomass as well as root detritus longevity, providing the best prospects for significant enhancement of belowground C sequestration.

  1. Standardized Plant Disease Evaluations will Enhance Resistance Gene Discovery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gene discovery and marker development using DNA based tools require plant populations with well-documented phenotypes. Related crops such as apples and pears may share a number of genes, for example resistance to common diseases, and data mining in one crop may reveal genes for the other. However, u...

  2. Antibiotic resistance gene discovery in food-producing animals

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Numerous environmental reservoirs contribute to the widespread antibiotic resistance problem in human pathogens. One environmental reservoir of particular importance is the intestinal bacteria of food-producing animals. In this review I examine recent discoveries of antibiotic resistance genes in ...

  3. Discovery of Tumor Suppressor Gene Function.

    ERIC Educational Resources Information Center

    Oppenheimer, Steven B.

    1995-01-01

    This is an update of a 1991 review on tumor suppressor genes written at a time when understanding of how the genes work was limited. A recent major breakthrough in the understanding of the function of tumor suppressor genes is discussed. (LZ)

  4. SNP marker discovery in koala TLR genes.

    PubMed

    Cui, Jian; Frankham, Greta J; Johnson, Rebecca N; Polkinghorne, Adam; Timms, Peter; O'Meally, Denis; Cheng, Yuanyuan; Belov, Katherine

    2015-01-01

    Toll-like receptors (TLRs) play a crucial role in the early defence against invading pathogens, yet our understanding of TLRs in marsupial immunity is limited. Here, we describe the characterisation of nine TLRs from a koala immune tissue transcriptome and one TLR from a draft sequence of the koala genome and the subsequent development of an assay to study genetic diversity in these genes. We surveyed genetic diversity in 20 koalas from New South Wales, Australia and showed that one gene, TLR10 is monomorphic, while the other nine TLR genes have between two and 12 alleles. 40 SNPs (16 non-synonymous) were identified across the ten TLR genes. These markers provide a springboard to future studies on innate immunity in the koala, a species under threat from two major infectious diseases. PMID:25799012

  5. SNP Marker Discovery in Koala TLR Genes

    PubMed Central

    Cui, Jian; Frankham, Greta J.; Johnson, Rebecca N.; Polkinghorne, Adam; Timms, Peter; O’Meally, Denis; Cheng, Yuanyuan; Belov, Katherine

    2015-01-01

    Toll-like receptors (TLRs) play a crucial role in the early defence against invading pathogens, yet our understanding of TLRs in marsupial immunity is limited. Here, we describe the characterisation of nine TLRs from a koala immune tissue transcriptome and one TLR from a draft sequence of the koala genome and the subsequent development of an assay to study genetic diversity in these genes. We surveyed genetic diversity in 20 koalas from New South Wales, Australia and showed that one gene, TLR10 is monomorphic, while the other nine TLR genes have between two and 12 alleles. 40 SNPs (16 non-synonymous) were identified across the ten TLR genes. These markers provide a springboard to future studies on innate immunity in the koala, a species under threat from two major infectious diseases. PMID:25799012

  6. A Discovery Lab for Studying Gene Regulation.

    ERIC Educational Resources Information Center

    Moss, Robert

    1997-01-01

    Presents a laboratory in which students are provided with cultures of three bacterial strains. Using the results, students will determine which of the strains corresponds to a mutant lacking a particular functional gene. (DDR)

  7. GWATCH: a web platform for automated gene association discovery analysis

    PubMed Central

    2014-01-01

    Background As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Findings Here we present a dynamic web-based platform – GWATCH – that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. Conclusions GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH. PMID:25374661

  8. Implementation of Discovery Projects in Statistics

    ERIC Educational Resources Information Center

    Bailey, Brad; Spence, Dianna J.; Sinn, Robb

    2013-01-01

    Researchers and statistics educators consistently suggest that students will learn statistics more effectively by conducting projects through which they actively engage in a broad spectrum of tasks integral to statistical inquiry, in the authentic context of a real-world application. In keeping with these findings, we share an implementation of…

  9. Implementation of Discovery Projects in Statistics

    ERIC Educational Resources Information Center

    Bailey, Brad; Spence, Dianna J.; Sinn, Robb

    2013-01-01

    Researchers and statistics educators consistently suggest that students will learn statistics more effectively by conducting projects through which they actively engage in a broad spectrum of tasks integral to statistical inquiry, in the authentic context of a real-world application. In keeping with these findings, we share an implementation of

  10. Standardized plant disease evaluations will enhance resistance gene discovery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gene discovery and marker development using DNA-based tools require plant populations with well documented phenotypes. If dissimilar phenotype evaluation methods or data scoring techniques are employed with different crops, or at different labs for the same crops, then data mining for genetic marker...

  11. Unsupervised fuzzy pattern discovery in gene expression data

    PubMed Central

    2011-01-01

    Background Discovering patterns from gene expression levels is regarded as a classification problem when tissue classes of the samples are given and solved as a discrete-data problem by discretizing the expression levels of each gene into intervals maximizing the interdependence between that gene and the class labels. However, when class information is unavailable, discovering gene expression patterns becomes difficult. Methods For a gene pool with large number of genes, we first cluster the genes into smaller groups. In each group, we use the representative gene, one with highest interdependence with others in the group, to drive the discretization of the gene expression levels of other genes. Treating intervals as discrete events, association patterns of events can be discovered. If the gene groups obtained are crisp gene clusters, significant patterns overlapping different gene clusters cannot be found. This paper presents a new method of “fuzzifying” the crisp gene clusters to overcome such problem. Results To evaluate the effectiveness of our approach, we first apply the above described procedure on a synthetic data set and then a gene expression data set with known class labels. The class labels are not being used in both analyses but used later as the ground truth in a classificatory problem for assessing the algorithm’s effectiveness in fuzzy gene clustering and discretization. The results show the efficacy of the proposed method. The existence of correlation among continuous valued gene expression levels suggests that certain genes in the gene groups have high interdependence with other genes in the group. Fuzzification of a crisp gene cluster allows the cluster to take in genes from other clusters so that overlapping relationship among gene clusters could be uncovered. Hence, previously unknown hidden patterns resided in overlapping gene clusters are discovered. From the experimental results, the high order patterns discovered reveal multiple gene interaction patterns in cancerous tissues not found in normal tissues. It was also found that for the colon cancer experiment, 70% of the top patterns and most of the discriminative patterns between cancerous and normal tissues are among those spanning across different crisp gene clusters. Conclusions We show that the proposed method for analyzing the error-prone microarray is effective even without the presence of tissue class information. A unified framework is presented, allowing fast and accurate pattern discovery for gene expression data. For a large gene set, to discover a comprehensive set of patterns, gene clustering, gene expression discretization and gene cluster fuzzification are absolutely necessary. PMID:21989090

  12. Technology development for gene discovery and full-length sequencing

    SciTech Connect

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  13. Gene regulatory network discovery using pairwise Granger causality.

    PubMed

    Tam, Gary Hak Fui; Chang, Chunqi; Hung, Yeung Sam

    2013-10-01

    Discovery of gene regulatory network from gene expression data can yield a useful insight to drug development. Among the methods applied to time-series data, Granger causality (GC) has emerged as a powerful tool with several merits. Since gene expression data usually have a much larger number of genes than time points therefore a full model cannot be applied in a straightforward manner, GC is often applied to genes pair wisely. In this study, the authors first investigate with synthetic data how spurious causalities (false discoveries) may arise because of the use of pairwise rather than full-model GC detection. Furthermore, spurious causalities may also arise if the order of the vector autoregressive model is not high enough. As a remedy, the authors demonstrate that model validation techniques can effectively reduce the number of false discoveries. Then, they apply pairwise GC with model validation to the real human HeLa cell-cycle dataset. They find that Akaike information criterion is generally most suitable for determining model order, but precaution should be taken for extremely short time series. With the authors proposed implementation, degree distributions and network hubs are obtained and compared with existing results, giving a new observation that the hubs tend to act as sources rather than receivers of interactions. PMID:24067420

  14. Beegle: from literature mining to disease-gene discovery.

    PubMed

    ElShal, Sarah; Tranchevent, Léon-Charles; Sifrim, Alejandro; Ardeshirdavani, Amin; Davis, Jesse; Moreau, Yves

    2016-01-29

    Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/. PMID:26384564

  15. Beegle: from literature mining to disease-gene discovery

    PubMed Central

    ElShal, Sarah; Tranchevent, Léon-Charles; Sifrim, Alejandro; Ardeshirdavani, Amin; Davis, Jesse; Moreau, Yves

    2016-01-01

    Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/. PMID:26384564

  16. Mitigating false-positive associations in rare disease gene discovery.

    PubMed

    Akle, Sebastian; Chun, Sung; Jordan, Daniel M; Cassa, Christopher A

    2015-10-01

    Clinical sequencing is expanding, but causal variants are still not identified in the majority of cases. These unsolved cases can aid in gene discovery when individuals with similar phenotypes are identified in systems such as the Matchmaker Exchange. We describe risks for gene discovery in this growing set of unsolved cases. In a set of rare disease cases with the same phenotype, it is not difficult to find two individuals with the same phenotype that carry variants in the same gene. We quantify the risk of false-positive association in a cohort of individuals with the same phenotype, using the prior probability of observing a variant in each gene from over 60,000 individuals (Exome Aggregation Consortium). Based on the number of individuals with a genic variant, cohort size, specific gene, and mode of inheritance, we calculate a P value that the match represents a true association. A match in two of 10 patients in MECP2 is statistically significant (P = 0.0014), whereas a match in TTN would not reach significance, as expected (P > 0.999). Finally, we analyze the probability of matching in clinical exome cases to estimate the number of cases needed to identify genes related to different disorders. We offer Rare Disease Match, an online tool to mitigate the uncertainty of false-positive associations. PMID:26378430

  17. Mouse models for the discovery of colorectal cancer driver genes.

    PubMed

    Clark, Christopher R; Starr, Timothy K

    2016-01-14

    Colorectal cancer (CRC) constitutes a major public health problem as the third most commonly diagnosed and third most lethal malignancy worldwide. The prevalence and the physical accessibility to colorectal tumors have made CRC an ideal model for the study of tumor genetics. Early research efforts using patient derived CRC samples led to the discovery of several highly penetrant mutations (e.g., APC, KRAS, MMR genes) in both hereditary and sporadic CRC tumors. This knowledge has enabled researchers to develop genetically engineered and chemically induced tumor models of CRC, both of which have had a substantial impact on our understanding of the molecular basis of CRC. Despite these advances, the morbidity and mortality of CRC remains a cause for concern and highlight the need to uncover novel genetic drivers of CRC. This review focuses on mouse models of CRC with particular emphasis on a newly developed cancer gene discovery tool, the Sleeping Beauty transposon-based mutagenesis model of CRC. PMID:26811627

  18. Mouse models for the discovery of colorectal cancer driver genes

    PubMed Central

    Clark, Christopher R; Starr, Timothy K

    2016-01-01

    Colorectal cancer (CRC) constitutes a major public health problem as the third most commonly diagnosed and third most lethal malignancy worldwide. The prevalence and the physical accessibility to colorectal tumors have made CRC an ideal model for the study of tumor genetics. Early research efforts using patient derived CRC samples led to the discovery of several highly penetrant mutations (e.g., APC, KRAS, MMR genes) in both hereditary and sporadic CRC tumors. This knowledge has enabled researchers to develop genetically engineered and chemically induced tumor models of CRC, both of which have had a substantial impact on our understanding of the molecular basis of CRC. Despite these advances, the morbidity and mortality of CRC remains a cause for concern and highlight the need to uncover novel genetic drivers of CRC. This review focuses on mouse models of CRC with particular emphasis on a newly developed cancer gene discovery tool, the Sleeping Beauty transposon-based mutagenesis model of CRC. PMID:26811627

  19. Gene discovery and gene function assignment in filamentous fungi

    PubMed Central

    Hamer, Lisbeth; Adachi, Kiichi; Montenegro-Chamorro, Maria V.; Tanzer, Matthew M.; Mahanty, Sanjoy K.; Lo, Clive; Tarpey, Rex W.; Skalchunes, Amy R.; Heiniger, Ryan W.; Frank, Sheryl A.; Darveaux, Blaise A.; Lampe, David J.; Slater, Ted M.; Ramamurthy, Lakshman; DeZwaan, Todd M.; Nelson, Grant H.; Shuster, Jeffrey R.; Woessner, Jeffrey; Hamer, John E.

    2001-01-01

    Filamentous fungi are a large group of diverse and economically important microorganisms. Large-scale gene disruption strategies developed in budding yeast are not applicable to these organisms because of their larger genomes and lower rate of targeted integration (TI) during transformation. We developed transposon-arrayed gene knockouts (TAGKO) to discover genes and simultaneously create gene disruption cassettes for subsequent transformation and mutant analysis. Transposons carrying a bacterial and fungal drug resistance marker are used to mutagenize individual cosmids or entire libraries in vitro. Cosmids are annotated by DNA sequence analysis at the transposon insertion sites, and cosmid inserts are liberated to direct insertional mutagenesis events in the genome. Based on saturation analysis of a cosmid insert and insertions in a fungal cosmid library, we show that TAGKO can be used to rapidly identify and mutate genes. We further show that insertions can create alterations in gene expression, and we have used this approach to investigate an amino acid oxidation pathway in two important fungal phytopathogens. PMID:11296265

  20. Nonlinear Dependence in the Discovery of Differentially Expressed Genes

    PubMed Central

    Deller, J. R.; Radha, Hayder; McCormick, J. Justin; Wang, Huiyan

    2012-01-01

    Microarray data are used to determine which genes are active in response to a changing cell environment. Genes are “discovered” when they are significantly differentially expressed in the microarray data collected under the differing conditions. In one prevalent approach, all genes are assumed to satisfy a null hypothesis, ℍ0, of no difference in expression. A false discovery (type 1 error) occurs when ℍ0 is incorrectly rejected. The quality of a detection algorithm is assessed by estimating its number of false discoveries, 𝔉. Work involving the second-moment modeling of the z-value histogram (representing gene expression differentials) has shown significantly deleterious effects of intergene expression correlation on the estimate of 𝔉. This paper suggests that nonlinear dependencies could likewise be important. With an applied emphasis, this paper extends the “moment framework” by including third-moment skewness corrections in an estimator of 𝔉. This estimator combines observed correlation (corrected for sampling fluctuations) with the information from easily identifiable null cases. Nonlinear-dependence modeling reduces the estimation error relative to that of linear estimation. Third-moment calculations involve empirical densities of 3 × 3 covariance matrices estimated using very few samples. The principle of entropy maximization is employed to connect estimated moments to 𝔉 inference. Model results are tested with BRCA and HIV data sets and with carefully constructed simulations. PMID:25937940

  1. Non-syndromic retinal ciliopathies: translating gene discovery into therapy.

    PubMed

    Estrada-Cuzcano, Alejandro; Roepman, Ronald; Cremers, Frans P M; den Hollander, Anneke I; Mans, Dorus A

    2012-10-15

    Homozygosity mapping and exome sequencing have accelerated the discovery of gene mutations and modifier alleles implicated in inherited retinal degeneration in humans. To date, 158 genes have been found to be mutated in individuals with retinal dystrophies. Approximately one-third of the gene defects underlying retinal degeneration affect the structure and/or function of the 'connecting cilium' in photoreceptors. This structure corresponds to the transition zone of a prototypic cilium, a region with increasing relevance for ciliary homeostasis. The connecting cilium connects the inner and outer segments of the photoreceptor, mediating bi-directional transport of phototransducing proteins required for vision. In fact, the outer segment, connecting cilium and associated basal body, forms a highly specialized sensory cilium, fully dedicated to photoreception and subsequent signal transduction to the brain. At least 21 genes that encode ciliary proteins are implicated in non-syndromic retinal dystrophies such as cone dystrophy, cone-rod dystrophy, Leber congenital amaurosis (LCA), macular degeneration or retinitis pigmentosa (RP). The generation and characterization of vertebrate retinal ciliopathy animal models have revealed insights into the molecular disease mechanism which are indispensable for the development and evaluation of therapeutic strategies. Gene augmentation therapy has proven to be safe and successful in restoring long-term sight in mice, dogs and humans suffering from LCA or RP. Here, we present a comprehensive overview of the genes, mutations and modifier alleles involved in non-syndromic retinal ciliopathies, review the progress in dissecting the associated retinal disease mechanisms and evaluate gene augmentation approaches to antagonize retinal degeneration in these ciliopathies. PMID:22843501

  2. Genome Enabled Discovery of Carbon Sequestration Genes in Poplar

    SciTech Connect

    Filichkin, Sergei; Etherington, Elizabeth; Ma, Caiping; Strauss, Steve

    2007-02-22

    The goals of the S.H. Strauss laboratory portion of 'Genome-enabled discovery of carbon sequestration genes in poplar' are (1) to explore the functions of candidate genes using Populus transformation by inserting genes provided by Oakridge National Laboratory (ORNL) and the University of Florida (UF) into poplar; (2) to expand the poplar transformation toolkit by developing transformation methods for important genotypes; and (3) to allow induced expression, and efficient gene suppression, in roots and other tissues. As part of the transformation improvement effort, OSU developed transformation protocols for Populus trichocarpa 'Nisqually-1' clone and an early flowering P. alba clone, 6K10. Complete descriptions of the transformation systems were published (Ma et. al. 2004, Meilan et. al 2004). Twenty-one 'Nisqually-1' and 622 6K10 transgenic plants were generated. To identify root predominant promoters, a set of three promoters were tested for their tissue-specific expression patterns in poplar and in Arabidopsis as a model system. A novel gene, ET304, was identified by analyzing a collection of poplar enhancer trap lines generated at OSU (Filichkin et. al 2006a, 2006b). Other promoters include the pGgMT1 root-predominant promoter from Casuarina glauca and the pAtPIN2 promoter from Arabidopsis root specific PIN2 gene. OSU tested two induction systems, alcohol- and estrogen-inducible, in multiple poplar transgenics. Ethanol proved to be the more efficient when tested in tissue culture and greenhouse conditions. Two estrogen-inducible systems were evaluated in transgenic Populus, neither of which functioned reliably in tissue culture conditions. GATEWAY-compatible plant binary vectors were designed to compare the silencing efficiency of homologous (direct) RNAi vs. heterologous (transitive) RNAi inverted repeats. A set of genes was targeted for post transcriptional silencing in the model Arabidopsis system; these include the floral meristem identity gene (APETALA1 or AP1), auxin response factor gene (ETTIN), the gene encoding transcriptional factor of WD40 family (TRANSPARENTTESTAGLABRA1 or TTG1), and the auxin efflux carrier (PIN-FORMED2 or PIN2) gene. More than 220 transgenic lines of the 1st, 2nd and 3rd generations were analyzed for RNAi suppression phenotypes (Filichkin et. al., manuscript submitted). A total of 108 constructs were supplied by ORNL, UF and OSU and used to generate over 1,881 PCR verified transgenic Populus and over 300 PCR verified transgenic Arabidopsis events. The Populus transgenics alone required Agrobacterium co-cultivations of 124.406 explants.

  3. Sugarcane Functional Genomics: Gene Discovery for Agronomic Trait Development

    PubMed Central

    Menossi, M.; Silva-Filho, M. C.; Vincentz, M.; Van-Sluys, M.-A.; Souza, G. M.

    2008-01-01

    Sugarcane is a highly productive crop used for centuries as the main source of sugar and recently to produce ethanol, a renewable bio-fuel energy source. There is increased interest in this crop due to the impending need to decrease fossil fuel usage. Sugarcane has a highly polyploid genome. Expressed sequence tag (EST) sequencing has significantly contributed to gene discovery and expression studies used to associate function with sugarcane genes. A significant amount of data exists on regulatory events controlling responses to herbivory, drought, and phosphate deficiency, which cause important constraints on yield and on endophytic bacteria, which are highly beneficial. The means to reduce drought, phosphate deficiency, and herbivory by the sugarcane borer have a negative impact on the environment. Improved tolerance for these constraints is being sought. Sugarcane's ability to accumulate sucrose up to 16% of its culm dry weight is a challenge for genetic manipulation. Genome-based technology such as cDNA microarray data indicates genes associated with sugar content that may be used to develop new varieties improved for sucrose content or for traits that restrict the expansion of the cultivated land. The genes can also be used as molecular markers of agronomic traits in traditional breeding programs. PMID:18273390

  4. Amyotrophic Lateral Sclerosis: An Emerging Era of Collaborative Gene Discovery

    PubMed Central

    Gwinn, Katrina; Corriveau, Roderick A.; Mitsumoto, Hiroshi; Bednarz, Kate; Brown, Robert H.; Cudkowicz, Merit; Gordon, Paul H.; Hardy, John; Kasarskis, Edward J.; Kaufmann, Petra; Miller, Robert; Sorenson, Eric; Tandan, Rup; Traynor, Bryan J.; Nash, Josefina; Sherman, Alex; Mailman, Matthew D.; Ostell, James; Bruijn, Lucie; Cwik, Valerie; Rich, Stephen S.; Singleton, Andrew; Refolo, Larry; Andrews, Jaime; Zhang, Ran; Conwit, Robin; Keller, Margaret A.

    2007-01-01

    Amyotrophic lateral sclerosis (ALS) is the most common form of motor neuron disease (MND). It is currently incurable and treatment is largely limited to supportive care. Family history is associated with an increased risk of ALS, and many Mendelian causes have been discovered. However, most forms of the disease are not obviously familial. Recent advances in human genetics have enabled genome-wide analyses of single nucleotide polymorphisms (SNPs) that make it possible to study complex genetic contributions to human disease. Genome-wide SNP analyses require a large sample size and thus depend upon collaborative efforts to collect and manage the biological samples and corresponding data. Public availability of biological samples (such as DNA), phenotypic and genotypic data further enhances research endeavors. Here we discuss a large collaboration among academic investigators, government, and non-government organizations which has created a public repository of human DNA, immortalized cell lines, and clinical data to further gene discovery in ALS. This resource currently maintains samples and associated phenotypic data from 2332 MND subjects and 4692 controls. This resource should facilitate genetic discoveries which we anticipate will ultimately provide a better understanding of the biological mechanisms of neurodegeneration in ALS. PMID:18060051

  5. The Matchmaker Exchange: a platform for rare disease gene discovery.

    PubMed

    Philippakis, Anthony A; Azzariti, Danielle R; Beltran, Sergi; Brookes, Anthony J; Brownstein, Catherine A; Brudno, Michael; Brunner, Han G; Buske, Orion J; Carey, Knox; Doll, Cassie; Dumitriu, Sergiu; Dyke, Stephanie O M; den Dunnen, Johan T; Firth, Helen V; Gibbs, Richard A; Girdea, Marta; Gonzalez, Michael; Haendel, Melissa A; Hamosh, Ada; Holm, Ingrid A; Huang, Lijia; Hurles, Matthew E; Hutton, Ben; Krier, Joel B; Misyura, Andriy; Mungall, Christopher J; Paschall, Justin; Paten, Benedict; Robinson, Peter N; Schiettecatte, François; Sobreira, Nara L; Swaminathan, Ganesh J; Taschner, Peter E; Terry, Sharon F; Washington, Nicole L; Züchner, Stephan; Boycott, Kym M; Rehm, Heidi L

    2015-10-01

    There are few better examples of the need for data sharing than in the rare disease community, where patients, physicians, and researchers must search for "the needle in a haystack" to uncover rare, novel causes of disease within the genome. Impeding the pace of discovery has been the existence of many small siloed datasets within individual research or clinical laboratory databases and/or disease-specific organizations, hoping for serendipitous occasions when two distant investigators happen to learn they have a rare phenotype in common and can "match" these cases to build evidence for causality. However, serendipity has never proven to be a reliable or scalable approach in science. As such, the Matchmaker Exchange (MME) was launched to provide a robust and systematic approach to rare disease gene discovery through the creation of a federated network connecting databases of genotypes and rare phenotypes using a common application programming interface (API). The core building blocks of the MME have been defined and assembled. Three MME services have now been connected through the API and are available for community use. Additional databases that support internal matching are anticipated to join the MME network as it continues to grow. PMID:26295439

  6. A New Omics Data Resource of Pleurocybellaporrigens for Gene Discovery

    PubMed Central

    Dohra, Hideo; Someya, Takumi; Takano, Tomoyuki; Harada, Kiyonori; Omae, Saori; Hirai, Hirofumi; Yano, Kentaro; Kawagishi, Hirokazu

    2013-01-01

    Background Pleurocybellaporrigens is a mushroom-forming fungus, which has been consumed as a traditional food in Japan. In 2004, 55 people were poisoned by eating the mushroom and 17 people among them died of acute encephalopathy. Since then, the Japanese government has been alerting Japanese people to take precautions against eating the P. porrigens mushroom. Unfortunately, despite efforts, the molecular mechanism of the encephalopathy remains elusive. The genome and transcriptome sequence data of P. porrigens and the related species, however, are not stored in the public database. To gain the omics data in P. porrigens, we sequenced genome and transcriptome of its fruiting bodies and mycelia by next generation sequencing. Methodology/Principal Findings Short read sequences of genomic DNAs and mRNAs in P. porrigens were generated by Illumina Genome Analyzer. Genome short reads were de novo assembled into scaffolds using Velvet. Comparisons of genome signatures among Agaricales showed that P. porrigens has a unique genome signature. Transcriptome sequences were assembled into contigs (unigenes). Biological functions of unigenes were predicted by Gene Ontology and KEGG pathway analyses. The majority of unigenes would be novel genes without significant counterparts in the public omics databases. Conclusions Functional analyses of unigenes present the existence of numerous novel genes in the basidiomycetes division. The results mean that the omics information such as genome, transcriptome and metabolome in basidiomycetes is short in the current databases. The large-scale omics information on P. porrigens, provided from this research, will give a new data resource for gene discovery in basidiomycetes. PMID:23936076

  7. Canonical Correlation Analysis for Gene-Based Pleiotropy Discovery

    PubMed Central

    Seoane, Jose A.; Campbell, Colin; Day, Ian N. M.; Casas, Juan P.; Gaunt, Tom R.

    2014-01-01

    Genome-wide association studies have identified a wealth of genetic variants involved in complex traits and multifactorial diseases. There is now considerable interest in testing variants for association with multiple phenotypes (pleiotropy) and for testing multiple variants for association with a single phenotype (gene-based association tests). Such approaches can increase statistical power by combining evidence for association over multiple phenotypes or genetic variants respectively. Canonical Correlation Analysis (CCA) measures the correlation between two sets of multidimensional variables, and thus offers the potential to combine these two approaches. To apply CCA, we must restrict the number of attributes relative to the number of samples. Hence we consider modules of genetic variation that can comprise a gene, a pathway or another biologically relevant grouping, and/or a set of phenotypes. In order to do this, we use an attribute selection strategy based on a binary genetic algorithm. Applied to a UK-based prospective cohort study of 4286 women (the British Women's Heart and Health Study), we find improved statistical power in the detection of previously reported genetic associations, and identify a number of novel pleiotropic associations between genetic variants and phenotypes. New discoveries include gene-based association of NSF with triglyceride levels and several genes (ACSM3, ERI2, IL18RAP, IL23RAP and NRG1) with left ventricular hypertrophy phenotypes. In multiple-phenotype analyses we find association of NRG1 with left ventricular hypertrophy phenotypes, fibrinogen and urea and pleiotropic relationships of F7 and F10 with Factor VII, Factor IX and cholesterol levels. PMID:25329069

  8. Turning publicly available gene expression data into discoveries using gene set context analysis

    PubMed Central

    Ji, Zhicheng; Vokes, Steven A.; Dang, Chi V.; Ji, Hongkai

    2016-01-01

    Gene Set Context Analysis (GSCA) is an open source software package to help researchers use massive amounts of publicly available gene expression data (PED) to make discoveries. Users can interactively visualize and explore gene and gene set activities in 25,000+ consistently normalized human and mouse gene expression samples representing diverse biological contexts (e.g. different cells, tissues and disease types, etc.). By providing one or multiple genes or gene sets as input and specifying a gene set activity pattern of interest, users can query the expression compendium to systematically identify biological contexts associated with the specified gene set activity pattern. In this way, researchers with new gene sets from their own experiments may discover previously unknown contexts of gene set functions and hence increase the value of their experiments. GSCA has a graphical user interface (GUI). The GUI makes the analysis convenient and customizable. Analysis results can be conveniently exported as publication quality figures and tables. GSCA is available at https://github.com/zji90/GSCA. This software significantly lowers the bar for biomedical investigators to use PED in their daily research for generating and screening hypotheses, which was previously difficult because of the complexity, heterogeneity and size of the data. PMID:26350211

  9. Turning publicly available gene expression data into discoveries using gene set context analysis.

    PubMed

    Ji, Zhicheng; Vokes, Steven A; Dang, Chi V; Ji, Hongkai

    2016-01-01

    Gene Set Context Analysis (GSCA) is an open source software package to help researchers use massive amounts of publicly available gene expression data (PED) to make discoveries. Users can interactively visualize and explore gene and gene set activities in 25,000+ consistently normalized human and mouse gene expression samples representing diverse biological contexts (e.g. different cells, tissues and disease types, etc.). By providing one or multiple genes or gene sets as input and specifying a gene set activity pattern of interest, users can query the expression compendium to systematically identify biological contexts associated with the specified gene set activity pattern. In this way, researchers with new gene sets from their own experiments may discover previously unknown contexts of gene set functions and hence increase the value of their experiments. GSCA has a graphical user interface (GUI). The GUI makes the analysis convenient and customizable. Analysis results can be conveniently exported as publication quality figures and tables. GSCA is available at https://github.com/zji90/GSCA. This software significantly lowers the bar for biomedical investigators to use PED in their daily research for generating and screening hypotheses, which was previously difficult because of the complexity, heterogeneity and size of the data. PMID:26350211

  10. Discovery of gene regulation pattern in lung cancer by gene expression profiling using human tissues

    PubMed Central

    Fu, Junjie; Khaybullin, Ravil; Liang, Xiao; Morin, Madeleine; Xia, Amy; Yeh, Anderson; Qi, Xin

    2015-01-01

    Lung cancer continues to be one the most prevalent and life threatening cancers worldwide. In order to study the gene regulation pattern in lung cancer for new therapeutics discovery, gene expression profiling using human lung cancer tissues was conducted. The gene expression profiles were established using Affymetrix Human Exon 1.0 ST Array with RNA extracts from six clinical patients (five lung cancer samples and one normal control). The raw data were analyzed with Affymetrix Expression Console and Affymetrix Transcriptome Analysis Console 2.0. The regulation of several genes was further validated using real-time reverse transcription quantitative polymerase chain reaction (RT-qPCR). Here we provide detailed experimental methods and analysis for the microarray data, which have been deposited into Gene Expression Omnibus (GEO) under GSE63571. PMID:25664237

  11. Gene Discovery through Expressed Sequence Tag Sequencing in Trypanosoma cruzi

    PubMed Central

    Verdun, Ramiro E.; Di Paolo, Nelson; Urmenyi, Turan P.; Rondinelli, Edson; Frasch, Alberto C. C.; Sanchez, Daniel O.

    1998-01-01

    Analysis of expressed sequence tags (ESTs) constitutes a useful approach for gene identification that, in the case of human pathogens, might result in the identification of new targets for chemotherapy and vaccine development. As part of the Trypanosoma cruzi genome project, we have partially sequenced the 5′ ends of 1,949 clones to generate ESTs. The clones were randomly selected from a normalized CL Brener epimastigote cDNA library. A total of 14.6% of the clones were homologous to previously identified T. cruzi genes, while 18.4% had significant matches to genes from other organisms in the database. A total of 67% of the ESTs had no matches in the database, and thus, some of them might be T. cruzi-specific genes. Functional groups of those sequences with matches in the database were constructed according to their putative biological functions. The two largest categories were protein synthesis (23.3%) and cell surface molecules (10.8%). The information reported in this paper should be useful for researchers in the field to analyze genes and proteins of their own interest. PMID:9784549

  12. Sequencing of GJB2 in Cameroonians and Black South Africans and comparison to 1000 Genomes Project Data Support Need to Revise Strategy for Discovery of Nonsyndromic Deafness Genes in Africans.

    PubMed

    Bosch, Jason; Noubiap, Jean Jacques N; Dandara, Collet; Makubalo, Nomlindo; Wright, Galen; Entfellner, Jean-Baka Domelevo; Tiffin, Nicki; Wonkam, Ambroise

    2014-11-01

    Mutations in the GJB2 gene, encoding connexin 26, could account for 50% of congenital, nonsyndromic, recessive deafness cases in some Caucasian/Asian populations. There is a scarcity of published data in sub-Saharan Africans. We Sanger sequenced the coding region of the GJB2 gene in 205 Cameroonian and Xhosa South Africans with congenital, nonsyndromic deafness; and performed bioinformatic analysis of variations in the GJB2 gene, incorporating data from the 1000 Genomes Project. Amongst Cameroonian patients, 26.1% were familial. The majority of patients (70%) suffered from sensorineural hearing loss. Ten GJB2 genetic variants were detected by sequencing. A previously reported pathogenic mutation, g.3741_3743delTTC (p.F142del), and a putative pathogenic mutation, g.3816G>A (p.V167M), were identified in single heterozygous samples. Amongst eight the remaining variants, two novel variants, g.3318-41G>A and g.3332G>A, were reported. There were no statistically significant differences in allele frequencies between cases and controls. Principal Components Analyses differentiated between Africans, Asians, and Europeans, but only explained 40% of the variation. The present study is the first to compare African GJB2 sequences with the data from the 1000 Genomes Project and have revealed the low variation between population groups. This finding has emphasized the hypothesis that the prevalence of mutations in GJB2 in nonsyndromic deafness amongst European and Asian populations is due to founder effects arising after these individuals migrated out of Africa, and not to a putative "protective" variant in the genomic structure of GJB2 in Africans. Our results confirm that mutations in GJB2 are not associated with nonsyndromic deafness in Africans. PMID:25162826

  13. Gene Discovery in the Apicomplexa as Revealed by EST Sequencing and Assembly of a Comparative Gene Database

    PubMed Central

    Li, Li; Brunk, Brian P.; Kissinger, Jessica C.; Pape, Deana; Tang, Keliang; Cole, Robert H.; Martin, John; Wylie, Todd; Dante, Mike; Fogarty, Steven J.; Howe, Daniel K.; Liberator, Paul; Diaz, Carmen; Anderson, Jennifer; White, Michael; Jerome, Maria E.; Johnson, Emily A.; Radke, Jay A.; Stoeckert, Christian J.; Waterston, Robert H.; Clifton, Sandra W.; Roos, David S.; Sibley, L. David

    2003-01-01

    Large-scale EST sequencing projects for several important parasites within the phylum Apicomplexa were undertaken for the purpose of gene discovery. Included were several parasites of medical importance (Plasmodium falciparum, Toxoplasma gondii) and others of veterinary importance (Eimeria tenella, Sarcocystis neurona, and Neospora caninum). A total of 55,192 ESTs, deposited into dbEST/GenBank, were included in the analyses. The resulting sequences have been clustered into nonredundant gene assemblies and deposited into a relational database that supports a variety of sequence and text searches. This database has been used to compare the gene assemblies using BLAST similarity comparisons to the public protein databases to identify putative genes. Of these new entries, ∼15%–20% represent putative homologs with a conservative cutoff of p < 10−9, thus identifying many conserved genes that are likely to share common functions with other well-studied organisms. Gene assemblies were also used to identify strain polymorphisms, examine stage-specific expression, and identify gene families. An interesting class of genes that are confined to members of this phylum and not shared by plants, animals, or fungi, was identified. These genes likely mediate the novel biological features of members of the Apicomplexa and hence offer great potential for biological investigation and as possible therapeutic targets. [The sequence data from this study have been submitted to dbEST division of GenBank under accession nos.: Toxoplasma gondii: –, –, –, –, – , –, –, –, –. Plasmodium falciparum: –, –, –, –. Sarcocystis neurona: , , , , , , , , , , , , , –, –, –, –, –. Eimeria tenella: –, –, –, –, –, –, –, –, – , –, –, –, –, –, –, –, –, –, –, –. Neospora caninum: –, –, , – , –, –.] PMID:12618375

  14. The discovery of the microphthalmia locus and its gene, Mitf

    PubMed Central

    Arnheiter, Heinz

    2010-01-01

    Summary The history of the discovery of the microphthalmia locus and its gene, now called Mitf, is a testament to the triumph of serendipity. Although the first microphthalmia mutation was discovered among the descendants of a mouse that was irradiated for the purpose of mutagenesis, the mutation most likely was not radiation-induced but occurred spontaneously in one of the parents of a later breeding. Although Mitf might eventually have been identified by other molecular genetic techniques, it was first cloned from a chance transgene insertion at the microphthalmia locus. And although Mitf was found to encode a member of a well-known transcription factor family, its analysis might still be in its infancy had Mitf not turned out to be of crucial importance for the physiology and pathology of many distinct organs, including eye, ear, immune system, bone, and skin, and in particular for melanoma. In fact, near seven decades of Mitf research have led to many insights about development, function, degeneration, and malignancies of a number of specific cell types, and it is hoped that these insights will one day lead to therapies benefitting those afflicted with diseases originating in these cell types. PMID:20807369

  15. Chromosome substitution strains: gene discovery functional analysis and systems studies

    PubMed Central

    Nadeau, Joseph H.; Forejt, Jiri; Takada, Toyoyuki; Shiroishi, Toshihiko

    2014-01-01

    Laboratory mice are valuable in biomedical research in part because of the extraordinary diversity of genetic resources that are available for studies of complex genetic traits and as models for human biology and disease. Chromosome substitution strains (CSSs) are important in this resource portfolio because of their demonstrated use for gene discovery, genetic and epigenetic studies, functional characterizations, and systems analysis. CSSs are made by replacing a single chromosome in a host strain with the corresponding chromosome from a donor strain. A complete CSS panel involves a total of 22 engineered inbred strains, one for each of the 19 autosomes, one each for the X and Y chromosomes, and one for mitochondria. A genome survey simply involves comparing each phenotype for each of the CSSs with the phenotypes of the host strain. The CSS panels that are available for laboratory mice have been used to dissect a remarkable variety of phenotypes and to characterize an impressive array of disease models. These surveys have revealed considerable phenotypic diversity even among closely related progenitor strains, evidence for strong epistasis and for heritable epigenetic changes. Perhaps most importantly, and presumably because of their unique genetic constitution, CSSs, and congenic strains derived from them, the genetic variants underlying quantitative trait loci (QTLs) are readily identified and functionally characterized. Together these studies show that CSSs are important resource for laboratory mice. PMID:22961226

  16. Gene discovery through imaging genetics: identification of two novel genes associated with schizophrenia.

    PubMed

    Potkin, S G; Turner, J A; Fallon, J A; Lakatos, A; Keator, D B; Guffanti, G; Macciardi, F

    2009-04-01

    We have discovered two genes, RSRC1 and ARHGAP18, associated with schizophrenia and in an independent study provided additional support for this association. We have both discovered and verified the association of two genes, RSRC1 and ARHGAP18, with schizophrenia. We combined a genome-wide screening strategy with neuroimaging measures as the quantitative phenotype and identified the single nucleotide polymorphisms (SNPs) related to these genes as consistently associated with the phenotypic variation. To control for the risk of false positives, the empirical P-value for association significance was calculated using permutation testing. The quantitative phenotype was Blood-Oxygen-Level Dependent (BOLD) Contrast activation in the left dorsal lateral prefrontal cortex measured during a working memory task. The differential distribution of SNPs associated with these two genes in cases and controls was then corroborated in a larger, independent sample of patients with schizophrenia (n=82) and healthy controls (n=91), thus suggesting a putative etiological function for both genes in schizophrenia. Up until now these genes have not been linked to any neuropsychiatric illness, although both genes have a function in prenatal brain development. We introduce the use of functional magnetic resonance imaging activation as a quantitative phenotype in conjunction with genome-wide association as a gene discovery tool. PMID:19065146

  17. Discovery and New Frontiers Project Budget Analysis Tool

    NASA Technical Reports Server (NTRS)

    Newhouse, Marilyn E.

    2011-01-01

    The Discovery and New Frontiers (D&NF) programs are multi-project, uncoupled programs that currently comprise 13 missions in phases A through F. The ability to fly frequent science missions to explore the solar system is the primary measure of program success. The program office uses a Budget Analysis Tool to perform "what-if" analyses and compare mission scenarios to the current program budget, and rapidly forecast the programs ability to meet their launch rate requirements. The tool allows the user to specify the total mission cost (fixed year), mission development and operations profile by phase (percent total mission cost and duration), launch vehicle, and launch date for multiple missions. The tool automatically applies inflation and rolls up the total program costs (in real year dollars) for comparison against available program budget. Thus, the tool allows the user to rapidly and easily explore a variety of launch rates and analyze the effect of changes in future mission or launch vehicle costs, the differing development profiles or operational durations of a future mission, or a replan of a current mission on the overall program budget. Because the tool also reports average monthly costs for the specified mission profile, the development or operations cost profile can easily be validate against program experience for similar missions. While specifically designed for predicting overall program budgets for programs that develop and operate multiple missions concurrently, the basic concept of the tool (rolling up multiple, independently-budget lines) could easily be adapted to other applications.

  18. New Discoveries From The Archean Biosphere Drilling Project (ABDP)

    NASA Astrophysics Data System (ADS)

    Nedachi, M.

    2004-12-01

    The Archean Biosphere Drilling Project (ABDP), an international scientific drilling project involving scientists from the USA, Australia and Japan, was initiated in Pilbara Craton, Western Australia. The scientific objectives of the ABDP are the identification of microfossils and biomarkers, the clarification of geochemical environment of the early Earth, and the understanding of geophysical contribution to the co-evolution of life and environment. Through 2003 and 2004 activities, we have drilled 150 _| 300 m deep holes to recover _gfresh_h (modern weathering-free) geologic formations that range from 3.5 to 2.7 Ga in age. The drilling targets were: (1) 3.46 Ga Towers Formation, (2) mid-Archean Mosquito Formation, (3) 2.77 Ga Mt Roe Basalt, (4) 2.76 Ga Tumbiana Formation, (5) 2.74 Ga Hardey Formation. The initial investigations on the ABDP drill cores by Japanese members have already produced many exciting and interesting data and observations. 3.46 Ga Marble Bar Jasper could provide clues to the argument about the early photosynthetic cyanobacteria that have produced free oxygen and have evolved the oxygen level on the earth. There have been many ideas how the hematite in jasper was formed. Our most important discoveries are the confirmations that hematite, magnetite and siderite precipitated separately as primary minerals, and that there is a remaining texture which resembles microfossil using FE-SEM, ESCA, Laser-Raman and cathodoluminescence. Taking into account the carbon isotopic ratios of remains from _|25 to _|40 permil, these iron oxides might be biogenic. We need to identify the iron bacteria in detail to deduce the early earth_fs surface environment. In addition, the black shale of Apex Basalt overlying Marble Bar Jasper contains organic carbon from 0.7 to 5.2 percent, and the carbon isotopic ratio of which is from -26 to -30 per mil, suggesting that various microbes inhabited in the early Archean ocean. 2.77 Ga Mt Roe Basalt, which is composed of basaltic lavas interbedded with tuffs, clastic sediment and minor evaporites, well preserves the primary biogeochemical, geochemical and geophysical phenomena. The discovery of black shale with sulfide nodules is worthy of special attention. Our study suggests that the following succession of events occurred more than once, (1) eruption of amygdaloidal basaltic lava followed by eruption of tuff into shallower water, (2) deposition of sandstone and black shale, and (3) concurrent hydrothermal activity with reduced fluids altered the tuff and the lowermost clastic sediments. The extremely light carbon isotopic ratios suggest the activities of methanogene in hydrothermal veinlets and methanotroph in black shale. In addition, the wide range of sulfur isotopic ratio in black shale suggests activity of co-existing sulfate-reducing bacteria in the black shale. Occasional presence of sandstone, especially in late stage of clastic sedimentation, suggests the sedimentation near coastal environment. Stromatolite-like microtexture in the sandstone suggests the existence of photosynthetic microbes, which is supported by heavy carbon isotopic ratios (up to _|25 permil) and by the signals of hopanoids biomarker. The three dimensional geochemical data suggest the existence of marine environment from oxic at shallow site to euxinic at the deeper site. Paleomagnetic analyses suggest the episodic initiation of the earth's dynamo at about 3.5 Ga and the increase of it's momentum since at least 2.77 Ga. Taking into account the biogeochemical evidences confirmed from other ABDP cores, the increase of geomagnetic intensity might have accelerated the diversification of early life.

  19. Gene Prioritization for Imaging Genetics Studies Using Gene Ontology and a Stratified False Discovery Rate Approach

    PubMed Central

    Patel, Sejal; Park, Min Tae M.; Chakravarty, M. Mallar; Knight, Jo

    2016-01-01

    Imaging genetics is an emerging field in which the association between genes and neuroimaging-based quantitative phenotypes are used to explore the functional role of genes in neuroanatomy and neurophysiology in the context of healthy function and neuropsychiatric disorders. The main obstacle for researchers in the field is the high dimensionality of the data in both the imaging phenotypes and the genetic variants commonly typed. In this article, we develop a novel method that utilizes Gene Ontology, an online database, to select and prioritize certain genes, employing a stratified false discovery rate (sFDR) approach to investigate their associations with imaging phenotypes. sFDR has the potential to increase power in genome wide association studies (GWAS), and is quickly gaining traction as a method for multiple testing correction. Our novel approach addresses both the pressing need in genetic research to move beyond candidate gene studies, while not being overburdened with a loss of power due to multiple testing. As an example of our methodology, we perform a GWAS of hippocampal volume using both the Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA2) and the Alzheimer's Disease Neuroimaging Initiative datasets. The analysis of ENIGMA2 data yielded a set of SNPs with sFDR values between 10 and 20%. Our approach demonstrates a potential method to prioritize genes based on biological systems impaired in a disease. PMID:27092072

  20. Discovery.

    ERIC Educational Resources Information Center

    Smithsonian Institution, Washington, DC. National Air And Space Museum.

    This material presents the historical perspectives of flight and student activities for grades K-3 prepared by the National Air and Space Museum (NASM) and National Aeronautics and Space Administration (NASA). Sections included are: (1) "Historical Perspective of Flight"; (2) "Discovery Vocabulary" (listing the terms found in the first section);…

  1. Nurturing Giftedness in All Children through Original Research. Project Discovery Final Report.

    ERIC Educational Resources Information Center

    Luvisi, Christopher L.

    This final report describes accomplishments of Project Discovery, a 3-year project in Kentucky to assist teachers in creating an innovative learning environment for gifted and talented primary-aged children. Major goals focused on and achieved by the project included: (1) increasing the percentage of disadvantaged students identified as gifted in…

  2. Discovery of mammalian genes that participate in virus infection

    PubMed Central

    Organ, Edward L; Sheng, Jinsong; Ruley, H Earl; Rubin, Donald H

    2004-01-01

    Background Viruses are obligate intracellular parasites that rely upon the host cell for different steps in their life cycles. The characterization of cellular genes required for virus infection and/or cell killing will be essential for understanding viral life cycles, and may provide cellular targets for new antiviral therapies. Results Candidate genes required for lytic reovirus infection were identified by tagged sequence mutagenesis, a process that permits rapid identification of genes disrupted by gene entrapment. One hundred fifty-one reovirus resistant clones were selected from cell libraries containing 2 × 105 independently disrupted genes, of which 111 contained mutations in previously characterized genes and functionally anonymous transcription units. Collectively, the genes associated with reovirus resistance differed from genes targeted by random gene entrapment in that known mutational hot spots were under represented, and a number of mutations appeared to cluster around specific cellular processes, including: IGF-II expression/signalling, vesicular transport/cytoskeletal trafficking and apoptosis. Notably, several of the genes have been directly implicated in the replication of reovirus and other viruses at different steps in the viral lifecycle. Conclusions Tagged sequence mutagenesis provides a rapid, genome-wide strategy to identify candidate cellular genes required for virus infection. The candidate genes provide a starting point for mechanistic studies of cellular processes that participate in the virus lifecycle and may provide targets for novel anti-viral therapies. PMID:15522117

  3. Using the DFCI Gene Index Databases for Biological Discovery

    PubMed Central

    Antonescu, Corina; Antonescu, Valentin; Sultana, Razvan; Quackenbush, John

    2014-01-01

    The DFCI Gene Index Web pages provide access to analyses of ESTs and gene sequences for nearly 114 species, as well as a number of resources derived from these. Each species-specific database is presented using a common format with a home page. A variety of methods exist that allow users to search each species-specific database. Methods implemented currently include nucleotide or protein sequence queries using WU-BLAST, text-based searches using various sequence identifiers, searches by gene, tissue and library name, and searches using functional classes through Gene Ontology assignments. This protocol provides guidance for using the Gene Index Databases to extract information. PMID:20205187

  4. Gene discovery of crop disease in the postgenome era

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Plant resistance (R) genes provide effective protection against invading pathogens at the front line of defense. Advances in genomic technology have accelerated efforts to characterize a wide range of crop R genes from diverse and economically important crops, resulting in effective crop protection....

  5. Literature Based Discovery of Gene Clusters Using Phylogenetic Methods

    PubMed Central

    Sarkar, Indra Neil; Agrawal, Abha

    2006-01-01

    Biomedical literature can offer valuable information for organizing genes associated with the etiology and pathogenesis of disease. In this study, we demonstrate the utility of existing phylogenetic methods for organizing 375 genes associated with Breast Cancer using the MeSH annotations from over 35,000 Medline articles. Specifically, we compare the clustering (using the Colless Imbalance Index, Ic) of distance-based methods, which are used by popular phylogenetic clustering algorithms, and a characteristic-based method (Maximum Parsimony) that is commonly used for phylogenetic studies. Focusing on genes that cluster around BRCA1 and BRCA2, we examine the relevance of the clustered genes proposed by the different clustering methods based on the number of exclusive MeSH terms. Our results indicate that existing phylogenetic methods and associated metrics can be used for organizing genes according to annotated knowledge in biomedical literature. PMID:17238429

  6. Speeding disease gene discovery by sequence based candidate prioritization

    PubMed Central

    Adie, Euan A; Adams, Richard R; Evans, Kathryn L; Porteous, David J; Pickard, Ben S

    2005-01-01

    Background Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time. Conclusion PROSPECTR is a simple and effective way to identify genes involved in Mendelian and oligogenic disorders. It performs markedly better than the single existing sequence-based classifier on novel data. PROSPECTR could save investigators looking at large regions of interest time and effort by prioritizing positional candidate genes for mutation detection and case-control association studies. PMID:15766383

  7. Guidelines to Media Utilization: Lessons from Project Discovery. Research Report 64.

    ERIC Educational Resources Information Center

    Taylor, William D.; Dipaolo, Andy

    An assessment was made in 1978 of the longterm effects on school media utilization of Project Discovery (1964-1967), which was designed to test the effect of "maximum availability" of instructional materials on curriculum, on student attitudes, achievement, creativity, and motivation, and on teaching methods and techniques. The project also tested…

  8. GWAS as a Driver of Gene Discovery in Cardiometabolic Diseases.

    PubMed

    Atanasovska, Biljana; Kumar, Vinod; Fu, Jingyuan; Wijmenga, Cisca; Hofker, Marten H

    2015-12-01

    Cardiometabolic diseases represent a common complex disorder with a strong genetic component. Currently, genome-wide association studies (GWAS) have yielded some 755 single-nucleotide polymorphisms (SNPs) encompassing 366 independent loci that may help to decipher the molecular basis of cardiometabolic diseases. Going from a disease SNP to the underlying disease mechanisms is a huge challenge because the associated SNPs rarely disrupt protein function. Many disease SNPs are located in noncoding regions, and therefore attention is now focused on linking genetic SNP variation to effects on gene expression levels. By integrating genetic information with large-scale gene expression data, and with data from epigenetic roadmaps revealing gene regulatory regions, we expect to be able to identify candidate disease genes and the regulatory potential of disease SNPs. PMID:26596674

  9. Systems-wide chicken DNA microarrays, gene expression profiling, and discovery of functional genes.

    PubMed

    Cogburn, L A; Wang, X; Carre, W; Rejto, L; Porter, T E; Aggrey, S E; Simon, J

    2003-06-01

    The goal of our current consortium project is to launch a new era--functional genomics of poultry--by providing genomic resources [expressed sequence tags (EST) and DNA microarrays] and by examining global gene expression in target tissues of chickens. DNA microarray analysis has been a fruitful strategy for the identification of functional genes in several model organisms (i.e., human, rodents, fruit fly, etc.). We have constructed and normalized five tissue-specific or multiple-tissue chicken cDNA libraries [liver, fat, breast, and leg muscle/epiphyseal growth plate, pituitary/hypothalamus/pineal, and reproductive tract (oviduct/ovary/testes)] for high-throughput DNA sequencing of EST. DNA sequence clustering was used to build contigs of overlapping sequence and to identify unique, non-redundant EST clones (unigenes), which permitted printing of systems-wide chicken DNA microarrays. One of the most promising genetic resources for gene exploration and functional gene mapping is provided by two sets of experimental lines of broiler-type chickens developed at INRA, France, by divergent selection for extremes in growth traits (fast-growing versus slow-growing; fatness versus leanness at a similar growth rate). We are using DNA microarrays for global gene expression profiling to identify candidate genes and to map growth, metabolic, and regulatory pathways that control important production traits. Candidate genes will be used for functional gene mapping and QTL analysis of F2 progeny from intercrosses made between divergent genetic lines (fat x lean lines; fast-growing x slow-growing lines). Using our first chicken liver microarray, we have already identified several interesting differentially expressed genes in commercial broilers and in divergently selected broiler lines. Many of these candidate genes are involved in the lipogenic pathway and are controlled in part by the thyrotropic axis. Thus, genome-wide transcriptional profiling is a powerful tool used to visualize the cascade of genetic circuits that govern complex biological responses. Global gene expression profiling and QTL scans should enable us to functionally map the genetic pathways that control growth, development, and metabolism of chickens. This emerging technology will have broad applications for poultry breeding programs (i.e., use of molecular markers) and for future production systems (i.e., the health and welfare of birds and the quality of poultry products). PMID:12817449

  10. Gene Discovery through Genomic Sequencing of Brucella abortus

    PubMed Central

    Sánchez, Daniel O.; Zandomeni, Ruben O.; Cravero, Silvio; Verdún, Ramiro E.; Pierrou, Ester; Faccio, Paula; Diaz, Gabriela; Lanzavecchia, Silvia; Agüero, Fernán; Frasch, Alberto C. C.; Andersson, Siv G. E.; Rossetti, Osvaldo L.; Grau, Oscar; Ugalde, Rodolfo A.

    2001-01-01

    Brucella abortus is the etiological agent of brucellosis, a disease that affects bovines and human. We generated DNA random sequences from the genome of B. abortus strain 2308 in order to characterize molecular targets that might be useful for developing immunological or chemotherapeutic strategies against this pathogen. The partial sequencing of 1,899 clones allowed the identification of 1,199 genomic sequence surveys (GSSs) with high homology (BLAST expect value < 10−5) to sequences deposited in the GenBank databases. Among them, 925 represent putative novel genes for the Brucella genus. Out of 925 nonredundant GSSs, 470 were classified in 15 categories based on cellular function. Seven hundred GSSs showed no significant database matches and remain available for further studies in order to identify their function. A high number of GSSs with homology to Agrobacterium tumefaciens and Rhizobium meliloti proteins were observed, thus confirming their close phylogenetic relationship. Among them, several GSSs showed high similarity with genes related to nodule nitrogen fixation, synthesis of nod factors, nodulation protein symbiotic plasmid, and nodule bacteroid differentiation. We have also identified several B. abortus homologs of virulence and pathogenesis genes from other pathogens, including a homolog to both the Shda gene from Salmonella enterica serovar Typhimurium and the AidA-1 gene from Escherichia coli. Other GSSs displayed significant homologies to genes encoding components of the type III and type IV secretion machineries, suggesting that Brucella might also have an active type III secretion machinery. PMID:11159979

  11. GENOME-ENABLED DISCOVERY OF CARBON SEQUESTRATION GENES IN POPLAR

    SciTech Connect

    DAVIS J M

    2007-10-11

    Plants utilize carbon by partitioning the reduced carbon obtained through photosynthesis into different compartments and into different chemistries within a cell and subsequently allocating such carbon to sink tissues throughout the plant. Since the phytohormones auxin and cytokinin are known to influence sink strength in tissues such as roots (Skoog & Miller 1957, Nordstrom et al. 2004), we hypothesized that altering the expression of genes that regulate auxin-mediated (e.g., AUX/IAA or ARF transcription factors) or cytokinin-mediated (e.g., RR transcription factors) control of root growth and development would impact carbon allocation and partitioning belowground (Fig. 1 - Renewal Proposal). Specifically, the ARF, AUX/IAA and RR transcription factor gene families mediate the effects of the growth regulators auxin and cytokinin on cell expansion, cell division and differentiation into root primordia. Invertases (IVR), whose transcript abundance is enhanced by both auxin and cytokinin, are critical components of carbon movement and therefore of carbon allocation. Thus, we initiated comparative genomic studies to identify the AUX/IAA, ARF, RR and IVR gene families in the Populus genome that could impact carbon allocation and partitioning. Bioinformatics searches using Arabidopsis gene sequences as queries identified regions with high degrees of sequence similarities in the Populus genome. These Populus sequences formed the basis of our transgenic experiments. Transgenic modification of gene expression involving members of these gene families was hypothesized to have profound effects on carbon allocation and partitioning.

  12. Gene discovery in the horned beetle Onthophagus taurus

    PubMed Central

    2010-01-01

    Background Horned beetles, in particular in the genus Onthophagus, are important models for studies on sexual selection, biological radiations, the origin of novel traits, developmental plasticity, biocontrol, conservation, and forensic biology. Despite their growing prominence as models for studying both basic and applied questions in biology, little genomic or transcriptomic data are available for this genus. We used massively parallel pyrosequencing (Roche 454-FLX platform) to produce a comprehensive EST dataset for the horned beetle Onthophagus taurus. To maximize sequence diversity, we pooled RNA extracted from a normalized library encompassing diverse developmental stages and both sexes. Results We used 454 pyrosequencing to sequence ESTs from all post-embryonic stages of O. taurus. Approximately 1.36 million reads assembled into 50,080 non-redundant sequences encompassing a total of 26.5 Mbp. The non-redundant sequences match over half of the genes in Tribolium castaneum, the most closely related species with a sequenced genome. Analyses of Gene Ontology annotations and biochemical pathways indicate that the O. taurus sequences reflect a wide and representative sampling of biological functions and biochemical processes. An analysis of sequence polymorphisms revealed that SNP frequency was negatively related to overall expression level and the number of tissue types in which a given gene is expressed. The most variable genes were enriched for a limited number of GO annotations whereas the least variable genes were enriched for a wide range of GO terms directly related to fitness. Conclusions This study provides the first large-scale EST database for horned beetles, a much-needed resource for advancing the study of these organisms. Furthermore, we identified instances of gene duplications and alternative splicing, useful for future study of gene regulation, and a large number of SNP markers that could be used in population-genetic studies of O. taurus and possibly other horned beetles. PMID:21156066

  13. Transient transformation meets gene function discovery: the strawberry fruit case

    PubMed Central

    Guidarelli, Michela; Baraldi, Elena

    2015-01-01

    Beside the well known nutritional and health benefits, strawberry (Fragaria X ananassa) crop draws increasing attention as plant model system for the Rosaceae family, due to the short generation time, the rapid in vitro regeneration, and to the availability of the genome sequence of F. X ananassa and F. vesca species. In the last years, the use of high-throughput sequence technologies provided large amounts of molecular information on the genes possibly related to several biological processes of this crop. Nevertheless, the function of most genes or gene products is still poorly understood and needs investigation. Transient transformation technology provides a powerful tool to study gene function in vivo, avoiding difficult drawbacks that typically affect the stable transformation protocols, such as transformation efficiency, transformants selection, and regeneration. In this review we provide an overview of the use of transient expression in the investigation of the function of genes important for strawberry fruit development, defense and nutritional properties. The technical aspects related to an efficient use of this technique are described, and the possible impact and application in strawberry crop improvement are discussed. PMID:26124771

  14. Transposons for cancer gene discovery: Sleeping Beauty and beyond

    PubMed Central

    Collier, Lara S; Largaespada, David A

    2007-01-01

    The use of Sleeping Beauty transposons as somatic mutagens to discover cancer genes in hematopoietic tumors and sarcomas has been documented. Here, we discuss the future of Sleeping Beauty for cancer genetic studies and the potential use of additional transposable elements for somatic mutagenesis. PMID:18047692

  15. QUANTITATIVE TRAIT LOCUS ANALYSIS AS A GENE DISCOVERY TOOL

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Quantitative trait locus analysis has been a mainstay approach for obtaining a genetic description of complex agronomic traits for plants. What is sometimes overlooked is the role QTL analysis can play in identifying genes that underlay complex traits. In this chapter, I will describe the basic st...

  16. Gene discovery for facioscapulohumeral muscular dystrophy by machine learning techniques.

    PubMed

    González-Navarro, Félix F; Belanche-Muñoz, Lluís A; Gámez-Moreno, María G; Flores-Ríos, Brenda L; Ibarra-Esquer, Jorge E; López-Morteo, Gabriel A

    2016-04-28

    Facioscapulohumeral muscular dystrophy (FSHD) is a neuromuscular disorder that shows a preference for the facial, shoulder and upper arm muscles. FSHD affects about one in 20-400,000 people, and no effective therapeutic strategies are known to halt disease progression or reverse muscle weakness or atrophy. Many genes may be incorrectly regulated in affected muscle tissue, but the mechanisms responsible for the progressive muscle weakness remain largely unknown. Although machine learning (ML) has made significant inroads in biomedical disciplines such as cancer research, no reports have yet addressed FSHD analysis using ML techniques. This study explores a specific FSHD data set from a ML perspective. We report results showing a very promising small group of genes that clearly separates FSHD samples from healthy samples. In addition to numerical prediction figures, we show data visualizations and biological evidence illustrating the potential usefulness of these results. PMID:26960968

  17. Discovery of the lomaiviticin biosynthetic gene cluster in Salinispora pacifica

    PubMed Central

    Janso, Jeffrey E.; Haltli, Brad A.; Eustáquio, Alessandra S.; Kulowski, Kerry; Waldman, Abraham J.; Zha, Li; Nakamura, Hitomi; Bernan, Valerie S.; He, Haiyin; Carter, Guy T.; Koehn, Frank E.; Balskus, Emily P.

    2014-01-01

    The lomaiviticins are a family of cytotoxic marine natural products that have captured the attention of both synthetic and biological chemists due to their intricate molecular scaffolds and potent biological activities. Here we describe the identification of the gene cluster responsible for lomaiviticin biosynthesis in Salinispora pacifica strains DPJ-0016 and DPJ-0019 using a combination of molecular approaches and genome sequencing. The link between the lom gene cluster and lomaiviticin production was confirmed using bacterial genetics, and subsequent analysis and annotation of this cluster revealed the biosynthetic basis for the core polyketide scaffold. Additionally, we have used comparative genomics to identify candidate enzymes for several unusual tailoring events, including diazo formation and oxidative dimerization. These findings will allow further elucidation of the biosynthetic logic of lomaiviticin assembly and provide useful molecular tools for application in biocatalysis and synthetic biology. PMID:25045187

  18. Obstructive nephropathy: towards biomarker discovery and gene therapy.

    PubMed

    Chevalier, Robert L

    2006-03-01

    Obstructive nephropathy is a major cause of renal failure, particularly in infants and children. Cellular and molecular mechanisms responsible for the progression of the tubular atrophy and interstitial fibrosis-processes that lead to nephron loss-have been elucidated in the past 5 years. Following urinary tract obstruction and tubular dilatation, a cascade of events results in upregulation of the intrarenal renin-angiotensin system, tubular apoptosis and macrophage infiltration of the interstitium. This is followed by accumulation of interstitial fibroblasts through proliferation of resident fibroblasts and epithelial-mesenchymal transformation of renal tubular cells. Under the influence of cytokines, chemokines and other signaling molecules produced by tubular and interstitial cells, fibroblasts undergo transformation to myofibroblasts that induce expansion of the extracellular matrix. The cellular interactions that regulate development of interstitial inflammation, tubular apoptosis and interstitial fibrosis are complex. Changes in renal gene expression and protein production afford many potential biomarkers of disease progression and targets for therapeutic manipulation. These include signaling molecules and receptors involved in macrophage recruitment and proliferation, tubular death signals and survival factors, and modulators of epithelial-mesenchymal transformation. Targeted gene deletion and various forms of gene therapy have been used in experimental obstructive nephropathy, mostly rodent models of unilateral ureteral obstruction or cell culture techniques. Further refinement of these models is needed to develop a matrix of biomarkers with clinical predictive value, as well as molecular therapies that will prevent or reverse the renal structural and functional consequences of obstructive nephropathy. PMID:16932414

  19. Resequencing and comparative genomics of Stagonospora nodorum: Sectional gene absence and effector discovery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    S. nodorum is an important wheat (Triticum aestivum) pathogen in many parts of the world causing major yield losses. It was the first species in the large fungal Dothideomycete class to be genome sequenced. The reference genome sequence (SN15) has been instrumental in the discovery of genes encoding...

  20. Whole-genome resequencing: changing the paradigms of SNP detection, molecular mapping and gene discovery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The next generation sequencing (NGS) technologies have opened a wealth of opportunities for plant breeding and genomics research, and changed the paradigms of marker detection, genotyping, and gene discovery. Abundant genomic resources have been generated using a whole genome resequencing (WGR) str...

  1. Network-based gene prediction for Plasmodium falciparum malaria towards genetics-based drug discovery

    PubMed Central

    2015-01-01

    Background Malaria is the most deadly parasitic infectious disease. Existing drug treatments have limited efficacy in malaria elimination, and the complex pathogenesis of the disease is not fully understood. Detecting novel malaria-associated genes not only contributes in revealing the disease pathogenesis, but also facilitates discovering new targets for anti-malaria drugs. Methods In this study, we developed a network-based approach to predict malaria-associated genes. We constructed a cross-species network to integrate human-human, parasite-parasite and human-parasite protein interactions. Then we extended the random walk algorithm on this network, and used known malaria genes as the seeds to find novel candidate genes for malaria. Results We validated our algorithms using 77 known malaria genes: 14 human genes and 63 parasite genes were ranked averagely within top 2% and top 4%, respectively among human and parasite genomes. We also evaluated our method for predicting novel malaria genes using a set of 27 genes with literature supporting evidence. Our approach ranked 12 genes within top 1% and 24 genes within top 5%. In addition, we demonstrated that top-ranked candied genes were enriched for drug targets, and identified commonalities underlying top-ranked malaria genes through pathway analysis. In summary, the candidate malaria-associated genes predicted by our data-driven approach have the potential to guide genetics-based anti-malaria drug discovery. PMID:26099491

  2. Tissue Compartment Analysis for Biomarker Discovery by Gene Expression Profiling

    PubMed Central

    Disset, Antoine; Cheval, Lydie; Soutourina, Olga; Duong Van Huyen, Jean-Paul; Li, Guorong; Genin, Christian; Tostain, Jacques; Loupy, Alexandre

    2009-01-01

    Background Although high throughput technologies for gene profiling are reliable tools, sample/tissue heterogeneity limits their outcomes when applied to identify molecular markers. Indeed, inter-sample differences in cell composition contribute to scatter the data, preventing detection of small but relevant changes in gene expression level. To date, attempts to circumvent this difficulty were based on isolation of the different cell structures constituting biological samples. As an alternate approach, we developed a tissue compartment analysis (TCA) method to assess the cell composition of tissue samples, and applied it to standardize data and to identify biomarkers. Methodology/Principal Findings TCA is based on the comparison of mRNA expression levels of specific markers of the different constitutive structures in pure isolated structures, on the one hand, and in the whole sample on the other. TCA method was here developed with human kidney samples, as an example of highly heterogeneous organ. It was validated by comparison of the data with those obtained by histo-morphometry. TCA demonstrated the extreme variety of composition of kidney samples, with abundance of specific structures varying from 5 to 95% of the whole sample. TCA permitted to accurately standardize gene expression level amongst >100 kidney biopsies, and to identify otherwise imperceptible molecular disease markers. Conclusions/Significance Because TCA does not require specific preparation of sample, it can be applied to all existing tissue or cDNA libraries or to published data sets, inasmuch specific operational compartments markers are available. In human, where the small size of tissue samples collected in clinical practice accounts for high structural diversity, TCA is well suited for the identification of molecular markers of diseases, and the follow up of identified markers in single patients for diagnosis/prognosis and evaluation of therapy efficiency. In laboratory animals, TCA will interestingly be applied to central nervous system where tissue heterogeneity is a limiting factor. PMID:19901995

  3. Scientific Discovery with the Blue Gene/L

    SciTech Connect

    Negele, John W.

    2011-12-09

    This project succeeded in developing key software optimization tools to bring fundamental QCD calculations of nucleon structure from the Terascale era through the Petascale era and prepare for the Exascale era. It also enabled fundamental QCD physics calculations and demonstrated the power of placing small versions of frontier emerging architectures at MIT to attract outstanding students to computational science. MIT also hosted a workshop September 19 2008 to brainstorm ways to promote computational science at top tier research universities and attract gifted students into the field, some of whom would provide the next generation of talent at our defense laboratories.

  4. Cohesin gene mutations in tumorigenesis: from discovery to clinical significance

    PubMed Central

    Solomon, David A.; Kim, Jung-Sik; Waldman, Todd

    2014-01-01

    Cohesin is a multi-protein complex composed of four core subunits (SMC1A, SMC3, RAD21, and either STAG1 or STAG2) that is responsible for the cohesion of sister chromatids following DNA replication until its cleavage during mitosis thereby enabling faithful segregation of sister chromatids into two daughter cells. Recent cancer genomics analyses have discovered a high frequency of somatic mutations in the genes encoding the core cohesin subunits as well as cohesin regulatory factors (e.g. NIPBL, PDS5B, ESPL1) in a select subset of human tumors including glioblastoma, Ewing sarcoma, urothelial carcinoma, acute myeloid leukemia, and acute megakaryoblastic leukemia. Herein we review these studies including discussion of the functional significance of cohesin inactivation in tumorigenesis and potential therapeutic mechanisms to selectively target cancers harboring cohesin mutations. [BMB Reports 2014; 47(6): 299-310] PMID:24856830

  5. The Salinas Airshower Learning And Discovery Project (SALAD)

    NASA Astrophysics Data System (ADS)

    Hernandez, Victor; Niduaza, Rommel; Ruiz Castruita, Daniel; Knox, Adrian; Ramos, Daniel; Fan, Sewan; Fatuzzo, Laura

    2015-04-01

    The SALAD project partners community college and high school STEM students in order to develop and investigate cosmic ray detector telescopes and the physical concepts, using a new light sensor technology based on silicon photomultiplier (SiPM) detectors. Replacing the conventional photomultiplier with the SiPM, offers notable advantages in cost and facilitates more in depth, hands-on learning laboratory activities. The students in the SALAD project design, construct and extensively evaluate the SiPM detector modules. These SiPM modules, can be completed in a short time utilizing cost effective components. We describe our research to implement SiPM as read out light detectors for plastic scintillators in a cosmic ray detector telescope for use in high schools. In particular, we describe our work in the design, evaluation and the assembly of (1) a fast preamplifier, (2) a simple coincidence circuit using fast comparators, to discriminate the SiPM noise signal pulses, and (3) a monovibrator circuit to shape the singles plus the AND logic pulses for subsequent processing. To store the singles and coincidence counts data, an Arduino micro-controller with program sketches can be implemented. Results and findings from our work would be described and presented. US Department of Education Title V Grant Award PO31S090007

  6. The Alveolate Perkinsus marinus: Biological Insights from EST Gene Discovery

    PubMed Central

    2010-01-01

    Background Perkinsus marinus, a protozoan parasite of the eastern oyster Crassostrea virginica, has devastated natural and farmed oyster populations along the Atlantic and Gulf coasts of the United States. It is classified as a member of the Perkinsozoa, a recently established phylum considered close to the ancestor of ciliates, dinoflagellates, and apicomplexans, and a key taxon for understanding unique adaptations (e.g. parasitism) within the Alveolata. Despite intense parasite pressure, no disease-resistant oysters have been identified and no effective therapies have been developed to date. Results To gain insight into the biological basis of the parasite's virulence and pathogenesis mechanisms, and to identify genes encoding potential targets for intervention, we generated >31,000 5' expressed sequence tags (ESTs) derived from four trophozoite libraries generated from two P. marinus strains. Trimming and clustering of the sequence tags yielded 7,863 unique sequences, some of which carry a spliced leader. Similarity searches revealed that 55% of these had hits in protein sequence databases, of which 1,729 had their best hit with proteins from the chromalveolates (E-value ≤ 1e-5). Some sequences are similar to those proven to be targets for effective intervention in other protozoan parasites, and include not only proteases, antioxidant enzymes, and heat shock proteins, but also those associated with relict plastids, such as acetyl-CoA carboxylase and methyl erythrithol phosphate pathway components, and those involved in glycan assembly, protein folding/secretion, and parasite-host interactions. Conclusions Our transcriptome analysis of P. marinus, the first for any member of the Perkinsozoa, contributes new insight into its biology and taxonomic position. It provides a very informative, albeit preliminary, glimpse into the expression of genes encoding functionally relevant proteins as potential targets for chemotherapy, and evidence for the presence of a relict plastid. Further, although P. marinus sequences display significant similarity to those from both apicomplexans and dinoflagellates, the presence of trans-spliced transcripts confirms the previously established affinities with the latter. The EST analysis reported herein, together with the recently completed sequence of the P. marinus genome and the development of transfection methodology, should result in improved intervention strategies against dermo disease. PMID:20374649

  7. Peanut EST Project: Gene discovery and marker development

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Aflatoxin contamination caused by Aspergillus fungi is a great concern in peanut production worldwide. Pre-harvest Aspergillii infection and aflatoxin contamination are usually severe in peanuts that are grown under drought stressed conditions. Genomic research can provide new tools to study plant-m...

  8. Comprehensive genetic variant discovery in the surfactant protein B gene.

    PubMed

    Hamvas, Aaron; Wegner, Daniel J; Carlson, Christopher S; Bergmann, Kelly R; Trusgnich, Michelle A; Fulton, Lucinda; Kasai, Yumi; An, Ping; Mardis, Elaine R; Wilson, Richard K; Cole, F Sessions

    2007-08-01

    Completely penetrant mutations in the surfactant protein B gene (SFTPB) and >75% reduction of SFTPB expression disrupt pulmonary surfactant function and cause neonatal respiratory distress syndrome. To inform studies of genetic regulation of SFTPB expression, we created a catalogue of SFTPB variants by comprehensive resequencing from an unselected, population-based cohort (n = 1,116). We found an excess of low-frequency variation [81 SNPs and five small insertion/deletions (in/dels)]. Despite its small genomic size (9.7 kb), SFTPB was characterized by weak linkage disequilibrium (LD) and high haplotype diversity. Using the HapMap Yoruban and European populations, we identified a recombination hot spot that spans SFTPB, was not detectable in our focused resequencing data, and accounts for weak LD. Using homology-based software tools, we discovered no definitively damaging exonic variants. We conclude that excess low-frequency variation, intragenic recombination and lack of common disruptive exonic variants favor complete resequencing as the optimal approach for genetic association studies to identify regulatory SFTPB variants that cause neonatal respiratory distress syndrome in genetically diverse populations. PMID:17597650

  9. Marfan Syndrome and Related Disorders: 25 Years of Gene Discovery.

    PubMed

    Verstraeten, Aline; Alaerts, Maaike; Van Laer, Lut; Loeys, Bart

    2016-06-01

    Marfan syndrome (MFS) is a rare, autosomal-dominant, multisystem disorder, presenting with skeletal, ocular, skin, and cardiovascular symptoms. Significant clinical overlap with other systemic connective tissue diseases, including Loeys-Dietz syndrome (LDS), Shprintzen-Goldberg syndrome (SGS), and the MASS phenotype, has been documented. In MFS and LDS, the cardiovascular manifestations account for the major cause of patient morbidity and mortality, rendering them the main target for therapeutic intervention. Over the past decades, gene identification studies confidently linked the aforementioned syndromes, as well as nonsyndromic aneurysmal disease, to genetic defects in proteins related to the transforming growth factor (TGF)-β pathway, greatly expanding our knowledge on the disease mechanisms and providing us with novel therapeutic targets. As a result, the focus of the developing pharmacological treatment strategies is shifting from hemodynamic stress management to TGF-β antagonism. In this review, we discuss the insights that have been gained in the molecular biology of MFS and related disorders over the past 25 years. PMID:26919284

  10. LSST’s Projected Near-Earth Asteroid Discovery Performance

    NASA Astrophysics Data System (ADS)

    Chesley, Steven R.; Vereš, Peter

    2015-11-01

    The Large Synoptic Survey Telescope (LSST) is an ambitious project that has the potential to make major advances in Near-Earth Asteroid search efforts. With construction already underway and major optical elements complete, first light is set for 2020, followed by two years of commissioning. Once regular survey operations begin in 2022, LSST will systematically survey the observable sky over a ten-year period from its site on Cerro Pachon, Chile. With an 8.4 m aperture (6.5 m effective), 9.6 square degree field of view, and a 3.2-Gigapixel camera, LSST represents the most capable asteroid survey instrument ever built.LSST will be able cover over 6000 square degrees of sky per clear night with single visit exposures of 30 s, reaching a faint limit of 24.5 mag in the r band. However the cadence of survey operations is a critical factor for the near-Earth asteroid search performance, and there are multiple science drivers with different cadence objectives that are competing to shape the final survey strategy. We examine the NEA search performance of various LSST search strategies, paying particular attention to the challenges of linking large numbers asteroid detections in the presence of noise.Our approach is to derive lists of synthetic detections for a given instantiation of the LSST survey, based on an assumed model for the populations of solar system objects from the main asteroid belt inwards to the near-Earth population. These detection lists are combined with false detection lists that model both random noise and non-random artifacts resulting from image differencing algorithms. These large detection lists are fed to the Moving Object Processing System (MOPS), which attempts to link the synthetic detections correctly without becoming confused or overwhelmed by the false detections.The LSST baseline survey cadence relies primarily on single night pairs of detections, with roughly 30-60 min separating elements of the pair. The strategy of using pairs is an aggressive and potentially fragile approach, but theoretically represents the most productive NEA search with the minimum impact on other LSST science drivers.

  11. The role of phenotype in gene discovery in the whole genome sequencing era

    PubMed Central

    Almasy, Laura

    2012-01-01

    As whole genome sequence becomes a routine component of gene discovery studies in humans, we will have an exhaustive catalog of genetic variation and the challenge becomes understanding the phenotypic consequences of these variants. Statistical genetic methods and analytical approaches that are concerned with optimizing phenotypes for gene discovery for complex traits offer two general categories of advantages. They may increase power to localize genes of interest and also aid in interpreting associations between genetic variants and disease outcomes by suggesting potential mechanisms and pathways through which genes may affect outcomes. Such phenotype optimization approaches include use of allied phenotypes such as symptoms or ages of onset to reduce genetic heterogeneity within a set of cases, study of quantitative risk factors or endophenotypes, joint analyses of related phenotypes, and derivation of new phenotypes designed to extract independent measures underlying the correlations among a set of related phenotypes through approaches such as principal components. New opportunities are also presented by technological advances that permit efficient collection of hundreds or thousands of phenotypes on an individual, including phenotypes more proximal to the level of gene action such as levels of gene expression, microRNAs, or metabolic and proteomic profiles. PMID:22722752

  12. Global Landscape of a Co-Expressed Gene Network in Barley and its Application to Gene Discovery in Triticeae Crops

    PubMed Central

    Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo

    2011-01-01

    Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235

  13. Modern plant metabolomics: Advanced natural product gene discoveries, improved technologies, and future prospects

    DOE PAGESBeta

    Sumner, Lloyd W.; Lei, Zhentian; Nikolau, Basil J.; Saito, Kazuki

    2014-10-24

    Plant metabolomics has matured and modern plant metabolomics has accelerated gene discoveries and the elucidation of a variety of plant natural product biosynthetic pathways. This study highlights specific examples of the discovery and characterization of novel genes and enzymes associated with the biosynthesis of natural products such as flavonoids, glucosinolates, terpenoids, and alkaloids. Additional examples of the integration of metabolomics with genome-based functional characterizations of plant natural products that are important to modern pharmaceutical technology are also reviewed. This article also provides a substantial review of recent technical advances in mass spectrometry imaging, nuclear magnetic resonance imaging, integrated LC-MS-SPE-NMR formore » metabolite identifications, and x-ray crystallography of microgram quantities for structural determinations. The review closes with a discussion on the future prospects of metabolomics related to crop species and herbal medicine.« less

  14. Modern plant metabolomics: Advanced natural product gene discoveries, improved technologies, and future prospects

    SciTech Connect

    Sumner, Lloyd W.; Lei, Zhentian; Nikolau, Basil J.; Saito, Kazuki

    2014-10-24

    Plant metabolomics has matured and modern plant metabolomics has accelerated gene discoveries and the elucidation of a variety of plant natural product biosynthetic pathways. This study highlights specific examples of the discovery and characterization of novel genes and enzymes associated with the biosynthesis of natural products such as flavonoids, glucosinolates, terpenoids, and alkaloids. Additional examples of the integration of metabolomics with genome-based functional characterizations of plant natural products that are important to modern pharmaceutical technology are also reviewed. This article also provides a substantial review of recent technical advances in mass spectrometry imaging, nuclear magnetic resonance imaging, integrated LC-MS-SPE-NMR for metabolite identifications, and x-ray crystallography of microgram quantities for structural determinations. The review closes with a discussion on the future prospects of metabolomics related to crop species and herbal medicine.

  15. Modern plant metabolomics: advanced natural product gene discoveries, improved technologies, and future prospects.

    PubMed

    Sumner, Lloyd W; Lei, Zhentian; Nikolau, Basil J; Saito, Kazuki

    2015-02-01

    Plant metabolomics has matured and modern plant metabolomics has accelerated gene discoveries and the elucidation of a variety of plant natural product biosynthetic pathways. This review covers the approximate period of 2000 to 2014, and highlights specific examples of the discovery and characterization of novel genes and enzymes associated with the biosynthesis of natural products such as flavonoids, glucosinolates, terpenoids, and alkaloids. Additional examples of the integration of metabolomics with genome-based functional characterizations of plant natural products that are important to modern pharmaceutical technology are also reviewed. This article also provides a substantial review of recent technical advances in mass spectrometry imaging, nuclear magnetic resonance imaging, integrated LC-MS-SPE-NMR for metabolite identifications, and X-ray crystallography of microgram quantities for structural determinations. The review closes with a discussion on the future prospects of metabolomics related to crop species and herbal medicine. PMID:25342293

  16. Discovery of New Candidate Genes Related to Brain Development Using Protein Interaction Information

    PubMed Central

    Chen, Lei; Chu, Chen; Kong, Xiangyin; Huang, Tao; Cai, Yu-Dong

    2015-01-01

    Human brain development is a dramatic process composed of a series of complex and fine-tuned spatiotemporal gene expressions. A good comprehension of this process can assist us in developing the potential of our brain. However, we have only limited knowledge about the genes and gene functions that are involved in this biological process. Therefore, a substantial demand remains to discover new brain development-related genes and identify their biological functions. In this study, we aimed to discover new brain-development related genes by building a computational method. We referred to a series of computational methods used to discover new disease-related genes and developed a similar method. In this method, the shortest path algorithm was executed on a weighted graph that was constructed using protein-protein interactions. New candidate genes fell on at least one of the shortest paths connecting two known genes that are related to brain development. A randomization test was then adopted to filter positive discoveries. Of the final identified genes, several have been reported to be associated with brain development, indicating the effectiveness of the method, whereas several of the others may have potential roles in brain development. PMID:25635857

  17. In silico prioritization based on coexpression can aid epileptic encephalopathy gene discovery

    PubMed Central

    Oliver, Karen L.; Lukic, Vesna; Freytag, Saskia; Scheffer, Ingrid E.; Berkovic, Samuel F.

    2016-01-01

    Objective: To evaluate the performance of an in silico prioritization approach that was applied to 179 epileptic encephalopathy candidate genes in 2013 and to expand the application of this approach to the whole genome based on expression data from the Allen Human Brain Atlas. Methods: PubMed searches determined which of the 179 epileptic encephalopathy candidate genes had been validated. For validated genes, it was noted whether they were 1 of the 19 of 179 candidates prioritized in 2013. The in silico prioritization approach was applied genome-wide; all genes were ranked according to their coexpression strength with a reference set (i.e., 51 established epileptic encephalopathy genes) in both adult and developing human brain expression data sets. Candidate genes ranked in the top 10% for both data sets were cross-referenced with genes previously implicated in the epileptic encephalopathies due to a de novo variant. Results: Five of 6 validated epileptic encephalopathy candidate genes were among the 19 prioritized in 2013 (odds ratio = 54, 95% confidence interval [7,∞], p = 4.5 × 10−5, Fisher exact test); one gene was false negative. A total of 297 genes ranked in the top 10% for both the adult and developing brain data sets based on coexpression with the reference set. Of these, 9 had been previously implicated in the epileptic encephalopathies (FBXO41, PLXNA1, ACOT4, PAK6, GABBR2, YWHAG, NBEA, KNDC1, and SELRC1). Conclusions: We conclude that brain gene coexpression data can be used to assist epileptic encephalopathy gene discovery and propose 9 genes as strong epileptic encephalopathy candidates worthy of further investigation. PMID:27066588

  18. Gene discovery and microarray analysis of cacao (Theobroma cacao L.) varieties.

    PubMed

    Jones, Paul G; Allaway, David; Gilmour, D Martin; Harris, Chris; Rankin, Debbie; Retzel, Ernest R; Jones, Chris A

    2002-12-01

    The cacao bean harvest from the relatively under developed tropical tree cacao (Theobroma cacao L.) is subject to high losses in potential production due to pests and diseases. To discover and understand the stability of putative natural resistance mechanisms in this commodity crop, essential for chocolate production, we undertook a gene-discovery program and demonstrated its use in gene-expression arrays. Sequencing and assembling bean and leaf cDNA library inserts produced a unique contig set of 1,380 members. High-quality annotation of this gene set using Blast and MetaFam produced annotation for 75% of the contigs and allowed us to identify the types of gene expressed in cacao beans and leaves. Microarrays were constructed using amplified inserts of the uni-gene set and challenged with bean and leaf RNA from five cacao varieties. The microarray performed well across the five randomly chosen cacao genotypes and did not show a bias towards either leaf or bean tissues. This demonstrates that the gene sequences are useful for microarray analysis across cacao genotypes and tissue types. The array results, when compared with real-time PCR results for selected genes, showed a correlation with differential gene-expression patterns. We intend that the resultant DNA sequences and molecular microarray platform will help the cacao community to understand the basis, likely stability and pathotype resistance range of candidate cacao plants. PMID:12447539

  19. The National Laboratory Gene Library Project

    SciTech Connect

    Deaven, L.L.; Van Dilla, M.A.

    1988-01-01

    The two National Laboratories at Livermore and Los Alamos have played a prominent role in the development and application of flow cytometry and sorting to chromosome classification and purification. Both laboratories began to receive numerous requests for specific human chromosomal types purified by flow sorting for gene library construction, but these requests were difficult to satisfy due to time and personnel constraints. The Department of Energy, through its Office of Health and Environmental Research, has a long-standing interest in the human genome in general and in the mutagenic and carcinogenic effects of energy-related environmental pollutants in particular. Hence, it was decided in 1983 to use the flow construct chromosome-specific gene libraries to be made available to the genetic research community. The National Laboratory Gene Library Project was envisioned as a practical way to deal with requests for sorted chromosomes, and also as a way to promote increased understanding of the human genome and the effects of mutagens and carcinogens on it. The strategy for the project was developed with the help of an advisory committee as well as suggestions and advice from many other geneticists. 4 refs., 2 tabs.

  20. Exome sequencing for gene discovery in lethal fetal disorders--harnessing the value of extreme phenotypes.

    PubMed

    Filges, Isabel; Friedman, Jan M

    2015-10-01

    Massively parallel sequencing has revolutionized our understanding of Mendelian disorders, and many novel genes have been discovered to cause disease phenotypes when mutant. At the same time, next-generation sequencing approaches have enabled non-invasive prenatal testing of free fetal DNA in maternal blood. However, little attention has been paid to using whole exome and genome sequencing strategies for gene identification in fetal disorders that are lethal in utero, because they can appear to be sporadic and Mendelian inheritance may be missed. We present challenges and advantages of applying next-generation sequencing approaches to gene discovery in fetal malformation phenotypes and review recent successful discovery approaches. We discuss the implication and significance of recessive inheritance and cross-species phenotyping in fetal lethal conditions. Whole exome sequencing can be used in individual families with undiagnosed lethal congenital anomaly syndromes to discover causal mutations, provided that prior to data analysis, the fetal phenotype can be correlated to a particular developmental pathway in embryogenesis. Cross-species phenotyping allows providing further evidence for causality of discovered variants in genes involved in those extremely rare phenotypes and will increase our knowledge about normal and abnormal human developmental processes. Ultimately, families will benefit from the option of early prenatal diagnosis. PMID:25046514

  1. Evolving connectionist systems for knowledge discovery from gene expression data of cancer tissue.

    PubMed

    Futschik, Matthias E; Reeve, Anthony; Kasabov, Nikola

    2003-06-01

    Microarray techniques have made it possible to observe the expression of thousands of genes simultaneously. They have recently been applied to study gene expression patterns in tissue samples. This may lead to highly desirable improvements in the diagnosis and treatment of human diseases. Statistical and machine learning methods have recently been used to classify cancer tissue based on gene expression data. Although some of these methods have achieved a high degree of accuracy, they generally lack transparency in their classification process. This, however, is crucial for the application in the medical field. In order to overcome this obstacle, we used knowledge-based neurocomputing (KBN), since KBN seeks to gain knowledge that is comprehensible to humans. In particular, we applied evolving fuzzy neural networks (EFuNNs) to classify cancer tissue, which is illustrated on the case studies of leukaemia and colon cancer. EFuNNs belong to the evolving connectionist system paradigm (ECOS) that has been recently introduced. They are well suited for adaptive learning and knowledge discovery. Fuzzy logic rules can be extracted from the trained networks and offer knowledge about the classification process in an easily accessible form. These rules point to genes that are strongly associated with specific types of cancer and may be used for the development of new tests and treatment discoveries. PMID:12893118

  2. Ontological Discovery Environment: a system for integrating gene-phenotype associations.

    PubMed

    Baker, Erich J; Jay, Jeremy J; Philip, Vivek M; Zhang, Yun; Li, Zuopan; Kirova, Roumyana; Langston, Michael A; Chesler, Elissa J

    2009-12-01

    The wealth of genomic technologies has enabled biologists to rapidly ascribe phenotypic characters to biological substrates. Central to effective biological investigation is the operational definition of the process under investigation. We propose an elucidation of categories of biological characters, including disease relevant traits, based on natural endogenous processes and experimentally observed biological networks, pathways and systems rather than on externally manifested constructs and current semantics such as disease names and processes. The Ontological Discovery Environment (ODE) is an Internet accessible resource for the storage, sharing, retrieval and analysis of phenotype-centered genomic data sets across species and experimental model systems. Any type of data set representing gene-phenotype relationships, such quantitative trait loci (QTL) positional candidates, literature reviews, microarray experiments, ontological or even meta-data, may serve as inputs. To demonstrate a use case leveraging the homology capabilities of ODE and its ability to synthesize diverse data sets, we conducted an analysis of genomic studies related to alcoholism. The core of ODE's gene set similarity, distance and hierarchical analysis is the creation of a bipartite network of gene-phenotype relations, a unique discrete graph approach to analysis that enables set-set matching of non-referential data. Gene sets are annotated with several levels of metadata, including community ontologies, while gene set translations compare models across species. Computationally derived gene sets are integrated into hierarchical trees based on gene-derived phenotype interdependencies. Automated set identifications are augmented by statistical tools which enable users to interpret the confidence of modeled results. This approach allows data integration and hypothesis discovery across multiple experimental contexts, regardless of the face similarity and semantic annotation of the experimental systems or species domain. PMID:19733230

  3. Discovery of nucleotide polymorphisms in the Musa gene pool by Ecotilling.

    PubMed

    Till, Bradley J; Jankowicz-Cieslak, Joanna; Sági, László; Huynh, Owen A; Utsushi, Hiroe; Swennen, Rony; Terauchi, Ryohei; Mba, Chikelu

    2010-11-01

    Musa (banana and plantain) is an important genus for the global export market and in local markets where it provides staple food for approximately 400 million people. Hybridization and polyploidization of several (sub)species, combined with vegetative propagation and human selection have produced a complex genetic history. We describe the application of the Ecotilling method for the discovery and characterization of nucleotide polymorphisms in diploid and polyploid accessions of Musa. We discovered over 800 novel alleles in 80 accessions. Sequencing and band evaluation shows Ecotilling to be a robust and accurate platform for the discovery of polymorphisms in homologous and homeologous gene targets. In the process of validating the method, we identified two single nucleotide polymorphisms that may be deleterious for the function of a gene putatively important for phototropism. Evaluation of heterozygous polymorphism and haplotype blocks revealed a high level of nucleotide diversity in Musa accessions. We further applied a strategy for the simultaneous discovery of heterozygous and homozygous polymorphisms in diploid accessions to rapidly evaluate nucleotide diversity in accessions of the same genome type. This strategy can be used to develop hypotheses for inheritance patterns of nucleotide polymorphisms within and between genome types. We conclude that Ecotilling is suitable for diversity studies in Musa, that it can be considered for functional genomics studies and as tool in selecting germplasm for traditional and mutation breeding approaches. PMID:20589365

  4. Genes Frequently Coexpressed with Hoxc8 Provide Insight into the Discovery of Target Genes

    PubMed Central

    Kalyani, Ruthala; Lee, Ji-Yeon; Min, Hyehyun; Yoon, Heejei; Kim, Myoung Hee

    2016-01-01

    Identifying Hoxc8 target genes is at the crux of understanding the Hoxc8-mediated regulatory networks underlying its roles during development. However, identification of these genes remains difficult due to intrinsic factors of Hoxc8, such as low DNA binding specificity, context-dependent regulation, and unknown cofactors. Therefore, as an alternative, the present study attempted to test whether the roles of Hoxc8 could be inferred by simply analyzing genes frequently coexpressed with Hoxc8, and whether these genes include putative target genes. Using archived gene expression datasets in which Hoxc8 was differentially expressed, we identified a total of 567 genes that were positively coexpressed with Hoxc8 in at least four out of eight datasets. Among these, 23 genes were coexpressed in six datasets. Gene sets associated with extracellular matrix and cell adhesion were most significantly enriched, followed by gene sets for skeletal system development, morphogenesis, cell motility, and transcriptional regulation. In particular, transcriptional regulators, including paralogs of Hoxc8, known Hox co-factors, and transcriptional remodeling factors were enriched. We randomly selected Adam19, Ptpn13, Prkd1, Tgfbi, and Aldh1a3, and validated their coexpression in mouse embryonic tissues and cell lines following TGF-β2 treatment or ectopic Hoxc8 expression. Except for Aldh1a3, all genes showed concordant expression with that of Hoxc8, suggesting that the coexpressed genes might include direct or indirect target genes. Collectively, we suggest that the coexpressed genes provide a resource for constructing Hoxc8-mediated regulatory networks. PMID:27025388

  5. Genes Frequently Coexpressed with Hoxc8 Provide Insight into the Discovery of Target Genes.

    PubMed

    Kalyani, Ruthala; Lee, Ji-Yeon; Min, Hyehyun; Yoon, Heejei; Kim, Myoung Hee

    2016-05-31

    Identifying Hoxc8 target genes is at the crux of understanding the Hoxc8-mediated regulatory networks underlying its roles during development. However, identification of these genes remains difficult due to intrinsic factors of Hoxc8, such as low DNA binding specificity, context-dependent regulation, and unknown cofactors. Therefore, as an alternative, the present study attempted to test whether the roles of Hoxc8 could be inferred by simply analyzing genes frequently coexpressed with Hoxc8, and whether these genes include putative target genes. Using archived gene expression datasets in which Hoxc8 was differentially expressed, we identified a total of 567 genes that were positively coexpressed with Hoxc8 in at least four out of eight datasets. Among these, 23 genes were coexpressed in six datasets. Gene sets associated with extracellular matrix and cell adhesion were most significantly enriched, followed by gene sets for skeletal system development, morphogenesis, cell motility, and transcriptional regulation. In particular, transcriptional regulators, including paralogs of Hoxc8, known Hox co-factors, and transcriptional remodeling factors were enriched. We randomly selected Adam19, Ptpn13, Prkd1, Tgfbi, and Aldh1a3, and validated their coexpression in mouse embryonic tissues and cell lines following TGF-β2 treatment or ectopic Hoxc8 expression. Except for Aldh1a3, all genes showed concordant expression with that of Hoxc8, suggesting that the coexpressed genes might include direct or indirect target genes. Collectively, we suggest that the coexpressed genes provide a resource for constructing Hoxc8-mediated regulatory networks. PMID:27025388

  6. Evaluation of five ab initio gene prediction programs for the discovery of maize genes.

    PubMed

    Yao, Hong; Guo, Ling; Fu, Yan; Borsuk, Lisa A; Wen, Tsui-Jung; Skibbe, David S; Cui, Xiangqin; Scheffler, Brian E; Cao, Jun; Emrich, Scott J; Ashlock, Daniel A; Schnable, Patrick S

    2005-02-01

    Five ab initio programs (FGENESH, GeneMark.hmm, GENSCAN, GlimmerR and Grail) were evaluated for their accuracy in predicting maize genes. Two of these programs, GeneMark.hmm and GENSCAN had been trained for maize; FGENESH had been trained for monocots (including maize), and the others had been trained for rice or Arabidopsis. Initial evaluations were conducted using eight maize genes (gl8a, pdc2, pdc3, rf2c, rf2d, rf2e1, rth1, and rth3) of which the sequences were not released to the public prior to conducting this evaluation. The significant advantage of this data set for this evaluation is that these genes could not have been included in the training sets of the prediction programs. FGENESH yielded the most accurate and GeneMark.hmm the second most accurate predictions. The five programs were used in conjunction with RT-PCR to identify and establish the structures of two new genes in the a1-sh2 interval of the maize genome. FGENESH, GeneMark.hmm and GENSCAN were tested on a larger data set consisting of maize assembled genomic islands (MAGIs) that had been aligned to ESTs. FGENESH, GeneMark.hmm and GENSCAN correctly predicted gene models in 773, 625, and 371 MAGIs, respectively, out of the 1353 MAGIs that comprise data set 2. PMID:15830133

  7. Gene Discovery of Modular Diterpene Metabolism in Nonmodel Systems1[W][OA

    PubMed Central

    Zerbe, Philipp; Hamberger, Björn; Yuen, Macaire M.S.; Chiang, Angela; Sandhu, Harpreet K.; Madilao, Lina L.; Nguyen, Anh; Hamberger, Britta; Bach, Søren Spanner; Bohlmann, Jörg

    2013-01-01

    Plants produce over 10,000 different diterpenes of specialized (secondary) metabolism, and fewer diterpenes of general (primary) metabolism. Specialized diterpenes may have functions in ecological interactions of plants with other organisms and also benefit humanity as pharmaceuticals, fragrances, resins, and other industrial bioproducts. Examples of high-value diterpenes are taxol and forskolin pharmaceuticals or ambroxide fragrances. Yields and purity of diterpenes obtained from natural sources or by chemical synthesis are often insufficient for large-volume or high-end applications. Improvement of agricultural or biotechnological diterpene production requires knowledge of biosynthetic genes and enzymes. However, specialized diterpene pathways are extremely diverse across the plant kingdom, and most specialized diterpenes are taxonomically restricted to a few plant species, genera, or families. Consequently, there is no single reference system to guide gene discovery and rapid annotation of specialized diterpene pathways. Functional diversification of genes and plasticity of enzyme functions of these pathways further complicate correct annotation. To address this challenge, we used a set of 10 different plant species to develop a general strategy for diterpene gene discovery in nonmodel systems. The approach combines metabolite-guided transcriptome resources, custom diterpene synthase (diTPS) and cytochrome P450 reference gene databases, phylogenies, and, as shown for select diTPSs, single and coupled enzyme assays using microbial and plant expression systems. In the 10 species, we identified 46 new diTPS candidates and over 400 putatively terpenoid-related P450s in a resource of nearly 1 million predicted transcripts of diterpene-accumulating tissues. Phylogenetic patterns of lineage-specific blooms of genes guided functional characterization. PMID:23613273

  8. Discovery of novel genes derived from transposable elements using integrative genomic analysis.

    PubMed

    Hoen, Douglas R; Bureau, Thomas E

    2015-06-01

    Complex eukaryotes contain millions of transposable elements (TEs), comprising large fractions of their nuclear genomes. TEs consist of structural, regulatory, and coding sequences that are ordinarily associated with transposition, but that occasionally confer on the organism a selective advantage and may thereby become exapted. Exapted transposable element genes (ETEs) are known to play critical roles in diverse systems, from vertebrate adaptive immunity to plant development. Yet despite their evident importance, most ETEs have been identified fortuitously and few systematic searches have been conducted, suggesting that additional ETEs may await discovery. To explore this possibility, we develop a comprehensive systematic approach to searching for ETEs. We use TE-specific conserved domains to identify with high precision genes derived from TEs and screen them for signatures of exaptation based on their similarities to reference sets of known ETEs, conventional (non-TE) genes, and TE genes across diverse genetic attributes including repetitiveness, conservation of genomic location and sequence, and levels of expression and repressive small RNAs. Applying this approach in the model plant Arabidopsis thaliana, we discover a surprisingly large number of novel high confidence ETEs. Intriguingly, unlike known plant ETEs, several of the novel ETE families form tandemly arrayed gene clusters, whereas others are relatively young. Our results not only identify novel TE-derived genes that may have practical applications but also challenge the notion that TE exaptation is merely a relic of ancient life, instead suggesting that it may continue to fundamentally drive evolution. PMID:25713212

  9. Inherited retinal diseases in dogs: advances in gene/mutation discovery

    PubMed Central

    Miyadera, Keiko

    2015-01-01

    1. Inherited retinal diseases (RDs) are vision-threatening conditions affecting humans as well as many domestic animals. Through many years of clinical studies of the domestic dog population, a wide array of RDs has been phenotypically characterized. Extensive effort to map the causative gene and to identify the underlying mutation followed. Through candidate gene, linkage analysis, genome-wide association studies, and more recently, by means of next-generation sequencing, as many as 31 mutations in 24 genes have been identified as the underlying cause for canine RDs. Most of these genes have been associated with human RDs providing opportunities to study their roles in the disease pathogenesis and in normal visual function. The canine model has also contributed in developing new treatments such as gene therapy which has been clinically applied to human patients. Meanwhile, with increasing knowledge of the molecular architecture of RDs in different subpopulations of dogs, the conventional understanding of RDs as a simple monogenic disease is beginning to change. Emerging evidence of modifiers that alters the disease outcome is complicating the interpretation of DNA tests. In this review, advances in the gene/mutation discovery approaches and the emerging genetic complexity of canine RDs are discussed. PMID:26120276

  10. RNA-Seq Analysis and Gene Discovery of Andrias davidianus Using Illumina Short Read Sequencing

    PubMed Central

    Li, Fenggang; Wang, Lixin; Lan, Qingjing; Yang, Hui; Li, Yang; Liu, Xiaolin; Yang, Zhaoxia

    2015-01-01

    The Chinese giant salamander, Andrias davidianus, is an important species in the course of evolution; however, there is insufficient genomic data in public databases for understanding its immunologic mechanisms. High-throughput transcriptome sequencing is necessary to generate an enormous number of transcript sequences from A. davidianus for gene discovery. In this study, we generated more than 40 million reads from samples of spleen and skin tissue using the Illumina paired-end sequencing technology. De novo assembly yielded 87,297 transcripts with a mean length of 734 base pairs (bp). Based on the sequence similarities, searching with known proteins, 38,916 genes were identified. Gene enrichment analysis determined that 981 transcripts were assigned to the immune system. Tissue-specific expression analysis indicated that 443 of transcripts were specifically expressed in the spleen and skin. Among these transcripts, 147 transcripts were found to be involved in immune responses and inflammatory reactions, such as fucolectin, β-defensins and lymphotoxin beta. Eight tissue-specific genes were selected for validation using real time reverse transcription quantitative PCR (qRT-PCR). The results showed that these genes were significantly more expressed in spleen and skin than in other tissues, suggesting that these genes have vital roles in the immune response. This work provides a comprehensive genomic sequence resource for A. davidianus and lays the foundation for future research on the immunologic and disease resistance mechanisms of A. davidianus and other amphibians. PMID:25874626

  11. In silico mining and PCR-based approaches to transcription factor discovery in non-model plants: gene discovery of the WRKY transcription factors in conifers.

    PubMed

    Liu, Jun-Jun; Xiang, Yu

    2011-01-01

    WRKY transcription factors are key regulators of numerous biological processes in plant growth and development, as well as plant responses to abiotic and biotic stresses. Research on biological functions of plant WRKY genes has focused in the past on model plant species or species with largely characterized transcriptomes. However, a variety of non-model plants, such as forest conifers, are essential as feed, biofuel, and wood or for sustainable ecosystems. Identification of WRKY genes in these non-model plants is equally important for understanding the evolutionary and function-adaptive processes of this transcription factor family. Because of limited genomic information, the rarity of regulatory gene mRNAs in transcriptomes, and the sequence divergence to model organism genes, identification of transcription factors in non-model plants using methods similar to those generally used for model plants is difficult. This chapter describes a gene family discovery strategy for identification of WRKY transcription factors in conifers by a combination of in silico-based prediction and PCR-based experimental approaches. Compared to traditional cDNA library screening or EST sequencing at transcriptome scales, this integrated gene discovery strategy provides fast, simple, reliable, and specific methods to unveil the WRKY gene family at both genome and transcriptome levels in non-model plants. PMID:21720945

  12. Genomics-driven discovery of the pneumocandin biosynthetic gene cluster in the fungus Glarea lozoyensis

    PubMed Central

    2013-01-01

    Background The antifungal therapy caspofungin is a semi-synthetic derivative of pneumocandin B0, a lipohexapeptide produced by the fungus Glarea lozoyensis, and was the first member of the echinocandin class approved for human therapy. The nonribosomal peptide synthetase (NRPS)-polyketide synthases (PKS) gene cluster responsible for pneumocandin biosynthesis from G. lozoyensis has not been elucidated to date. In this study, we report the elucidation of the pneumocandin biosynthetic gene cluster by whole genome sequencing of the G. lozoyensis wild-type strain ATCC 20868. Results The pneumocandin biosynthetic gene cluster contains a NRPS (GLNRPS4) and a PKS (GLPKS4) arranged in tandem, two cytochrome P450 monooxygenases, seven other modifying enzymes, and genes for L-homotyrosine biosynthesis, a component of the peptide core. Thus, the pneumocandin biosynthetic gene cluster is significantly more autonomous and organized than that of the recently characterized echinocandin B gene cluster. Disruption mutants of GLNRPS4 and GLPKS4 no longer produced the pneumocandins (A0 and B0), and the Δglnrps4 and Δglpks4 mutants lost antifungal activity against the human pathogenic fungus Candida albicans. In addition to pneumocandins, the G. lozoyensis genome encodes a rich repertoire of natural product-encoding genes including 24 PKSs, six NRPSs, five PKS-NRPS hybrids, two dimethylallyl tryptophan synthases, and 14 terpene synthases. Conclusions Characterization of the gene cluster provides a blueprint for engineering new pneumocandin derivatives with improved pharmacological properties. Whole genome estimation of the secondary metabolite-encoding genes from G. lozoyensis provides yet another example of the huge potential for drug discovery from natural products from the fungal kingdom. PMID:23688303

  13. Network Based Consensus Gene Signatures for Biomarker Discovery in Breast Cancer

    PubMed Central

    Fröhlich, Holger

    2011-01-01

    Diagnostic and prognostic biomarkers for cancer based on gene expression profiles are viewed as a major step towards a better personalized medicine. Many studies using various computational approaches have been published in this direction during the last decade. However, when comparing different gene signatures for related clinical questions often only a small overlap is observed. This can have various reasons, such as technical differences of platforms, differences in biological samples or their treatment in lab, or statistical reasons because of the high dimensionality of the data combined with small sample size, leading to unstable selection of genes. In conclusion retrieved gene signatures are often hard to interpret from a biological point of view. We here demonstrate that it is possible to construct a consensus signature from a set of seemingly different gene signatures by mapping them on a protein interaction network. Common upstream proteins of close gene products, which we identified via our developed algorithm, show a very clear and significant functional interpretation in terms of overrepresented KEGG pathways, disease associated genes and known drug targets. Moreover, we show that such a consensus signature can serve as prior knowledge for predictive biomarker discovery in breast cancer. Evaluation on different datasets shows that signatures derived from the consensus signature reveal a much higher stability than signatures learned from all probesets on a microarray, while at the same time being at least as predictive. Furthermore, they are clearly interpretable in terms of enriched pathways, disease associated genes and known drug targets. In summary we thus believe that network based consensus signatures are not only a way to relate seemingly different gene signatures to each other in a functional manner, but also to establish prior knowledge for highly stable and interpretable predictive biomarkers. PMID:22046239

  14. Arctic Research Mapping Application (ARMAP) Showcases discovery level metadata for US Funded Research Projects

    NASA Astrophysics Data System (ADS)

    Gaylord, A. G.; Kassin, A.; Cody, R. P.; Manley, W. F.; Dover, M.; Score, R.; Garcia-Lavigne3, D.; Tweedie, C. E.

    2013-12-01

    The Arctic Research Mapping Application (ARMAP) is a suite of online applications and data services that support Arctic science by providing project tracking information (who's doing what, when and where in the region) for United States Government funded projects. Development of an interagency standard for tracking discovery level metadata for projects has been achieved through collaboration with the Alaska Data Integration work group. The US National Science Foundation plus 17 other agencies and organizations have adopted the standard with several entities successfully implementing XML based REST webservices. With ARMAP's web mapping applications and data services (http://armap.org), users can search for research projects by location, year, funding program, keyword, investigator, and discipline, among other variables. Key information about each project is displayed within the application with links to web pages that provide additional information. The ARMAP 2D mapping application has been significantly enhanced to include support for multiple projections, improved base maps, additional reference data layers, and optimization for better performance. In 2013, ship tracks for US National Science Foundation supported vessel based surveys and health care facilities have been included in ARMAP. The additional functionality of this tool will increase awareness of projects funded by numerous entities in the Arctic, enhance coordination for logistics support, help identify geographic gaps in research efforts and potentially foster more collaboration amongst researchers working in the region. Additionally, ARMAP can be used to demonstrate the effects of the International Polar Year (IPY) on funding of different research disciplines by the U.S. Government.

  15. Arctic Research Mapping Application (ARMAP) Showcases discovery level metadata for US Funded Research Projects

    NASA Astrophysics Data System (ADS)

    Score, R.; Gaylord, A. G.; Kassin, A.; Cody, R. P.; Copenhaver, W.; Manley, W. F.; Dover, M.; Tweedie, C. E.

    2014-12-01

    The Arctic Research Mapping Application (ARMAP) is a suite of online applications and data services that support Arctic science by providing project tracking information (who's doing what, when and where in the region) for United States Government funded projects. Development of an interagency standard for tracking discovery level metadata for projects has been achieved through collaboration with the Alaska Data Integration work group. The US National Science Foundation plus 17 other agencies and organizations have adopted the standard with several entities successfully implementing XML based REST webservices. With ARMAP's web mapping applications and data services (http://armap.org), users can search for research projects by location, year, funding program, keyword, investigator, and discipline, among other variables. Key information about each project is displayed within the application with links to web pages that provide additional information. The ARMAP 2D mapping application has been significantly enhanced to include support for multiple projections, improved base maps, additional reference data layers, and optimization for better performance. In 2014, ship tracks for US National Science Foundation supported vessel based surveys have been expanded. These enhancements have been made to increase awareness of projects funded by numerous entities in the Arctic, enhance coordination for logistics support, help identify geographic gaps in research efforts and potentially foster more collaboration amongst researchers working in the region. Additionally, ARMAP can be used to demonstrate past, present, and future research efforts supported by the U.S. Government.

  16. Designing and Developing a NASA Research Projects Knowledge Base and Implementing Knowledge Management and Discovery Techniques

    NASA Astrophysics Data System (ADS)

    Dabiru, L.; O'Hara, C. G.; Shaw, D.; Katragadda, S.; Anderson, D.; Kim, S.; Shrestha, B.; Aanstoos, J.; Frisbie, T.; Policelli, F.; Keblawi, N.

    2006-12-01

    The Research Project Knowledge Base (RPKB) is currently being designed and will be implemented in a manner that is fully compatible and interoperable with enterprise architecture tools developed to support NASA's Applied Sciences Program. Through user needs assessment, collaboration with Stennis Space Center, Goddard Space Flight Center, and NASA's DEVELOP Staff personnel insight to information needs for the RPKB were gathered from across NASA scientific communities of practice. To enable efficient, consistent, standard, structured, and managed data entry and research results compilation a prototype RPKB has been designed and fully integrated with the existing NASA Earth Science Systems Components database. The RPKB will compile research project and keyword information of relevance to the six major science focus areas, 12 national applications, and the Global Change Master Directory (GCMD). The RPKB will include information about projects awarded from NASA research solicitations, project investigator information, research publications, NASA data products employed, and model or decision support tools used or developed as well as new data product information. The RPKB will be developed in a multi-tier architecture that will include a SQL Server relational database backend, middleware, and front end client interfaces for data entry. The purpose of this project is to intelligently harvest the results of research sponsored by the NASA Applied Sciences Program and related research program results. We present various approaches for a wide spectrum of knowledge discovery of research results, publications, projects, etc. from the NASA Systems Components database and global information systems and show how this is implemented in SQL Server database. The application of knowledge discovery is useful for intelligent query answering and multiple-layered database construction. Using advanced EA tools such as the Earth Science Architecture Tool (ESAT), RPKB will enable NASA and partner agencies to efficiently identify the significant results for new experiment directions and principle investigators to formulate experiment directions for new proposals.

  17. SPARCoC: A New Framework for Molecular Pattern Discovery and Cancer Gene Identification

    PubMed Central

    Ma, Shiqian; Johnson, Daniel; Ashby, Cody; Xiong, Donghai; Cramer, Carole L.; Moore, Jason H.; Zhang, Shuzhong; Huang, Xiuzhen

    2015-01-01

    It is challenging to cluster cancer patients of a certain histopathological type into molecular subtypes of clinical importance and identify gene signatures directly relevant to the subtypes. Current clustering approaches have inherent limitations, which prevent them from gauging the subtle heterogeneity of the molecular subtypes. In this paper we present a new framework: SPARCoC (Sparse-CoClust), which is based on a novel Common-background and Sparse-foreground Decomposition (CSD) model and the Maximum Block Improvement (MBI) co-clustering technique. SPARCoC has clear advantages compared with widely-used alternative approaches: hierarchical clustering (Hclust) and nonnegative matrix factorization (NMF). We apply SPARCoC to the study of lung adenocarcinoma (ADCA), an extremely heterogeneous histological type, and a significant challenge for molecular subtyping. For testing and verification, we use high quality gene expression profiling data of lung ADCA patients, and identify prognostic gene signatures which could cluster patients into subgroups that are significantly different in their overall survival (with p-values < 0.05). Our results are only based on gene expression profiling data analysis, without incorporating any other feature selection or clinical information; we are able to replicate our findings with completely independent datasets. SPARCoC is broadly applicable to large-scale genomic data to empower pattern discovery and cancer gene identification. PMID:25768286

  18. PhenomeNET: a whole-phenome approach to disease gene discovery

    PubMed Central

    Hoehndorf, Robert; Schofield, Paul N.; Gkoutos, Georgios V.

    2011-01-01

    Phenotypes are investigated in model organisms to understand and reveal the molecular mechanisms underlying disease. Phenotype ontologies were developed to capture and compare phenotypes within the context of a single species. Recently, these ontologies were augmented with formal class definitions that may be utilized to integrate phenotypic data and enable the direct comparison of phenotypes between different species. We have developed a method to transform phenotype ontologies into a formal representation, combine phenotype ontologies with anatomy ontologies, and apply a measure of semantic similarity to construct the PhenomeNET cross-species phenotype network. We demonstrate that PhenomeNET can identify orthologous genes, genes involved in the same pathway and gene–disease associations through the comparison of mutant phenotypes. We provide evidence that the Adam19 and Fgf15 genes in mice are involved in the tetralogy of Fallot, and, using zebrafish phenotypes, propose the hypothesis that the mammalian homologs of Cx36.7 and Nkx2.5 lie in a pathway controlling cardiac morphogenesis and electrical conductivity which, when defective, cause the tetralogy of Fallot phenotype. Our method implements a whole-phenome approach toward disease gene discovery and can be applied to prioritize genes for rare and orphan diseases for which the molecular basis is unknown. PMID:21737429

  19. A Genomics Based Discovery of Secondary Metabolite Biosynthetic Gene Clusters in Aspergillus ustus

    PubMed Central

    Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong

    2015-01-01

    Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic. PMID:25706180

  20. iCOSSY: An Online Tool for Context-Specific Subnetwork Discovery from Gene Expression Data

    PubMed Central

    Saha, Ashis; Jeon, Minji; Tan, Aik Choon; Kang, Jaewoo

    2015-01-01

    Pathway analyses help reveal underlying molecular mechanisms of complex biological phenotypes. Biologists tend to perform multiple pathway analyses on the same dataset, as there is no single answer. It is often inefficient for them to implement and/or install all the algorithms by themselves. Online tools can help the community in this regard. Here we present an online gene expression analytical tool called iCOSSY which implements a novel pathway-based COntext-specific Subnetwork discoverY (COSSY) algorithm. iCOSSY also includes a few modifications of COSSY to increase its reliability and interpretability. Users can upload their gene expression datasets, and discover important subnetworks of closely interacting molecules to differentiate between two phenotypes (context). They can also interactively visualize the resulting subnetworks. iCOSSY is a web server that finds subnetworks that are differentially expressed in two phenotypes. Users can visualize the subnetworks to understand the biology of the difference. PMID:26147457

  1. Discovery of Functional Gene Variants Associated With Human Longevity: Opportunities and Challenges

    PubMed Central

    Tazearslan, Cagdas; Cho, Miook

    2012-01-01

    Age is a major risk factor for many human diseases. Extremely long-lived individuals, such as centenarians, have managed to ward off age-related diseases and serve as human models to search for the genetic factors that influence longevity. The discovery of evolutionarily conserved pathways with major impact on life span in animal models has provided tantalizing opportunities to test the relevance of these pathways for human longevity. Here we specifically focus on the insulin/insulin-like growth factor-1 signaling as a prime candidate pathway. Coupled with the rapid advances in ultra high-throughput sequencing technologies, it is now feasible to comprehensively analyze all possible sequence variants in candidate genes segregating with a longevity phenotype and to investigate the functional consequences of the associated variants. A better understanding of the functional genes that affect healthy longevity in humans may lead to a rational basis for intervention strategies that can delay or prevent age-related diseases. PMID:22156437

  2. The discovery of zinc fingers and their applications in gene regulation and genome manipulation.

    PubMed

    Klug, Aaron

    2010-01-01

    An account is given of the discovery of the classical Cys(2)His(2) zinc finger, arising from the interpretation of biochemical studies on the interaction of the Xenopus protein transcription factor IIIA with 5S RNA, and of structural studies on its structure and its interaction with DNA. The finger is a self-contained domain stabilized by a zinc ion ligated to a pair of cysteines and a pair of histidines, and by an inner hydrophobic core. This discovery showed not only a new protein fold but also a novel principle of DNA recognition. Whereas other DNA binding proteins generally make use of the two-fold symmetry of the double helix, zinc fingers can be linked linearly in tandem to recognize nucleic acid sequences of varying lengths. This modular design offers a large number of combinatorial possibilities for the specific recognition of DNA (or RNA). It is therefore not surprising that the zinc finger is found widespread in nature, including 3% of the genes of the human genome. The zinc finger design is ideally suited for engineering proteins to target specific genes. In the first example of their application in 1994, a three-finger protein was constructed to block the expression of an oncogene transformed into a mouse cell line. In addition, a reporter gene was activated by targeting an inserted zinc finger promoter. Thus, by fusing zinc finger peptides to repression or activation domains, genes can be selectively switched off or on. It was also suggested that by combining zinc fingers with other effector domains, e.g., from nucleases or integrases, to form chimeric proteins, genomes could be manipulated or modified. Several applications of such engineered zinc finger proteins are described here, including some of therapeutic importance. PMID:20192761

  3. Systems-based biological concordance and predictive reproducibility of gene set discovery methods in cardiovascular disease.

    PubMed

    Azuaje, Francisco; Zheng, Huiru; Camargo, Anyela; Wang, Haiying

    2011-08-01

    The discovery of novel disease biomarkers is a crucial challenge for translational bioinformatics. Demonstration of both their classification power and reproducibility across independent datasets are essential requirements to assess their potential clinical relevance. Small datasets and multiplicity of putative biomarker sets may explain lack of predictive reproducibility. Studies based on pathway-driven discovery approaches have suggested that, despite such discrepancies, the resulting putative biomarkers tend to be implicated in common biological processes. Investigations of this problem have been mainly focused on datasets derived from cancer research. We investigated the predictive and functional concordance of five methods for discovering putative biomarkers in four independently-generated datasets from the cardiovascular disease domain. A diversity of biosignatures was identified by the different methods. However, we found strong biological process concordance between them, especially in the case of methods based on gene set analysis. With a few exceptions, we observed lack of classification reproducibility using independent datasets. Partial overlaps between our putative sets of biomarkers and the primary studies exist. Despite the observed limitations, pathway-driven or gene set analysis can predict potentially novel biomarkers and can jointly point to biomedically-relevant underlying molecular mechanisms. PMID:21315182

  4. Gene analysis techniques and susceptibility gene discovery in non-BRCA1/BRCA2 familial breast cancer.

    PubMed

    Aloraifi, Fatima; Boland, Michael R; Green, Andrew J; Geraghty, James G

    2015-06-01

    Breast cancer is the leading cause of cancer deaths in females worldwide occurring in both hereditary and sporadic forms. Women with inherited pathogenic mutations in the BRCA1 or BRCA2 genes have up to an 85% risk of developing breast cancer in their lifetimes. These patients are candidates for risk-reduction measures such as intensive radiological screening, prophylactic surgery or chemoprevention. However, only about 20% of familial breast cancer cases are attributed to mutations in BRCA1 and BRCA2, while a further 5-10% are attributed to mutations in other rare susceptibility genes such as TP53, STK11, PTEN, ATM and CHEK2. A multitude of genome wide association studies (GWAS) have been conducted confirming low-risk common variants associated with breast cancer in excess of 90 loci, which may contribute to a further 23% of the heritability. We currently find ourselves in "the next generation", with technologies offering deep sequencing at a fraction of the cost. Starting off primarily in a research setting, multi-gene panel testing is now utilized in the clinic to sequence multiple predisposing genes simultaneously (otherwise known as multi-gene panel testing). In this review, we focus on the hereditary breast cancer discoveries, techniques and the challenges we face in this complex disease, especially in the light of the vast amount of data we now have at hand. It has been 20 years since the first breast cancer susceptibility gene has been discovered and there has been substantial progress in unraveling the genetic component of the disease. However, hereditary breast cancer remains a challenging topic subject to common debate. PMID:25936246

  5. Exploiting pre-rRNA processing in Diamond Blackfan anemia gene discovery and diagnosis.

    PubMed

    Farrar, Jason E; Quarello, Paola; Fisher, Ross; O'Brien, Kelly A; Aspesi, Anna; Parrella, Sara; Henson, Adrianna L; Seidel, Nancy E; Atsidaftos, Eva; Prakash, Supraja; Bari, Shahla; Garelli, Emanuela; Arceci, Robert J; Dianzani, Irma; Ramenghi, Ugo; Vlachos, Adrianna; Lipton, Jeffrey M; Bodine, David M; Ellis, Steven R

    2014-10-01

    Diamond Blackfan anemia (DBA), a syndrome primarily characterized by anemia and physical abnormalities, is one among a group of related inherited bone marrow failure syndromes (IBMFS) which share overlapping clinical features. Heterozygous mutations or single-copy deletions have been identified in 12 ribosomal protein genes in approximately 60% of DBA cases, with the genetic etiology unexplained in most remaining patients. Unlike many IBMFS, for which functional screening assays complement clinical and genetic findings, suspected DBA in the absence of typical alterations of the known genes must frequently be diagnosed after exclusion of other IBMFS. We report here a novel deletion in a child that presented such a diagnostic challenge and prompted development of a novel functional assay that can assist in the diagnosis of a significant fraction of patients with DBA. The ribosomal proteins affected in DBA are required for pre-rRNA processing, a process which can be interrogated to monitor steps in the maturation of 40S and 60S ribosomal subunits. In contrast to prior methods used to assess pre-rRNA processing, the assay reported here, based on capillary electrophoresis measurement of the maturation of rRNA in pre-60S ribosomal subunits, would be readily amenable to use in diagnostic laboratories. In addition to utility as a diagnostic tool, we applied this technique to gene discovery in DBA, resulting in the identification of RPL31 as a novel DBA gene. PMID:25042156

  6. Using Osteoclast Differentiation as a Model for Gene Discovery in an Undergraduate Cell Biology Laboratory

    ERIC Educational Resources Information Center

    Birnbaum, Mark J.; Picco, Jenna; Clements, Meghan; Witwicka, Hanna; Yang, Meiheng; Hoey, Margaret T.; Odgren, Paul R.

    2010-01-01

    A key goal of molecular/cell biology/biotechnology is to identify essential genes in virtually every physiological process to uncover basic mechanisms of cell function and to establish potential targets of drug therapy combating human disease. This article describes a semester-long, project-oriented molecular/cellular/biotechnology laboratory

  7. Using Osteoclast Differentiation as a Model for Gene Discovery in an Undergraduate Cell Biology Laboratory

    ERIC Educational Resources Information Center

    Birnbaum, Mark J.; Picco, Jenna; Clements, Meghan; Witwicka, Hanna; Yang, Meiheng; Hoey, Margaret T.; Odgren, Paul R.

    2010-01-01

    A key goal of molecular/cell biology/biotechnology is to identify essential genes in virtually every physiological process to uncover basic mechanisms of cell function and to establish potential targets of drug therapy combating human disease. This article describes a semester-long, project-oriented molecular/cellular/biotechnology laboratory…

  8. Discovery and validation of gene classifiers for endocrine-disrupting chemicals in zebrafish (danio rerio)

    PubMed Central

    2012-01-01

    Background Development and application of transcriptomics-based gene classifiers for ecotoxicological applications lag far behind those of biomedical sciences. Many such classifiers discovered thus far lack vigorous statistical and experimental validations. A combination of genetic algorithm/support vector machines and genetic algorithm/K nearest neighbors was used in this study to search for classifiers of endocrine-disrupting chemicals (EDCs) in zebrafish. Searches were conducted on both tissue-specific and tissue-combined datasets, either across the entire transcriptome or within individual transcription factor (TF) networks previously linked to EDC effects. Candidate classifiers were evaluated by gene set enrichment analysis (GSEA) on both the original training data and a dedicated validation dataset. Results Multi-tissue dataset yielded no classifiers. Among the 19 chemical-tissue conditions evaluated, the transcriptome-wide searches yielded classifiers for six of them, each having approximately 20 to 30 gene features unique to a condition. Searches within individual TF networks produced classifiers for 15 chemical-tissue conditions, each containing 100 or fewer top-ranked gene features pooled from those of multiple TF networks and also unique to each condition. For the training dataset, 10 out of 11 classifiers successfully identified the gene expression profiles (GEPs) of their targeted chemical-tissue conditions by GSEA. For the validation dataset, classifiers for prochloraz-ovary and flutamide-ovary also correctly identified the GEPs of corresponding conditions while no classifier could predict the GEP from prochloraz-brain. Conclusions The discrepancies in the performance of these classifiers were attributed in part to varying data complexity among the conditions, as measured to some degree by Fisher’s discriminant ratio statistic. This variation in data complexity could likely be compensated by adjusting sample size for individual chemical-tissue conditions, thus suggesting a need for a preliminary survey of transcriptomic responses before launching a full scale classifier discovery effort. Classifier discovery based on individual TF networks could yield more mechanistically-oriented biomarkers. GSEA proved to be a flexible and effective tool for application of gene classifiers but a similar and more refined algorithm, connectivity mapping, should also be explored. The distribution characteristics of classifiers across tissues, chemicals, and TF networks suggested a differential biological impact among the EDCs on zebrafish transcriptome involving some basic cellular functions. PMID:22849515

  9. Discovery of BRM Targeted Therapies: Novel Reactivation of an Anti-cancer Gene.

    PubMed

    Gramling, Sarah; Reisman, David

    2011-01-01

    Drug discovery in the field of oncology has been advanced mainly through the targeting of receptor tyrosine kinases. Both antibodies and small molecule inhibitors have been found to have successful applications in blocking the proliferative functions of these cell surface receptors. Based on these early successes, additional kinases within the cytoplasm have been found to promote cancer and, as such, have been recognized as feasible targets for additional modes of therapies. Unlike these oncogene targets, most tumor suppressors are irreversibly altered during cancer progression and therefore are not feasible targets for therapy. However, a subset of these genes is reversibly epigenetically suppressed. One such gene is BRM, and when it is re-expressed in cancer cells, this gene halts their growth. Moreover, as the key catalytic subunit of the SWI/SNF complex, BRM is centrally important to a host of anticancer pathways and cellular mechanisms, and its status may serve as a biomarker. Restoring its expression will both reconnect a number of growth-controlling pathways and affect cellular adhesion, DNA repair, and immune functions. For these reasons, restoring BRM expression is not only feasible, but potentially a potent form of anticancer therapy. To identify BRM-restoring compounds, we developed a cell-based luciferase assay. In this review, we discuss some of the challenges we encountered, issues related to this type of drug discovery, and our future ambitions. We hope this review will provide insight to this type of endeavor and lead to more investigations pursuing this type of drug research. PMID:23565070

  10. SAGExplore: a web server for unambiguous tag mapping in serial analysis of gene expression oriented to gene discovery and annotation.

    PubMed

    Norambuena, Tomás; Malig, Rodrigo; Melo, Francisco

    2007-07-01

    We describe a web server for the accurate mapping of experimental tags in serial analysis of gene expression (SAGE). The core of the server relies on a database of genomic virtual tags built by a recently described method that attempts to reduce the amount of ambiguous assignments for those tags that are not unique in the genome. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome. The output of the server consists of a table in HTML format that contains links to a graphic representation of the results and to some external servers and databases, facilitating the tasks of analysis of gene expression and gene discovery. Also, a table in tab delimited text format is produced, allowing the user to export the results into custom databases and software for further analysis. The current server version provides the most accurate and complete SAGE tag mapping source that is available for the yeast organism. In the near future, this server will also allow the accurate mapping of experimental SAGE-tags from other model organisms such as human, mouse, frog and fly. The server is freely available on the web at: http://dna.bio.puc.cl/SAGExplore.html. PMID:17626053

  11. De novo transcriptome sequencing and discovery of genes related to copper tolerance in Paeonia ostii.

    PubMed

    Wang, Yanjie; Dong, Chunlan; Xue, Zeyun; Jin, Qijiang; Xu, Yingchun

    2016-01-15

    Paeonia ostii, an important ornamental and medicinal plant, grows normally on copper (Cu) mines with widespread Cu contamination of soils, and it has the ability to lower Cu contents in the Cu-contaminated soils. However, very little molecular information concerned with Cu resistance of P. ostii is available. In this study, high-throughput de novo transcriptome sequencing was carried out for P. ostii with and without Cu treatment using Illumina HiSeq 2000 platform. A total of 77,704 All-unigenes were obtained with a mean length of 710 bp. Of these unigenes, 47,461 were annotated with public databases based on sequence similarities. Comparative transcript profiling allowed the discovery of 4324 differentially expressed genes (DEGs), with 2207 up-regulated and 2117 down-regulated unigenes in Cu-treated library as compared to the control counterpart. Based on these DEGs, Gene Ontology (GO) enrichment analysis indicated Cu stress-relevant terms, such as 'membrane' and 'antioxidant activity'. Meanwhile, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis uncovered some important pathways, including 'biosynthesis of secondary metabolites' and 'metabolic pathways'. In addition, expression patterns of 12 selected DEGs derived from quantitative real-time polymerase chain reaction (qRT-PCR) were consistent with their transcript abundance changes obtained by transcriptomic analyses, suggesting that all the 12 genes were authentically involved in Cu tolerance in P. ostii. This is the first report to identify genes related to Cu stress responses in P. ostii, which could offer valuable information on the molecular mechanisms of Cu resistance, and provide a basis for further genomics research on this and related ornamental species for phytoremediation. PMID:26435192

  12. Resequencing and comparative genomics of Stagonospora nodorum: sectional gene absence and effector discovery.

    PubMed

    Syme, Robert Andrew; Hane, James K; Friesen, Timothy L; Oliver, Richard P

    2013-06-01

    Stagonospora nodorum is an important wheat (Triticum aestivum) pathogen in many parts of the world, causing major yield losses. It was the first species in the large fungal Dothideomycete class to be genome sequenced. The reference genome sequence (SN15) has been instrumental in the discovery of genes encoding necrotrophic effectors that induce disease symptoms in specific host genotypes. Here we present the genome sequence of two further S. nodorum strains (Sn4 and Sn79) that differ in their effector repertoire from the reference. Sn79 is avirulent on wheat and produces no apparent effectors when infiltrated onto many cultivars and mapping population parents. Sn4 is pathogenic on wheat and has virulences not found in SN15. The new strains, sequenced with short-read Illumina chemistry, are compared with SN15 by a combination of mapping and de novo assembly approaches. Each of the genomes contains a large number of strain-specific genes, many of which have no meaningful similarity to any known gene. Large contiguous sections of the reference genome are absent in the two newly sequenced strains. We refer to these differences as "sectional gene absences." The presence of genes in pathogenic strains and absence in Sn79 is added to computationally predicted properties of known proteins to produce a list of likely effector candidates. Transposon insertion was observed in the mitochondrial genomes of virulent strains where the avirulent strain retained the likely ancestral sequence. The study suggests that short-read enabled comparative genomics is an effective way to both identify new S. nodorum effector candidates and to illuminate evolutionary processes in this species. PMID:23589517

  13. Sleeping Beauty transposon insertional mutagenesis based mouse models for cancer gene discovery.

    PubMed

    Moriarity, Branden S; Largaespada, David A

    2015-02-01

    Large-scale genomic efforts to study human cancer, such as the cancer gene atlas (TCGA), have identified numerous cancer drivers in a wide variety of tumor types. However, there are limitations to this approach, the mutations and expression or copy number changes that are identified are not always clearly functionally relevant, and only annotated genes and genetic elements are thoroughly queried. The use of complimentary, nonbiased, functional approaches to identify drivers of cancer development and progression is ideal to maximize the rate at which cancer discoveries are achieved. One such approach that has been successful is the use of the Sleeping Beauty (SB) transposon-based mutagenesis system in mice. This system uses a conditionally expressed transposase and mutagenic transposon allele to target mutagenesis to somatic cells of a given tissue in mice to cause random mutations leading to tumor development. Analysis of tumors for transposon common insertion sites (CIS) identifies candidate cancer genes specific to that tumor type. While similar screens have been performed in mice with the PiggyBac (PB) transposon and viral approaches, we limit extensive discussion to SB. Here we discuss the basic structure of these screens, screens that have been performed, methods used to identify CIS. PMID:26051241

  14. Characterization of ESTs from black locust for gene discovery and marker development.

    PubMed

    Wang, J X; Lu, C; Yuan, C Q; Cui, B B; Qiu, Q D; Sun, P; Hu, R Y; Wu, D C; Sun, Y H; Li, Y

    2015-01-01

    Black locust (Robinia pseudoacacia L.) is an ecologically and economically important species. However, it has relatively underdeveloped genomic resources, and this limits gene discovery and marker-assisted selective breeding. In the present study, we obtained large-scale transcriptome data using a next-generation sequencing platform to compensate for the lack of black locust genomic information. Increasing the amount of transcriptome data for black locust will provide a valuable resource for multi-gene phylogenetic analyses and will facilitate research on the mechanisms whereby conserved genes and functions are maintained in the face of species divergence. We sequenced the black locust transcriptome from a cDNA library of multiple tissues and individuals on an Illumina platform, and this produced 108,229,352 clean sequence reads. The high-quality overlapping expressed sequence tags (ESTs) were assembled into 36,533 unigenes, and 4781 simple sequence repeats were characterized. A large collection of high-quality ESTs was obtained, de novo assembled, and characterized. Our results markedly expand the previous transcript catalogues of black locust and can gradually be applied to black locust breeding programs. Furthermore, our data will facilitate future research on the comparative genomics of black locust and related species. PMID:26505419

  15. Immune gene discovery by expressed sequence tag analysis of spleen in the duck (Anas platyrhynchos).

    PubMed

    Xia, Jianguo; Radford, Cynthia; Guo, Xiaoxin; Magor, Katharine E

    2007-01-01

    To search for immune relevant genes of the duck we have conducted an expressed sequence tag (EST) project. Duck immune genes are relevant to disease resistance in agriculture and in intervention in avian influenza outbreaks through vaccination of ducks. We sequenced 3168 clones from a spleen cDNA library of a White Pekin duck, Anas platyrhynchos. We constructed an EST analysis pipeline enabling (1) quality checking and clustering, (2) identification based on BLAST results, (3) annotation using Gene Ontology, and (4) submission to the dbEST database. In total, 208 genes were relevant to the duck immune system, which we have divided into categories: (1) leukocyte receptors, (2) lectin-like immunoreceptors, (3) cytokines and chemokines, (4) transcription factors (5) mediators of antigen processing or apoptosis, and (6) innate effectors. We compared 425 homologous sequences between the duck and the closest genetic model organism, chicken. Homologous genes had regions sharing 80 to 99 percent nucleotide identity, however immune-relevant genes were less conserved than other genes. PMID:16919729

  16. Machine learning approaches for the discovery of gene-gene interactions in disease data.

    PubMed

    Upstill-Goddard, Rosanna; Eccles, Diana; Fliege, Joerg; Collins, Andrew

    2013-03-01

    Because of the complexity of gene-phenotype relationships machine learning approaches have considerable appeal as a strategy for modelling interactions. A number of such methods have been developed and applied in recent years with some modest success. Progress is hampered by the challenges presented by the complexity of the disease genetic data, including phenotypic and genetic heterogeneity, polygenic forms of inheritance and variable penetrance, combined with the analytical and computational issues arising from the enormous number of potential interactions. We review here recent and current approaches focusing, wherever possible, on applications to real data (particularly in the context of genome-wide association studies) and looking ahead to the further challenges posed by next generation sequencing data. PMID:22611119

  17. Heuristic Bayesian segmentation for discovery of coexpressed genes within genomic regions.

    PubMed

    Pehkonen, Petri; Wong, Garry; Törönen, Petri

    2010-01-01

    Segmentation aims to separate homogeneous areas from the sequential data, and plays a central role in data mining. It has applications ranging from finance to molecular biology, where bioinformatics tasks such as genome data analysis are active application fields. In this paper, we present a novel application of segmentation in locating genomic regions with coexpressed genes. We aim at automated discovery of such regions without requirement for user-given parameters. In order to perform the segmentation within a reasonable time, we use heuristics. Most of the heuristic segmentation algorithms require some decision on the number of segments. This is usually accomplished by using asymptotic model selection methods like the Bayesian information criterion. Such methods are based on some simplification, which can limit their usage. In this paper, we propose a Bayesian model selection to choose the most proper result from heuristic segmentation. Our Bayesian model presents a simple prior for the segmentation solutions with various segment numbers and a modified Dirichlet prior for modeling multinomial data. We show with various artificial data sets in our benchmark system that our model selection criterion has the best overall performance. The application of our method in yeast cell-cycle gene expression data reveals potential active and passive regions of the genome. PMID:20150667

  18. ESTs from a wild Arachis species for gene discovery and marker development

    PubMed Central

    Proite, Karina; Leal-Bertioli, Soraya CM; Bertioli, David J; Moretzsohn, Márcio C; da Silva, Felipe R; Martins, Natalia F; Guimarães, Patrícia M

    2007-01-01

    Background Due to its origin, peanut has a very narrow genetic background. Wild relatives can be a source of genetic variability for cultivated peanut. In this study, the transcriptome of the wild species Arachis stenosperma accession V10309 was analyzed. Results ESTs were produced from four cDNA libraries of RNAs extracted from leaves and roots of A. stenosperma. Randomly selected cDNA clones were sequenced to generate 8,785 ESTs, of which 6,264 (71.3%) had high quality, with 3,500 clusters: 963 contigs and 2537 singlets. Only 55.9% matched homologous sequences of known genes. ESTs were classified into 23 different categories according to putative protein functions. Numerous sequences related to disease resistance, drought tolerance and human health were identified. Two hundred and six microsatellites were found and markers have been developed for 188 of these. The microsatellite profile was analyzed and compared to other transcribed and genomic sequence data. Conclusion This is, to date, the first report on the analysis of transcriptome of a wild relative of peanut. The ESTs produced in this study are a valuable resource for gene discovery, the characterization of new wild alleles, and for marker development. The ESTs were released in the [GenBank:EH041934 to EH048197]. PMID:17302987

  19. Next-generation gene discovery for variants of large impact on lipid traits

    PubMed Central

    Rosenthal, Elisabeth; Blue, Elizabeth; Jarvik, Gail P.

    2015-01-01

    Purpose of review Detection of high impact variants on lipid traits is complicated by complex genetic architecture. Although genome-wide association studies (GWAS) successfully identified many novel genes associated with lipid traits, it was less successful in identifying variants with a large impact on the phenotype. This is not unexpected, as the more common variants detectable by GWAS typically have small effects. The availability of large familial datasets and sequence data has changed the paradigm for successful genomic discovery of the novel genes and pathogenic variants underlying lipid disorders. Recent findings Novel loci with large effects have been successfully mapped in families, and next-generation sequencing allowed for the identification of the underlying lipid associated variants of large effect size. The success of this strategy relies on the simplification of the underlying genetic variation by focusing on large single families segregating extreme lipid phenotypes. Summary Rare, high impact variants are expected to have large effects and be more relevant for medical and pharmaceutical applications. Family data have many advantages over population-based data because they allow for the efficient detection of high-impact variants with an exponentially smaller sample size and increased power for follow-up studies. PMID:25636063

  20. Estimation of false discovery rates in multiple testing: application to gene microarray data.

    PubMed

    Tsai, Chen-An; Hsueh, Huey-miin; Chen, James J

    2003-12-01

    Testing for significance with gene expression data from DNA microarray experiments involves simultaneous comparisons of hundreds or thousands of genes. If R denotes the number of rejections (declared significant genes) and V denotes the number of false rejections, then V/R, if R > 0, is the proportion of false rejected hypotheses. This paper proposes a model for the distribution of the number of rejections and the conditional distribution of V given R, V / R. Under the independence assumption, the distribution of R is a convolution of two binomials and the distribution of V / R has a noncentral hypergeometric distribution. Under an equicorrelated model, the distributions are more complex and are also derived. Five false discovery rate probability error measures are considered: FDR = E(V/R), pFDR = E(V/R / R > 0) (positive FDR), cFDR = E(V/R / R = r) (conditional FDR), mFDR = E(V)/E(R) (marginal FDR), and eFDR = E(V)/r (empirical FDR). The pFDR, cFDR, and mFDR are shown to be equivalent under the Bayesian framework, in which the number of true null hypotheses is modeled as a random variable. We present a parametric and a bootstrap procedure to estimate the FDRs. Monte Carlo simulations were conducted to evaluate the performance of these two methods. The bootstrap procedure appears to perform reasonably well, even when the alternative hypotheses are correlated (rho = .25). An example from a toxicogenomic microarray experiment is presented for illustration. PMID:14969487

  1. Adeno-Associated Virus at 50: A Golden Anniversary of Discovery, Research, and Gene Therapy Success—A Personal Perspective

    PubMed Central

    Hastie, Eric

    2015-01-01

    Abstract Fifty years after the discovery of adeno-associated virus (AAV) and more than 30 years after the first gene transfer experiment was conducted, dozens of gene therapy clinical trials are in progress, one vector is approved for use in Europe, and breakthroughs in virus modification and disease modeling are paving the way for a revolution in the treatment of rare diseases, cancer, as well as HIV. This review will provide a historical perspective on the progression of AAV for gene therapy from discovery to the clinic, focusing on contributions from the Samulski lab regarding basic science and cloning of AAV, optimized large-scale production of vectors, preclinical large animal studies and safety data, vector modifications for improved efficacy, and successful clinical applications. PMID:25807962

  2. Gene invasion in distant eukaryotic lineages: discovery of mutually exclusive genetic elements reveals marine biodiversity.

    PubMed

    Monier, Adam; Sudek, Sebastian; Fast, Naomi M; Worden, Alexandra Z

    2013-09-01

    Inteins are rare, translated genetic parasites mainly found in bacteria and archaea, while spliceosomal introns are distinctly eukaryotic features abundant in most nuclear genomes. Using targeted metagenomics, we discovered an intein in an Atlantic population of the photosynthetic eukaryote, Bathycoccus, harbored by the essential spliceosomal protein PRP8 (processing factor 8 protein). Although previously thought exclusive to fungi, we also identified PRP8 inteins in parasitic (Capsaspora) and predatory (Salpingoeca) protists. Most new PRP8 inteins were at novel insertion sites that, surprisingly, were not in the most conserved regions of the gene. Evolutionarily, Dikarya fungal inteins at PRP8 insertion site a appeared more related to the Bathycoccus intein at a unique insertion site, than to other fungal and opisthokont inteins. Strikingly, independent analyses of Pacific and Atlantic samples revealed an intron at the same codon as the Bathycoccus PRP8 intein. The two elements are mutually exclusive and neither was found in cultured Bathycoccus or other picoprasinophyte genomes. Thus, wild Bathycoccus contain one of few non-fungal eukaryotic inteins known and a rare polymorphic intron. Our data indicate at least two Bathycoccus ecotypes exist, associated respectively with oceanic or mesotrophic environments. We hypothesize that intein propagation is facilitated by marine viruses; and, while intron gain is still poorly understood, presence of a spliceosomal intron where a locus lacks an intein raises the possibility of new, intein-primed mechanisms for intron gain. The discovery of nucleus-encoded inteins and associated sequence polymorphisms in uncultivated marine eukaryotes highlights their diversity and reveals potential sexual boundaries between populations indistinguishable by common marker genes. PMID:23635865

  3. Gene invasion in distant eukaryotic lineages: discovery of mutually exclusive genetic elements reveals marine biodiversity

    PubMed Central

    Monier, Adam; Sudek, Sebastian; Fast, Naomi M; Worden, Alexandra Z

    2013-01-01

    Inteins are rare, translated genetic parasites mainly found in bacteria and archaea, while spliceosomal introns are distinctly eukaryotic features abundant in most nuclear genomes. Using targeted metagenomics, we discovered an intein in an Atlantic population of the photosynthetic eukaryote, Bathycoccus, harbored by the essential spliceosomal protein PRP8 (processing factor 8 protein). Although previously thought exclusive to fungi, we also identified PRP8 inteins in parasitic (Capsaspora) and predatory (Salpingoeca) protists. Most new PRP8 inteins were at novel insertion sites that, surprisingly, were not in the most conserved regions of the gene. Evolutionarily, Dikarya fungal inteins at PRP8 insertion site a appeared more related to the Bathycoccus intein at a unique insertion site, than to other fungal and opisthokont inteins. Strikingly, independent analyses of Pacific and Atlantic samples revealed an intron at the same codon as the Bathycoccus PRP8 intein. The two elements are mutually exclusive and neither was found in cultured Bathycoccus or other picoprasinophyte genomes. Thus, wild Bathycoccus contain one of few non-fungal eukaryotic inteins known and a rare polymorphic intron. Our data indicate at least two Bathycoccus ecotypes exist, associated respectively with oceanic or mesotrophic environments. We hypothesize that intein propagation is facilitated by marine viruses; and, while intron gain is still poorly understood, presence of a spliceosomal intron where a locus lacks an intein raises the possibility of new, intein-primed mechanisms for intron gain. The discovery of nucleus-encoded inteins and associated sequence polymorphisms in uncultivated marine eukaryotes highlights their diversity and reveals potential sexual boundaries between populations indistinguishable by common marker genes. PMID:23635865

  4. Display technologies: application for the discovery of drug and gene delivery agents

    PubMed Central

    Sergeeva, Anna; Kolonin, Mikhail G.; Molldrem, Jeffrey J.; Pasqualini, Renata; Arap, Wadih

    2007-01-01

    Recognition of molecular diversity of cell surface proteomes in disease is essential for the development of targeted therapies. Progress in targeted therapeutics requires establishing effective approaches for high-throughput identification of agents specific for clinically relevant cell surface markers. Over the past decade, a number of platform strategies have been developed to screen polypeptide libraries for ligands targeting receptors selectively expressed in the context of various cell surface proteomes. Streamlined procedures for identification of ligand-receptor pairs that could serve as targets in disease diagnosis, profiling, imaging and therapy have relied on the display technologies, in which polypeptides with desired binding profiles can be serially selected, in a process called biopanning, based on their physical linkage with the encoding nucleic acid. These technologies include virus/phage display, cell display, ribosomal display, mRNA display and covalent DNA display (CDT), with phage display being by far the most utilized. The scope of this review is the recent advancements in the display technologies with a particular emphasis on molecular mapping of cell surface proteomes with peptide phage display. Prospective applications of targeted compounds derived from display libraries in the discovery of targeted drugs and gene therapy vectors are discussed. PMID:17123658

  5. Gene-based single nucleotide polymorphism discovery in bovine muscle using next-generation transcriptomic sequencing

    PubMed Central

    2013-01-01

    Background Genetic information based on molecular markers has increasingly being used in cattle breeding improvement programmes, as a mean to improve conventionally phenotypic selection. Advances in molecular genetics have led to the identification of several genetic markers associated with genes affecting economic traits. Until recently, the identification of the causative genetic variants involved in the phenotypes of interest has remained a difficult task. The advent of novel sequencing technologies now offers a new opportunity for the identification of such variants. Despite sequencing costs plummeting, sequencing whole-genomes or large targeted regions is still too expensive for most laboratories. A transcriptomic-based sequencing approach offers a cheaper alternative to identify a large number of polymorphisms and possibly to discover causative variants. In the present study, we performed a gene-based single nucleotide polymorphism (SNP) discovery analysis in bovine Longissimus thoraci, using RNA-Seq. To our knowledge, this represents the first study done in bovine muscle. Results Messenger RNAs from Longissimus thoraci from three Limousin bull calves were subjected to high-throughput sequencing. Approximately 36–46 million paired-end reads were obtained per library. A total of 19,752 transcripts were identified and 34,376 different SNPs were detected. Fifty-five percent of the SNPs were found in coding regions and ~22% resulted in an amino acid change. Applying a very stringent SNP quality threshold, we detected 8,407 different high-confidence SNPs, 18% of which are non synonymous coding SNPs. To analyse the accuracy of RNA-Seq technology for SNP detection, 48 SNPs were selected for validation by genotyping. No discrepancies were observed when using the highest SNP probability threshold. To test the usefulness of the identified SNPs, the 48 selected SNPs were assessed by genotyping 93 bovine samples, representing mostly the nine major breeds used in France. Principal component analysis indicates a clear separation between the nine populations. Conclusions The RNA-Seq data and the collection of newly discovered coding SNPs improve the genomic resources available for cattle, especially for beef breeds. The large amount of variation present in genes expressed in Limousin Longissimus thoracis, especially the large number of non synonymous coding SNPs, may prove useful to study the mechanisms underlying the genetic variability of meat quality traits. PMID:23651547

  6. An Evaluation of Active Learning Causal Discovery Methods for Reverse-Engineering Local Causal Pathways of Gene Regulation

    PubMed Central

    Ma, Sisi; Kemmeren, Patrick; Aliferis, Constantin F.; Statnikov, Alexander

    2016-01-01

    Reverse-engineering of causal pathways that implicate diseases and vital cellular functions is a fundamental problem in biomedicine. Discovery of the local causal pathway of a target variable (that consists of its direct causes and direct effects) is essential for effective intervention and can facilitate accurate diagnosis and prognosis. Recent research has provided several active learning methods that can leverage passively observed high-throughput data to draft causal pathways and then refine the inferred relations with a limited number of experiments. The current study provides a comprehensive evaluation of the performance of active learning methods for local causal pathway discovery in real biological data. Specifically, 54 active learning methods/variants from 3 families of algorithms were applied for local causal pathways reconstruction of gene regulation for 5 transcription factors in S. cerevisiae. Four aspects of the methods’ performance were assessed, including adjacency discovery quality, edge orientation accuracy, complete pathway discovery quality, and experimental cost. The results of this study show that some methods provide significant performance benefits over others and therefore should be routinely used for local causal pathway discovery tasks. This study also demonstrates the feasibility of local causal pathway reconstruction in real biological systems with significant quality and low experimental cost. PMID:26939894

  7. An Evaluation of Active Learning Causal Discovery Methods for Reverse-Engineering Local Causal Pathways of Gene Regulation.

    PubMed

    Ma, Sisi; Kemmeren, Patrick; Aliferis, Constantin F; Statnikov, Alexander

    2016-01-01

    Reverse-engineering of causal pathways that implicate diseases and vital cellular functions is a fundamental problem in biomedicine. Discovery of the local causal pathway of a target variable (that consists of its direct causes and direct effects) is essential for effective intervention and can facilitate accurate diagnosis and prognosis. Recent research has provided several active learning methods that can leverage passively observed high-throughput data to draft causal pathways and then refine the inferred relations with a limited number of experiments. The current study provides a comprehensive evaluation of the performance of active learning methods for local causal pathway discovery in real biological data. Specifically, 54 active learning methods/variants from 3 families of algorithms were applied for local causal pathways reconstruction of gene regulation for 5 transcription factors in S. cerevisiae. Four aspects of the methods' performance were assessed, including adjacency discovery quality, edge orientation accuracy, complete pathway discovery quality, and experimental cost. The results of this study show that some methods provide significant performance benefits over others and therefore should be routinely used for local causal pathway discovery tasks. This study also demonstrates the feasibility of local causal pathway reconstruction in real biological systems with significant quality and low experimental cost. PMID:26939894

  8. De Novo Transcriptomic Analysis of Peripheral Blood Lymphocytes from the Chinese Goose: Gene Discovery and Immune System Pathway Description

    PubMed Central

    Tariq, Mansoor; Chen, Rong; Yuan, Hongyu; Liu, Yanjie; Wu, Yanan; Wang, Junya; Xia, Chun

    2015-01-01

    Background The Chinese goose is one of the most economically important poultry birds and is a natural reservoir for many avian viruses. However, the nature and regulation of the innate and adaptive immune systems of this waterfowl species are not completely understood due to limited information on the goose genome. Recently, transcriptome sequencing technology was applied in the genomic studies focused on novel gene discovery. Thus, this study described the transcriptome of the goose peripheral blood lymphocytes to identify immunity relevant genes. Principal Findings De novo transcriptome assembly of the goose peripheral blood lymphocytes was sequenced by Illumina-Solexa technology. In total, 211,198 unigenes were assembled from the 69.36 million cleaned reads. The average length, N50 size and the maximum length of the assembled unigenes were 687 bp, 1,298 bp and 18,992 bp, respectively. A total of 36,854 unigenes showed similarity by BLAST search against the NCBI non-redundant (Nr) protein database. For functional classification, 163,161 unigenes were comprised of three Gene Ontology (Go) categories and 67 subcategories. A total of 15,334 unigenes were annotated into 25 eukaryotic orthologous groups (KOGs) categories. Kyoto Encyclopedia of Genes and Genomes (KEGG) database annotated 39,585 unigenes into six biological functional groups and 308 pathways. Among the 2,757 unigenes that participated in the 15 immune system KEGG pathways, 125 of the most important immune relevant genes were summarized and analyzed by STRING analysis to identify gene interactions and relationships. Moreover, 10 genes were confirmed by PCR and analyzed. Of these 125 unigenes, 109 unigenes, approximately 87%, were not previously identified in the goose. Conclusion This de novo transcriptome analysis could provide important Chinese goose sequence information and highlights the value of new gene discovery, pathways investigation and immune system gene identification, and comparison with other avian species as useful tools to understand the goose immune system. PMID:25816068

  9. Improving data discovery and usability through commentary and user feedback: the CHARMe project

    NASA Astrophysics Data System (ADS)

    Alegre, R.; Blower, J. D.

    2014-12-01

    Earth science datasets are highly diverse. Users of these datasets are similarly varied, ranging from research scientists through industrial users to government decision- and policy-makers. It is very important for these users to understand the applicability of any dataset to their particular problem so that they can select the most appropriate data sources for their needs. Although data providers often provide rich supporting information in the form of metadata, typically this information does not include community usage information that can help other users judge fitness-for-purpose.The CHARMe project (http://www.charme.org.uk) is filling this gap by developing a system for sharing "commentary metadata". These are annotations that are generated and shared by the user community and include: Links between publications and datasets. The CHARMe system can record information about why a particular dataset was used (e.g. the paper may describe the dataset, it may use the dataset as a source, or it may be publishing results of a dataset assessment). These publications may appear in the peer-reviewed literature, or may be technical reports, websites or blog posts. Free-text comments supplied by the user. Provenance information, including links between datasets and descriptions of processing algorithms and sensors. External events that may affect data quality (e.g. large volcanic eruptions or El Niño events); we call these "significant events". Data quality information, e.g. system maturity indices. Commentary information can be linked to anything that can be uniquely identified (e.g. a dataset with a DOI or a persistent web address). It is also possible to associate commentary with particular subsets of datasets, for example to highlight an issue that is confined to a particular geographic region. We will demonstrate tools that show these capabilities in action, showing how users can apply commentary information during data discovery, visualization and analysis. The CHARMe project has implemented a set of open-source tools to create, store and explore commentary information, using open Web standards. In this presentation we will describe the application of the CHARMe system to the particular case of the climate data community; however the techniques and technologies are generic and can be applied in many fields.

  10. Genome-Scale Discovery of Cell Wall Biosynthesis Genes in Populus (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    SciTech Connect

    Muchero, Wellington

    2012-03-22

    Wellington Muchero from Oak Ridge National Laboratory gives a talk titled "Discovery of Cell Wall Biosynthesis Genes in Populus" at the JGI 7th Annual Users Meeting: Genomics of Energy & Environment Meeting on March 22, 2012 in Walnut Creek, California.

  11. Genome-Scale Discovery of Cell Wall Biosynthesis Genes in Populus (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    ScienceCinema

    Muchero, Wellington [Oak Ridge National Laboratory

    2013-01-22

    Wellington Muchero from Oak Ridge National Laboratory gives a talk titled "Discovery of Cell Wall Biosynthesis Genes in Populus" at the JGI 7th Annual Users Meeting: Genomics of Energy & Environment Meeting on March 22, 2012 in Walnut Creek, California.

  12. IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites

    PubMed Central

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T. B. K.; Cimermančič, Peter; Fischbach, Michael A.; Ivanova, Natalia N.; Markowitz, Victor M.

    2015-01-01

    ABSTRACT In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of “big” genomic data for discovering small molecules. IMG-ABC relies on IMG’s comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC’s focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. PMID:26173699

  13. Metabologenomics: Correlation of Microbial Gene Clusters with Metabolites Drives Discovery of a Nonribosomal Peptide with an Unusual Amino Acid Monomer

    PubMed Central

    2016-01-01

    For more than half a century the pharmaceutical industry has sifted through natural products produced by microbes, uncovering new scaffolds and fashioning them into a broad range of vital drugs. We sought a strategy to reinvigorate the discovery of natural products with distinctive structures using bacterial genome sequencing combined with metabolomics. By correlating genetic content from 178 actinomycete genomes with mass spectrometry-enabled analyses of their exported metabolomes, we paired new secondary metabolites with their biosynthetic gene clusters. We report the use of this new approach to isolate and characterize tambromycin, a new chlorinated natural product, composed of several nonstandard amino acid monomeric units, including a unique pyrrolidine-containing amino acid we name tambroline. Tambromycin shows antiproliferative activity against cancerous human B- and T-cell lines. The discovery of tambromycin via large-scale correlation of gene clusters with metabolites (a.k.a. metabologenomics) illuminates a path for structure-based discovery of natural products at a sharply increased rate.

  14. Metabologenomics: Correlation of Microbial Gene Clusters with Metabolites Drives Discovery of a Nonribosomal Peptide with an Unusual Amino Acid Monomer.

    PubMed

    Goering, Anthony W; McClure, Ryan A; Doroghazi, James R; Albright, Jessica C; Haverland, Nicole A; Zhang, Yongbo; Ju, Kou-San; Thomson, Regan J; Metcalf, William W; Kelleher, Neil L

    2016-02-24

    For more than half a century the pharmaceutical industry has sifted through natural products produced by microbes, uncovering new scaffolds and fashioning them into a broad range of vital drugs. We sought a strategy to reinvigorate the discovery of natural products with distinctive structures using bacterial genome sequencing combined with metabolomics. By correlating genetic content from 178 actinomycete genomes with mass spectrometry-enabled analyses of their exported metabolomes, we paired new secondary metabolites with their biosynthetic gene clusters. We report the use of this new approach to isolate and characterize tambromycin, a new chlorinated natural product, composed of several nonstandard amino acid monomeric units, including a unique pyrrolidine-containing amino acid we name tambroline. Tambromycin shows antiproliferative activity against cancerous human B- and T-cell lines. The discovery of tambromycin via large-scale correlation of gene clusters with metabolites (a.k.a. metabologenomics) illuminates a path for structure-based discovery of natural products at a sharply increased rate. PMID:27163034

  15. The CLN3 gene is a novel molecular target for cancer drug discovery.

    PubMed

    Rylova, Svetlana N; Amalfitano, Andrea; Persaud-Sawin, Dixie-Ann; Guo, Wei-Xing; Chang, Jerry; Jansen, Paul J; Proia, Alan D; Boustany, Rose-Mary

    2002-02-01

    Juvenile Batten disease is a neurodegenerative disease caused by accelerated apoptotic death of photoreceptors and neurons attributable to defects in the CLN3 gene. CLN3 is antiapoptotic when overexpressed in NT2 neuronal precursor cells. CLN3 negatively modulates endogenous ceramide levels in NT2 cells and acts upstream of ceramide generation. Because defects in regulation of apoptosis are involved in the development of cancer, we evaluated the expression of CLN3 on both mRNA and protein levels in a variety of cancer cell lines and solid colon cancer tissue. We also observed the effect of the blocking of CLN3 protein expression on cancer cell growth, survival, ceramide production, and apoptosis by using an adenovirus-bearing antisense CLN3 construct. We show that CLN3 mRNA and protein are overexpressed in glioblastoma (U-373G and T98g), neuroblastoma (IMR-32 and SK-N-MC), prostate (Du145, PC-3, and LNCaP), ovarian (SK-OV-3, SW626, and PA-1), breast (BT-20, BT-549, and BT-474), and colon (SW1116, SW480, and HCT 116) cancer cell lines but not in pancreatic (CAPAN and As-PC-1) or lung (A-549 and NCI-H520) cancer cell lines. CLN3 is also up-regulated in mouse melanoma and breast carcinoma cancer cell lines. We found CLN3 expression is 22-330% higher than in corresponding normal colon control tissue in 8 of 10 solid colon tumors. An adenovirus-expressing antisense CLN3 (Ad-AS-CLN3) blocks CLN3 protein expression in DU-145, BT-20, SW1116, and T98g cancer cell lines as seen by Western blot. Blocking of CLN3 expression using Ad-AS-CLN3 inhibits growth and viability of cancer cells. It also causes elevation in endogenous ceramide production through de novo ceramide synthesis and results in increased apoptosis as shown by propidium iodide and JC-1 staining. This suggests that Ad-AS-CLN3 may be an option for therapy in some cancers. More importantly these results suggest that CLN3 is a novel molecular target for cancer drug discovery. PMID:11830536

  16. IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

    SciTech Connect

    Chen, I-Min; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Huang, Jinghua; Reddy, T. B.K.; Cimermancic, Peter; Fischbach, Michael; Ivanova, Natalia; Markowitz, Victor; Kyrpides, Nikos; Pati, Amrita

    2014-10-28

    In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC (https://img.jgi.doe.gov/abc/) -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorway to a new era in the discovery of novel molecules.

  17. Gene discovery using massively parallel pyrosequencing to develop ESTs for the flesh fly Sarcophaga crassipalpis

    PubMed Central

    Hahn, Daniel A; Ragland, Gregory J; Shoemaker, D DeWayne; Denlinger, David L

    2009-01-01

    Background Flesh flies in the genus Sarcophaga are important models for investigating endocrinology, diapause, cold hardiness, reproduction, and immunity. Despite the prominence of Sarcophaga flesh flies as models for insect physiology and biochemistry, and in forensic studies, little genomic or transcriptomic data are available for members of this genus. We used massively parallel pyrosequencing on the Roche 454-FLX platform to produce a substantial EST dataset for the flesh fly Sarcophaga crassipalpis. To maximize sequence diversity, we pooled RNA extracted from whole bodies of all life stages and normalized the cDNA pool after reverse transcription. Results We obtained 207,110 ESTs with an average read length of 241 bp. These reads assembled into 20,995 contigs and 31,056 singletons. Using BLAST searches of the NR and NT databases we were able to identify 11,757 unique gene elements (E<0.0001) representing approximately 9,000 independent transcripts. Comparison of the distribution of S. crassipalpis unigenes among GO Biological Process functional groups with that of the Drosophila melanogaster transcriptome suggests that our ESTs are broadly representative of the flesh fly transcriptome. Insertion and deletion errors in 454 sequencing present a serious hurdle to comparative transcriptome analysis. Aided by a new approach to correcting for these errors, we performed a comparative analysis of genetic divergence across GO categories among S. crassipalpis, D. melanogaster, and Anopheles gambiae. The results suggest that non-synonymous substitutions occur at similar rates across categories, although genes related to response to stimuli may evolve slightly faster. In addition, we identified over 500 potential microsatellite loci and more than 12,000 SNPs among our ESTs. Conclusion Our data provides the first large-scale EST-project for flesh flies, a much-needed resource for exploring this model species. In addition, we identified a large number of potential microsatellite and SNP markers that could be used in population and systematic studies of S. crassipalpis and other flesh flies. PMID:19454017

  18. The Berkeley Drosophila Genome Project gene disruption project: Single P-element insertions mutating 25% of vital Drosophila genes.

    PubMed Central

    Spradling, A C; Stern, D; Beaton, A; Rhem, E J; Laverty, T; Mozden, N; Misra, S; Rubin, G M

    1999-01-01

    A fundamental goal of genetics and functional genomics is to identify and mutate every gene in model organisms such as Drosophila melanogaster. The Berkeley Drosophila Genome Project (BDGP) gene disruption project generates single P-element insertion strains that each mutate unique genomic open reading frames. Such strains strongly facilitate further genetic and molecular studies of the disrupted loci, but it has remained unclear if P elements can be used to mutate all Drosophila genes. We now report that the primary collection has grown to contain 1045 strains that disrupt more than 25% of the estimated 3600 Drosophila genes that are essential for adult viability. Of these P insertions, 67% have been verified by genetic tests to cause the associated recessive mutant phenotypes, and the validity of most of the remaining lines is predicted on statistical grounds. Sequences flanking >920 insertions have been determined to exactly position them in the genome and to identify 376 potentially affected transcripts from collections of EST sequences. Strains in the BDGP collection are available from the Bloomington Stock Center and have already assisted the research community in characterizing >250 Drosophila genes. The likely identity of 131 additional genes in the collection is reported here. Our results show that Drosophila genes have a wide range of sensitivity to inactivation by P elements, and provide a rationale for greatly expanding the BDGP primary collection based entirely on insertion site sequencing. We predict that this approach can bring >85% of all Drosophila open reading frames under experimental control. PMID:10471706

  19. Discovery of Possible Gene Relationships through the Application of Self-Organizing Maps to DNA Microarray Databases

    PubMed Central

    Chavez-Alvarez, Rocio; Chavoya, Arturo; Mendez-Vazquez, Andres

    2014-01-01

    DNA microarrays and cell cycle synchronization experiments have made possible the study of the mechanisms of cell cycle regulation of Saccharomyces cerevisiae by simultaneously monitoring the expression levels of thousands of genes at specific time points. On the other hand, pattern recognition techniques can contribute to the analysis of such massive measurements, providing a model of gene expression level evolution through the cell cycle process. In this paper, we propose the use of one of such techniques –an unsupervised artificial neural network called a Self-Organizing Map (SOM)–which has been successfully applied to processes involving very noisy signals, classifying and organizing them, and assisting in the discovery of behavior patterns without requiring prior knowledge about the process under analysis. As a test bed for the use of SOMs in finding possible relationships among genes and their possible contribution in some biological processes, we selected 282 S. cerevisiae genes that have been shown through biological experiments to have an activity during the cell cycle. The expression level of these genes was analyzed in five of the most cited time series DNA microarray databases used in the study of the cell cycle of this organism. With the use of SOM, it was possible to find clusters of genes with similar behavior in the five databases along two cell cycles. This result suggested that some of these genes might be biologically related or might have a regulatory relationship, as was corroborated by comparing some of the clusters obtained with SOMs against a previously reported regulatory network that was generated using biological knowledge, such as protein-protein interactions, gene expression levels, metabolism dynamics, promoter binding, and modification, regulation and transport of proteins. The methodology described in this paper could be applied to the study of gene relationships of other biological processes in different organisms. PMID:24699245

  20. Discovery of possible gene relationships through the application of self-organizing maps to DNA microarray databases.

    PubMed

    Chavez-Alvarez, Rocio; Chavoya, Arturo; Mendez-Vazquez, Andres

    2014-01-01

    DNA microarrays and cell cycle synchronization experiments have made possible the study of the mechanisms of cell cycle regulation of Saccharomyces cerevisiae by simultaneously monitoring the expression levels of thousands of genes at specific time points. On the other hand, pattern recognition techniques can contribute to the analysis of such massive measurements, providing a model of gene expression level evolution through the cell cycle process. In this paper, we propose the use of one of such techniques--an unsupervised artificial neural network called a Self-Organizing Map (SOM)-which has been successfully applied to processes involving very noisy signals, classifying and organizing them, and assisting in the discovery of behavior patterns without requiring prior knowledge about the process under analysis. As a test bed for the use of SOMs in finding possible relationships among genes and their possible contribution in some biological processes, we selected 282 S. cerevisiae genes that have been shown through biological experiments to have an activity during the cell cycle. The expression level of these genes was analyzed in five of the most cited time series DNA microarray databases used in the study of the cell cycle of this organism. With the use of SOM, it was possible to find clusters of genes with similar behavior in the five databases along two cell cycles. This result suggested that some of these genes might be biologically related or might have a regulatory relationship, as was corroborated by comparing some of the clusters obtained with SOMs against a previously reported regulatory network that was generated using biological knowledge, such as protein-protein interactions, gene expression levels, metabolism dynamics, promoter binding, and modification, regulation and transport of proteins. The methodology described in this paper could be applied to the study of gene relationships of other biological processes in different organisms. PMID:24699245

  1. Discovery of CTCF-Sensitive Cis-Spliced Fusion RNAs between Adjacent Genes in Human Prostate Cells

    PubMed Central

    Qin, Fujun; Song, Zhenguo; Babiceanu, Mihaela; Song, Yansu; Facemire, Loryn; Singh, Ritambhara; Adli, Mazhar; Li, Hui

    2015-01-01

    Genes or their encoded products are not expected to mingle with each other unless in some disease situations. In cancer, a frequent mechanism that can produce gene fusions is chromosomal rearrangement. However, recent discoveries of RNA trans-splicing and cis-splicing between adjacent genes (cis-SAGe) support for other mechanisms in generating fusion RNAs. In our transcriptome analyses of 28 prostate normal and cancer samples, 30% fusion RNAs on average are the transcripts that contain exons belonging to same-strand neighboring genes. These fusion RNAs may be the products of cis-SAGe, which was previously thought to be rare. To validate this finding and to better understand the phenomenon, we used LNCaP, a prostate cell line as a model, and identified 16 additional cis-SAGe events by silencing transcription factor CTCF and paired-end RNA sequencing. About half of the fusions are expressed at a significant level compared to their parental genes. Silencing one of the in-frame fusions resulted in reduced cell motility. Most out-of-frame fusions are likely to function as non-coding RNAs. The majority of the 16 fusions are also detected in other prostate cell lines, as well as in the 14 clinical prostate normal and cancer pairs. By studying the features associated with these fusions, we developed a set of rules: 1) the parental genes are same-strand-neighboring genes; 2) the distance between the genes is within 30kb; 3) the 5′ genes are actively transcribing; and 4) the chimeras tend to have the second-to-last exon in the 5′ genes joined to the second exon in the 3′ genes. We then randomly selected 20 neighboring genes in the genome, and detected four fusion events using these rules in prostate cancer and non-cancerous cells. These results suggest that splicing between neighboring gene transcripts is a rather frequent phenomenon, and it is not a feature unique to cancer cells. PMID:25658338

  2. The BDGP gene disruption project: Single transposon insertions associated with 40 percent of Drosophila genes

    SciTech Connect

    Bellen, Hugo J.; Levis, Robert W.; Liao, Guochun; He, Yuchun; Carlson, Joseph W.; Tsang, Garson; Evans-Holm, Martha; Hiesinger, P. Robin; Schulze, Karen L.; Rubin, Gerald M.; Hoskins, Roger A.; Spradling, Allan C.

    2004-01-13

    The Berkeley Drosophila Genome Project (BDGP) strives to disrupt each Drosophila gene by the insertion of a single transposable element. As part of this effort, transposons in more than 30,000 fly strains were localized and analyzed relative to predicted Drosophila gene structures. Approximately 6,300 lines that maximize genomic coverage were selected to be sent to the Bloomington Stock Center for public distribution, bringing the size of the BDGP gene disruption collection to 7,140 lines. It now includes individual lines predicted to disrupt 5,362 of the 13,666 currently annotated Drosophila genes (39 percent). Other lines contain an insertion at least 2 kb from others in the collection and likely mutate additional incompletely annotated or uncharacterized genes and chromosomal regulatory elements. The remaining strains contain insertions likely to disrupt alternative gene promoters or to allow gene mis-expression. The expanded BDGP gene disruption collection provides a public resource that will facilitate the application of Drosophila genetics to diverse biological problems. Finally, the project reveals new insight into how transposons interact with a eukaryotic genome and helps define optimal strategies for using insertional mutagenesis as a genomic tool.

  3. The BDGP gene disruption project: single transposon insertions associated with 40% of Drosophila genes.

    PubMed Central

    Bellen, Hugo J; Levis, Robert W; Liao, Guochun; He, Yuchun; Carlson, Joseph W; Tsang, Garson; Evans-Holm, Martha; Hiesinger, P Robin; Schulze, Karen L; Rubin, Gerald M; Hoskins, Roger A; Spradling, Allan C

    2004-01-01

    The Berkeley Drosophila Genome Project (BDGP) strives to disrupt each Drosophila gene by the insertion of a single transposable element. As part of this effort, transposons in >30,000 fly strains were localized and analyzed relative to predicted Drosophila gene structures. Approximately 6300 lines that maximize genomic coverage were selected to be sent to the Bloomington Stock Center for public distribution, bringing the size of the BDGP gene disruption collection to 7140 lines. It now includes individual lines predicted to disrupt 5362 of the 13,666 currently annotated Drosophila genes (39%). Other lines contain an insertion at least 2 kb from others in the collection and likely mutate additional incompletely annotated or uncharacterized genes and chromosomal regulatory elements. The remaining strains contain insertions likely to disrupt alternative gene promoters or to allow gene misexpression. The expanded BDGP gene disruption collection provides a public resource that will facilitate the application of Drosophila genetics to diverse biological problems. Finally, the project reveals new insight into how transposons interact with a eukaryotic genome and helps define optimal strategies for using insertional mutagenesis as a genomic tool. PMID:15238527

  4. Discovery of agents that eradicate leukemia stem cells using an in silico screen of public gene expression data

    PubMed Central

    Hassane, Duane C.; Guzman, Monica L.; Corbett, Cheryl; Li, Xiaojie; Abboud, Ramzi; Young, Fay; Liesveld, Jane L.; Carroll, Martin

    2008-01-01

    Increasing evidence indicates that malignant stem cells are important for the pathogenesis of acute myelogenous leukemia (AML) and represent a reservoir of cells that drive the development of AML and relapse. Therefore, new treatment regimens are necessary to prevent relapse and improve therapeutic outcomes. Previous studies have shown that the sesquiterpene lactone, parthenolide (PTL), ablates bulk, progenitor, and stem AML cells while causing no appreciable toxicity to normal hematopoietic cells. Thus, PTL must evoke cellular responses capable of mediating AML selective cell death. Given recent advances in chemical genomics such as gene expression-based high-throughput screening (GE-HTS) and the Connectivity Map, we hypothesized that the gene expression signature resulting from treatment of primary AML with PTL could be used to search for similar signatures in publicly available gene expression profiles deposited into the Gene Expression Omnibus (GEO). We therefore devised a broad in silico screen of the GEO database using the PTL gene expression signature as a template and discovered 2 new agents, celastrol and 4-hydroxy-2-nonenal, that effectively eradicate AML at the bulk, progenitor, and stem cell level. These findings suggest the use of multicenter collections of high-throughput data to facilitate discovery of leukemia drugs and drug targets. PMID:18305216

  5. A Hybrid Computational Method for the Discovery of Novel Reproduction-Related Genes

    PubMed Central

    Chen, Lei; Chu, Chen; Kong, Xiangyin; Huang, Guohua; Huang, Tao; Cai, Yu-Dong

    2015-01-01

    Uncovering the molecular mechanisms underlying reproduction is of great importance to infertility treatment and to the generation of healthy offspring. In this study, we discovered novel reproduction-related genes with a hybrid computational method, integrating three different types of method, which offered new clues for further reproduction research. This method was first executed on a weighted graph, constructed based on known protein-protein interactions, to search the shortest paths connecting any two known reproduction-related genes. Genes occurring in these paths were deemed to have a special relationship with reproduction. These newly discovered genes were filtered with a randomization test. Then, the remaining genes were further selected according to their associations with known reproduction-related genes measured by protein-protein interaction score and alignment score obtained by BLAST. The in-depth analysis of the high confidence novel reproduction genes revealed hidden mechanisms of reproduction and provided guidelines for further experimental validations. PMID:25768094

  6. Gene discovery in the tropical scallop Nodipecten nodosus: construction and sequencing of a normalized cDNA library.

    PubMed

    Americo, Juliana Alves; Dondero, Francesco; Moraes, Milton Ozório; Allodi, Silvana; de Freitas Rebelo, Mauro

    2013-10-01

    We report the construction and characterization of a normalized cDNA library from the digestive gland of the marine bivalve Nodipecten nodosus, a commercially valuable tropical scallop. A total of 288 clones were sequenced, and 250 unique sequences were obtained. The cDNA library showed a small sequence redundancy (2.3%) and high numbers of recombinant (99.9%) and independent clones (2.0 × 10(6) cfu), indicating that the cDNA library generated in this study is a profitable resource for efficient gene discovery for N. nodosus. EST functional annotation by Gene Ontology term assignment revealed the identification of sequences potentially involved in aquaculture and ecotoxicology relevant processes such as apoptosis, growth, lipid metabolism, reproduction, development, response to stress and immunity. PMID:23669241

  7. Discovery of a novel esterase subfamily sharing an identified arm sequence (ArmEst) by gene-specific metagenomic PCR.

    PubMed

    Zhang, Ao; Zhao, Rong; Jin, Peng; Ma, Lifang; Xiong, Xiaolong; Xie, Tian; Pei, Xiaolin; Yu, Li; Yin, Xiaopu; Wang, Qiuyan

    2013-11-01

    A gene-specific, metagenomic PCR method has led to the discovery of a novel esterase subfamily consisting of five homologous members. Sequence analysis of this esterase subfamily, named the ArmEst subfamily, revealed a unique conserved pattern with a significant variable interior sequence flanked by two symmetric and identical long arm sequences. The two homologous long arm sequences had 100 % sequence identity and symmetry at both ends between the five members of this esterase class, but only 17-58 % identity was shared for the internal sequence. The biochemical properties of two of the ArmEst esterases definitively demonstrated that they are true active esterases rather than pseudogenes. This is the first report presenting an esterase subfamily containing a unique arm sequence, indicating a rare homologous recombination occurring in the coding area of a functional gene to generate their functional diversity. PMID:23881330

  8. Gene discovery from Jatropha curcas by sequencing of ESTs from normalized and full-length enriched cDNA library from developing seeds

    PubMed Central

    2010-01-01

    Background Jatropha curcas L. is promoted as an important non-edible biodiesel crop worldwide. Jatropha oil, which is a triacylglycerol, can be directly blended with petro-diesel or transesterified with methanol and used as biodiesel. Genetic improvement in jatropha is needed to increase the seed yield, oil content, drought and pest resistance, and to modify oil composition so that it becomes a technically and economically preferred source for biodiesel production. However, genetic improvement efforts in jatropha could not take advantage of genetic engineering methods due to lack of cloned genes from this species. To overcome this hurdle, the current gene discovery project was initiated with an objective of isolating as many functional genes as possible from J. curcas by large scale sequencing of expressed sequence tags (ESTs). Results A normalized and full-length enriched cDNA library was constructed from developing seeds of J. curcas. The cDNA library contained about 1 × 106 clones and average insert size of the clones was 2.1 kb. Totally 12,084 ESTs were sequenced to average high quality read length of 576 bp. Contig analysis revealed 2258 contigs and 4751 singletons. Contig size ranged from 2-23 and there were 7333 ESTs in the contigs. This resulted in 7009 unigenes which were annotated by BLASTX. It showed 3982 unigenes with significant similarity to known genes and 2836 unigenes with significant similarity to genes of unknown, hypothetical and putative proteins. The remaining 191 unigenes which did not show similarity with any genes in the public database may encode for unique genes. Functional classification revealed unigenes related to broad range of cellular, molecular and biological functions. Among the 7009 unigenes, 6233 unigenes were identified to be potential full-length genes. Conclusions The high quality normalized cDNA library was constructed from developing seeds of J. curcas for the first time and 7009 unigenes coding for diverse biological functions including oil biosynthesis were identified. These genes will serve as invaluable genetic resource for crop improvement in jatropha to make it an ideal and profitable crop for biodiesel production. PMID:20979643

  9. Co-clustering phenome–genome for phenotype classification and disease gene discovery

    PubMed Central

    Hwang, TaeHyun; Atluri, Gowtham; Xie, MaoQiang; Dey, Sanjoy; Hong, Changjin; Kumar, Vipin; Kuang, Rui

    2012-01-01

    Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype–gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this article, a regularized non-negative matrix tri-factorization (R-NMTF) algorithm is introduced to co-cluster phenotypes and genes, and simultaneously detect associations between the detected phenotype clusters and gene clusters. The R-NMTF algorithm factorizes the phenotype–gene association matrix under the prior knowledge from phenotype similarity network and protein–protein interaction network, supervised by the label information from known disease classes and biological pathways. In the experiments on disease phenotype–gene associations in OMIM and KEGG disease pathways, R-NMTF significantly improved the classification of disease phenotypes and disease pathway genes compared with support vector machines and Label Propagation in cross-validation on the annotated phenotypes and genes. The newly predicted phenotypes in each disease class are highly consistent with human phenotype ontology annotations. The roles of the new member genes in the disease pathways are examined and validated in the protein–protein interaction subnetworks. Extensive literature review also confirmed many new members of the disease classes and pathways as well as the predicted associations between disease phenotype classes and pathways. PMID:22735708

  10. Co-clustering phenome-genome for phenotype classification and disease gene discovery.

    PubMed

    Hwang, TaeHyun; Atluri, Gowtham; Xie, MaoQiang; Dey, Sanjoy; Hong, Changjin; Kumar, Vipin; Kuang, Rui

    2012-10-01

    Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype-gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this article, a regularized non-negative matrix tri-factorization (R-NMTF) algorithm is introduced to co-cluster phenotypes and genes, and simultaneously detect associations between the detected phenotype clusters and gene clusters. The R-NMTF algorithm factorizes the phenotype-gene association matrix under the prior knowledge from phenotype similarity network and protein-protein interaction network, supervised by the label information from known disease classes and biological pathways. In the experiments on disease phenotype-gene associations in OMIM and KEGG disease pathways, R-NMTF significantly improved the classification of disease phenotypes and disease pathway genes compared with support vector machines and Label Propagation in cross-validation on the annotated phenotypes and genes. The newly predicted phenotypes in each disease class are highly consistent with human phenotype ontology annotations. The roles of the new member genes in the disease pathways are examined and validated in the protein-protein interaction subnetworks. Extensive literature review also confirmed many new members of the disease classes and pathways as well as the predicted associations between disease phenotype classes and pathways. PMID:22735708

  11. Discovery of Core Biotic Stress Responsive Genes in Arabidopsis by Weighted Gene Co-Expression Network Analysis

    PubMed Central

    Amrine, Katherine C. H.; Blanco-Ulate, Barbara; Cantu, Dario

    2015-01-01

    Intricate signal networks and transcriptional regulators translate the recognition of pathogens into defense responses. In this study, we carried out a gene co-expression analysis of all currently publicly available microarray data, which were generated in experiments that studied the interaction of the model plant Arabidopsis thaliana with microbial pathogens. This work was conducted to identify (i) modules of functionally related co-expressed genes that are differentially expressed in response to multiple biotic stresses, and (ii) hub genes that may function as core regulators of disease responses. Using Weighted Gene Co-expression Network Analysis (WGCNA) we constructed an undirected network leveraging a rich curated expression dataset comprising 272 microarrays that involved microbial infections of Arabidopsis plants with a wide array of fungal and bacterial pathogens with biotrophic, hemibiotrophic, and necrotrophic lifestyles. WGCNA produced a network with scale-free and small-world properties composed of 205 distinct clusters of co-expressed genes. Modules of functionally related co-expressed genes that are differentially regulated in response to multiple pathogens were identified by integrating differential gene expression testing with functional enrichment analyses of gene ontology terms, known disease associated genes, transcriptional regulators, and cis-regulatory elements. The significance of functional enrichments was validated by comparisons with randomly generated networks. Network topology was then analyzed to identify intra- and inter-modular gene hubs. Based on high connectivity, and centrality in meta-modules that are clearly enriched in defense responses, we propose a list of 66 target genes for reverse genetic experiments to further dissect the Arabidopsis immune system. Our results show that statistical-based data trimming prior to network analysis allows the integration of expression datasets generated by different groups, under different experimental conditions and biological systems, into a functionally meaningful co-expression network. PMID:25730421

  12. Discovery and saturation analysis of cancer genes across 21 tumor types

    PubMed Central

    Lawrence, Michael S.; Stojanov, Petar; Mermel, Craig H.; Garraway, Levi A.; Golub, Todd R.; Meyerson, Matthew; Gabriel, Stacey B.; Lander, Eric S.; Getz, Gad

    2014-01-01

    Summary While a few cancer genes are mutated in a high proportion of tumors of a given type (>20%), most are mutated at intermediate frequencies (2–20%). To explore the feasibility of creating a comprehensive catalog of cancer genes, we analyzed somatic point mutations in exome sequence from 4,742 tumor-normal pairs across 21 cancer types. We found that large-scale genomic analysis can identify nearly all known cancer genes in these tumor types. Our analysis also identified 33 genes not previously known to be significantly mutated, including genes related to proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing and protein homeostasis. Down-sampling analysis indicates that larger sample sizes will reveal many more genes, mutated at clinically important frequencies. We estimate that near-saturation may be achieved with 600–5000 samples per tumor type, depending on background mutation rate. The results help guide the next stage of cancer genomics. PMID:24390350

  13. Discovery of estrogen-responsive genes using an improved method which combines subtractive hybridization and PCR.

    PubMed Central

    Liu, W; Su, W; Roberts, T M

    1998-01-01

    Here we describe a reliable method for isolating genes that are differentially expressed in two cell populations. The method is a combination of subtractive hybridization and PCR. Among many improvements to previously described methods is the incorporation of a new technology into the procedure which sterilizes(inactivates) PCR amplicons, and thereby overcomes the limitation of similar procedures. To test this improved method, we conducted a search for estrogen-responsive genes. Estrogen-regulated genes dominated the subtracted libraries after four rounds of subtractive hybridizations. Four estrogen-regulated genes were identified from the initial screening. PMID:9671829

  14. IMG-ABC. A knowledge base to fuel discovery of biosynthetic gene clusters and novel secondary metabolites

    DOE PAGESBeta

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T. B. K.; Cimermančič, Peter; Fischbach, Michael A.; et al

    2015-07-14

    In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of “big” genomic data for discovering small molecules. IMG-ABC relies on IMG’s comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve asmore » the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC’s focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in lphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG’s extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world.« less

  15. IMG-ABC. A knowledge base to fuel discovery of biosynthetic gene clusters and novel secondary metabolites

    SciTech Connect

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T. B. K.; Cimermančič, Peter; Fischbach, Michael A.; Ivanova, Natalia N.; Markowitz, Victor M.; Kyrpides, Nikos C.; Pati, Amrita

    2015-07-14

    In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of “big” genomic data for discovering small molecules. IMG-ABC relies on IMG’s comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC’s focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in lphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG’s extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world.

  16. Discovery and Replication of Gene Influences on Brain Structure Using LASSO Regression.

    PubMed

    Kohannim, Omid; Hibar, Derrek P; Stein, Jason L; Jahanshad, Neda; Hua, Xue; Rajagopalan, Priya; Toga, Arthur W; Jack, Clifford R; Weiner, Michael W; de Zubicaray, Greig I; McMahon, Katie L; Hansell, Narelle K; Martin, Nicholas G; Wright, Margaret J; Thompson, Paul M

    2012-01-01

    We implemented least absolute shrinkage and selection operator (LASSO) regression to evaluate gene effects in genome-wide association studies (GWAS) of brain images, using an MRI-derived temporal lobe volume measure from 729 subjects scanned as part of the Alzheimer's Disease Neuroimaging Initiative (ADNI). Sparse groups of SNPs in individual genes were selected by LASSO, which identifies efficient sets of variants influencing the data. These SNPs were considered jointly when assessing their association with neuroimaging measures. We discovered 22 genes that passed genome-wide significance for influencing temporal lobe volume. This was a substantially greater number of significant genes compared to those found with standard, univariate GWAS. These top genes are all expressed in the brain and include genes previously related to brain function or neuropsychiatric disorders such as MACROD2, SORCS2, GRIN2B, MAGI2, NPAS3, CLSTN2, GABRG3, NRXN3, PRKAG2, GAS7, RBFOX1, ADARB2, CHD4, and CDH13. The top genes we identified with this method also displayed significant and widespread post hoc effects on voxelwise, tensor-based morphometry (TBM) maps of the temporal lobes. The most significantly associated gene was an autism susceptibility gene known as MACROD2. We were able to successfully replicate the effect of the MACROD2 gene in an independent cohort of 564 young, Australian healthy adult twins and siblings scanned with MRI (mean age: 23.8 ± 2.2 SD years). Our approach powerfully complements univariate techniques in detecting influences of genes on the living brain. PMID:22888310

  17. The Utility of Next-Generation Sequencing in Gene Discovery for Mutation-Negative Patients with Rett Syndrome

    PubMed Central

    Gold, Wendy Anne; Christodoulou, John

    2015-01-01

    Rett syndrome (RTT) is a rare, severe disorder of neuronal plasticity that predominantly affects girls. Girls with RTT usually appear asymptomatic in the first 6–18 months of life, but gradually develop severe motor, cognitive, and behavioral abnormalities that persist for life. A predominance of neuronal and synaptic dysfunction, with altered excitatory–inhibitory neuronal synaptic transmission and synaptic plasticity, are overarching features of RTT in children and in mouse models. Over 90% of patients with classical RTT have mutations in the X-linked methyl-CpG-binding (MECP2) gene, while other genes, including cyclin-dependent kinase-like 5 (CDKL5), Forkhead box protein G1 (FOXG1), myocyte-specific enhancer factor 2C (MEF2C), and transcription factor 4 (TCF4), have been associated with phenotypes overlapping with RTT. However, there remain a proportion of patients who carry a clinical diagnosis of RTT, but who are mutation negative. In recent years, next-generation sequencing technologies have revolutionized approaches to genetic studies, making whole-exome and even whole-genome sequencing possible strategies for the detection of rare and de novo mutations, aiding the discovery of novel disease genes. Here, we review the recent progress that is emerging in identifying pathogenic variations, specifically from exome sequencing in RTT patients, and emphasize the need for the use of this technology to identify known and new disease genes in RTT patients. PMID:26236194

  18. SNP discovery and marker development for disease resistance candidate genes in common carp (Cyprinus carpio)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Single nucleotide polymorphisms (SNPs) in immune response genes have been reported as markers of susceptibility to infectious diseases in human and livestock. A disease caused by cyprinid herpes virus 3 (CyHV-3) is highly contagious and virulent in common carp. With the aim to investigate the gene...

  19. Systematic discovery of novel ciliary genes through functional genomics in the zebrafish.

    PubMed

    Choksi, Semil P; Babu, Deepak; Lau, Doreen; Yu, Xianwen; Roy, Sudipto

    2014-09-01

    Cilia are microtubule-based hair-like organelles that play many important roles in development and physiology, and are implicated in a rapidly expanding spectrum of human diseases, collectively termed ciliopathies. Primary ciliary dyskinesia (PCD), one of the most prevalent of ciliopathies, arises from abnormalities in the differentiation or motility of the motile cilia. Despite their biomedical importance, a methodical functional screen for ciliary genes has not been carried out in any vertebrate at the organismal level. We sought to systematically discover novel motile cilia genes by identifying the genes induced by Foxj1, a winged-helix transcription factor that has an evolutionarily conserved role as the master regulator of motile cilia biogenesis. Unexpectedly, we find that the majority of the Foxj1-induced genes have not been associated with cilia before. To characterize these novel putative ciliary genes, we subjected 50 randomly selected candidates to a systematic functional phenotypic screen in zebrafish embryos. Remarkably, we find that over 60% are required for ciliary differentiation or function, whereas 30% of the proteins encoded by these genes localize to motile cilia. We also show that these genes regulate the proper differentiation and beating of motile cilia. This collection of Foxj1-induced genes will be invaluable for furthering our understanding of ciliary biology, and in the identification of new mutations underlying ciliary disorders in humans. PMID:25139857

  20. Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera

    PubMed Central

    2010-01-01

    Background Grosmannia clavigera is a bark beetle-vectored fungal pathogen of pines that causes wood discoloration and may kill trees by disrupting nutrient and water transport. Trees respond to attacks from beetles and associated fungi by releasing terpenoid and phenolic defense compounds. It is unclear which genes are important for G. clavigera's ability to overcome antifungal pine terpenoids and phenolics. Results We constructed seven cDNA libraries from eight G. clavigera isolates grown under various culture conditions, and Sanger sequenced the 5' and 3' ends of 25,000 cDNA clones, resulting in 44,288 high quality ESTs. The assembled dataset of unique transcripts (unigenes) consists of 6,265 contigs and 2,459 singletons that mapped to 6,467 locations on the G. clavigera reference genome, representing ~70% of the predicted G. clavigera genes. Although only 54% of the unigenes matched characterized proteins at the NCBI database, this dataset extensively covers major metabolic pathways, cellular processes, and genes necessary for response to environmental stimuli and genetic information processing. Furthermore, we identified genes expressed in spores prior to germination, and genes involved in response to treatment with lodgepole pine phloem extract (LPPE). Conclusions We provide a comprehensively annotated EST dataset for G. clavigera that represents a rich resource for gene characterization in this and other ophiostomatoid fungi. Genes expressed in response to LPPE treatment are indicative of fungal oxidative stress response. We identified two clusters of potentially functionally related genes responsive to LPPE treatment. Furthermore, we report a simple method for identifying contig misassemblies in de novo assembled EST collections caused by gene overlap on the genome. PMID:20920358

  1. Discovery of single-gene inborn errors of immunity by next generation sequencing

    PubMed Central

    Conley, Mary Ellen; Casanova, Jean-Laurent

    2014-01-01

    Many patients with clinical and laboratory evidence of primary immunodeficiency do not have a gene specific diagnosis. The use of next generation sequencing, particularly whole exome sequencing, has given us an extraordinarily powerful tool to identify the disease-causing genes in some of these patients. At least 34 new gene defects have been identified in the last 4 years. These findings document the striking heterogeneity of the phenotype in patients with mutations in the same gene. In some cases this can be attributed to loss-of-function mutations in some patients, but gain-of-function mutations in others. In addition, the surprisingly high frequency of autosomal dominant immunodeficiencies with variable penetrance, and de novo mutations in disorders with a severe phenotype has been unmasked. PMID:24886697

  2. Discovery and validation of gene classifiers for endocrine-disrupting chemicals in zebrafish (Danio rerio)

    EPA Science Inventory

    Development and application of transcriptomics-based gene classifiers for ecotoxicological applications lag far behind those of human biomedical science. Many such classifiers discovered thus far lack vigorous statistical and experimental validations, with their stability and rel...

  3. Discovery of diversity in xylan biosynthetic genes by transcriptional profiling of a heteroxylan containing mucilaginous tissue.

    PubMed

    Jensen, Jacob K; Johnson, Nathan; Wilkerson, Curtis G

    2013-01-01

    The exact biochemical steps of xylan backbone synthesis remain elusive. In Arabidopsis, three non-redundant genes from two glycosyltransferase (GT) families, IRX9 and IRX14 from GT43 and IRX10 from GT47, are candidates for forming the xylan backbone. In other plants, evidence exists that different tissues express these three genes at widely different levels, which suggests that diversity in the makeup of the xylan synthase complex exists. Recently we have profiled the transcripts present in the developing mucilaginous tissue of psyllium (Plantago ovata Forsk). This tissue was found to have high expression levels of an IRX10 homolog, but very low levels of the two GT43 family members. This contrasts with recent wheat endosperm tissue profiling that found a relatively high abundance of the GT43 family members. We have performed an in-depth analysis of all GTs genes expressed in four developmental stages of the psyllium mucilagenous layer and in a single stage of the psyllium stem using RNA-Seq. This analysis revealed several IRX10 homologs, an expansion in GT61 (homologs of At3g18170/At3g18180), and several GTs from other GT families that are highly abundant and specifically expressed in the mucilaginous tissue. Our current hypothesis is that the four IRX10 genes present in the mucilagenous tissues have evolved to function without the GT43 genes. These four genes represent some of the most divergent IRX10 genes identified to date. Conversely, those present in the psyllium stem are very similar to those in other eudicots. This suggests these genes are under selective pressure, likely due to the synthesis of the various xylan structures present in mucilage that has a different biochemical role than that present in secondary walls. The numerous GT61 family members also show a wide sequence diversity and may be responsible for the larger number of side chain structures present in the psyllium mucilage. PMID:23761806

  4. Discovery of diversity in xylan biosynthetic genes by transcriptional profiling of a heteroxylan containing mucilaginous tissue

    PubMed Central

    Jensen, Jacob K.; Johnson, Nathan; Wilkerson, Curtis G.

    2013-01-01

    The exact biochemical steps of xylan backbone synthesis remain elusive. In Arabidopsis, three non-redundant genes from two glycosyltransferase (GT) families, IRX9 and IRX14 from GT43 and IRX10 from GT47, are candidates for forming the xylan backbone. In other plants, evidence exists that different tissues express these three genes at widely different levels, which suggests that diversity in the makeup of the xylan synthase complex exists. Recently we have profiled the transcripts present in the developing mucilaginous tissue of psyllium (Plantago ovata Forsk). This tissue was found to have high expression levels of an IRX10 homolog, but very low levels of the two GT43 family members. This contrasts with recent wheat endosperm tissue profiling that found a relatively high abundance of the GT43 family members. We have performed an in-depth analysis of all GTs genes expressed in four developmental stages of the psyllium mucilagenous layer and in a single stage of the psyllium stem using RNA-Seq. This analysis revealed several IRX10 homologs, an expansion in GT61 (homologs of At3g18170/At3g18180), and several GTs from other GT families that are highly abundant and specifically expressed in the mucilaginous tissue. Our current hypothesis is that the four IRX10 genes present in the mucilagenous tissues have evolved to function without the GT43 genes. These four genes represent some of the most divergent IRX10 genes identified to date. Conversely, those present in the psyllium stem are very similar to those in other eudicots. This suggests these genes are under selective pressure, likely due to the synthesis of the various xylan structures present in mucilage that has a different biochemical role than that present in secondary walls. The numerous GT61 family members also show a wide sequence diversity and may be responsible for the larger number of side chain structures present in the psyllium mucilage. PMID:23761806

  5. Discovery and saturation analysis of cancer genes across 21 tumour types.

    PubMed

    Lawrence, Michael S; Stojanov, Petar; Mermel, Craig H; Robinson, James T; Garraway, Levi A; Golub, Todd R; Meyerson, Matthew; Gabriel, Stacey B; Lander, Eric S; Getz, Gad

    2014-01-23

    Although a few cancer genes are mutated in a high proportion of tumours of a given type (>20%), most are mutated at intermediate frequencies (2-20%). To explore the feasibility of creating a comprehensive catalogue of cancer genes, we analysed somatic point mutations in exome sequences from 4,742 human cancers and their matched normal-tissue samples across 21 cancer types. We found that large-scale genomic analysis can identify nearly all known cancer genes in these tumour types. Our analysis also identified 33 genes that were not previously known to be significantly mutated in cancer, including genes related to proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing and protein homeostasis. Down-sampling analysis indicates that larger sample sizes will reveal many more genes mutated at clinically important frequencies. We estimate that near-saturation may be achieved with 600-5,000 samples per tumour type, depending on background mutation frequency. The results may help to guide the next stage of cancer genomics. PMID:24390350

  6. Discovery of functional non-coding conserved regions in the α-synuclein gene locus

    PubMed Central

    Sterling, Lori; Walter, Michael; Ting, Dennis; Schüle, Birgitt

    2014-01-01

    Several single nucleotide polymorphisms (SNPs) and the Rep-1 microsatellite marker of the α-synuclein ( SNCA) gene have consistently been shown to be associated with Parkinson’s disease, but the functional relevance is unclear. Based on these findings we hypothesized that conserved cis-regulatory elements in the SNCA genomic region regulate expression of SNCA, and that SNPs in these regions could be functionally modulating the expression of SNCA, thus contributing to neuronal demise and predisposing to Parkinson’s disease. In a pair-wise comparison of a 206kb genomic region encompassing the SNCA gene, we revealed 34 evolutionary conserved DNA sequences between human and mouse. All elements were cloned into reporter vectors and assessed for expression modulation in dual luciferase reporter assays.  We found that 12 out of 34 elements exhibited either an enhancement or reduction of the expression of the reporter gene. Three elements upstream of the SNCA gene displayed an approximately 1.5 fold (p<0.009) increase in expression. Of the intronic regions, three showed a 1.5 fold increase and two others indicated a 2 and 2.5 fold increase in expression (p<0.002). Three elements downstream of the SNCA gene showed 1.5 fold and 2.5 fold increase (p<0.0009). One element downstream of SNCA had a reduced expression of the reporter gene of 0.35 fold (p<0.0009) of normal activity. Our results demonstrate that the SNCA gene contains cis-regulatory regions that might regulate the transcription and expression of SNCA. Further studies in disease-relevant tissue types will be important to understand the functional impact of regulatory regions and specific Parkinson’s disease-associated SNPs and its function in the disease process. PMID:25566351

  7. Discovery and characterization of nutritionally regulated genes associated with muscle growth in Atlantic salmon

    PubMed Central

    Johnston, Ian A.

    2010-01-01

    A genomics approach was used to identify nutritionally regulated genes involved in growth of fast skeletal muscle in Atlantic salmon (Salmo salar L.). Forward and reverse subtractive cDNA libraries were prepared comparing fish with zero growth rates to fish growing rapidly. We produced 7,420 ESTs and assembled them into nonredundant clusters prior to annotation. Contigs representing 40 potentially unrecognized nutritionally responsive candidate genes were identified. Twenty-three of the subtractive library candidates were also differentially regulated by nutritional state in an independent fasting-refeeding experiment and their expression placed in the context of 26 genes with established roles in muscle growth regulation. The expression of these genes was also determined during the maturation of a primary myocyte culture, identifying 13 candidates from the subtractive cDNA libraries with putative roles in the myogenic program. During early stages of refeeding DNAJA4, HSPA1B, HSP90A, and CHAC1 expression increased, indicating activation of unfolded protein response pathways. Four genes were considered inhibitory to myogenesis based on their in vivo and in vitro expression profiles (CEBPD, ASB2, HSP30, novel transcript GE623928). Other genes showed increased expression with feeding and highest in vitro expression during the proliferative phase of the culture (FOXD1, DRG1) or as cells differentiated (SMYD1, RTN1, MID1IP1, HSP90A, novel transcript GE617747). The genes identified were associated with chromatin modification (SMYD1, RTN1), microtubule stabilization (MID1IP1), cell cycle regulation (FOXD1, CEBPD, DRG1), and negative regulation of signaling (ASB2) and may play a role in the stimulation of myogenesis during the transition from a catabolic to anabolic state in skeletal muscle. PMID:20663983

  8. Target genes discovery through copy number alteration analysis in human hepatocellular carcinoma.

    PubMed

    Gu, De-Leung; Chen, Yen-Hsieh; Shih, Jou-Ho; Lin, Chi-Hung; Jou, Yuh-Shan; Chen, Chian-Feng

    2013-12-21

    High-throughput short-read sequencing of exomes and whole cancer genomes in multiple human hepatocellular carcinoma (HCC) cohorts confirmed previously identified frequently mutated somatic genes, such as TP53, CTNNB1 and AXIN1, and identified several novel genes with moderate mutation frequencies, including ARID1A, ARID2, MLL, MLL2, MLL3, MLL4, IRF2, ATM, CDKN2A, FGF19, PIK3CA, RPS6KA3, JAK1, KEAP1, NFE2L2, C16orf62, LEPR, RAC2, and IL6ST. Functional classification of these mutated genes suggested that alterations in pathways participating in chromatin remodeling, Wnt/β-catenin signaling, JAK/STAT signaling, and oxidative stress play critical roles in HCC tumorigenesis. Nevertheless, because there are few druggable genes used in HCC therapy, the identification of new therapeutic targets through integrated genomic approaches remains an important task. Because a large amount of HCC genomic data genotyped by high density single nucleotide polymorphism arrays is deposited in the public domain, copy number alteration (CNA) analyses of these arrays is a cost-effective way to reveal target genes through profiling of recurrent and overlapping amplicons, homozygous deletions and potentially unbalanced chromosomal translocations accumulated during HCC progression. Moreover, integration of CNAs with other high-throughput genomic data, such as aberrantly coding transcriptomes and non-coding gene expression in human HCC tissues and rodent HCC models, provides lines of evidence that can be used to facilitate the identification of novel HCC target genes with the potential of improving the survival of HCC patients. PMID:24379610

  9. De Novo Assembly of the Common Bean Transcriptome Using Short Reads for the Discovery of Drought-Responsive Genes

    PubMed Central

    Wu, Jing; Wang, Lanfen; Li, Long; Wang, Shumin

    2014-01-01

    The common bean (Phaseolus vulgaris L.) is one of the most important food legumes, far ahead of other legumes. The average grain yield of the common bean worldwide is much lower than its potential yields, primarily due to drought in the field. However, the gene network that mediates plant responses to drought stress remains largely unknown in this species. The major goals of our study are to identify a large scale of genes involved in drought stress using RNA-seq. First, we assembled 270 million high-quality trimmed reads into a non-redundant set of 62,828 unigenes, representing approximately 49 Mb of unique transcriptome sequences. Of these unigenes, 26,501 (42.2%) common bean unigenes had significant similarity with unigenes/predicted proteins from other legumes or sequenced plants. All unigenes were functionally annotated within the GO, COG and KEGG pathways. The strategy for de novo assembly of transcriptome data generated here will be useful in other legume plant transcriptome studies. Second, we identified 10,482 SSRs and 4,099 SNPs in transcripts. The large number of genetic markers provides a resource for gene discovery and development of functional molecular markers. Finally, we found differential expression genes (DEGs) between terminal drought and optimal irrigation treatments and between the two different genotypes Long 22-0579 (drought tolerant) and Naihua (drought sensitive). DEGs were confirmed by quantitative real-time PCR assays, which indicated that these genes are functionally associated with the drought-stress response. These resources will be helpful for basic and applied research for genome analysis and crop drought resistance improvement in the common bean. PMID:25275443

  10. De novo assembly of Auricularia polytricha transcriptome using Illumina sequencing for gene discovery and SSR marker identification.

    PubMed

    Zhou, Yan; Chen, Lianfu; Fan, Xiuzhi; Bian, Yinbing

    2014-01-01

    Auricularia polytricha (Mont.) Sacc., a type of edible black-brown mushroom with a gelatinous and modality-specific fruiting body, is in high demand in Asia due to its nutritional and medicinal properties. Illumina Solexa sequenceing technology was used to generate very large transcript sequences from the mycelium and the mature fruiting body of A. polytricha for gene discovery and molecular marker development. De novo assembly generated 36,483 ESTs with an N50 length of 636 bp. A total of 28,108 ESTs demonstrated significant hits with known proteins in the nr database, and 94.03% of the annotated ESTs showed the greatest similarity to A. delicata, a related species of A. polytricha. Functional categorization of the Gene Ontology (GO), Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) metabolic pathways revealed the conservation of genes involved in various biological processes in A. polytricha. Gene expression profile analysis indicated that a total of 2,057 ESTs were differentially expressed, including 1,020 ESTs that were up-regulated in the mycelium and 1,037 up-regulated in the fruiting body. Functional enrichment showed that the ESTs associated with biosynthesis, metabolism and assembly of proteins were more active in fruiting body development. The expression patterns of homologous transcription factors indicated that the molecular mechanisms of fruiting body formation and development were not exactly the same as for other agarics. Interestingly, an EST encoding tyrosinase was significantly up-regulated in the fruiting body, indicating that melanins accumulated during the processes of the formation of the black-brown color of the fruiting body in A. polytricha development. In addition, a total of 1,715 potential SSRs were detected in this transcriptome. The transcriptome analysis of A. polytricha provides valuable sequence resources and numerous molecular markers to facilitate further functional genomics studies and genetic researches on this fungus. PMID:24626227

  11. De Novo Assembly of Auricularia polytricha Transcriptome Using Illumina Sequencing for Gene Discovery and SSR Marker Identification

    PubMed Central

    Zhou, Yan; Chen, Lianfu; Fan, Xiuzhi; Bian, Yinbing

    2014-01-01

    Auricularia polytricha (Mont.) Sacc., a type of edible black-brown mushroom with a gelatinous and modality-specific fruiting body, is in high demand in Asia due to its nutritional and medicinal properties. Illumina Solexa sequenceing technology was used to generate very large transcript sequences from the mycelium and the mature fruiting body of A. polytricha for gene discovery and molecular marker development. De novo assembly generated 36,483 ESTs with an N50 length of 636 bp. A total of 28,108 ESTs demonstrated significant hits with known proteins in the nr database, and 94.03% of the annotated ESTs showed the greatest similarity to A. delicata, a related species of A. polytricha. Functional categorization of the Gene Ontology (GO), Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) metabolic pathways revealed the conservation of genes involved in various biological processes in A. polytricha. Gene expression profile analysis indicated that a total of 2,057 ESTs were differentially expressed, including 1,020 ESTs that were up-regulated in the mycelium and 1,037 up-regulated in the fruiting body. Functional enrichment showed that the ESTs associated with biosynthesis, metabolism and assembly of proteins were more active in fruiting body development. The expression patterns of homologous transcription factors indicated that the molecular mechanisms of fruiting body formation and development were not exactly the same as for other agarics. Interestingly, an EST encoding tyrosinase was significantly up-regulated in the fruiting body, indicating that melanins accumulated during the processes of the formation of the black-brown color of the fruiting body in A. polytricha development. In addition, a total of 1,715 potential SSRs were detected in this transcriptome. The transcriptome analysis of A. polytricha provides valuable sequence resources and numerous molecular markers to facilitate further functional genomics studies and genetic researches on this fungus. PMID:24626227

  12. Discovery of Pituitary Adenylate Cyclase-Activating Polypeptide-Regulated Genes through Microarray Analyses in Cell Culture and In Vivo

    PubMed Central

    Eiden, Lee E.; Samal, Babru; Gerdin, Matthew J.; Mustafa, Tomris; Vaudry, David; Stroth, Nikolas

    2010-01-01

    Pituitary adenylate cyclase-activating polypeptide (PACAP) is an evolutionarily well conserved neuropeptide with multiple functions in the nervous, endocrine, and immune systems. PACAP provides neuroprotection from ischemia and toxin exposure, is anti-inflammatory in gastric inflammatory disease and sepsis, controls proliferative signaling pathways involved in neural cell transformation, and modulates glucohomeostasis. PACAP-based, disease-targeted therapeutics might thus be both effective and benign, enhancing homeostatic responses to behavioral, metabolic, oncogenic, and inflammatory stressors. PACAP signal transduction employs synergistic regulation of calcium and cyclic adenosine monophosphate (cAMP), and noncanonical activation of both calcium- and cAMP-dependent processes. Pharmacological activation of PACAP signaling should consequently have highly specific effects even in vivo. Here, a combined cellular biochemical, pharmacologic, transcriptomic, and bioinformatic approach to understanding PACAP signal transduction by identifying PACAP target genes with oligonucleotide- and cDNA-based microarray is described. Calcium- and cAMP-dependent PACAP signaling pathways for regulation of genes encoding proteins required for neuritogenesis, changes in cell morphology, and cell survival have been traced in PC12 cells. Pharmacological experiments have linked gene expression to cell physiological responses in this system, in which gene silencing can also be employed to confirm the functional significance of induction of specific transcripts. Differential transcriptional responses to metabolic, ischemic, and other stressors in wild type compared to PACAP-deficient mice establish in principle which PACAP-responsive transcripts in culture are PACAP-dependent in vivo. Bioinformatic approaches aid in creating a pipeline for identifying neuropeptide-regulated genes, validating their cellular functions, and defining their expression in the context of neuropeptide signaling physiology, required for discovery of new targets for drug action. PMID:19076358

  13. DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes.

    PubMed

    Piñero, Janet; Queralt-Rosinach, Núria; Bravo, Àlex; Deu-Pons, Jordi; Bauer-Mehren, Anna; Baron, Martin; Sanz, Ferran; Furlong, Laura I

    2015-01-01

    DisGeNET is a comprehensive discovery platform designed to address a variety of questions concerning the genetic underpinning of human diseases. DisGeNET contains over 380,000 associations between >16,000 genes and 13,000 diseases, which makes it one of the largest repositories currently available of its kind. DisGeNET integrates expert-curated databases with text-mined data, covers information on Mendelian and complex diseases, and includes data from animal disease models. It features a score based on the supporting evidence to prioritize gene-disease associations. It is an open access resource available through a web interface, a Cytoscape plugin and as a Semantic Web resource. The web interface supports user-friendly data exploration and navigation. DisGeNET data can also be analysed via the DisGeNET Cytoscape plugin, and enriched with the annotations of other plugins of this popular network analysis software suite. Finally, the information contained in DisGeNET can be expanded and complemented using Semantic Web technologies and linked to a variety of resources already present in the Linked Data cloud. Hence, DisGeNET offers one of the most comprehensive collections of human gene-disease associations and a valuable set of tools for investigating the molecular mechanisms underlying diseases of genetic origin, designed to fulfill the needs of different user profiles, including bioinformaticians, biologists and health-care practitioners. Database URL: http://www.disgenet.org/ PMID:25877637

  14. DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes

    PubMed Central

    Piñero, Janet; Queralt-Rosinach, Núria; Bravo, Àlex; Deu-Pons, Jordi; Bauer-Mehren, Anna; Baron, Martin; Sanz, Ferran; Furlong, Laura I.

    2015-01-01

    DisGeNET is a comprehensive discovery platform designed to address a variety of questions concerning the genetic underpinning of human diseases. DisGeNET contains over 380 000 associations between >16 000 genes and 13 000 diseases, which makes it one of the largest repositories currently available of its kind. DisGeNET integrates expert-curated databases with text-mined data, covers information on Mendelian and complex diseases, and includes data from animal disease models. It features a score based on the supporting evidence to prioritize gene-disease associations. It is an open access resource available through a web interface, a Cytoscape plugin and as a Semantic Web resource. The web interface supports user-friendly data exploration and navigation. DisGeNET data can also be analysed via the DisGeNET Cytoscape plugin, and enriched with the annotations of other plugins of this popular network analysis software suite. Finally, the information contained in DisGeNET can be expanded and complemented using Semantic Web technologies and linked to a variety of resources already present in the Linked Data cloud. Hence, DisGeNET offers one of the most comprehensive collections of human gene-disease associations and a valuable set of tools for investigating the molecular mechanisms underlying diseases of genetic origin, designed to fulfill the needs of different user profiles, including bioinformaticians, biologists and health-care practitioners. Database URL: http://www.disgenet.org/ PMID:25877637

  15. Essential Gene Discovery in the Basidiomycete Cryptococcus neoformans for Antifungal Drug Target Prioritization

    PubMed Central

    Ianiri, Giuseppe

    2015-01-01

    ABSTRACT Fungal diseases represent a major burden to health care globally. As with other pathogenic microbes, there is a limited number of agents suitable for use in treating fungal diseases, and resistance to these agents can develop rapidly. Cryptococcus neoformans is a basidiomycete fungus that causes cryptococcosis worldwide in both immunocompromised and healthy individuals. As a basidiomycete, it diverged from other common pathogenic or model ascomycete fungi more than 500 million years ago. Here, we report C. neoformans genes that are essential for viability as identified through forward and reverse genetic approaches, using an engineered diploid strain and genetic segregation after meiosis. The forward genetic approach generated random insertional mutants in the diploid strain, the induction of meiosis and sporulation, and selection for haploid cells with counterselection of the insertion event. More than 2,500 mutants were analyzed, and transfer DNA (T-DNA) insertions in several genes required for viability were identified. The genes include those encoding the thioredoxin reductase (Trr1), a ribosome assembly factor (Rsa4), an mRNA-capping component (Cet1), and others. For targeted gene replacement, the C. neoformans homologs of 35 genes required for viability in ascomycete fungi were disrupted, meiosis and sporulation were induced, and haploid progeny were evaluated for their ability to grow on selective media. Twenty-one (60%) were found to be required for viability in C. neoformans. These genes are involved in mitochondrial translation, ergosterol biosynthesis, and RNA-related functions. The heterozygous diploid mutants were evaluated for haploinsufficiency on a number of perturbing agents and drugs, revealing phenotypes due to the loss of one copy of an essential gene in C. neoformans. This study expands the knowledge of the essential genes in fungi using a basidiomycete as a model organism. Genes that have no mammalian homologs and are essential in both Cryptococcus and ascomycete human pathogens would be ideal for the development of antifungal drugs with broad-spectrum activity. PMID:25827419

  16. Discovery of Antibiotics-derived Polymers for Gene Delivery using Combinatorial Synthesis and Cheminformatics Modeling

    PubMed Central

    Potta, Thrimoorthy; Zhen, Zhuo; Grandhi, Taraka Sai Pavan; Christensen, Matthew D.; Ramos, James; Breneman, Curt M.; Rege, Kaushal

    2014-01-01

    We describe the combinatorial synthesis and cheminformatics modeling of aminoglycoside antibiotics-derived polymers for transgene delivery and expression. Fifty-six polymers were synthesized by polymerizing aminoglycosides with diglycidyl ether cross-linkers. Parallel screening resulted in identification of several lead polymers that resulted in high transgene expression levels in cells. The role of polymer physicochemical properties in determining efficacy of transgene expression was investigated using Quantitative Structure-Activity Relationship (QSAR) cheminformatics models based on Support Vector Regression (SVR) and ‘building block’ polymer structures. The QSAR model exhibited high predictive ability, and investigation of descriptors in the model, using molecular visualization and correlation plots, indicated that physicochemical attributes related to both, aminoglycosides and diglycidyl ethers facilitated transgene expression. This work synergistically combines combinatorial synthesis and parallel screening with cheminformatics-based QSAR models for discovery and physicochemical elucidation of effective antibiotics-derived polymers for transgene delivery in medicine and biotechnology. PMID:24331709

  17. Use of model organism and disease databases to support matchmaking for human disease gene discovery.

    PubMed

    Mungall, Christopher J; Washington, Nicole L; Nguyen-Xuan, Jeremy; Condit, Christopher; Smedley, Damian; Köhler, Sebastian; Groza, Tudor; Shefchek, Kent; Hochheiser, Harry; Robinson, Peter N; Lewis, Suzanna E; Haendel, Melissa A

    2015-10-01

    The Matchmaker Exchange application programming interface (API) allows searching a patient's genotypic or phenotypic profiles across clinical sites, for the purposes of cohort discovery and variant disease causal validation. This API can be used not only to search for matching patients, but also to match against public disease and model organism data. This public disease data enable matching known diseases and variant-phenotype associations using phenotype semantic similarity algorithms developed by the Monarch Initiative. The model data can provide additional evidence to aid diagnosis, suggest relevant models for disease mechanism and treatment exploration, and identify collaborators across the translational divide. The Monarch Initiative provides an implementation of this API for searching multiple integrated sources of data that contextualize the knowledge about any given patient or patient family into the greater biomedical knowledge landscape. While this corpus of data can aid diagnosis, it is also the beginning of research to improve understanding of rare human diseases. PMID:26269093

  18. Natural and man-made V-gene repertoires for antibody discovery.

    PubMed

    Finlay, William J J; Almagro, Juan C

    2012-01-01

    Antibodies are the fastest-growing segment of the biologics market. The success of antibody-based drugs resides in their exquisite specificity, high potency, stability, solubility, safety, and relatively inexpensive manufacturing process in comparison with other biologics. We outline here the structural studies and fundamental principles that define how antibodies interact with diverse targets. We also describe the antibody repertoires and affinity maturation mechanisms of humans, mice, and chickens, plus the use of novel single-domain antibodies in camelids and sharks. These species all utilize diverse evolutionary solutions to generate specific and high affinity antibodies and illustrate the plasticity of natural antibody repertoires. In addition, we discuss the multiple variations of man-made antibody repertoires designed and validated in the last two decades, which have served as tools to explore how the size, diversity, and composition of a repertoire impact the antibody discovery process. PMID:23162556

  19. Gene Annotation and Drug Target Discovery in Candida albicans with a Tagged Transposon Mutant Collection

    PubMed Central

    Oh, Julia; Fung, Eula; Schlecht, Ulrich; Davis, Ronald W.; Giaever, Guri; St. Onge, Robert P.; Deutschbauer, Adam; Nislow, Corey

    2010-01-01

    Candida albicans is the most common human fungal pathogen, causing infections that can be lethal in immunocompromised patients. Although Saccharomyces cerevisiae has been used as a model for C. albicans, it lacks C. albicans' diverse morphogenic forms and is primarily non-pathogenic. Comprehensive genetic analyses that have been instrumental for determining gene function in S. cerevisiae are hampered in C. albicans, due in part to limited resources to systematically assay phenotypes of loss-of-function alleles. Here, we constructed and screened a library of 3633 tagged heterozygous transposon disruption mutants, using them in a competitive growth assay to examine nutrient- and drug-dependent haploinsufficiency. We identified 269 genes that were haploinsufficient in four growth conditions, the majority of which were condition-specific. These screens identified two new genes necessary for filamentous growth as well as ten genes that function in essential processes. We also screened 57 chemically diverse compounds that more potently inhibited growth of C. albicans versus S. cerevisiae. For four of these compounds, we examined the genetic basis of this differential inhibition. Notably, Sec7p was identified as the target of brefeldin A in C. albicans screens, while S. cerevisiae screens with this compound failed to identify this target. We also uncovered a new C. albicans-specific target, Tfp1p, for the synthetic compound 0136-0228. These results highlight the value of haploinsufficiency screens directly in this pathogen for gene annotation and drug target identification. PMID:20949076

  20. Genomic approaches for the discovery of genes mutated in inherited retinal degeneration.

    PubMed

    Siemiatkowska, Anna M; Collin, Rob W J; den Hollander, Anneke I; Cremers, Frans P M

    2014-08-01

    In view of their high degree of genetic heterogeneity, inherited retinal diseases (IRDs) pose a significant challenge for identifying novel genetic causes. Thus far, more than 200 genes have been found to be mutated in IRDs, which together contain causal variants in >80% of the cases. Accurate genetic diagnostics is particularly important for isolated cases, in which X-linked and de novo autosomal dominant variants are not uncommon. In addition, new gene- or mutation-specific therapies are emerging, underlining the importance of identifying causative mutations in each individual. Sanger sequencing of selected genes followed by cost-effective targeted next-generation sequencing (NGS) can identify defects in known IRD-associated genes in the majority of the cases. Exome NGS in combination with genetic linkage or homozygosity mapping studies can aid the identification of the remaining causal genes. As these are thought to be mutated in <1% of the cases, validation through functional modeling in, for example, zebrafish and/or replication through the genotyping of large patient cohorts is required. In the near future, whole genome NGS in combination with transcriptome NGS may reveal mutations that are currently hidden in the noncoding regions of the human genome. PMID:24939053

  1. Genome-wide discovery of transcriptional modules from DNA sequence and gene expression.

    PubMed

    Segal, E; Yelensky, R; Koller, D

    2003-01-01

    In this paper, we describe an approach for understanding transcriptional regulation from both gene expression and promoter sequence data. We aim to identify transcriptional modules--sets of genes that are co-regulated in a set of experiments, through a common motif profile. Using the EM algorithm, our approach refines both the module assignment and the motif profile so as to best explain the expression data as a function of transcriptional motifs. It also dynamically adds and deletes motifs, as required to provide a genome-wide explanation of the expression data. We evaluate the method on two Saccharomyces cerevisiae gene expression data sets, showing that our approach is better than a standard one at recovering known motifs and at generating biologically coherent modules. We also combine our results with binding localization data to obtain regulatory relationships with known transcription factors, and show that many of the inferred relationships have support in the literature. PMID:12855470

  2. RNA-Seq Based De Novo Transcriptome Assembly and Gene Discovery of Cistanche deserticola Fleshy Stem

    PubMed Central

    Yao, Fuwen; Li, Cuiping; Tang, Qingli; Sun, Min; Sun, Gaoyuan; Hu, Songnian; Yu, Jun; Song, Shuhui

    2015-01-01

    Backgrounds Cistanche deserticola is a completely non-photosynthetic parasitic plant with great medicinal value and mainly distributed in desert of Northwest China. Its dried fleshy stem is a crucial tonic in traditional Chinese medicine with roles of mainly improving male sexual function and strengthening immunity, but few mechanistic studies have been conducted partly due to the lack of genomic and transcriptomic resources. Results In this study, we performed deep transcriptome sequencing in fleshy stem of C. deserticola, and about 80 million reads were generated using Illumina pair-end sequencing on HiSeq2000 platform. Using trinity assembler, we obtained 95,787 transcript sequences with transcript lengths ranging from 200bp to 15,698bp, having an average length of 950 bases and the N50 length of 1,519 bases. 63,957 transcripts were identified actively expressed with FPKM ≥ 0.5, in which 30,098 transcripts were annotated with gene descriptions or gene ontology terms by sequence similarity analyses against several public databases (Uniprot, NR and Nt at NCBI, and KEGG). Furthermore, we identified key enzyme genes involved in biosynthesis of lignin and phenylethanoid glycosides (PhGs) which are known to be the primary active ingredients. Four phenylalanine ammonia-lyase (PAL) genes, the first key enzyme in lignin and PhG biosynthesis, were identified based on sequences comparison and phylogenetic analysis. Two biosynthesis pathways of PhGs were also proposed for the first time. Conclusions In all, we completed a global analysis of the C. deserticola fleshy stem transcriptome using RNA-seq technology. A collection of enzyme genes related to biosynthesis of lignin and phenylethanoid glysides were identified from the assembled and annotated transcripts, and the gene family of PAL was also predicted. The sequence data from this study will provide a valuable resource for conducting future phenylethanoid glysides biosynthesis researches and functional genomic studies in this important medicinal plant. PMID:25938435

  3. Plant gravitropic signal transduction: A network analysis leads to gene discovery

    NASA Astrophysics Data System (ADS)

    Wyatt, Sarah

    Gravity plays a fundamental role in plant growth and development. Although a significant body of research has helped define the events of gravity perception, the role of the plant growth regulator auxin, and the mechanisms resulting in the gravity response, the events of signal transduction, those that link the biophysical action of perception to a biochemical signal that results in auxin redistribution, those that regulate the gravitropic effects on plant growth, remain, for the most part, a “black box.” Using a cold affect, dubbed the gravity persistent signal (GPS) response, we developed a mutant screen to specifically identify components of the signal transduction pathway. Cloning of the GPS genes have identified new proteins involved in gravitropic signaling. We have further exploited the GPS response using a multi-faceted approach including gene expression microarrays, proteomics analysis, and bioinformatics analysis and continued mutant analysis to identified additional genes, physiological and biochemical processes. Gene expression data provided the foundation of a regulatory network for gravitropic signaling. Based on these gene expression data and related data sets/information from the literature/repositories, we constructed a gravitropic signaling network for Arabidopsis inflorescence stems. To generate the network, both a dynamic Bayesian network approach and a time-lagged correlation coefficient approach were used. The dynamic Bayesian network added existing information of protein-protein interaction while the time-lagged correlation coefficient allowed incorporation of temporal regulation and thus could incorporate the time-course metric from the data set. Thus the methods complemented each other and provided us with a more comprehensive evaluation of connections. Each method generated a list of possible interactions associated with a statistical significance value. The two networks were then overlaid to generate a more rigorous, intersected network with shared genes and interactions. This network is flexible and can be updated with new data from the original research. The network allows identification of hubs/additional components and processes that are involved in gravitropic signal transduction to provide further hypotheses for testing. In essence, genes identified through experimental methods can be located and interactions that might connect them identified. Genes along these connections can then tested, much like stopping at towns along a driving route from one city to another.

  4. Discovery of candidate genes for muscle traits based on GWAS supported by eQTL-analysis.

    PubMed

    Ponsuksili, Siriluck; Murani, Eduard; Trakooljul, Nares; Schwerin, Manfred; Wimmers, Klaus

    2014-01-01

    Biochemical and biophysical processes that take place in muscle under relaxed and stressed conditions depend on the abundance and activity of gene products of metabolic and structural pathways. In livestock at post-mortem, these muscle properties determine aspects of meat quality and are measurable. The conversion of muscle to meat mimics pathological processes associated with muscle ischemia, injury or damage in humans and it is an economic factor in pork production. Linkage, association, and expression analyses independently contributed to the identification of trait-associated molecular pathways and genes. We aim at providing multiple evidences for the role of specific genes in meat quality by integrating a genome-wide association study (GWAS) for meat quality traits and the detection of eQTL based on trait-correlated expressed genes and trait-associated markers. The GWAS revealed 51 and 200 SNPs significantly associated with meat quality in a crossbred Pietrain×(German Landrace×Large White) (Pi×(GL×LW)) and a purebred German Landrace (GL) population, respectively. Most significant SNPs in Pi×(GL×LW) were located on chromosomes (SSC) 4 and 6. The data of 47,836 eQTLs at a significance level of p<10(-5) were used to scale down the number candidate genes located in these regions. These SNPs on SSC4 showed association with expression levels of ZNF704, IMPA1, and OXSR1; SSC6 SNPs were associated with expression of SIGLEC10 and PIH1D1. Most significant SNPs in GL were located on SSC6 and associated with expression levels of PIH1D1, SIGLEC10, TBCB, LOC100518735, KIF1B, LOC100514845, and two unknown genes. The abundance of transcripts of these genes in muscle, in turn, is significantly correlated with meat quality traits. We identified several genes with evidence for their candidacy for meat quality arising from the integrative approach of a genome-wide association study and eQTL analysis. PMID:24643240

  5. Integrating microRNA target predictions for the discovery of gene regulatory networks: a semi-supervised ensemble learning approach

    PubMed Central

    2014-01-01

    Background MicroRNAs (miRNAs) are small non-coding RNAs which play a key role in the post-transcriptional regulation of many genes. Elucidating miRNA-regulated gene networks is crucial for the understanding of mechanisms and functions of miRNAs in many biological processes, such as cell proliferation, development, differentiation and cell homeostasis, as well as in many types of human tumors. To this aim, we have recently presented the biclustering method HOCCLUS2, for the discovery of miRNA regulatory networks. Experiments on predicted interactions revealed that the statistical and biological consistency of the obtained networks is negatively affected by the poor reliability of the output of miRNA target prediction algorithms. Recently, some learning approaches have been proposed to learn to combine the outputs of distinct prediction algorithms and improve their accuracy. However, the application of classical supervised learning algorithms presents two challenges: i) the presence of only positive examples in datasets of experimentally verified interactions and ii) unbalanced number of labeled and unlabeled examples. Results We present a learning algorithm that learns to combine the score returned by several prediction algorithms, by exploiting information conveyed by (only positively labeled/) validated and unlabeled examples of interactions. To face the two related challenges, we resort to a semi-supervised ensemble learning setting. Results obtained using miRTarBase as the set of labeled (positive) interactions and mirDIP as the set of unlabeled interactions show a significant improvement, over competitive approaches, in the quality of the predictions. This solution also improves the effectiveness of HOCCLUS2 in discovering biologically realistic miRNA:mRNA regulatory networks from large-scale prediction data. Using the miR-17-92 gene cluster family as a reference system and comparing results with previous experiments, we find a large increase in the number of significantly enriched biclusters in pathways, consistent with miR-17-92 functions. Conclusion The proposed approach proves to be fundamental for the computational discovery of miRNA regulatory networks from large-scale predictions. This paves the way to the systematic application of HOCCLUS2 for a comprehensive reconstruction of all the possible multiple interactions established by miRNAs in regulating the expression of gene networks, which would be otherwise impossible to reconstruct by considering only experimentally validated interactions. PMID:24564296

  6. Phenotype discovery by gene expression profiling: mapping of biological processes linked to BMP-2-mediated osteoblast differentiation.

    PubMed

    Balint, Eva; Lapointe, David; Drissi, Hicham; van der Meijden, Caroline; Young, Daniel W; van Wijnen, Andre J; Stein, Janet L; Stein, Gary S; Lian, Jane B

    2003-05-15

    Understanding physiological control of osteoblast differentiation necessitates characterization of the regulatory signals that initiate the events directing a cell to lineage commitment and establishing competency for bone formation. The bone morphogenetic protein, BMP-2, a member of the TGFbeta superfamily, induces osteoblast differentiation and functions through the Smad signal transduction pathway during in vivo bone formation. However, the molecular targets of BMP-mediated gene transcription during the process of osteoblast differentiation have not been comprehensively identified. In the present study, BMP-2 responsive factors involved in the early stages of commitment and differentiation to the osteoblast phenotype were analyzed by microarray gene expression profiling in samples ranging from 1 to 24 h following BMP-2 dependent differentiation of C2C12 premyoblasts into the osteogenic lineage. A total of 1,800 genes were responsive to BMP-2 and expression was modulated from 3- to 14-fold for less than 100 genes during the time course. Approximately 50% of these 100 genes are either up- or downregulated. Major events associated with phenotypic changes towards the osteogenic lineage were identified from hierarchical and functional clustering analyses. BMP-2 immediately responsive genes (1-4 h), which exhibited either transient or sustained expression, reflect activation and repression of non-osseous BMP-2 developmental systems. This initial response was followed by waves of expression of nuclear proteins and developmental regulatory factors including inhibitors of DNA binding, Runx2, C/EBP, Zn finger binding proteins, forkhead, and numerous homeobox proteins (e.g., CDP/cut, paired, distaless, Hox) which are expressed at characterized stages during osteoblast differentiation. A sequential profile of genes mediating changes in cell morphology, cell growth, and basement membrane formation is observed as a secondary transient early response (2-8 h). Commitment to the osteogenic phenotype is recognized by 8 h, reflected by downregulation of most myogenic-related genes and induction of a spectrum of signaling proteins and enzymes facilitating synthesis and assembly of an extracellular skeletal environment. These genes included collagens Type I and VI and the small leucine rich repeat family of proteoglycans (e.g., decorin, biglycan, osteomodulin, fibromodulin, and osteoadherin/osteoglycin) that reached peak expression at 24 h. With extracellular matrix development, the bone phenotype was further established from 16 to 24 h by induction of genes for cell adhesion and communication and enzymes that organize the bone ECM. Our microarray analysis resulted in the discovery of a class of genes, initially described in relation to differentiation of astrocytes and oligodendrocytes that are functionally coupled to signals for cellular extensions. They include nexin, neuropilin, latexin, neuroglian, neuron specific gene 1, and Ulip; suggesting novel roles for these genes in the bone microenvironment. This global analysis identified a multistage molecular and cellular cascade that supports BMP-2-mediated osteoblast differentiation. PMID:12704803

  7. USING NATURAL VARIATION FOR GENE DISCOVERY TO IMPROVE SEED IRON NUTRITIONAL VALUE

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We and others are interested in developing crops biofortified with iron to improve their nutritional value for human consumption. One of the crucial tasks, therefore, is to identify relevant genes that can be targeted for transgenic or conventional breeding approaches to improve the Fe concentratio...

  8. Biomarker discovery and gene expression responses in Lycopersicon esculentum root exposed to lead.

    PubMed

    Hou, Jing; Bai, Lili; Xie, Yujia; Liu, Xinhui; Cui, Baoshan

    2015-12-15

    Gene expression analysis has shown particular promise for the identification of molecular biomarkers that can be used for further evaluation of potential toxicity of chemicals present in agricultural soil. In the study, we focused on the development of molecular markers to detect Pb toxicity in agricultural soil. Using the results obtained from microarray analysis, twelve Pb-responsive genes were selected and tested in different Pb concentrations to examine their concentration-response characteristics using real-time quantitative polymerase chain reaction (RT-qPCR). All the Pb treatments set in our study could generally induce the differential expression of the 12 genes, while the lowest observable adverse effect concentration (LOAEC) of Pb for seed germination, root elongation, biomass and structural modification derived from 1,297, 177, 177, and 1,297mgPb/kg soil, respectively, suggesting that the transcriptional approach was more sensitive than the traditional end points of death, growth, and morphology for the evaluation of Pb toxicity. The relative expression of glycoalkaloid metabolism 1 (P=-0.790), ethylene-responsive transcription factor ERF017 (P=-0.686) and CASP-like protein 4C2 (P=-0.652) demonstrates a dose-dependent response with Pb content in roots, implying that the three genes can be used as sensitive bioindicators of Pb stress in Lycopersicon esculentum. PMID:26252993

  9. Discovery of Chemosensory Genes in the Oriental Fruit Fly, Bactrocera dorsalis.

    PubMed

    Wu, Zhongzhen; Zhang, He; Wang, Zhengbing; Bin, Shuying; He, Hualiang; Lin, Jintian

    2015-01-01

    The oriental fruit fly, Bactrocera dorsalis, is a devastating fruit fly pest in tropical and sub-tropical countries. Like other insects, this fly uses its chemosensory system to efficiently interact with its environment. However, our understanding of the molecular components comprising B. dorsalis chemosensory system is limited. Using next generation sequencing technologies, we sequenced the transcriptome of four B. dorsalis developmental stages: egg, larva, pupa and adult chemosensory tissues. A total of 31 candidate odorant binding proteins (OBPs), 4 candidate chemosensory proteins (CSPs), 23 candidate odorant receptors (ORs), 11 candidate ionotropic receptors (IRs), 6 candidate gustatory receptors (GRs) and 3 candidate sensory neuron membrane proteins (SNMPs) were identified. The tissue distributions of the OBP and CSP transcripts were determined by RT-PCR and a subset of nine genes were further characterized. The predicted proteins from these genes shared high sequence similarity to Drosophila melanogaster pheromone binding protein related proteins (PBPRPs). Interestingly, one OBP (BdorOBP19c) was exclusively expressed in the sex pheromone glands of mature females. RT-PCR was also used to compare the expression of the candidate genes in the antennae of male and female B. dorsalis adults. These antennae-enriched OBPs, CSPs, ORs, IRs and SNMPs could play a role in the detection of pheromones and general odorants and thus could be useful target genes for the integrated pest management of B. dorsalis and other agricultural pests. PMID:26070069

  10. The discovery of integrated gene networks for autism and related disorders

    PubMed Central

    Hormozdiari, Fereydoun; Penn, Osnat; Borenstein, Elhanan; Eichler, Evan E.

    2015-01-01

    Despite considerable genetic heterogeneity underlying neurodevelopmental diseases, there is compelling evidence that many disease genes will map to a much smaller number of biological subnetworks. We developed a computational method, termed MAGI (merging affected genes into integrated networks), that simultaneously integrates protein–protein interactions and RNA-seq expression profiles during brain development to discover “modules” enriched for de novo mutations in probands. We applied this method to recent exome sequencing of 1116 patients with autism and intellectual disability, discovering two distinct modules that differ in their properties and associated phenotypes. The first module consists of 80 genes associated with Wnt, Notch, SWI/SNF, and NCOR complexes and shows the highest expression early during embryonic development (8–16 post-conception weeks [pcw]). The second module consists of 24 genes associated with synaptic function, including long-term potentiation and calcium signaling with higher levels of postnatal expression. Patients with de novo mutations in these modules are more significantly intellectually impaired and carry more severe missense mutations when compared to probands with de novo mutations outside of these modules. We used our approach to define subsets of the network associated with higher functioning autism as well as greater severity with respect to IQ. Finally, we applied MAGI independently to epilepsy and schizophrenia exome sequencing cohorts and found significant overlap as well as expansion of these modules, suggesting a core set of integrated neurodevelopmental networks common to seemingly diverse human diseases. PMID:25378250

  11. Discovery of Chemosensory Genes in the Oriental Fruit Fly, Bactrocera dorsalis

    PubMed Central

    Wu, Zhongzhen; Zhang, He; Wang, Zhengbing; Bin, Shuying; He, Hualiang; Lin, Jintian

    2015-01-01

    The oriental fruit fly, Bactrocera dorsalis, is a devastating fruit fly pest in tropical and sub-tropical countries. Like other insects, this fly uses its chemosensory system to efficiently interact with its environment. However, our understanding of the molecular components comprising B. dorsalis chemosensory system is limited. Using next generation sequencing technologies, we sequenced the transcriptome of four B. dorsalis developmental stages: egg, larva, pupa and adult chemosensory tissues. A total of 31 candidate odorant binding proteins (OBPs), 4 candidate chemosensory proteins (CSPs), 23 candidate odorant receptors (ORs), 11 candidate ionotropic receptors (IRs), 6 candidate gustatory receptors (GRs) and 3 candidate sensory neuron membrane proteins (SNMPs) were identified. The tissue distributions of the OBP and CSP transcripts were determined by RT-PCR and a subset of nine genes were further characterized. The predicted proteins from these genes shared high sequence similarity to Drosophila melanogaster pheromone binding protein related proteins (PBPRPs). Interestingly, one OBP (BdorOBP19c) was exclusively expressed in the sex pheromone glands of mature females. RT-PCR was also used to compare the expression of the candidate genes in the antennae of male and female B. dorsalis adults. These antennae-enriched OBPs, CSPs, ORs, IRs and SNMPs could play a role in the detection of pheromones and general odorants and thus could be useful target genes for the integrated pest management of B. dorsalis and other agricultural pests. PMID:26070069

  12. A Soybean Transcript Map: Discovery and Mapping of Single Nucleotide Polymorphisms in Soybean Genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Single nucleotide polymorphisms (SNPs) were discovered via the re-sequencing of sequence tagged sites (STS) developed from EST sequence. From an initial set of 9,459 primers sets designed to a diverse set of genes, 4,240 STS were amplified and sequenced in each of six diverse soybean genotypes. In...

  13. The discovery of integrated gene networks for autism and related disorders.

    PubMed

    Hormozdiari, Fereydoun; Penn, Osnat; Borenstein, Elhanan; Eichler, Evan E

    2015-01-01

    Despite considerable genetic heterogeneity underlying neurodevelopmental diseases, there is compelling evidence that many disease genes will map to a much smaller number of biological subnetworks. We developed a computational method, termed MAGI (merging affected genes into integrated networks), that simultaneously integrates protein-protein interactions and RNA-seq expression profiles during brain development to discover "modules" enriched for de novo mutations in probands. We applied this method to recent exome sequencing of 1116 patients with autism and intellectual disability, discovering two distinct modules that differ in their properties and associated phenotypes. The first module consists of 80 genes associated with Wnt, Notch, SWI/SNF, and NCOR complexes and shows the highest expression early during embryonic development (8-16 post-conception weeks [pcw]). The second module consists of 24 genes associated with synaptic function, including long-term potentiation and calcium signaling with higher levels of postnatal expression. Patients with de novo mutations in these modules are more significantly intellectually impaired and carry more severe missense mutations when compared to probands with de novo mutations outside of these modules. We used our approach to define subsets of the network associated with higher functioning autism as well as greater severity with respect to IQ. Finally, we applied MAGI independently to epilepsy and schizophrenia exome sequencing cohorts and found significant overlap as well as expansion of these modules, suggesting a core set of integrated neurodevelopmental networks common to seemingly diverse human diseases. PMID:25378250

  14. Transcriptome Analysis of Catharanthus roseus for Gene Discovery and Expression Profiling

    PubMed Central

    Sharma, Raghvendra; Sinha, Alok K.; Jain, Mukesh

    2014-01-01

    The medicinal plant, Catharanthus roseus, accumulates wide range of terpenoid indole alkaloids, which are well documented therapeutic agents. In this study, deep transcriptome sequencing of C. roseus was carried out to identify the pathways and enzymes (genes) involved in biosynthesis of these compounds. About 343 million reads were generated from different tissues (leaf, flower and root) of C. roseus using Illumina platform. Optimization of de novo assembly involving a two-step process resulted in a total of 59,220 unique transcripts with an average length of 1284 bp. Comprehensive functional annotation and gene ontology (GO) analysis revealed the representation of many genes involved in different biological processes and molecular functions. In total, 65% of C. roseus transcripts showed homology with sequences available in various public repositories, while remaining 35% unigenes may be considered as C. roseus specific. In silico analysis revealed presence of 11,620 genic simple sequence repeats (excluding mono-nucleotide repeats) and 1820 transcription factor encoding genes in C. roseus transcriptome. Expression analysis showed roots and leaves to be actively participating in bisindole alkaloid production with clear indication that enzymes involved in pathway of vindoline and vinblastine biosynthesis are restricted to aerial tissues. Such large-scale transcriptome study provides a rich source for understanding plant-specialized metabolism, and is expected to promote research towards production of plant-derived pharmaceuticals. PMID:25072156

  15. Transcriptome Analysis and Discovery of Genes Relevant to Development in Bradysia odoriphaga at Three Developmental Stages

    PubMed Central

    Wang, Wenbo; Chen, Hao; Zhou, Xianhong; Zhuang, Qianying; Yu, Yi; Li, Rumei

    2016-01-01

    Bradysia odoriphaga (Diptera: Sciaridae) is the most important pest of Chinese chive (Allium tuberosum) in Asia; however, the molecular genetics are poorly understood. To explore the molecular biological mechanism of development, Illumina sequencing and de novo assembly were performed in the third-instar, fourth-instar, and pupal B. odoriphaga. The study resulted in 16.2 Gb of clean data and 47,578 unigenes (≥125bp) contained in 7,632,430contigs, 46.21% of which were annotated from non-redundant protein (NR), Gene Ontology (GO), Clusters of Orthologous Groups (COG), Eukaryotic Orthologous Groups (KOG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. It was found that 19.67% of unigenes matched the homologous species mainly, including Aedes aegypti, Culex quinquefasciatus, Ceratitis capitata, and Anopheles gambiae. According to differentially expressed gene (DEG) analysis, 143, 490, and 309 DEGs were annotated as involved in the developmental process in the GO database respectively, in the comparisons of third-instar and fourth-instar larvae, third-instar larvae and pupae, and fourth-instar larvae and pupae. Twenty-five genes were closely related to these processes, including developmental process, reproduction process, and reproductive organs development and programmed cell death (PCD). The information of unigenes assembled in B. odoriphaga through transcriptome and DEG analyses could provide a detailed genetic basis and regulated information for elaborating the developmental mechanism from the larval, pre-pupal to pupal stages of B. odoriphaga. PMID:26891450

  16. Discovery of new soybean and soybean rust genes using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Soybean is one of the top five agricultural products in the United States and is highly susceptible to soybean rust (SR), an exotic obligate fungus that arrived in the USA in 2004. We used mRNA-Seq by Illumina/Solexa to analyze gene expression patterns of the host and pathogen at different time poin...

  17. Gene Discovery in the Threatened Elkhorn Coral: 454 Sequencing of the Acropora palmata Transcriptome

    PubMed Central

    Polato, Nicholas R.; Vera, J. Cristobal; Baums, Iliana B.

    2011-01-01

    Background Cnidarians, including corals and anemones, offer unique insights into metazoan evolution because they harbor genetic similarities with vertebrates beyond that found in model invertebrates and retain genes known only from non-metazoans. Cataloging genes expressed in Acropora palmata, a foundation-species of reefs in the Caribbean and western Atlantic, will advance our understanding of the genetic basis of ecologically important traits in corals and comes at a time when sequencing efforts in other cnidarians allow for multi-species comparisons. Results A cDNA library from a sample enriched for symbiont free larval tissue was sequenced on the 454 GS-FLX platform. Over 960,000 reads were obtained and assembled into 42,630 contigs. Annotation data was acquired for 57% of the assembled sequences. Analysis of the assembled sequences indicated that 83–100% of all A. palmata transcripts were tagged, and provided a rough estimate of the total number genes expressed in our samples (∼18,000–20,000). The coral annotation data contained many of the same molecular components as in the Bilateria, particularly in pathways associated with oxidative stress and DNA damage repair, and provided evidence that homologs of p53, a key player in DNA repair pathways, has experienced selection along the branch separating Cnidaria and Bilateria. Transcriptome wide screens of paralog groups and transition/transversion ratios highlighted genes including: green fluorescent proteins, carbonic anhydrase, and oxidative stress proteins; and functional groups involved in protein and nucleic acid metabolism, and the formation of structural molecules. These results provide a starting point for study of adaptive evolution in corals. Conclusions Currently available transcriptome data now make comparative studies of the mechanisms underlying coral's evolutionary success possible. Here we identified candidate genes that enable corals to maintain genomic integrity despite considerable exposure to genotoxic stress over long life spans, and showed conservation of important physiological pathways between corals and bilaterians. PMID:22216101

  18. A comprehensive resource of drought- and salinity- responsive ESTs for gene discovery and marker development in chickpea (Cicer arietinum L.)

    PubMed Central

    2009-01-01

    Background Chickpea (Cicer arietinum L.), an important grain legume crop of the world is seriously challenged by terminal drought and salinity stresses. However, very limited number of molecular markers and candidate genes are available for undertaking molecular breeding in chickpea to tackle these stresses. This study reports generation and analysis of comprehensive resource of drought- and salinity-responsive expressed sequence tags (ESTs) and gene-based markers. Results A total of 20,162 (18,435 high quality) drought- and salinity- responsive ESTs were generated from ten different root tissue cDNA libraries of chickpea. Sequence editing, clustering and assembly analysis resulted in 6,404 unigenes (1,590 contigs and 4,814 singletons). Functional annotation of unigenes based on BLASTX analysis showed that 46.3% (2,965) had significant similarity (≤1E-05) to sequences in the non-redundant UniProt database. BLASTN analysis of unique sequences with ESTs of four legume species (Medicago, Lotus, soybean and groundnut) and three model plant species (rice, Arabidopsis and poplar) provided insights on conserved genes across legumes as well as novel transcripts for chickpea. Of 2,965 (46.3%) significant unigenes, only 2,071 (32.3%) unigenes could be functionally categorised according to Gene Ontology (GO) descriptions. A total of 2,029 sequences containing 3,728 simple sequence repeats (SSRs) were identified and 177 new EST-SSR markers were developed. Experimental validation of a set of 77 SSR markers on 24 genotypes revealed 230 alleles with an average of 4.6 alleles per marker and average polymorphism information content (PIC) value of 0.43. Besides SSR markers, 21,405 high confidence single nucleotide polymorphisms (SNPs) in 742 contigs (with ≥ 5 ESTs) were also identified. Recognition sites for restriction enzymes were identified for 7,884 SNPs in 240 contigs. Hierarchical clustering of 105 selected contigs provided clues about stress- responsive candidate genes and their expression profile showed predominance in specific stress-challenged libraries. Conclusion Generated set of chickpea ESTs serves as a resource of high quality transcripts for gene discovery and development of functional markers associated with abiotic stress tolerance that will be helpful to facilitate chickpea breeding. Mapping of gene-based markers in chickpea will also add more anchoring points to align genomes of chickpea and other legume species. PMID:19912666

  19. Mapping our genes: The genome projects: How big, how fast

    SciTech Connect

    none,

    1988-04-01

    For the past 2 years, scientific and technical journals in biology and medicine have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part of the public agenda. The debate involves science, technology, and politics. Congress is responsible for /open quotes/writing the rules/close quotes/ of what various federal agencies do and for funding their work. This report surveys the points made so far in the debate, focusing on those that most directly influence the policy options facing the US Congress. Congressional interest focused on how to assess the rationales for conducting human genome projects, how to fund human genome projects (at what level and through which mechanisms), how to coordinate the scientific and technical programs of the several federal agencies and private interests already supporting various genome projects, and how to strike a balance regarding the impact of genome projects on international scientific cooperation and international economic competition in biotechnology. OTA prepared this report with the assistance of several hundred experts throughout the world. 342 refs., 26 figs., 11 tabs.

  20. Mapping Our Genes: The Genome Projects: How Big, How Fast

    DOE R&D Accomplishments Database

    1988-04-01

    For the past 2 years, scientific and technical journals in biology and medicine have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part of the public agenda. The debate involves science, technology, and politics. Congress is responsible for �writing the rules� of what various federal agencies do and for funding their work. This report surveys the points made so far in the debate, focusing on those that most directly influence the policy options facing the US Congress. Congressional interest focused on how to assess the rationales for conducting human genome projects, how to fund human genome projects (at what level and through which mechanisms), how to coordinate the scientific and technical programs of the several federal agencies and private interests already supporting various genome projects, and how to strike a balance regarding the impact of genome projects on international scientific cooperation and international economic competition in biotechnology. The Office of Technology Assessment (OTA) prepared this report with the assistance of several hundred experts throughout the world.

  1. Scientific Discovery through Advanced Computing (SciDAC-3) Partnership Project Annual Report

    SciTech Connect

    Hoffman, Forest M.; Bochev, Pavel B.; Cameron-Smith, Philip J..; Easter, Richard C; Elliott, Scott M.; Ghan, Steven J.; Liu, Xiaohong; Lowrie, Robert B.; Lucas, Donald D.; Ma, Po-lun; Sacks, William J.; Shrivastava, Manish; Singh, Balwinder; Tautges, Timothy J.; Taylor, Mark A.; Vertenstein, Mariana; Worley, Patrick H.

    2014-01-15

    The Applying Computationally Efficient Schemes for BioGeochemical Cycles ACES4BGC Project is advancing the predictive capabilities of Earth System Models (ESMs) by reducing two of the largest sources of uncertainty, aerosols and biospheric feedbacks, with a highly efficient computational approach. In particular, this project is implementing and optimizing new computationally efficient tracer advection algorithms for large numbers of tracer species; adding important biogeochemical interactions between the atmosphere, land, and ocean models; and applying uncertainty quanti cation (UQ) techniques to constrain process parameters and evaluate uncertainties in feedbacks between biogeochemical cycles and the climate system.

  2. Gene discovery through transcriptome sequencing for the invasive mussel Limnoperna fortunei.

    PubMed

    Uliano-Silva, Marcela; Americo, Juliana Alves; Brindeiro, Rodrigo; Dondero, Francesco; Prosdocimi, Francisco; Rebelo, Mauro de Freitas

    2014-01-01

    The success of the Asian bivalve Limnoperna fortunei as an invader in South America is related to its high acclimation capability. It can inhabit waters with a wide range of temperatures and salinity and handle long-term periods of air exposure. We describe the transcriptome of L. fortunei aiming to give a first insight into the phenotypic plasticity that allows non-native taxa to become established and widespread. We sequenced 95,219 reads from five main tissues of the mussel L. fortunei using Roche's 454 and assembled them to form a set of 84,063 unigenes (contigs and singletons) representing partial or complete gene sequences. We annotated 24,816 unigenes using a BLAST sequence similarity search against a NCBI nr database. Unigenes were divided into 20 eggNOG functional categories and 292 KEGG metabolic pathways. From the total unigenes, 1,351 represented putative full-length genes of which 73.2% were functionally annotated. We described the first partial and complete gene sequences in order to start understanding bivalve invasiveness. An expansion of the hsp70 gene family, seen also in other bivalves, is present in L. fortunei and could be involved in its adaptation to extreme environments, e.g. during intertidal periods. The presence of toll-like receptors gives a first insight into an immune system that could be more complex than previously assumed and may be involved in the prevention of disease and extinction when population densities are high. Finally, the apparent lack of special adaptations to extremely low O2 levels is a target worth pursuing for the development of a molecular control approach. PMID:25047650

  3. Gene Discovery through Transcriptome Sequencing for the Invasive Mussel Limnoperna fortunei

    PubMed Central

    Uliano-Silva, Marcela; Americo, Juliana Alves; Brindeiro, Rodrigo; Dondero, Francesco; Prosdocimi, Francisco; de Freitas Rebelo, Mauro

    2014-01-01

    The success of the Asian bivalve Limnoperna fortunei as an invader in South America is related to its high acclimation capability. It can inhabit waters with a wide range of temperatures and salinity and handle long-term periods of air exposure. We describe the transcriptome of L. fortunei aiming to give a first insight into the phenotypic plasticity that allows non-native taxa to become established and widespread. We sequenced 95,219 reads from five main tissues of the mussel L. fortunei using Roche’s 454 and assembled them to form a set of 84,063 unigenes (contigs and singletons) representing partial or complete gene sequences. We annotated 24,816 unigenes using a BLAST sequence similarity search against a NCBI nr database. Unigenes were divided into 20 eggNOG functional categories and 292 KEGG metabolic pathways. From the total unigenes, 1,351 represented putative full-length genes of which 73.2% were functionally annotated. We described the first partial and complete gene sequences in order to start understanding bivalve invasiveness. An expansion of the hsp70 gene family, seen also in other bivalves, is present in L. fortunei and could be involved in its adaptation to extreme environments, e.g. during intertidal periods. The presence of toll-like receptors gives a first insight into an immune system that could be more complex than previously assumed and may be involved in the prevention of disease and extinction when population densities are high. Finally, the apparent lack of special adaptations to extremely low O2 levels is a target worth pursuing for the development of a molecular control approach. PMID:25047650

  4. RNA-seq analysis for secondary metabolite pathway gene discovery in Polygonum minus

    PubMed Central

    Loke, Kok-Keong; Rahnamaie-Tajadod, Reyhaneh; Yeoh, Chean-Chean; Goh, Hoe-Han; Mohamed-Hussein, Zeti-Azura; Mohd Noor, Normah; Zainal, Zamri; Ismail, Ismanizan

    2015-01-01

    Polygonum minus plant is rich in secondary metabolites, especially terpenoids and flavonoids. Present study generates transcriptome resource for P. minus to decipher its secondary metabolite biosynthesis pathways. Raw reads and the transcriptome assembly project have been deposited at GenBank under the accessions SRX313492 (root) and SRX669305 (leaf) respectively. PMID:26981350

  5. Gene discovery in Carcinus maenas and Homarus americanus via expressed sequence tags.

    PubMed

    Towle, David W; Smith, Christine M

    2006-12-01

    Expressed sequence tags (ESTs) were produced for normalized cDNA libraries prepared from several tissues of 2 marine crustaceans, the green shore crab Carcinus maenas and the American lobster Homarus americanus. Tissues represented in the Carcinus library were anterior and posterior gills, hypodermis, heart, hepatopancreas, antennal gland, brain, testis, and skeletal muscle, obtained from animals acclimated to 35 and 10‰ salinity. Tissues represented in the Homarus library were gill, epipodite, branchiostegite, heart, ovary, testis, antennal gland, skeletal muscle, hepatopancreas, and brain, obtained from intermolt and post-molt animals. Directional libraries from oligo-dT-primed cDNA were constructed in the pCMVsport6.1 vector and normalized by self-subtraction at 2 different Cot values. Randomly picked clones were single-pass sequenced from the 5' end. Raw sequence data were trimmed and prepared for submission to dbEST using phred, cross-match, and blastx protocols embedded within trace2dbest software. A total of 4604 Homarus and 12 401 Carcinus ESTs produced 540 and 2651 clusters, respectively, as determined by TIGR Gene Index Clustering software. Gene Ontology analysis with reference to a Drosophila melanogaster database using GOblet revealed 187 positive hits (35%) with the Homarus clusters and 1037 positive hits (39%) with the Carcinus clusters. Combining the number of assembled sequences with the number of singlets obtained after cluster analysis suggested that transcripts representing as much as 25% of the total number of genes in Carcinus have been isolated. PMID:21672795

  6. Gene expression profiling for discovery of novel targets in human traumatic brain injury.

    PubMed

    Barr, Taura L; Alexander, Sheila; Conley, Yvette

    2011-04-01

    Several clinical trials have failed to demonstrate a significant effect on outcome following human traumatic brain injury (TBI) despite promising results obtained in preclinical animal studies. These failures may be due in part to a misinterpretation of the findings obtained in preclinical animal models of TBI, a misunderstanding of the complexity of the human response to TBI, limited knowledge about the biological pathways that interact to contribute to good and bad outcomes after brain injury, and the effects of genomic variability and environment on individual recovery. Recent publications suggest that data obtained from gene expression profiling studies of complex neurological diseases such as stroke, multiple sclerosis (MS), Alzheimer's and Parkinson's may contribute to a more informed understanding of what affects outcome following TBI. These data may help to bridge the gap between successful preclinical studies and negative clinical trials in humans to reveal novel targets for therapy. Gene expression profiling has the capability to identify biomarkers associated with response to TBI, elucidate complex genetic interactions that may play a role in outcome following TBI, and reveal biological pathways related to brain health. This review highlights the current state of the literature on gene expression profiling for neurological disease and discusses its ability to aid in unraveling the variable human response to TBI and the potential for it to offer treatment strategies in an area where we currently have limited therapeutic options primarily based on supportive care. PMID:21112922

  7. Project ARCHIMEDES: Applications, Reasoning and Concepts for High School Instructors: Making Educational Discoveries and Expanding Skills.

    ERIC Educational Resources Information Center

    Lea, Suzanne M.

    Project ARCHIMEDES was designed in cooperation with local teachers to enhance concept understanding of teachers of physics and physical sciences, to increase use of electronics and computers in the classroom, and to introduce research on students' misconceptions in physics, teaching methods for identifying and remediating misconceptions, and ways…

  8. 78 FR 69363 - Lake Tahoe Basin Management Unit, California, Heavenly Mountain Resort Epic Discovery Project

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-11-19

    ... certified in 2007. The project is located at Heavenly within El Dorado and Alpine Counties, CA and Douglas... contained in the proposal that lie outside of the Lake Tahoe Region (Carson River watershed), and are... Alpine Coaster), not currently accepted for review and decision by the Forest Service. The...

  9. BIOINFORMATIC RESOURCES FOR SNP AND INDEL DISCOVERY IN THE MAIZE MAPPING PROJECT

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The objective of our single nucleotide polymorphism (SNP) project is to anchor maize unigenes to the intermated B73 x Mo17 (IBM) genetic map. Starting with maize unigene sequences, primers were designed with Primer3 to give a polymerase chain reactor (PCR) products of ~300 bases. The PCR products ...

  10. Ac/Ds-transposon activation tagging in poplar: a powerful tool for gene discovery

    PubMed Central

    2012-01-01

    Background Rapid improvements in the development of new sequencing technologies have led to the availability of genome sequences of more than 300 organisms today. Thanks to bioinformatic analyses, prediction of gene models and protein-coding transcripts has become feasible. Various reverse and forward genetics strategies have been followed to determine the functions of these gene models and regulatory sequences. Using T-DNA or transposons as tags, significant progress has been made by using "Knock-in" approaches ("gain-of-function" or "activation tagging") in different plant species but not in perennial plants species, e.g. long-lived trees. Here, large scale gene tagging resources are still lacking. Results We describe the first application of an inducible transposon-based activation tagging system for a perennial plant species, as example a poplar hybrid (P. tremula L. P. tremuloides Michx.). Four activation-tagged populations comprising a total of 12,083 individuals derived from 23 independent "Activation Tagging Ds" (ATDs) transgenic lines were produced and phenotyped. To date, 29 putative variants have been isolated and new ATDs genomic positions were successfully determined for 24 of those. Sequences obtained were blasted against the publicly available genome sequence of P. trichocarpa v2.0 (Phytozome v7.0; http://www.phytozome.net/poplar) revealing possible transcripts for 17 variants. In a second approach, 300 randomly selected individuals without any obvious phenotypic alterations were screened for ATDs excision. For one third of those transposition of ATDs was confirmed and in about 5% of these cases genes were tagged. Conclusions The novel strategy of first genotyping and then phenotyping a tagging population as proposed here is, in particular, applicable for long-lived, difficult to transform plant species. We could demonstrate the power of the ATDs transposon approach and the simplicity to induce ATDs transposition in vitro. Since a transposon is able to pass chromosomal boundaries, only very few primary transposon-carrying transgenic lines are required for the establishment of large transposon tagging populations. In contrast to T-DNA-based activation tagging, which is plagued by a lack of transformation efficiency and its time consuming nature, this for the first time, makes it feasible one day to tag (similarly to Arabidopsis) every gene within a perennial plant genome. PMID:22309468

  11. Transcriptomics Analysis of Crassostrea hongkongensis for the Discovery of Reproduction-Related Genes

    PubMed Central

    Tong, Ying; Zhang, Yang; Huang, Jiaomei; Xiao, Shu; Zhang, Yuehuan; Li, Jun; Chen, Jinhui; Yu, Ziniu

    2015-01-01

    Background The reproductive mechanisms of mollusk species have been interesting targets in biological research because of the diverse reproductive strategies observed in this phylum. These species have also been studied for the development of fishery technologies in molluscan aquaculture. Although the molecular mechanisms underlying the reproductive process have been well studied in animal models, the relevant information from mollusks remains limited, particularly in species of great commercial interest. Crassostrea hongkongensis is the dominant oyster species that is distributed along the coast of the South China Sea and little genomic information on this species is available. Currently, high-throughput sequencing techniques have been widely used for investigating the basis of physiological processes and facilitating the establishment of adequate genetic selection programs. Results The C.hongkongensis transcriptome included a total of 1,595,855 reads, which were generated by 454 sequencing and were assembled into 41,472 contigs using de novo methods. Contigs were clustered into 33,920 isotigs and further grouped into 22,829 isogroups. Approximately 77.6% of the isogroups were successfully annotated by the Nr database. More than 1,910 genes were identified as being related to reproduction. Some key genes involved in germline development, sex determination and differentiation were identified for the first time in C.hongkongensis (nanos, piwi, ATRX, FoxL2, β-catenin, etc.). Gene expression analysis indicated that vasa, nanos, piwi, ATRX, FoxL2, β-catenin and SRD5A1 were highly or specifically expressed in C.hongkongensis gonads. Additionally, 94,056 single nucleotide polymorphisms (SNPs) and 1,699 simple sequence repeats (SSRs) were compiled. Conclusions Our study significantly increased C.hongkongensis genomic information based on transcriptomics analysis. The group of reproduction-related genes identified in the present study constitutes a new tool for research on bivalve reproduction processes. The large group of molecular markers discovered in this study will be useful for population screening and marker assisted selection programs in C.hongkongensis aquaculture. PMID:26258576

  12. Cultivation of Hard-To-Culture Subsurface Mercury-Resistant Bacteria and Discovery of New merA Gene Sequences▿

    PubMed Central

    Rasmussen, L. D.; Zawadsky, C.; Binnerup, S. J.; Øregaard, G.; Sørensen, S. J.; Kroer, N.

    2008-01-01

    Mercury-resistant bacteria may be important players in mercury biogeochemistry. To assess the potential for mercury reduction by two subsurface microbial communities, resistant subpopulations and their merA genes were characterized by a combined molecular and cultivation-dependent approach. The cultivation method simulated natural conditions by using polycarbonate membranes as a growth support and a nonsterile soil slurry as a culture medium. Resistant bacteria were pregrown to microcolony-forming units (mCFU) before being plated on standard medium. Compared to direct plating, culturability was increased up to 2,800 times and numbers of mCFU were similar to the total number of mercury-resistant bacteria in the soils. Denaturing gradient gel electrophoresis analysis of DNA extracted from membranes suggested stimulation of growth of hard-to-culture bacteria during the preincubation. A total of 25 different 16S rRNA gene sequences were observed, including Alpha-, Beta-, and Gammaproteobacteria; Actinobacteria; Firmicutes; and Bacteroidetes. The diversity of isolates obtained by direct plating included eight different 16S rRNA gene sequences (Alpha- and Betaproteobacteria and Actinobacteria). Partial sequencing of merA of selected isolates led to the discovery of new merA sequences. With phylum-specific merA primers, PCR products were obtained for Alpha- and Betaproteobacteria and Actinobacteria but not for Bacteroidetes and Firmicutes. The similarity to known sequences ranged between 89 and 95%. One of the sequences did not result in a match in the BLAST search. The results illustrate the power of integrating advanced cultivation methodology with molecular techniques for the characterization of the diversity of mercury-resistant populations and assessing the potential for mercury reduction in contaminated environments. PMID:18441111

  13. Gene Discovery and Molecular Marker Development, Based on High-Throughput Transcript Sequencing of Paspalum dilatatum Poir

    PubMed Central

    Giordano, Andrea; Cogan, Noel O. I.; Kaur, Sukhjiwan; Drayton, Michelle; Mouradov, Aidyn; Panter, Stephen; Schrauf, Gustavo E.; Mason, John G.; Spangenberg, German C.

    2014-01-01

    Background Paspalum dilatatum Poir. (common name dallisgrass) is a native grass species of South America, with special relevance to dairy and red meat production. P. dilatatum exhibits higher forage quality than other C4 forage grasses and is tolerant to frost and water stress. This species is predominantly cultivated in an apomictic monoculture, with an inherent high risk that biotic and abiotic stresses could potentially devastate productivity. Therefore, advanced breeding strategies that characterise and use available genetic diversity, or assess germplasm collections effectively are required to deliver advanced cultivars for production systems. However, there are limited genomic resources available for this forage grass species. Results Transcriptome sequencing using second-generation sequencing platforms has been employed using pooled RNA from different tissues (stems, roots, leaves and inflorescences) at the final reproductive stage of P. dilatatum cultivar Primo. A total of 324,695 sequence reads were obtained, corresponding to c. 102 Mbp. The sequences were assembled, generating 20,169 contigs of a combined length of 9,336,138 nucleotides. The contigs were BLAST analysed against the fully sequenced grass species of Oryza sativa subsp. japonica, Brachypodium distachyon, the closely related Sorghum bicolor and foxtail millet (Setaria italica) genomes as well as against the UniRef 90 protein database allowing a comprehensive gene ontology analysis to be performed. The contigs generated from the transcript sequencing were also analysed for the presence of simple sequence repeats (SSRs). A total of 2,339 SSR motifs were identified within 1,989 contigs and corresponding primer pairs were designed. Empirical validation of a cohort of 96 SSRs was performed, with 34% being polymorphic between sexual and apomictic biotypes. Conclusions The development of genetic and genomic resources for P. dilatatum will contribute to gene discovery and expression studies. Association of gene function with agronomic traits will significantly enable molecular breeding and advance germplasm enhancement. PMID:24520314

  14. Gene discovery in the Antarctic fur seal (Arctocephalus gazella) skin transcriptome.

    PubMed

    Hoffman, Joseph I

    2011-07-01

    Next-generation sequencing provides a powerful new approach for developing functional genomic tools for nonmodel species, helping to narrow the gap between studies of model organisms and those of natural populations. Consequently, massively parallel 454 sequencing was used to characterize a normalized cDNA library derived from skin biopsy samples of twelve Antarctic fur seal (Arctocephalus gazella) individuals. Over 412 Mb of sequence data were generated, comprising 1.4 million reads of average length 286 bp. De novo assembly using Newbler 2.3 yielded 156 contigs plus 22 869 isotigs, which in turn clustered into 18,576 isogroups. Almost half of the assembled transcript sequences showed significant similarity to the nr database, revealing a functionally diverse array of genes. Moreover, 97.9% of these mapped to the dog (Canis lupis familiaris) genome, with a strong positive relationship between the number of sequences locating to a given chromosome and the length of that chromosome in the dog indicating a broad genomic distribution. Average depth of coverage was also almost 20-fold, sufficient to detect several thousand putative microsatellite loci and single nucleotide polymorphisms. This study constitutes an important step towards developing genomic resources with which to address consequential questions in pinniped ecology and evolution. It also supports an earlier but smaller study showing that skin tissue can be a rich source of expressed genes, with important implications for studying the genomics not only of marine mammals, but also more generally of species that cannot be destructively sampled. PMID:21466659

  15. High-Throughput, Motility-Based Sorter for Microswimmers and Gene Discovery Platform

    NASA Astrophysics Data System (ADS)

    Yuan, Jinzhou; Raizen, David; Bau, Haim

    2015-11-01

    Animal motility varies with genotype, disease progression, aging, and environmental conditions. In many studies, it is desirable to carry out high throughput motility-based sorting to isolate rare animals for, among other things, forward genetic screens to identify genetic pathways that regulate phenotypes of interest. Many commonly used screening processes are labor-intensive, lack sensitivity, and require extensive investigator training. Here, we describe a sensitive, high throughput, automated, motility-based method for sorting nematodes. Our method was implemented in a simple microfluidic device capable of sorting many thousands of animals per hour per module, and is amenable to parallelism. The device successfully enriched for known C. elegans motility mutants. Furthermore, using this device, we isolated low-abundance mutants capable of suppressing the somnogenic effects of the flp-13 gene, which regulates sleep-like quiescence in C. elegans. Subsequent genomic sequencing led to the identification of a flp-13-suppressor gene. This research was supported, in part, by NIH NIA Grant 5R03AG042690-02.

  16. Registered report: Discovery and preclinical validation of drug indications using compendia of public gene expression data.

    PubMed

    Kandela, Irawati; Zervantonakis, Ioannis

    2015-01-01

    The Science Translational Medicine in 2011 (Sirota et al., 2011). The key experiments being replicated include Figure 4C and D and Supplemental Figure 1. In these figures, Sirota and colleagues. tested a proof of concept experiment validating their prediction that cimetidine, a histamine-2 (H2) receptor agonist commonly used to treat peptic ulcers (Kubecova et al., 2011), would be effective against lung adenocarcinoma (Figure 4C and D). As a control they also tested the effects of cimetidine against renal carcinoma, for which it was not predicted to be efficacious (Supplemental Figure 1). The Reproducibility Project: Cancer Biology is a collaboration between the eLife. PMID:25939392

  17. Large-scale gene discovery in the pea aphid Acyrthosiphon pisum (Hemiptera)

    PubMed Central

    Sabater-Muñoz, Beatriz; Legeai, Fabrice; Rispe, Claude; Bonhomme, Joël; Dearden, Peter; Dossat, Carole; Duclert, Aymeric; Gauthier, Jean-Pierre; Ducray, Danièle Giblot; Hunter, Wayne; Dang, Phat; Kambhampati, Srini; Martinez-Torres, David; Cortes, Teresa; Moya, Andrès; Nakabachi, Atsushi; Philippe, Cathy; Prunier-Leterme, Nathalie; Rahbé, Yvan; Simon, Jean-Christophe; Stern, David L; Wincker, Patrick; Tagu, Denis

    2006-01-01

    Aphids are the leading pests in agricultural crops. A large-scale sequencing of 40,904 ESTs from the pea aphid Acyrthosiphon pisum was carried out to define a catalog of 12,082 unique transcripts. A strong AT bias was found, indicating a compositional shift between Drosophila melanogaster and A. pisum. An in silico profiling analysis characterized 135 transcripts specific to pea-aphid tissues (relating to bacteriocytes and parthenogenetic embryos). This project is the first to address the genetics of the Hemiptera and of a hemimetabolous insect. PMID:16542494

  18. The MY NASA DATA Project: Tools and a Collaboration Space for Knowledge Discovery

    NASA Astrophysics Data System (ADS)

    Chambers, L. H.; Alston, E. J.; Diones, D. D.; Moore, S. W.; Oots, P. C.; Phelps, C. S.

    2006-05-01

    The Atmospheric Science Data Center (ASDC) at NASA Langley Research Center is charged with serving a wide user community that is interested in its large data holdings in the areas of Aerosols, Clouds, Radiation Budget, and Tropospheric Chemistry. Most of the data holdings, however, are in large files with specialized data formats. The MY NASA DATA (mynasadata.larc.nasa.gov) project began in 2004, as part of the NASA Research, Education, and Applications Solutions Network (REASoN), in order to open this important resource to a broader community including K-12 education and citizen scientists. MY NASA DATA (short for Mentoring and inquirY using NASA Data on Atmospheric and earth science for Teachers and Amateurs) consists of a web space that collects tools, lesson plans, and specially developed documentation to help the target audience more easily use the vast collection of NASA data about the Earth System. The core piece of the MY NASA DATA project is the creation of microsets (both static and custom) that make data easily accessible. The installation of a Live Access Server (LAS) greatly enhanced the ability for teachers, students, and citizen scientists to create and explore custom microsets of Earth System Science data. The LAS, which is an open source software tool using emerging data standards, also allows the MY NASA DATA team to make available data on other aspects of the Earth System from collaborating data centers. We are currently working with the Physical Oceanography DAAC at the Jet Propulsion Laboratory to bring in several parameters describing the ocean. In addition, MY NASA DATA serves as a central space for the K-12 community to share resources. The site already includes a dozen User-contributed lesson plans. This year we will be focusing on the Citizen Science portion of the site, and will be welcoming user-contributed project ideas, as well as reports of completed projects. An e-mentor network has also been created to involve a wider community in answering questions on scientific and pedagogical aspects of data use. The MY NASA DATA website, and an initial collection of lesson plans, have passed the NASA Earth Science Education peer review process, and thus are also being cataloged in the Digital Library for Earth System Education (DLESE).

  19. Discovery and characterization of a novel CCND1/MRCK gene fusion in mantle cell lymphoma.

    PubMed

    Masamha, Chioniso Patience; Albrecht, Todd R; Wagner, Eric J

    2016-01-01

    The t(11;14) translocation resulting in constitutive cyclin D1 expression is an early event in mantle cell lymphoma (MCL) transformation. Patients with a highly proliferative phenotype produce cyclin D1 transcripts with truncated 3'UTRs that evade miRNA regulation. Here, we report the recurrence of a novel gene fusion in MCL cell lines and MCL patient isolates that consists of the full protein coding region of cyclin D1 (CCND1) and a 3'UTR consisting of sequences from both the CCND1 3'UTR and myotonic dystrophy kinase-related Cdc42-binding kinase's (MRCK) intron one. The resulting CCND1/MRCK mRNA is resistant to CCND1-targeted miRNA regulation, and targeting the MRCK region of the chimeric 3'UTR with siRNA results in decreased CCND1 levels. PMID:27025456

  20. Ellis Englesberg and the Discovery of Positive Control in Gene Regulation

    PubMed Central

    Hahn, Steven

    2014-01-01

    Based on his work with the Escherichia coli l-arabinose operon, Ellis Englesberg proposed in 1965 that the regulatory gene araC was an “activator gene” required for positive control of the ara operon. This challenged the widely held belief in a universal mechanism of negative regulation proposed earlier by Jacob and Monod. For years, Englesberg’s model was met with deep skepticism. Despite much frustration with complex ad hoc explanations used to challenge his model, Englesberg persisted until the evidence for positive control in ara and other systems became overwhelming. Englesberg’s pioneering work enriched the original operon model and had a lasting impact in opening new and exciting ways of thinking about transcriptional regulation. PMID:25316786

  1. Discovery and characterization of the first genuine avian leptin gene in the rock dove (Columba livia).

    PubMed

    Friedman-Einat, Miriam; Cogburn, Larry A; Yosefi, Sara; Hen, Gideon; Shinder, Dmitry; Shirak, Andrey; Seroussi, Eyal

    2014-09-01

    Leptin, the key regulator of mammalian energy balance, has been at the center of a great controversy in avian biology for the last 15 years since initial reports of a putative leptin gene (LEP) in chickens. Here, we characterize a novel LEP in rock dove (Columba livia) with low similarity of the predicted protein sequence (30% identity, 47% similarity) to the human ortholog. Searching the Sequence-Read-Archive database revealed leptin transcripts, in the dove's liver, with 2 noncoding exons preceding 2 coding exons. This unusual 4-exon structure was validated by sequencing of a GC-rich product (76% GC, 721 bp) amplified from liver RNA by RT-PCR. Sequence alignment of the dove leptin with orthologous leptins indicated that it consists of a leader peptide (21 amino acids; aa) followed by the mature protein (160 aa), which has a putative structure typical of 4-helical-bundle cytokines except that it is 12 aa longer than human leptin. Extra residues (10 aa) were located within the loop between 2 5'-helices, interrupting the amino acid motif that is conserved in tetrapods and considered essential for activation of leptin receptor (LEPR) but not for receptor binding per se. Quantitative RT-PCR of 11 tissues showed highest (P < .05) expression of LEP in the dove's liver, whereas the dove LEPR peaked (P < .01) in the pituitary. Both genes were prominently expressed in the gonads and at lower levels in tissues involved in mammalian leptin signaling (adipose; hypothalamus). A bioassay based on activation of the chicken LEPR in vitro showed leptin activity in the dove's circulation, suggesting that dove LEP encodes an active protein, despite the interrupted loop motif. Providing tools to study energy-balance control at an evolutionary perspective, our original demonstration of leptin signaling in dove predicts a more ancient role of leptin in growth and reproduction in birds, rather than appetite control. PMID:24758303

  2. Microfluidic droplet-based PCR instrumentation for high-throughput gene expression profiling and biomarker discovery

    PubMed Central

    Hayes, Christopher J.; Dalton, Tara M.

    2015-01-01

    PCR is a common and often indispensable technique used in medical and biological research labs for a variety of applications. Real-time quantitative PCR (RT-qPCR) has become a definitive technique for quantitating differences in gene expression levels between samples. Yet, in spite of this importance, reliable methods to quantitate nucleic acid amounts in a higher throughput remain elusive. In the following paper, a unique design to quantify gene expression levels at the nanoscale in a continuous flow system is presented. Fully automated, high-throughput, low volume amplification of deoxynucleotides (DNA) in a droplet based microfluidic system is described. Unlike some conventional qPCR instrumentation that use integrated fluidic circuits or plate arrays, the instrument performs qPCR in a continuous, micro-droplet flowing process with droplet generation, distinctive reagent mixing, thermal cycling and optical detection platforms all combined on one complete instrument. Detailed experimental profiling of reactions of less than 300 nl total volume is achieved using the platform demonstrating the dynamic range to be 4 order logs and consistent instrument sensitivity. Furthermore, reduced pipetting steps by as much as 90% and a unique degree of hands-free automation makes the analytical possibilities for this instrumentation far reaching. In conclusion, a discussion of the first demonstrations of this approach to perform novel, continuous high-throughput biological screens is presented. The results generated from the instrument, when compared with commercial instrumentation, demonstrate the instrument reliability and robustness to carry out further studies of clinical significance with added throughput and economic benefits. PMID:27077035

  3. Microfluidic droplet-based PCR instrumentation for high-throughput gene expression profiling and biomarker discovery.

    PubMed

    Hayes, Christopher J; Dalton, Tara M

    2015-06-01

    PCR is a common and often indispensable technique used in medical and biological research labs for a variety of applications. Real-time quantitative PCR (RT-qPCR) has become a definitive technique for quantitating differences in gene expression levels between samples. Yet, in spite of this importance, reliable methods to quantitate nucleic acid amounts in a higher throughput remain elusive. In the following paper, a unique design to quantify gene expression levels at the nanoscale in a continuous flow system is presented. Fully automated, high-throughput, low volume amplification of deoxynucleotides (DNA) in a droplet based microfluidic system is described. Unlike some conventional qPCR instrumentation that use integrated fluidic circuits or plate arrays, the instrument performs qPCR in a continuous, micro-droplet flowing process with droplet generation, distinctive reagent mixing, thermal cycling and optical detection platforms all combined on one complete instrument. Detailed experimental profiling of reactions of less than 300 nl total volume is achieved using the platform demonstrating the dynamic range to be 4 order logs and consistent instrument sensitivity. Furthermore, reduced pipetting steps by as much as 90% and a unique degree of hands-free automation makes the analytical possibilities for this instrumentation far reaching. In conclusion, a discussion of the first demonstrations of this approach to perform novel, continuous high-throughput biological screens is presented. The results generated from the instrument, when compared with commercial instrumentation, demonstrate the instrument reliability and robustness to carry out further studies of clinical significance with added throughput and economic benefits. PMID:27077035

  4. MixMir: microRNA motif discovery from gene expression data using mixed linear models.

    PubMed

    Diao, Liyang; Marcais, Antoine; Norton, Scott; Chen, Kevin C

    2014-01-01

    microRNAs (miRNAs) are a class of ∼22nt non-coding RNAs that potentially regulate over 60% of human protein-coding genes. miRNA activity is highly specific, differing between cell types, developmental stages and environmental conditions, so the identification of active miRNAs in a given sample is of great interest. Here we present a novel computational approach for analyzing both mRNA sequence and gene expression data, called MixMir. Our method corrects for 3' UTR background sequence similarity between transcripts, which is known to correlate with mRNA transcript abundance. We demonstrate that after accounting for kmer sequence similarities in 3' UTRs, a statistical linear model based on motif presence/absence can effectively discover active miRNAs in a sample. MixMir utilizes fast software implementations for solving mixed linear models, which are widely used in genome-wide association studies (GWASs). Essentially we use 3' UTR sequence similarity in place of population cryptic relatedness in the GWAS problem. Compared to similar methods such as miReduce, Sylamer and cWords, we found that MixMir performed better at discovering true miRNA motifs in three mouse Dicer-knockout experiments from different tissues, two of which were collected by our group. We confirmed these results on protein and mRNA expression data obtained from miRNA transfection experiments in human cell lines. MixMir can be freely downloaded from https://github.com/ldiao/MixMir. PMID:25081207

  5. [De novotranscriptomic analysis of Chlorella sorokiniana: Pathway description and gene discovery for lipid production ].

    PubMed

    Li, Lin; Wang, Qinhong; Yang, Hailin; Wang, Wu

    2014-09-01

    [ OBJECTIVE] The paucity of genomic information limits the metabolic engineering of non-model microalgae Chlorella sorokiniana. Our study aimed to elucidate the fatty acid, triacylglycerol and starch biosynthetic pathways in the microalgae C. sorokiniana based on de novo transcriptomic analysis. [METHODS] We cultured C. sorokiniana with different nitrogen concentrations (KNO3: 8g/L and 2g/L) , then sequenced the transcriptomeusing Illumina Hiseq2000 platform. We used Trinity to de novo assemble the reads so as to obtain transcripts, aligned all the transcripts with Nr database, UniProtKB/Swiss-Prot database and COG database to annotate the function and classify using BLASTx algorithm, and assigned the transcript with metabolic pathway by aligning with KEGG database. Then we used RSEM to calculate FPKM value, and used it for preliminary analysis of different gene expression in the related pathways. [RESULTS] Over 49M high quality raw reads were produced with the length of 100bp, We used Trinity to assembled these reads into 49885 transcripts with an N50 of 1941bp, ranging from 300bp to 14100bp. 26479 transcripts were annotated through BLASTx similarity search, 2357 transcripts were assigned with EC number, and 207 metabolic pathways were assigned in total. Based on these analyses, we reconstructed the fatty acids, triacylglycerol and starch biosynthetic pathways in C. sorokiniana. We also identified preliminarily different geneexpression in the pathways. [CONCLUSION] Using RNA-seq technology, we reconstructed the metabolic pathways involving in the fatty acid, triacylglycerol and starch biosynthesis in non-model microalgae C. sorokiniana without genomic data, which is consistent with those in model microalgae Chlamydomonas reinhardtii, and compared the gene expression level under different conditions. These information is very useful for the metabolic engineering of C. sorokiniana and other microalgae to enhance the production of lipids. PMID:25522590

  6. Next-generation pyrosequencing of gonad transcriptomes in the polyploid lake sturgeon (Acipenser fulvescens): the relative merits of normalization and rarefaction in gene discovery

    PubMed Central

    Hale, Matthew C; McCormick, Cory R; Jackson, James R; DeWoody, J Andrew

    2009-01-01

    Background Next-generation sequencing technologies have been applied most often to model organisms or species closely related to a model. However, these methods have the potential to be valuable in many wild organisms, including those of conservation concern. We used Roche 454 pyrosequencing to characterize gene expression in polyploid lake sturgeon (Acipenser fulvescens) gonads. Results Titration runs on a Roche 454 GS-FLX produced more than 47,000 sequencing reads. These reads represented 20,741 unique sequences that passed quality control (mean length = 186 bp). These were assembled into 1,831 contigs (mean contig depth = 4.1 sequences). Over 4,000 sequencing reads (~19%) were assigned gene ontologies, mostly to protein, RNA, and ion binding. A total of 877 candidate SNPs were identified from > 50 different genes. We employed an analytical approach from theoretical ecology (rarefaction) to evaluate depth of sequencing coverage relative to gene discovery. We also considered the relative merits of normalized versus native cDNA libraries when using next-generation sequencing platforms. Not surprisingly, fewer genes from the normalized libraries were rRNA subunits. Rarefaction suggests that normalization has little influence on the efficiency of gene discovery, at least when working with thousands of reads from a single tissue type. Conclusion Our data indicate that titration runs on 454 sequencers can characterize thousands of expressed sequence tags which can be used to identify SNPs, gene ontologies, and levels of gene expression in species of conservation concern. We anticipate that rarefaction will be useful in evaluations of gene discovery and that next-generation sequencing technologies hold great potential for the study of other non-model organisms. PMID:19402907

  7. The Extragalactic Distance Scale Key Project VIII. The Discovery of Cepheids and a New Distance to NGC 3621 Using the Hubble Space Telescope

    NASA Technical Reports Server (NTRS)

    Rawson, D. M.; Mould, J. R.; Macri, L. M.; Huchra, J. P.; Kennicutt, R. C.; Harding, P.; Freedman, W. L.; Hill, R. J.; Phelps, R. L.; Madore, B. F.; Silbermann, N. A.; Graham, J. A.; Ferrarese, L.; Ford, H. C.; Illingworth, G. D.; Hoessel, J. G.; Han, M.; Hughes, S. M.; Saha, A.; Stetson, P. B.

    1996-01-01

    We report on the discovery of Cepheids in the field spiral galaxy NGC3621, based on observations made with the Wide Field and Planetary Camera 2 on board the Hubble Space Telescope (HST). NGC 3621 is one of 18 galaxies observed as part of the HST Key Project on the Extragalctic Distance Scale, which aims to measure the Hubble Constant to 10 percent accuracy.

  8. RNA-seq analysis for plant carnivory gene discovery in Nepenthes × ventrata.

    PubMed

    Wan Zakaria, Wan Nor Adibah; Loke, Kok-Keong; Goh, Hoe-Han; Mohd Noor, Normah

    2016-03-01

    Carnivorous plants have the ability to capture and digest insects for nutrients, which allows them to survive in land deprived of nitrogenous nutrients. Nepenthes spp. are one of the carnivorous plants, which uniquely produce pitcher from the tip of an elongated leaf. This study provides the first transcriptome resource from pitcher of a Nepenthes ventricosa × Nepenthes alata hybrid, Nepenthes × ventrata to understand carnivory mechanism in Nepenthes spp., as well as in other carnivorous species. Raw reads and the transcriptome assembly project have been deposited to SRA database with the accession numbers SRX1389337 (day 0 control), SRX1389392 (day 3 longevity), and SRX1389395 (day 3 chitin-treated). PMID:26981352

  9. RNA-seq analysis for plant carnivory gene discovery in Nepenthes × ventrata

    PubMed Central

    Wan Zakaria, Wan Nor Adibah; Loke, Kok-Keong; Goh, Hoe-Han; Mohd Noor, Normah

    2015-01-01

    Carnivorous plants have the ability to capture and digest insects for nutrients, which allows them to survive in land deprived of nitrogenous nutrients. Nepenthes spp. are one of the carnivorous plants, which uniquely produce pitcher from the tip of an elongated leaf. This study provides the first transcriptome resource from pitcher of a Nepenthes ventricosa × Nepenthes alata hybrid, Nepenthes × ventrata to understand carnivory mechanism in Nepenthes spp., as well as in other carnivorous species. Raw reads and the transcriptome assembly project have been deposited to SRA database with the accession numbers SRX1389337 (day 0 control), SRX1389392 (day 3 longevity), and SRX1389395 (day 3 chitin-treated). PMID:26981352

  10. The discovery of zinc fingers and their development for practical applications in gene regulation and genome manipulation.

    PubMed

    Klug, Aaron

    2010-02-01

    A long-standing goal of molecular biologists has been to construct DNA-binding proteins for the control of gene expression. The classical Cys2His2 (C2H2) zinc finger design is ideally suited for such purposes. Discriminating between closely related DNA sequences both in vitro and in vivo, this naturally occurring design was adopted for engineering zinc finger proteins (ZFPs) to target genes specifically. Zinc fingers were discovered in 1985, arising from the interpretation of our biochemical studies on the interaction of the Xenopus protein transcription factor IIIA (TFIIIA) with 5S RNA. Subsequent structural studies revealed its three-dimensional structure and its interaction with DNA. Each finger constitutes a self-contained domain stabilized by a zinc (Zn) ion ligated to a pair of cysteines and a pair of histidines and also by an inner structural hydrophobic core. This discovery showed not only a new protein fold but also a novel principle of DNA recognition. Whereas other DNA-binding proteins generally make use of the 2-fold symmetry of the double helix, functioning as homo- or heterodimers, zinc fingers can be linked linearly in tandem to recognize nucleic acid sequences of varying lengths. This modular design offers a large number of combinatorial possibilities for the specific recognition of DNA (or RNA). It is therefore not surprising that the zinc finger is found widespread in nature, including 3% of the genes of the human genome. The zinc finger design can be used to construct DNA-binding proteins for specific intervention in gene expression. By fusing selected zinc finger peptides to repression or activation domains, genes can be selectively switched off or on by targeting the peptide to the desired gene target. It was also suggested that by combining an appropriate zinc finger peptide with other effector or functional domains, e.g. from nucleases or integrases to form chimaeric proteins, genomes could be modified or manipulated. The first example of the power of the method was published in 1994 when a three-finger protein was constructed to block the expression of a human oncogene transformed into a mouse cell line. The same paper also described how a reporter gene was activated by targeting an inserted 9-base pair (bp) sequence, which acts as the promoter. Thus, by fusing zinc finger peptides to repression or activation domains, genes can be selectively switched off or on. It was also suggested that, by combining zinc fingers with other effector or functional domains, e.g. from nucleases or integrases, to form chimaeric proteins, genomes could be manipulated or modified. Several applications of such engineered ZFPs are described here, including some of therapeutic importance, and also their adaptation for breeding improved crop plants. PMID:20478078

  11. Discovery of PPi-type Phosphoenolpyruvate Carboxykinase Genes in Eukaryotes and Bacteria.

    PubMed

    Chiba, Yoko; Kamikawa, Ryoma; Nakada-Tsukui, Kumiko; Saito-Nakano, Yumiko; Nozaki, Tomoyoshi

    2015-09-25

    Phosphoenolpyruvate carboxykinase (PEPCK) is one of the pivotal enzymes that regulates the carbon flow of the central metabolism by fixing CO2 to phosphoenolpyruvate (PEP) to produce oxaloacetate or vice versa. Whereas ATP- and GTP-type PEPCKs have been well studied, and their protein identities are established, inorganic pyrophosphate (PPi)-type PEPCK (PPi-PEPCK) is poorly characterized. Despite extensive enzymological studies, its protein identity and encoding gene remain unknown. In this study, PPi-PEPCK has been identified for the first time from a eukaryotic human parasite, Entamoeba histolytica, by conventional purification and mass spectrometric identification of the native enzyme, followed by demonstration of its enzymatic activity. A homolog of the amebic PPi-PEPCK from an anaerobic bacterium Propionibacterium freudenreichii subsp. shermanii also exhibited PPi-PEPCK activity. The primary structure of PPi-PEPCK has no similarity to the functional homologs ATP/GTP-PEPCKs and PEP carboxylase, strongly suggesting that PPi-PEPCK arose independently from the other functional homologues and very likely has unique catalytic sites. PPi-PEPCK homologs were found in a variety of bacteria and some eukaryotes but not in archaea. The molecular identification of this long forgotten enzyme shows us the diversity and functional redundancy of enzymes involved in the central metabolism and can help us to understand the central metabolism more deeply. PMID:26269598

  12. The long (and winding) road to gene discovery for canine hip dysplasia

    PubMed Central

    Zhu, Lan; Zhang, Zhiwu; Friedenberg, Steven; Jung, Seung-Woo; Phavaphutanon, Janjira; Vernier-Singer, Margaret; Corey, Elizabeth; Mateescu, Raluca; Dykes, Nathan; Sandler, Jody; Acland, Gregory; Lust, George; Todhunter, Rory

    2009-01-01

    Hip dysplasia is a common inherited trait of dogs that results in secondary osteoarthritis. In this article the methods used to uncover the mutations contributing to this condition are reviewed, beginning with hip phenotyping. Coarse, genome-wide, microsatellite-based screens of pedigrees of greyhounds and dysplastic Labrador retrievers were used to identify linked quantitative trait loci (QTL). Fine-mapping across two chromosomes (CFA11 and 29) was employed using single nucleotide polymorphism (SNP) genotyping. Power analyses and preferential selection of dogs for ongoing SNP-based genotyping is described with the aim of refining the QTL intervals to 1–2 megabases on these and several additional chromosomes prior to candidate gene screening. The review considers how a mutation or a genetic marker such as a SNP or haplotype of SNPs might be combined with pedigree and phenotype information to create a ‘breeding value’ that could improve the accuracy of predicting a dog’s hip conformation. PMID:19297220

  13. Gene mutation discovery research of non-smoking lung cancer patients due to indoor radon exposure.

    PubMed

    Choi, Jung Ran; Park, Seong Yong; Noh, O Kyu; Koh, Young Wha; Kang, Dae Ryong

    2016-01-01

    Although the incidence and mortality for most cancers such as lung and colon are decreasing in several countries, they are increasing in several developed countries because of an unhealthy western lifestyles including smoking, physical inactivity and consumption of calorie-dense food. The incidences for lung and colon cancers in a few of these countries have already exceeded those in the United States and other western countries. Among them, lung cancer is the main cause of cancer death in worldwide. The cumulative survival rate at five years differs between 13 and 21 % in several countries. Although the most important risk factors are smoking for lung cancer, however, the increased incidence of lung cancer in never smokers(LCINS) is necessary to improve knowledge concerning other risk factors. Environmental factors and genetic susceptibility are also thought to contribute to lung cancer risk. Patients with lung adenocarcinoma who have never smoking frequently contain mutation within tyrosine kinase domain of the epidermal growth factor receptor(EGFR) gene. Also, K-ras mutations are more common in individuals with a history of smoking use and are related with resistance to EFGR-tyrosine kinase inhibitors. Recently, radon(Rn), natural and noble gas, has been recognized as second common reason of lung cancer. In this review, we aim to know whether residential radon is associated with an increased risk for developing lung cancer and regulated by several genetic polymorphisms. PMID:26985396

  14. Where gene discovery turns into systems biology: genome-scale RNAi screens in Drosophila

    PubMed Central

    Neumuüller, Ralph A.; Perrimon, Norbert

    2013-01-01

    Systems biology aims to describe the complex interplays between cellular building blocks which, in their concurrence, give rise to the emergent properties observed in cellular behaviors and responses. This approach tries to determine the molecular players and the architectural principles of their interactions within the genetic networks that control certain biological processes. Large-scale loss-of-function screens, applicable in various different model systems, have begun to systematically interrogate entire genomes to identify the genes that contribute to a certain cellular response. In particular, RNA interference (RNAi)-based high-throughput screens have been instrumental in determining the composition of regulatory systems and paired with integrative data analyses have begun to delineate the genetic networks that control cell biological and developmental processes. Through the creation of tools for both, in vitro and in vivo genome-wide RNAi screens, Drosophila melanogaster has emerged as one of the key model organisms in systems biology research and over the last years has massively contributed to and hence shaped this discipline. PMID:21197652

  15. Human Genome Project discoveries: Dialectics and rhetoric in the science of genetics

    NASA Astrophysics Data System (ADS)

    Robidoux, Charlotte A.

    The Human Genome Project (HGP), a $437 million effort that began in 1990 to chart the chemical sequence of our three billion base pairs of DNA, was completed in 2003, marking the 50th anniversary that proved the definitive structure of the molecule. This study considered how dialectical and rhetorical arguments functioned in the science, political, and public forums over a 20-year period, from 1980 to 2000, to advance human genome research and to establish the official project. I argue that Aristotle's continuum of knowledge--which ranges from the probable on one end to certified or demonstrated knowledge on the other--provides useful distinctions for analyzing scientific reasoning. While contemporary scientific research seeks to discover certified knowledge, investigators generally employ the hypothetico-deductive or scientific method, which often yields probable rather than certain findings, making these dialectical in nature. Analysis of the discourse describing human genome research revealed the use of numerous rhetorical figures and topics. Persuasive and probable reasoning were necessary for scientists to characterize unknown genetic phenomena, to secure interest in and funding for large-scale human genome research, to solve scientific problems, to issue probable findings, to convince colleagues and government officials that the findings were sound and to disseminate information to the public. Both government and private venture scientists drew on these tools of reasoning to promote their methods of mapping and sequencing the genome. The debate over how to carry out sequencing was rooted in conflicting values. Scientists representing the academic tradition valued a more conservative method that would establish high quality results, and those supporting private industry valued an unconventional approach that would yield products and profits more quickly. Values in turn influenced political and public forum arguments. Agency representatives and investors sided with the approach that reflected values they supported. Fascinated with this controversy and the convincing comparisons, the media often endorsed Celera's work for its efficiency. The analysis of discourse from the science, political, and public forums revealed that value systems influenced the accuracy and quality of the arguments more than the type or number of figures used to describe the research to various audiences.

  16. Two Pulsar Discoveries from the Einstein@Home Distributed Computing Project

    NASA Astrophysics Data System (ADS)

    Deneva, Julia S.; Knispel, B.; Allen, B.; Cordes, J.; Bogdanov, S.; Brazier, A.; Bhat, R.; Camilo, F.; Chatterjee, S.; Crawford, F.; Desvignes, G.; Freire, P.; Hessels, J.; Jenet, F.; Kaspi, V.; Kramer, M.; Lazarus, P.; Lorimer, D.; van Leeuwen, J.; Lyne, A.; McLaughlin, M.; Nice, D.; Ransom, S.; Siemens, X.; Stairs, I.; Stappers, B.; Stovall, K.

    2011-01-01

    We present two pulsars discovered by Einstein@Home, a distributed computing project that runs on volunteers' computers and searches for gravitational waves in LIGO data and binary pulsars in Arecibo PALFA data. J2007+27 is an isolated pulsar with a period of 24.49 ms. Its unusually large duty cycle and the presence of emission almost throughout the rotation period suggests that its magnetic and spin axes are nearly aligned. Limits on the period derivative, magnetic field, and age indicate that this is the fastest-spinning disrupted recycled pulsar known to date. J1952+26 has a period of 20.73 ms and is in a 7-hour binary. Assuming a pulsar mass of 1.4 Msun, the system's mass function indicates that the minimum companion mass is 0.95 Msun. The companion is likely a neutron star or a massive white dwarf, which makes the system an excellent candidate for Shapiro delay measurement and therefore an accurate estimate of the pulsar and companion masses.

  17. De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes

    PubMed Central

    Zolotarov, Yevgen; Strömvik, Martina

    2015-01-01

    Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved. PMID:26114291

  18. The Fragile X Mental Retardation Syndrome 20 Years After the FMR1 Gene Discovery: an Expanding Universe of Knowledge

    PubMed Central

    Rousseau, François; Labelle, Yves; Bussières, Johanne; Lindsay, Carmen

    2011-01-01

    The fragile X mental retardation (FXMR) syndrome is one of the most frequent causes of mental retardation. Affected individuals display a wide range of additional characteristic features including behavioural and physical phenotypes, and the extent to which individuals are affected is highly variable. For these reasons, elucidation of the pathophysiology of this disease has been an important challenge to the scientific community. 1991 marks the year of the discovery of both the FMR1 gene mutations involved in this disease, and of their dynamic nature. Although a mouse model for the disease has been available for 16 years and extensive research has been performed on the FMR1 protein (FMRP), we still understand little about how the disease develops, and no treatment has yet been shown to be effective. In this review, we summarise current knowledge on FXMR with an emphasis on the technical challenges of molecular diagnostics, on its prevalence and dynamics among populations, and on the potential of screening for FMR1 mutations. PMID:21912443

  19. Transcriptome analysis of the grass carp (Ctenopharyngodon idella) using 454 pyrosequencing methodology for gene and marker discovery.

    PubMed

    Yu, L Y; Bai, J J; Fan, J J; Ma, D M; Quan, Y C; Jiang, P

    2015-01-01

    Total RNA isolated from the brain, muscle, liver, gonad, and intestinal tissues of grass carp was pooled to construct cDNA libraries. Using 454 pyrosequencing, a total of 738,604 high-quality reads were generated from the normalized cDNAs of the pooled individuals. Clustering and assembly of these reads produced a set of 37,086 all-unigene sequences after BLAST. Of these, 24,010 (64.74%) were annotated in the National Center for Biotechnology Information database, and 3715 simple sequence repeats and 2008 single nucleotide polymorphisms were identified in this EST dataset as potential molecular markers. This study provides new data for functional genomic and biological research on grass carp. The markers identified in this study will enrich the currently used molecular markers and facilitate marker-assisted selection in grass carp-breeding programs. These results also demonstrate that transcriptomic analysis based on 454 sequencing is a powerful tool for gene discovery and molecular marker development in non-model species. PMID:26782578

  20. DISCOVERY OF A LOW-MASS COMPANION TO A METAL-RICH F STAR WITH THE MARVELS PILOT PROJECT

    SciTech Connect

    Fleming, Scott W.; Ge Jian; Mahadevan, Suvrath; Lee, Brian; Cuong Nguyen, Duy; Morehead, Robert C.; Wan Xiaoke; Zhao Bo; Liu Jian; Guo Pengcheng; Kane, Stephen R.; Eastman, Jason D.; Siverd, Robert J.; Scott Gaudi, B.; Niedzielski, Andrzej; Sivarani, Thirupathi; Stassun, Keivan G.; Gary, Bruce; Wolszczan, Alex; Barnes, Rory

    2010-08-01

    We report the discovery of a low-mass companion orbiting the metal-rich, main sequence F star TYC 2949-00557-1 during the Multi-object APO Radial Velocity Exoplanet Large-area Survey (MARVELS) pilot project. The host star has an effective temperature T{sub eff} = 6135 {+-} 40 K, logg = 4.4 {+-} 0.1, and [Fe/H] = 0.32 {+-} 0.01, indicating a mass of M = 1.25 {+-} 0.09 M{sub sun} and R = 1.15 {+-} 0.15 R{sub sun}. The companion has an orbital period of 5.69449 {+-} 0.00023 days and straddles the hydrogen burning limit with a minimum mass of 64 M{sub J} , and thus may be an example of the rare class of brown dwarfs orbiting at distances comparable to those of 'Hot Jupiters'. We present relative photometry that demonstrates that the host star is photometrically stable at the few millimagnitude level on time scales of hours to years, and rules out transits for a companion of radius {approx}>0.8 R{sub J} at the 95% confidence level. Tidal analysis of the system suggests that the star and companion are likely in a double synchronous state where both rotational and orbital synchronization have been achieved. This is the first low-mass companion detected with a multi-object, dispersed, fixed-delay interferometer.

  1. Analysis of Gene Expression Profiles in Leaf Tissues of Cultivated Peanuts and Development of EST-SSR Markers and Gene Discovery

    PubMed Central

    Guo, Baozhu; Chen, Xiaoping; Hong, Yanbin; Liang, Xuanqiang; Dang, Phat; Brenneman, Tim; Holbrook, Corley; Culbreath, Albert

    2009-01-01

    Peanut is vulnerable to a range of foliar diseases such as spotted wilt caused by Tomato spotted wilt virus (TSWV), early (Cercospora arachidicola) and late (Cercosporidium personatum) leaf spots, southern stem rot (Sclerotium rolfsii), and sclerotinia blight (Sclerotinia minor). In this study, we report the generation of 17,376 peanut expressed sequence tags (ESTs) from leaf tissues of a peanut cultivar (Tifrunner, resistant to TSWV and leaf spots) and a breeding line (GT-C20, susceptible to TSWV and leaf spots). After trimming vector and discarding low quality sequences, a total of 14,432 high-quality ESTs were selected for further analysis and deposition to GenBank. Sequence clustering resulted in 6,888 unique ESTs composed of 1,703 tentative consensus (TCs) sequences and 5185 singletons. A large number of ESTs (5717) representing genes of unknown functions were also identified. Among the unique sequences, there were 856 EST-SSRs identified. A total of 290 new EST-based SSR markers were developed and examined for amplification and polymorphism in cultivated peanut and wild species. Resequencing information of selected amplified alleles revealed that allelic diversity could be attributed mainly to differences in repeat type and length in the SSR regions. In addition, a few additional INDEL mutations and substitutions were observed in the regions flanking the microsatellite regions. In addition, some defense-related transcripts were also identified, such as putative oxalate oxidase (EU024476) and NBS-LRR domains. EST data in this study have provided a new source of information for gene discovery and development of SSR markers in cultivated peanut. A total of 16931 ESTs have been deposited to the NCBI GenBank database with accession numbers ES751523 to ES768453. PMID:19584933

  2. Analysis of Gene Expression Profiles in Leaf Tissues of Cultivated Peanuts and Development of EST-SSR Markers and Gene Discovery.

    PubMed

    Guo, Baozhu; Chen, Xiaoping; Hong, Yanbin; Liang, Xuanqiang; Dang, Phat; Brenneman, Tim; Holbrook, Corley; Culbreath, Albert

    2009-01-01

    Peanut is vulnerable to a range of foliar diseases such as spotted wilt caused by Tomato spotted wilt virus (TSWV), early (Cercospora arachidicola) and late (Cercosporidium personatum) leaf spots, southern stem rot (Sclerotium rolfsii), and sclerotinia blight (Sclerotinia minor). In this study, we report the generation of 17,376 peanut expressed sequence tags (ESTs) from leaf tissues of a peanut cultivar (Tifrunner, resistant to TSWV and leaf spots) and a breeding line (GT-C20, susceptible to TSWV and leaf spots). After trimming vector and discarding low quality sequences, a total of 14,432 high-quality ESTs were selected for further analysis and deposition to GenBank. Sequence clustering resulted in 6,888 unique ESTs composed of 1,703 tentative consensus (TCs) sequences and 5185 singletons. A large number of ESTs (5717) representing genes of unknown functions were also identified. Among the unique sequences, there were 856 EST-SSRs identified. A total of 290 new EST-based SSR markers were developed and examined for amplification and polymorphism in cultivated peanut and wild species. Resequencing information of selected amplified alleles revealed that allelic diversity could be attributed mainly to differences in repeat type and length in the SSR regions. In addition, a few additional INDEL mutations and substitutions were observed in the regions flanking the microsatellite regions. In addition, some defense-related transcripts were also identified, such as putative oxalate oxidase (EU024476) and NBS-LRR domains. EST data in this study have provided a new source of information for gene discovery and development of SSR markers in cultivated peanut. A total of 16931 ESTs have been deposited to the NCBI GenBank database with accession numbers ES751523 to ES768453. PMID:19584933

  3. Integration of Cot Analysis, DNA Cloning, and High-Throughput Sequencing Facilitates Genome Characterization and Gene Discovery

    PubMed Central

    Peterson, Daniel G.; Schulze, Stefan R.; Sciara, Erica B.; Lee, Scott A.; Bowers, John E.; Nagel, Alexander; Jiang, Ning; Tibbitts, Deanne C.; Wessler, Susan R.; Paterson, Andrew H.

    2002-01-01

    Cot-based sequence discovery represents a powerful means by which both low-copy and repetitive sequences can be selectively and efficiently fractionated, cloned, and characterized. Based upon the results of a Cot analysis, hydroxyapatite chromatography was used to fractionate sorghum (Sorghum bicolor) genomic DNA into highly repetitive (HR), moderately repetitive (MR), and single/low-copy (SL) sequence components that were consequently cloned to produce HRCot, MRCot, and SLCot genomic libraries. Filter hybridization (blotting) and sequence analysis both show that the HRCot library is enriched in sequences traditionally found in high-copy number (e.g., retroelements, rDNA, centromeric repeats), the SLCot library is enriched in low-copy sequences (e.g., genes and “nonrepetitive ESTs”), and the MRCot library contains sequences of moderate redundancy. The Cot analysis suggests that the sorghum genome is approximately 700 Mb (in agreement with previous estimates) and that HR, MR, and SL components comprise 15%, 41%, and 24% of sorghum DNA, respectively. Unlike previously described techniques to sequence the low-copy components of genomes, sequencing of Cot components is independent of expression and methylation patterns that vary widely among DNA elements, developmental stages, and taxa. High-throughput sequencing of Cot clones may be a means of “capturing” the sequence complexity of eukaryotic genomes at unprecedented efficiency. [Online supplementary material is available at www.genome.org. The sequence data described in this paper have been submitted to the GenBank under accession nos. AZ921847-AZ923007. Reagents, samples, and unpublished information freely provided by H. Ma and J. Messing.] PMID:11997346

  4. A Seriation Approach for Visualization-Driven Discovery of Co-Expression Patterns in Serial Analysis of Gene Expression (SAGE) Data

    PubMed Central

    Morozova, Olena; Morozov, Vyacheslav; Hoffman, Brad G.; Helgason, Cheryl D.; Marra, Marco A.

    2008-01-01

    Background Serial Analysis of Gene Expression (SAGE) is a DNA sequencing-based method for large-scale gene expression profiling that provides an alternative to microarray analysis. Most analyses of SAGE data aimed at identifying co-expressed genes have been accomplished using various versions of clustering approaches that often result in a number of false positives. Principal Findings Here we explore the use of seriation, a statistical approach for ordering sets of objects based on their similarity, for large-scale expression pattern discovery in SAGE data. For this specific task we implement a seriation heuristic we term ‘progressive construction of contigs’ that constructs local chains of related elements by sequentially rearranging margins of the correlation matrix. We apply the heuristic to the analysis of simulated and experimental SAGE data and compare our results to those obtained with a clustering algorithm developed specifically for SAGE data. We show using simulations that the performance of seriation compares favorably to that of the clustering algorithm on noisy SAGE data. Conclusions We explore the use of a seriation approach for visualization-based pattern discovery in SAGE data. Using both simulations and experimental data, we demonstrate that seriation is able to identify groups of co-expressed genes more accurately than a clustering algorithm developed specifically for SAGE data. Our results suggest that seriation is a useful method for the analysis of gene expression data whose applicability should be further pursued. PMID:18787709

  5. A Systems-Genetics Approach and Data Mining Tool to Assist in the Discovery of Genes Underlying Complex Traits in Oryza sativa

    PubMed Central

    Ficklin, Stephen P.; Feltus, Frank Alex

    2013-01-01

    Many traits of biological and agronomic significance in plants are controlled in a complex manner where multiple genes and environmental signals affect the expression of the phenotype. In Oryza sativa (rice), thousands of quantitative genetic signals have been mapped to the rice genome. In parallel, thousands of gene expression profiles have been generated across many experimental conditions. Through the discovery of networks with real gene co-expression relationships, it is possible to identify co-localized genetic and gene expression signals that implicate complex genotype-phenotype relationships. In this work, we used a knowledge-independent, systems genetics approach, to discover a high-quality set of co-expression networks, termed Gene Interaction Layers (GILs). Twenty-two GILs were constructed from 1,306 Affymetrix microarray rice expression profiles that were pre-clustered to allow for improved capture of gene co-expression relationships. Functional genomic and genetic data, including over 8,000 QTLs and 766 phenotype-tagged SNPs (p-value < = 0.001) from genome-wide association studies, both covering over 230 different rice traits were integrated with the GILs. An online systems genetics data-mining resource, the GeneNet Engine, was constructed to enable dynamic discovery of gene sets (i.e. network modules) that overlap with genetic traits. GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation. A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits. Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance. PMID:23874666

  6. Genomic Resources for Gene Discovery, Functional Genome Annotation, and Evolutionary Studies of Maize and Its Close Relatives

    PubMed Central

    Wang, Chao; Shi, Xue; Liu, Lin; Li, Haiyan; Ammiraju, Jetty S.S.; Kudrna, David A.; Xiong, Wentao; Wang, Hao; Dai, Zhaozhao; Zheng, Yonglian; Lai, Jinsheng; Jin, Weiwei; Messing, Joachim; Bennetzen, Jeffrey L; Wing, Rod A.; Luo, Meizhong

    2013-01-01

    Maize is one of the most important food crops and a key model for genetics and developmental biology. A genetically anchored and high-quality draft genome sequence of maize inbred B73 has been obtained to serve as a reference sequence. To facilitate evolutionary studies in maize and its close relatives, much like the Oryza Map Alignment Project (OMAP) (www.OMAP.org) bacterial artificial chromosome (BAC) resource did for the rice community, we constructed BAC libraries for maize inbred lines Zheng58, Chang7-2, and Mo17 and maize wild relatives Zea mays ssp. parviglumis and Tripsacum dactyloides. Furthermore, to extend functional genomic studies to maize and sorghum, we also constructed binary BAC (BIBAC) libraries for the maize inbred B73 and the sorghum landrace Nengsi-1. The BAC/BIBAC vectors facilitate transfer of large intact DNA inserts from BAC clones to the BIBAC vector and functional complementation of large DNA fragments. These seven Zea Map Alignment Project (ZMAP) BAC/BIBAC libraries have average insert sizes ranging from 92 to 148 kb, organellar DNA from 0.17 to 2.3%, empty vector rates between 0.35 and 5.56%, and genome equivalents of 4.7- to 8.4-fold. The usefulness of the Parviglumis and Tripsacum BAC libraries was demonstrated by mapping clones to the reference genome. Novel genes and alleles present in these ZMAP libraries can now be used for functional complementation studies and positional or homology-based cloning of genes for translational genomics. PMID:24037269

  7. A critical assessment of Agrobacterium tumefaciens-mediated transformation as a tool for pathogenicity gene discovery in the phytopathogenic fungus Leptosphaeria maculans.

    PubMed

    Blaise, Françoise; Rémy, Estelle; Meyer, Michel; Zhou, Ligang; Narcy, Jean-Paul; Roux, Jacqueline; Balesdent, Marie-Hélène; Rouxel, Thierry

    2007-02-01

    We evaluated the usefulness and robustness of Agrobacterium tumefaciens-mediated transformation (ATMT) as a high-throughput transformation tool for pathogenicity gene discovery in the filamentous phytopathogen Leptosphaeria maculans. Thermal asymmetric interlaced polymerase chain reaction allowed us to amplify the left border (LB) flanking sequence in 135 of 400 transformants analysed, and indicated a high level of preservation of the T-DNA LB. In addition, T-DNA preferentially integrated as a single copy in gene-rich regions of the fungal genome, with a probable bias towards intergenic and/or regulatory regions. A total of 53 transformants out of 1388 (3.8%) showed reproducible pathogenicity defects when inoculated on cotyledons of Brassica napus, with diverse altered phenotypes. Co-segregation of the altered phenotype with the T-DNA integration was observed for 6 of 12 transformants crossed. If extrapolated to the whole collection, this indicates that 1.9% of the collection actually corresponds to tagged pathogenicity mutants. The preferential insertion into gene-rich regions along with the high ratio of tagged mutants renders ATMT a tool of choice for large-scale gene discovery in L. maculans. PMID:16979359

  8. Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new GO relationships.

    PubMed

    Manda, Prashanti; McCarthy, Fiona; Bridges, Susan M

    2013-10-01

    The Gene Ontology (GO), a set of three sub-ontologies, is one of the most popular bio-ontologies used for describing gene product characteristics. GO annotation data containing terms from multiple sub-ontologies and at different levels in the ontologies is an important source of implicit relationships between terms from the three sub-ontologies. Data mining techniques such as association rule mining that are tailored to mine from multiple ontologies at multiple levels of abstraction are required for effective knowledge discovery from GO annotation data. We present a data mining approach, Multi-ontology data mining at All Levels (MOAL) that uses the structure and relationships of the GO to mine multi-ontology multi-level association rules. We introduce two interestingness measures: Multi-ontology Support (MOSupport) and Multi-ontology Confidence (MOConfidence) customized to evaluate multi-ontology multi-level association rules. We also describe a variety of post-processing strategies for pruning uninteresting rules. We use publicly available GO annotation data to demonstrate our methods with respect to two applications (1) the discovery of co-annotation suggestions and (2) the discovery of new cross-ontology relationships. PMID:23850840

  9. A novel approach to the discovery of survival biomarkers in glioblastoma using a joint analysis of DNA methylation and gene expression

    PubMed Central

    Smith, Ashley A; Huang, Yen-Tsung; Eliot, Melissa; Houseman, E Andres; Marsit, Carmen J; Wiencke, John K; Kelsey, Karl T

    2014-01-01

    Glioblastoma multiforme (GBM) is the most aggressive of all brain tumors, with a median survival of less than 1.5 years. Recently, epigenetic alterations were found to play key roles in both glioma genesis and clinical outcome, demonstrating the need to integrate genetic and epigenetic data in predictive models. To enhance current models through discovery of novel predictive biomarkers, we employed a genome-wide, agnostic strategy to specifically capture both methylation-directed changes in gene expression and alternative associations of DNA methylation with disease survival in glioma. Human GBM-associated DNA methylation, gene expression, IDH1 mutation status, and survival data were obtained from The Cancer Genome Atlas. DNA methylation loci and expression probes were paired by gene, and their subsequent association with survival was determined by applying an accelerated failure time model to previously published alternative and expression-based association equations. Significant associations were seen in 27 unique methylation/expression pairs with expression-based, alternative, and combinatorial associations observed (10, 13, and 4 pairs, respectively). The majority of the predictive DNA methylation loci were located within CpG islands, and all but three of the locus pairs were negatively correlated with survival. This finding suggests that for most loci, methylation/expression pairs are inversely related, consistent with methylation-associated gene regulatory action. Our results indicate that changes in DNA methylation are associated with altered survival outcome through both coordinated changes in gene expression and alternative mechanisms. Furthermore, our approach offers an alternative method of biomarker discovery using a priori gene pairing and precise targeting to identify novel sites for locus-specific therapeutic intervention. PMID:24670968

  10. The Discovery Method in Training.

    ERIC Educational Resources Information Center

    Belbin, R. M.

    In the form of a discussion between faceless people, this booklet concerns discovery learning and its advantages. Subjects covered in the discussions are: Introducing the Discovery Method; An Experiment with British Railways; The OECD Research Projects in U.S.A., Austria, and Sweden; How the Discovery Method Differs from Other Methods; Discovery…

  11. Discovery of a 29-Gene Panel in Peripheral Blood Mononuclear Cells for the Detection of Colorectal Cancer and Adenomas Using High Throughput Real-Time PCR

    PubMed Central

    Ciarloni, Laura; Hosseinian, Sahar; Monnier-Benoit, Sylvain; Imaizumi, Natsuko; Dorta, Gian; Ruegg, Curzio

    2015-01-01

    Colorectal cancer (CRC) is the second leading cause of cancer-related death in developed countries. Early detection of CRC leads to decreased CRC mortality. A blood-based CRC screening test is highly desirable due to limited invasiveness and high acceptance rate among patients compared to currently used fecal occult blood testing and colonoscopy. Here we describe the discovery and validation of a 29-gene panel in peripheral blood mononuclear cells (PBMC) for the detection of CRC and adenomatous polyps (AP). Blood samples were prospectively collected from a multicenter, case-control clinical study. First, we profiled 93 samples with 667 candidate and 3 reference genes by high throughput real-time PCR (OpenArray system). After analysis, 160 genes were retained and tested again on 51 additional samples. Low expressed and unstable genes were discarded resulting in a final dataset of 144 samples profiled with 140 genes. To define which genes, alone or in combinations had the highest potential to discriminate AP and/or CRC from controls, data were analyzed by a combination of univariate and multivariate methods. A list of 29 potentially discriminant genes was compiled and evaluated for its predictive accuracy by penalized logistic regression and bootstrap. This method discriminated AP >1cm and CRC from controls with a sensitivity of 59% and 75%, respectively, with 91% specificity. The behavior of the 29-gene panel was validated with a LightCycler 480 real-time PCR platform, commonly adopted by clinical laboratories. In this work we identified a 29-gene panel expressed in PBMC that can be used for developing a novel minimally-invasive test for accurate detection of AP and CRC using a standard real-time PCR platform. PMID:25876024

  12. How the serotonin story is being rewritten by new gene-based discoveries principally related to SLC6A4, the serotonin transporter gene, which functions to influence all cellular serotonin systems.

    PubMed

    Murphy, Dennis L; Fox, Meredith A; Timpano, Kiara R; Moya, Pablo R; Ren-Patterson, Renee; Andrews, Anne M; Holmes, Andrew; Lesch, Klaus-Peter; Wendland, Jens R

    2008-11-01

    Discovered and crystallized over sixty years ago, serotonin's important functions in the brain and body were identified over the ensuing years by neurochemical, physiological and pharmacological investigations. This 2008 M. Rapport Memorial Serotonin Review focuses on some of the most recent discoveries involving serotonin that are based on genetic methodologies. These include examples of the consequences that result from direct serotonergic gene manipulation (gene deletion or overexpression) in mice and other species; an evaluation of some phenotypes related to functional human serotonergic gene variants, particularly in SLC6A4, the serotonin transporter gene; and finally, a consideration of the pharmacogenomics of serotonergic drugs with respect to both their therapeutic actions and side effects. The serotonin transporter (SERT) has been the most comprehensively studied of the serotonin system molecular components, and will be the primary focus of this review. We provide in-depth examples of gene-based discoveries primarily related to SLC6A4 that have clarified serotonin's many important homeostatic functions in humans, non-human primates, mice and other species. PMID:18824000

  13. How the Serotonin Story is Being Rewritten By New Gene-Based Discoveries Principally Related to SLC6A4, the Serotonin Transporter Gene, Which Functions To Influence All Cellular Serotonin Systems

    PubMed Central

    Murphy, Dennis L.; Fox, Meredith A.; Timpano, Kiara R.; Moya, Pablo; Ren-Patterson, Renee; Andrews, Anne M.; Holmes, Andrew; Lesch, Klaus-Peter; Wendland, Jens R.

    2009-01-01

    Discovered and crystallized over sixty years ago, serotonin's important functions in the brain and body were identified over the ensuing years by neurochemical, physiological and pharmacological investigations. This 2008 M. Rapport Memorial Serotonin Review focuses on some of the most recent discoveries in serotonin that are based on genetic methodologies. These include examples of the consequences that result from direct serotonergic gene manipulation (gene deletion or overexpression) in mice and other species; an evaluation of some phenotypes related to functional human serotonergic gene variants, particularly in SLC6A4, the serotonin transporter gene; and finally, a consideration of the pharmacogenomics of serotonergic drugs with respect to both their therapeutic actions and side effects. The serotonin transporter (SERT) has been the most comprehensively studied of the serotonin system molecular components, and will be the primary focus of this review. We provide in-depth examples of gene-based discoveries primarily related to SLC6A4 that have clarified serotonin's many important homeostatic functions in humans, non-human primates, mice and other species. PMID:18824000

  14. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework.

    PubMed

    Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format.Database URL: http://targetmine.mizuguchilab.org. PMID:26989145

  15. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework

    PubMed Central

    Chen, Yi-An; Tripathi, Lokesh P.; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org PMID:26989145

  16. The Genetics of Obsessive-Compulsive Disorder and Tourette Syndrome: An Epidemiological and Pathway-Based Approach for Gene Discovery

    ERIC Educational Resources Information Center

    Grados, Marco A.

    2010-01-01

    Objective: To provide a contemporary perspective on genetic discovery methods applied to obsessive-compulsive disorder (OCD) and Tourette syndrome (TS). Method: A review of research trends in genetics research in OCD and TS is conducted, with emphasis on novel approaches. Results: Genome-wide association studies (GWAS) are now in progress in OCD…

  17. The Genetics of Obsessive-Compulsive Disorder and Tourette Syndrome: An Epidemiological and Pathway-Based Approach for Gene Discovery

    ERIC Educational Resources Information Center

    Grados, Marco A.

    2010-01-01

    Objective: To provide a contemporary perspective on genetic discovery methods applied to obsessive-compulsive disorder (OCD) and Tourette syndrome (TS). Method: A review of research trends in genetics research in OCD and TS is conducted, with emphasis on novel approaches. Results: Genome-wide association studies (GWAS) are now in progress in OCD

  18. Genomic pathways to antifungal discovery.

    PubMed

    Monk, Brian C; Cannon, Richard D

    2002-12-01

    The limitations of the therapeutic antifungals are becoming increasingly apparent in the clinic due to their modest efficacy against life-threatening systemic fungal infections. These antifungals belong to only a few structural classes that affect a small range of targets, some are quite toxic in humans while the use of others, particularly the azole drugs, has encouraged the emergence of resistant clinical isolates and the selection of innately resistant fungal pathogens. Only a few new drugs based on novel targets are in clinical development, and these may be insufficient to overcome the changing tide of fungal disease. In parallel with the successful completion of the Saccharomyces cerevisiae and human genome sequencing projects, an increasing number of genome sequencing projects are being initiated and completed for significant fungal pathogens. The growing repository of genomic information, which is complemented by decades of genetic and biochemical study, is now available for genome-wide analysis of gene function and for incisive inter-genomic comparison, with the S. cerevisiae and human genomes providing key points of reference. Functional genomic and comparative genomic techniques, many of which were developed with S. cerevisiae, are being applied to fungal pathogens with the aim of obtaining an integrated view of fungal biology and to extract targets suitable for drug discovery. This review describes some of these techniques, their limitations and their increasing contribution to the antifungal discovery process through effective gene annotation, target identification and prioritization, and in the optimization of antifungal leads. PMID:12570738

  19. In-depth cDNA library sequencing provides quantitative gene expression profiling in cancer biomarker discovery.

    PubMed

    Yang, Wanling; Ying, Dingge; Lau, Yu-Lung

    2009-06-01

    Quantitative gene expression analysis plays an important role in identifying differentially expressed genes in various pathological states, gene expression regulation and co-regulation, shedding light on gene functions. Although microarray is widely used as a powerful tool in this regard, it is suboptimal quantitatively and unable to detect unknown gene variants. Here we demonstrated effective detection of differential expression and co-regulation of certain genes by expressed sequence tag analysis using a selected subset of cDNA libraries. We discussed the issues of sequencing depth and library preparation, and propose that increased sequencing depth and improved preparation procedures may allow detection of many expression features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to increase sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique advantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis. PMID:19591787

  20. Discovery and evaluation of candidate sex-determining genes and xenobiotics in the gonads of lake sturgeon (Acipenser fulvescens).

    PubMed

    Hale, Matthew C; Jackson, James R; Dewoody, J Andrew

    2010-07-01

    Modern pyrosequencing has the potential to uncover many interesting aspects of genome evolution, even in lineages where genomic resources are scarce. In particular, 454 pyrosequencing of nonmodel species has been used to characterize expressed sequence tags, xenobiotics, gene ontologies, and relative levels of gene expression. Herein, we use pyrosequencing to study the evolution of genes expressed in the gonads of a polyploid fish, the lake sturgeon (Acipenser fulvescens). Using 454 pyrosequencing of transcribed genes, we produced more than 125 MB of sequence data from 473,577 high-quality sequencing reads. Sequences that passed stringent quality control thresholds were assembled into 12,791 male contigs and 32,629 female contigs. Average depth of coverage was 4.2 x for the male assembly and 5.5x for the female assembly. Analytical rarefaction indicates that our assemblies include most of the genes expressed in lake sturgeon gonads. Over 86,700 sequencing reads were assigned gene ontologies, many to general housekeeping genes like protein, RNA, and ion binding genes. We searched specifically for sex determining genes and documented significant sex differences in the expression of two genes involved in animal sex determination, DMRT1 and TRA-1. DMRT1 is the master sex determining gene in birds and in medaka (Oryzias latipes) whereas TRA-1 helps direct sexual differentiation in nematodes. We also searched the lake sturgeon assembly for evidence of xenobiotic organisms that may exist as endosymbionts. Our results suggest that exogenous parasites (trematodes) and pathogens (protozoans) apparently have infected lake sturgeon gonads, and the trematodes have horizontally transferred some genes to the lake sturgeon genome. PMID:20386959

  1. Gene disruptions using P transposable elements: An integral component of the Drosophila genome project

    SciTech Connect

    Spradling, A.C.; Stern, D.M.; Kiss, I.

    1995-11-21

    Biologists require genetic as well as molecular tools to decipher genomic information and ultimately to understand gene function. The Berkeley Drosophila Genome Project is addressing these needs with a massive gene disruption project that uses individual, genetically engineered P transposable elements to target open reading frames throughout the Drosophila genome DNA flanking the insertions is sequenced thereby placing and extensive series of genetic markers on the physical genomic map and associating insertions with specific open reading frames and genes. Insertions from the collection now lie within or near most Drosophila genes, greatly reducing the time required to identify new mutations and analyze gene functions. Information revealed from these studies about P element site specificity is being used to target the remaining open reading frames. 38 refs., 5 figs., 1 tab.

  2. Coupled Transcriptome and Proteome Analysis of Human Lymphotropic Tumor Viruses: Insights on the Detection and Discovery of Viral Genes

    SciTech Connect

    Dresang, Lindsay R.; Teuton, Jeremy R.; Feng, Huichen; Jacobs, Jon M.; Camp, David G.; Purvine, Samuel O.; Gritsenko, Marina A.; Li, Zhihua; Smith, Richard D.; Sugden, Bill; Moore, Patrick S.; Chang, Yuan

    2011-12-20

    Kaposi's sarcoma-associated herpesvirus (KSHV) and Epstein-Barr virus (EBV) are related human tumor viruses that cause primary effusion lymphomas (PEL) and Burkitt's lymphomas (BL), respectively. Viral genes expressed in naturally-infected cancer cells contribute to disease pathogenesis; knowing which viral genes are expressed is critical in understanding how these viruses cause cancer. To evaluate the expression of viral genes, we used high-resolution separation and mass spectrometry coupled with custom tiling arrays to align the viral proteomes and transcriptomes of three PEL and two BL cell lines under latent and lytic culture conditions. Results The majority of viral genes were efficiently detected at the transcript and/or protein level on manipulating the viral life cycle. Overall the correlation of expressed viral proteins and transcripts was highly complementary in both validating and providing orthogonal data with latent/lytic viral gene expression. Our approach also identified novel viral genes in both KSHV and EBV, and extends viral genome annotation. Several previously uncharacterized genes were validated at both transcript and protein levels. Conclusions This systems biology approach coupling proteome and transcriptome measurements provides a comprehensive view of viral gene expression that could not have been attained using each methodology independently. Detection of viral proteins in combination with viral transcripts is a potentially powerful method for establishing virus-disease relationships.

  3. Immune gene discovery by expressed sequence tag (EST) analysis of hemocytes in the ridgetail white prawn Exopalaemon carinicauda

    PubMed Central

    Duan, Yafei; Liu, Ping; Li, Jitao; Li, Jian; Chen, Ping

    2013-01-01

    The ridgetail white prawn Exopalaemon carinicauda is one of the most important commercial species in eastern China. However, little information of immune genes in E. carinicauda has been reported. To identify distinctive genes associated with immunity, an expressed sequence tag (EST) library was constructed from hemocytes of E. carinicauda. A total of 3411 clones were sequenced, yielding 2853 ESTs and the average sequence length is 436 bp. The cluster and assembly analysis yielded 1053 unique sequences including 329 contigs and 724 singletons. Blast analysis identified 593 (56.3%) of the unique sequences as orthologs of genes from other organisms (E-value < 1e-5). Based on the COG and Gene Ontology (GO), 593 unique sequences were classified. Through comparison with previous studies, 153 genes assembled from 367 ESTs have been identified as possibly involved in defense or immune functions. These genes are categorized into seven categories according to their putative functions in shrimp immune system: antimicrobial peptides, prophenoloxidase activating system, antioxidant defense systems, chaperone proteins, clottable proteins, pattern recognition receptors and other immune-related genes. According to EST abundance, the major immune-related genes were thioredoxin (141, 4.94% of all ESTs) and calmodulin (14, 0.49% of all ESTs). The EST sequences of E. carinicauda hemocytes provide important information of the immune system and lay the groundwork for development of molecular markers related to disease resistance in prawn species. PMID:23092732

  4. Analyzing Interaction of μ-, δ- and κ-opioid Receptor Gene Variants on Alcohol or Drug Dependence Using a Pattern Discovery-based Method

    PubMed Central

    Li, Zhong; Zhang, Huiping

    2013-01-01

    Background Polymorphisms in the μ-, δ- and κ-opioid receptor genes (OPRM1, OPRD1 and OPRK1) have been reported to be associated with substance (alcohol or drug) dependence. The influence of an individual gene on a disease trait should be more evident when analyzed in the context of gene-gene interactions. Thus, we assessed the joint effect of variants in these three opioid receptor genes on alcohol, cocaine, or opioid dependence. Methods Genotype data for 13 OPRM1 Single Nucleotide Polymorphisms (SNPs), 11 OPRD1 SNPs and seven OPRK1 SNPs were obtained from 382 European Americans (EAs) affected with substance dependence [among them, 318 with Alcohol Dependence (AD), 171 with Cocaine Dependence (CD), and 91 with Opioid Dependence (OD)] and 338 EA control subjects. We assessed the joint effect of OPRM1, OPRD1 and OPRK1 variants on AD, CD, or OD using a pattern discovery-based association test. Specific marker patterns (consisting of alleles of OPRM1, OPRD1 and OPRK1) that were significantly more frequent in AD, CD, or OD cases than in controls were identified. Results 12 significant patterns in the AD dataset, four significant patterns in the CD dataset, and 18 significant patterns in the OD dataset were identified. Moreover, the significance of most marker patterns was due primarily to OPRM1 variants and, to a lesser degree, OPRD1 variants. Conclusion Our findings suggest that variation in the above three opioid receptor genes can jointly influence the vulnerability of individuals to alcohol or drug dependence. Evidence provided by this study also supports previous biological findings that the interaction of the three opioid receptors can modulate the action of opioid and non-opioid drugs and alcohol. PMID:24533225

  5. Mapping our genes: Federal Genome projects: How vast. How Fast

    SciTech Connect

    Not Available

    1987-06-26

    This document provides the transcript of the discussion by panel members of legal land ethical issues confronting the biotechnology community. The meeting was convened at the request of the Congressional Office of Technology Assessment to help it prepare an assessment of the Human Genome Project. The issues discussed include patent rights, intellectual property rights, and the positions of the US National Institutes of Health, US Department of Energy, and US National Science Foundation with regard to these rights.

  6. Pigmentation in sand pear (Pyrus pyrifolia) fruit: biochemical characterization, gene discovery and expression analysis with exocarp pigmentation mutant.

    PubMed

    Wang, Yue-zhi; Zhang, Shujun; Dai, Mei-song; Shi, Ze-bin

    2014-05-01

    Exocarp color of sand pear is an important trait for the fruit production and has caused our concern for a long time. Our previous study explored the different expression genes between the two genotypes contrasting for exocarp color, which indicated the different suberin, cutin, wax and lignin biosynthesis between the russet- and green-exocarp. In this study, we carried out microscopic observation and Fourier transform infrared spectroscopy analysis to detect the differences of tissue structure and biochemical composition between the russet- and green-exocarp of sand pear. The green exocarp was covered with epidermis and cuticle which was replaced by a cork layer on the surface of russet exocarp, and the chemicals of the russet exocarp were characterized by lignin, cellulose and hemicellulose. We explored differential gene expression between the russet exocarp of 'Niitaka' and its green exocarp mutant cv. 'Suisho' using Illumina RNA-sequencing. A total of 559 unigenes showed different expression between the two types of exocarp, and 123 of them were common to the previous study. The quantitative real time-PCR analysis supports the RNA-seq-derived gene with different expression between the two types of exocarp and revealed the preferential expression of these genes in exocarp than in mesocarp and fruit core. Gene ontology enrichment analysis revealed divorced expression of lipid metabolic process genes, transport genes, stress responsive genes and other biological process genes in the two types of exocarp. Expression changes in lignin metabolism-related genes were consistent with the different pigmentation of russet and green exocarp. Increased transcripts of putative genes involved the suberin, cutin and wax biosynthesis in 'Suisho' exocarp could facilitate deposition of the chemicals and take a role in the mutant trait responsible for the green exocarp. In addition, the divorced expression of ATP-binding cassette transporters involved in the trans-membrane transport of lignin, cutin, and suberin precursors suggests that the transport process could also affect the composition of exocarp and take a role in the regulation of exocarp pigmentation. Results from this study provide a base for the analysis of the molecular mechanism underlying sand pear russet/green exocarp mutation, and presents a comprehensive list of candidate genes that could be used to further investigate the trait mutation at the molecular level. PMID:24445590

  7. De novo assembly and discovery of genes that are involved in drought tolerance in Tibetan Sophora moorcroftiana.

    PubMed

    Li, Huie; Yao, Weijie; Fu, Yaru; Li, Shaoke; Guo, Qiqiang

    2015-01-01

    Sophora moorcroftiana, a Leguminosae shrub species that is restricted to the arid and semi-arid regions of the Qinghai-Tibet Plateau, is an ecologically important foundation species and exhibits substantial drought tolerance in the Plateau. There are no functional genomics resources in public databases for understanding the molecular mechanism underlying the drought tolerance of S. moorcroftiana. Therefore, we performed a large-scale transcriptome sequencing of this species under drought stress using the Illumina sequencing technology. A total of 62,348,602 clean reads were obtained. The assembly of the clean reads resulted in 146,943 transcripts, including 66,026 unigenes. In the assembled sequences, 1534 transcription factors were identified and classified into 23 different common families, and 9040 SSR loci, from di- to hexa-nucleotides, whose repeat number is greater than five, were presented. In addition, we performed a gene expression profiling analysis upon dehydration treatment. The results indicated significant differences in the gene expression profiles among the control, mild stress and severe stress. In total, 4687, 5648 and 5735 genes were identified from the comparison of mild versus control, severe versus control and severe versus mild stress, respectively. Based on the differentially expressed genes, a Gene Ontology annotation analysis indicated many dehydration-relevant categories, including 'response to water 'stimulus' and 'response to water deprivation'. Meanwhile, the Kyoto Encyclopedia of Genes and Genomes pathway analysis uncovered some important pathways, such as 'metabolic pathways' and 'plant hormone signal transduction'. In addition, the expression patterns of 25 putative genes that are involved in drought tolerance resulting from quantitative real-time PCR were consistent with their transcript abundance changes as identified by RNA-seq. The globally sequenced genes covered a considerable proportion of the S. moorcroftiana transcriptome, and the expression results may be useful to further extend the knowledge on the drought tolerance of this plant species that survives under Plateau conditions. PMID:25559297

  8. De Novo Assembly, Gene Annotation, and Marker Discovery in Stored-Product Pest Liposcelis entomophila (Enderlein) Using Transcriptome Sequences

    PubMed Central

    Wei, Dan-Dan; Chen, Er-Hu; Ding, Tian-Bo; Chen, Shi-Chun; Dou, Wei; Wang, Jin-Jun

    2013-01-01

    Background As a major stored-product pest insect, Liposcelis entomophila has developed high levels of resistance to various insecticides in grain storage systems. However, the molecular mechanisms underlying resistance and environmental stress have not been characterized. To date, there is a lack of genomic information for this species. Therefore, studies aimed at profiling the L. entomophila transcriptome would provide a better understanding of the biological functions at the molecular levels. Methodology/Principal Findings We applied Illumina sequencing technology to sequence the transcriptome of L. entomophila. A total of 54,406,328 clean reads were obtained and that de novo assembled into 54,220 unigenes, with an average length of 571 bp. Through a similarity search, 33,404 (61.61%) unigenes were matched to known proteins in the NCBI non-redundant (Nr) protein database. These unigenes were further functionally annotated with gene ontology (GO), cluster of orthologous groups of proteins (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. A large number of genes potentially involved in insecticide resistance were manually curated, including 68 putative cytochrome P450 genes, 37 putative glutathione S-transferase (GST) genes, 19 putative carboxyl/cholinesterase (CCE) genes, and other 126 transcripts to contain target site sequences or encoding detoxification genes representing eight types of resistance enzymes. Furthermore, to gain insight into the molecular basis of the L. entomophila toward thermal stresses, 25 heat shock protein (Hsp) genes were identified. In addition, 1,100 SSRs and 57,757 SNPs were detected and 231 pairs of SSR primes were designed for investigating the genetic diversity in future. Conclusions/Significance We developed a comprehensive transcriptomic database for L. entomophila. These sequences and putative molecular markers would further promote our understanding of the molecular mechanisms underlying insecticide resistance or environmental stress, and will facilitate studies on population genetics for psocids, as well as providing useful information for functional genomic research in the future. PMID:24244605

  9. Discovery of Genes Related to Insecticide Resistance in Bactrocera dorsalis by Functional Genomic Analysis of a De Novo Assembled Transcriptome

    PubMed Central

    Hsu, Ju-Chun; Wu, Wen-Jer; Feng, Hai-Tung; Haymer, David S.; Chen, Chien-Yu

    2012-01-01

    Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS). The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs). A total of 29,067 isotigs have putative homologues in the non-redundant (nr) protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also analyzed to characterize putative polypeptide translational products and associate them with specific genes and protein functions. PMID:22879883

  10. Sleeping Beauty Transposon Mutagenesis as a Tool for Gene Discovery in the NOD Mouse Model of Type 1 Diabetes.

    PubMed

    Elso, Colleen M; Chu, Edward P F; Alsayb, May A; Mackin, Leanne; Ivory, Sean T; Ashton, Michelle P; Bröer, Stefan; Silveira, Pablo A; Brodnicki, Thomas C

    2015-12-01

    A number of different strategies have been used to identify genes for which genetic variation contributes to type 1 diabetes (T1D) pathogenesis. Genetic studies in humans have identified >40 loci that affect the risk for developing T1D, but the underlying causative alleles are often difficult to pinpoint or have subtle biological effects. A complementary strategy to identifying "natural" alleles in the human population is to engineer "artificial" alleles within inbred mouse strains and determine their effect on T1D incidence. We describe the use of the Sleeping Beauty (SB) transposon mutagenesis system in the nonobese diabetic (NOD) mouse strain, which harbors a genetic background predisposed to developing T1D. Mutagenesis in this system is random, but a green fluorescent protein (GFP)-polyA gene trap within the SB transposon enables early detection of mice harboring transposon-disrupted genes. The SB transposon also acts as a molecular tag to, without additional breeding, efficiently identify mutated genes and prioritize mutant mice for further characterization. We show here that the SB transposon is functional in NOD mice and can produce a null allele in a novel candidate gene that increases diabetes incidence. We propose that SB transposon mutagenesis could be used as a complementary strategy to traditional methods to help identify genes that, when disrupted, affect T1D pathogenesis. PMID:26438296

  11. Sleeping Beauty Transposon Mutagenesis as a Tool for Gene Discovery in the NOD Mouse Model of Type 1 Diabetes

    PubMed Central

    Elso, Colleen M.; Chu, Edward P. F.; Alsayb, May A.; Mackin, Leanne; Ivory, Sean T.; Ashton, Michelle P.; Bröer, Stefan; Silveira, Pablo A.; Brodnicki, Thomas C.

    2015-01-01

    A number of different strategies have been used to identify genes for which genetic variation contributes to type 1 diabetes (T1D) pathogenesis. Genetic studies in humans have identified >40 loci that affect the risk for developing T1D, but the underlying causative alleles are often difficult to pinpoint or have subtle biological effects. A complementary strategy to identifying “natural” alleles in the human population is to engineer “artificial” alleles within inbred mouse strains and determine their effect on T1D incidence. We describe the use of the Sleeping Beauty (SB) transposon mutagenesis system in the nonobese diabetic (NOD) mouse strain, which harbors a genetic background predisposed to developing T1D. Mutagenesis in this system is random, but a green fluorescent protein (GFP)-polyA gene trap within the SB transposon enables early detection of mice harboring transposon-disrupted genes. The SB transposon also acts as a molecular tag to, without additional breeding, efficiently identify mutated genes and prioritize mutant mice for further characterization. We show here that the SB transposon is functional in NOD mice and can produce a null allele in a novel candidate gene that increases diabetes incidence. We propose that SB transposon mutagenesis could be used as a complementary strategy to traditional methods to help identify genes that, when disrupted, affect T1D pathogenesis. PMID:26438296

  12. Accelerating novel candidate gene discovery in neurogenetic disorders via whole-exome sequencing of prescreened multiplex consanguineous families.

    PubMed

    Alazami, Anas M; Patel, Nisha; Shamseldin, Hanan E; Anazi, Shamsa; Al-Dosari, Mohammed S; Alzahrani, Fatema; Hijazi, Hadia; Alshammari, Muneera; Aldahmesh, Mohammed A; Salih, Mustafa A; Faqeih, Eissa; Alhashem, Amal; Bashiri, Fahad A; Al-Owain, Mohammed; Kentab, Amal Y; Sogaty, Sameera; Al Tala, Saeed; Temsah, Mohamad-Hani; Tulbah, Maha; Aljelaify, Rasha F; Alshahwan, Saad A; Seidahmed, Mohammed Zain; Alhadid, Adnan A; Aldhalaan, Hesham; AlQallaf, Fatema; Kurdi, Wesam; Alfadhel, Majid; Babay, Zainab; Alsogheer, Mohammad; Kaya, Namik; Al-Hassnan, Zuhair N; Abdel-Salam, Ghada M H; Al-Sannaa, Nouriya; Al Mutairi, Fuad; El Khashab, Heba Y; Bohlega, Saeed; Jia, Xiaofei; Nguyen, Henry C; Hammami, Rakad; Adly, Nouran; Mohamed, Jawahir Y; Abdulwahab, Firdous; Ibrahim, Niema; Naim, Ewa A; Al-Younes, Banan; Meyer, Brian F; Hashem, Mais; Shaheen, Ranad; Xiong, Yong; Abouelhoda, Mohamed; Aldeeri, Abdulrahman A; Monies, Dorota M; Alkuraya, Fowzan S

    2015-01-13

    Our knowledge of disease genes in neurological disorders is incomplete. With the aim of closing this gap, we performed whole-exome sequencing on 143 multiplex consanguineous families in whom known disease genes had been excluded by autozygosity mapping and candidate gene analysis. This prescreening step led to the identification of 69 recessive genes not previously associated with disease, of which 33 are here described (SPDL1, TUBA3E, INO80, NID1, TSEN15, DMBX1, CLHC1, C12orf4, WDR93, ST7, MATN4, SEC24D, PCDHB4, PTPN23, TAF6, TBCK, FAM177A1, KIAA1109, MTSS1L, XIRP1, KCTD3, CHAF1B, ARV1, ISCA2, PTRH2, GEMIN4, MYOCD, PDPR, DPH1, NUP107, TMEM92, EPB41L4A, and FAM120AOS). We also encountered instances in which the phenotype departed significantly from the established clinical presentation of a known disease gene. Overall, a likely causal mutation was identified in >73% of our cases. This study contributes to the global effort toward a full compendium of disease genes affecting brain function. PMID:25558065

  13. Discovery of Genes Related to Witches Broom Disease in Paulownia tomentosa × Paulownia fortunei by a De Novo Assembled Transcriptome

    PubMed Central

    Liu, Rongning; Dong, Yanpeng; Fan, Guoqiang; Zhao, Zhenli; Deng, Minjie; Cao, Xibing; Niu, Suyan

    2013-01-01

    In spite of its economic importance, very little molecular genetics and genomic research has been targeted at the family Paulownia spp. The little genetic information on this plant is a big obstacle to studying the mechanisms of its ability to resist Paulownia Witches’ Broom (PaWB) disease. Analysis of the Paulownia transcriptome and its expression profile data are essential to extending the genetic resources on this species, thus will greatly improves our studies on Paulownia. In the current study, we performed the de novo assembly of a transcriptome on P. tomentosa × P. fortunei using the short-read sequencing technology (Illumina). 203,664 unigenes with a mean length of 1,328 bp was obtained. Of these unigenes, 32,976 (30% of all unigenes) containing complete structures were chosen. Eukaryotic clusters of orthologous groups, gene orthology, and the Kyoto Encyclopedia of Genes and Genomes annotations were performed of these unigenes. Genes related to PaWB disease resistance were analyzed in detail. To our knowledge, this is the first study to elucidate the genetic makeup of Paulownia. This transcriptome provides a quick way to understanding Paulownia, increases the number of gene sequences available for further functional genomics studies and provides clues to the identification of potential PaWB disease resistance genes. This study has provided a comprehensive insight into gene expression profiles at different states, which facilitates the study of each gene’s roles in the developmental process and in PaWB disease resistance. PMID:24278262

  14. Transcriptome-based discovery of pathways and genes related to resistance against Fusarium head blight in wheat landrace Wangshuibai

    PubMed Central

    2013-01-01

    Background Fusarium head blight (FHB), caused mainly by Fusarium graminearum (Fg) Schwabe (teleomorph: Gibberellazeae Schwble), brings serious damage to wheat production. Chinese wheat landrace Wangshuibai is one of the most important resistance sources in the world. The knowledge of mechanism underlying its resistance to FHB is still limited. Results To get an overview of transcriptome characteristics of Wangshuibai during infection by Fg, a high-throughput RNA sequencing based on next generation sequencing (NGS) technology (Illumina) were performed. Totally, 165,499 unigenes were generated and assigned to known protein databases including NCBI non-redundant protein database (nr) (82,721, 50.0%), Gene Ontology (GO) (38,184, 23.1%), Swiss-Prot (50,702, 30.6%), Clusters of orthologous groups (COG) (51,566, 31.2%) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) (30,657, 18.5%), as determined by Blastx search. With another NGS based platform, a digital gene expression (DGE) system, gene expression in Wangshuibai and its FHB susceptible mutant NAUH117 was profiled and compared at two infection stages by inoculation of Fg at 24 and 48 hour, with the aim of identifying genes involved in FHB resistance. Conclusion Pathogen-related proteins such as PR5, PR14 and ABC transporter and JA signaling pathway were crucial for FHB resistance, especially that mediated by Fhb1. ET pathway and ROS/NO pathway were not activated in Wangshuibai and may be not pivotal in defense to FHB. Consistent with the fact that in NAUH117 there presented a chromosome fragment deletion, which led to its increased FHB susceptibility, in Wangshuibai, twenty out of eighty-nine genes showed changed expression patterns upon the infection of Fg. The up-regulation of eight of them was confirmed by qRT-PCR, revealing they may be candidate genes for Fhb1 and need further functional analysis to confirm their roles in FHB resistance. PMID:23514540

  15. Discovery and characterization of miRNA genes in atlantic salmon (Salmo salar) by use of a deep sequencing approach

    PubMed Central

    2013-01-01

    Background MicroRNAs (miRNAs) are an abundant class of endogenous small RNA molecules that downregulate gene expression at the posttranscriptional level. They play important roles in multiple biological processes by regulating genes that control developmental timing, growth, stem cell division and apoptosis by binding to the mRNA of target genes. Despite the position Atlantic salmon (Salmo salar) has as an economically important domesticated animal, there has been little research on miRNAs in this species. Knowledge about miRNAs and their target genes may be used to control health and to improve performance of economically important traits. However, before their biological function can be unravelled they must be identified and annotated. The aims of this study were to identify and characterize miRNA genes in Atlantic salmon by deep sequencing analysis of small RNA libraries from nine different tissues. Results A total of 180 distinct mature miRNAs belonging to 106 families of evolutionary conserved miRNAs, and 13 distinct novel mature miRNAs were discovered and characterized. The mature miRNAs corresponded to 521 putative precursor sequences located at unique genome locations. About 40% of these precursors were part of gene clusters, and the majority of the Salmo salar gene clusters discovered were conserved across species. Comparison of expression levels in samples from different tissues applying DESeq indicated that there were tissue specific expression differences in three conserved and one novel miRNA. Ssa-miR 736 was detected in heart tissue only, while two other clustered miRNAs (ssa-miR 212 and132) seems to be at a higher expression level in brain tissue. These observations correlate well with their expected functions as regulators of signal pathways in cardiac and neuronal cells, respectively. Ssa-miR 8163 is one of the novel miRNAs discovered and its function remains unknown. However, differential expression analysis using DESeq suggests that this miRNA is enriched in liver tissue and the precursor was mapped to intron 7 of the transferrin gene. Conclusions The identification and annotation of evolutionary conserved and novel Salmo salar miRNAs as well as the characterization of miRNA gene clusters provide biological knowledge that will greatly facilitate further functional studies on miRNAs in this species. PMID:23865519

  16. Anti-cancer Parasporin Toxins are Associated with Different Environments: Discovery of Two Novel Parasporin 5-like Genes.

    PubMed

    Ammons, David R; Short, John D; Bailey, Jeffery; Hinojosa, Gabriela; Tavarez, Lourdes; Salazar, Martha; Rampersad, Joanne N

    2016-02-01

    Cry toxins are primarily a family of insecticidal toxins produced by the bacterium Bacillus thuringiensis (Bt). However, some Cry toxins, called parasporins (PSs), are non-insecticidal and have been shown to differentially kill human cancer cells. Based on amino acid homology, there are currently six different classes of parasporins (PS1-6). It is not known what role parasporins play in nature, nor if certain PSs are associated with Bt found in particular environments. Herein, we present ten parasporin-containing isolates of Bt from the Caribbean island of Trinidad. Genes coding for PS1 and PS6 were found in isolates associated mainly with artificial aquatic environments (e.g., barrels with rain water), while Bt possessing two novel PS5-like genes (ps5-1 and ps5-2), were isolated from manure collected directly from the rectum of cattle. The amino acid sequences inferred from the two PS5-like genes were 51 % homologous to each other, while being only 41 or 45 % similar to PS5Aa1/Cry64Aa, the only reported member of the parasporin five class. The low level of amino acid homology between the two PS5-like genes and PS5Aa1 indicate that the two PS5-like genes may represent a new class of parasporins, or greatly expand the level of diversity within the current parasporin 5 class. PMID:26563301

  17. Discovery of Novel Leaf Rust Responsive microRNAs in Wheat and Prediction of Their Target Genes

    PubMed Central

    Kumar, Dhananjay; Singh, Dharmendra; Kanodia, Pulkit; Prabhu, Kumble Vinod; Kumar, Manish; Mukhopadhyay, Kunal

    2014-01-01

    MicroRNAs are endogenous small noncoding RNAs which play critical roles in gene regulation. Few wheat (Triticum aestivum L.) miRNA sequences are available in miRBase repertoire and knowledge of their biological functions related to biotic stress is limited. We identified 52 miRNAs, belonging to 19 families, from next-generation transcriptome sequence data based on homology search. One wheat specific novel miRNA was identified but could not be ascribed or assigned to any known miRNA family. Differentially expressed 22 miRNAs were found between susceptible and resistant wheat near-isogenic lines inoculated with leaf rust pathogen Puccinia triticina and compared with mock inoculated controls. Most miRNAs were more upregulated in susceptible NIL compared to resistant NIL. We identified 1306 potential target genes for these 52 miRNAs with vital roles in response to stimuli, signaling, and diverse metabolic and cellular processes. Gene ontology analysis showed 66, 20, and 35 target genes to be categorized into biological process, molecular function, and cellular component, respectively. A miRNA-mediated regulatory network revealed relationships among the components of the targetome. The present study provides insight into potential miRNAs with probable roles in leaf rust pathogenesis and their target genes in wheat which establish a foundation for future studies. PMID:25180085

  18. Discovery of gene-gene interactions across multiple independent data sets of late onset Alzheimer disease from the Alzheimer Disease Genetics Consortium.

    PubMed

    Hohman, Timothy J; Bush, William S; Jiang, Lan; Brown-Gentry, Kristin D; Torstenson, Eric S; Dudek, Scott M; Mukherjee, Shubhabrata; Naj, Adam; Kunkle, Brian W; Ritchie, Marylyn D; Martin, Eden R; Schellenberg, Gerard D; Mayeux, Richard; Farrer, Lindsay A; Pericak-Vance, Margaret A; Haines, Jonathan L; Thornton-Wells, Tricia A

    2016-02-01

    Late-onset Alzheimer disease (AD) has a complex genetic etiology, involving locus heterogeneity, polygenic inheritance, and gene-gene interactions; however, the investigation of interactions in recent genome-wide association studies has been limited. We used a biological knowledge-driven approach to evaluate gene-gene interactions for consistency across 13 data sets from the Alzheimer Disease Genetics Consortium. Fifteen single nucleotide polymorphism (SNP)-SNP pairs within 3 gene-gene combinations were identified: SIRT1 × ABCB1, PSAP × PEBP4, and GRIN2B × ADRA1A. In addition, we extend a previously identified interaction from an endophenotype analysis between RYR3 × CACNA1C. Finally, post hoc gene expression analyses of the implicated SNPs further implicate SIRT1 and ABCB1, and implicate CDH23 which was most recently identified as an AD risk locus in an epigenetic analysis of AD. The observed interactions in this article highlight ways in which genotypic variation related to disease may depend on the genetic context in which it occurs. Further, our results highlight the utility of evaluating genetic interactions to explain additional variance in AD risk and identify novel molecular mechanisms of AD pathogenesis. PMID:26827652

  19. Discovery of a Linear Peptide for Improving Tumor Targeting of Gene Products and Treatment of Distal Tumors by IL-12 Gene Therapy

    PubMed Central

    Cutrera, Jeffry; Dibra, Denada; Xia, Xueqing; Hasan, Azeem; Reed, Scott; Li, Shulin

    2011-01-01

    Like many effective therapeutics, interleukin-12 (IL-12) therapy often causes side effects. Tumor targeted delivery may improve the efficacy and decrease the toxicity of systemic IL-12 treatments. In this study, a novel targeting approach was investigated. A secreted alkaline phosphatase (SEAP) reporter gene-based screening process was used to identify a mini-peptide which can be produced in vivo to target gene products to tumors. The coding region for the best peptide was inserted into an IL-12 gene to determine the antitumor efficacy. Affinity chromatography, mass spectrometry analysis, and binding studies were used to identify a receptor for this peptide. We discovered that the linear peptide VNTANST increased the tumor accumulation of the reporter gene products in five independent tumor models including one human xenogeneic model. The product from VNTANST-IL-12 fusion gene therapy increased accumulation of IL-12 in the tumor environment, and in three tumor models, VNTANST-IL-12 gene therapy inhibited distal tumor growth. In a spontaneous lung metastasis model, inhibition of metastatic tumor growth was improved compared to wild-type IL-12 gene therapy, and in a squamous cell carcinoma model, toxic liver lesions were reduced. The receptor for VNTANST was identified as vimentin. These results show the promise of using VNTANST to improve IL-12 treatments. PMID:21386825

  20. Toxins and drug discovery.

    PubMed

    Harvey, Alan L

    2014-12-15

    Components from venoms have stimulated many drug discovery projects, with some notable successes. These are briefly reviewed, from captopril to ziconotide. However, there have been many more disappointments on the road from toxin discovery to approval of a new medicine. Drug discovery and development is an inherently risky business, and the main causes of failure during development programmes are outlined in order to highlight steps that might be taken to increase the chances of success with toxin-based drug discovery. These include having a clear focus on unmet therapeutic needs, concentrating on targets that are well-validated in terms of their relevance to the disease in question, making use of phenotypic screening rather than molecular-based assays, and working with development partners with the resources required for the long and expensive development process. PMID:25448391

  1. Comparative transcriptome analysis of testes and ovaries for the discovery of novel genes from Amur sturgeon (Acipenser schrenckii).

    PubMed

    Jin, S B; Zhang, Y; Dong, X L; Xi, Q K; Song, D; Fu, H T; Sun, D J

    2015-01-01

    Sturgeons (Acipenser schrenckii) are of high evolutionary, economic, and conservation value, and caviar isone of the most valuable animal food products in the world. The Illumina HiSeq2000 sequencing platform was used to construct testicular and ovarian transcriptomes to identify genes involved in reproduction and sex determination in A. schrenckii. A total of 122,381 and 114,527 unigenes were obtained in the testicular and ovarian transcriptomes, respectively, with average lengths of 748 and 697 bp. A total of 46,179 genes were matched to the non-redundant nr database. GO (31,266), KEGG (39,712), and COG analyses (20,126) were performed to identify potential genes and their functions. Twenty-six gene families involved in reproduction and sex determination were identified from the A. schrenckii testicular and ovarian transcriptomes based on functional annotation of non-redundant transcripts and comparisons with the published literature. Furthermore, 1309 unigenes showed significant differences between the testes and ovaries, including 782 genes that were up-regulated in the testes and 527 that were up-regulated in the ovaries. Eleven genes were involved in reproduction and sex determination mechanisms. Furthermore, 19,065 simple sequence repeats (SSRs) were identified in the expressed sequence tagged dataset, and 190,863 and 193,258 single nucleotide polymorphisms (SNPs) were obtained from the testicular and ovarian transcriptomic databases, respectively. This study provides new sequence information about A. schrenckii, which will provide a basis for the further study of reproduction and sex determination mechanisms in Acipenser species. The potential SSR and SNP markers isolated from the transcriptome may shed light on the evolution and molecular ecology of Acipenser species. PMID:26782541

  2. Chemical Discovery

    ERIC Educational Resources Information Center

    Brown, Herbert C.

    1974-01-01

    The role of discovery in the advance of the science of chemistry and the factors that are currently operating to handicap that function are considered. Examples are drawn from the author's work with boranes. The thesis that exploratory research and discovery should be encouraged is stressed. (DT)

  3. Manipulating gene expression in projection-specific neuronal populations using combinatorial viral approaches

    PubMed Central

    Gore, Bryan B.; Soden, Marta E.; Zweifel, Larry S.

    2013-01-01

    The mammalian brain contains tremendous structural and genetic complexity that is vital for its function. The elucidation of gene expression profiles in the brain, coupled with the development of large-scale connectivity maps and emerging viral vector-based approaches for target-selective gene manipulation, now allow for detailed dissection of gene-circuit interfaces. This protocol details how to perform combinatorial viral injections to manipulate gene expression in subsets of neurons interconnecting two brain regions. This method utilizes stereotaxic injection of a retrograde transducing CAV2-Cre virus into one brain region, combined with injection of a locally transducing Cre-dependent AAV virus into another brain region. This technique is widely applicable to the genetic dissection of neural circuitry, as it enables selective expression of candidate genes, dominant-negatives, fluorescent reporters, or genetic tools within heterogeneous populations of neurons based upon their projection targets. PMID:25429312

  4. Discovery of single nucleotide polymorphisms in candidate genes associated with fertility and production traits in Holstein cattle

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Identification of single nucleotide polymorphisms (SNPs) for specific genes involved in reproduction might improve reliability of genomic estimates for these low- heritability traits. Semen from 550 Holstein bulls of high (>= 1.7; n=288) or low (<= -2; n = 262) daughter pregnancy rate (DPR) was geno...

  5. SNP discovery and development of genetic markers for mapping immune response genes in common carp (Cyprinus carpio)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Single nucleotide polymorphisms (SNPs) in immune response genes have been reported as markers for susceptibility to infectious diseases in human and livestock. A disease caused by cyprinid herpesvirus 3 (CyHV-3) is highly contagious and virulent in common carp (Cyprinus carpio). With the aim to de...

  6. Transcriptome Analysis of the White Body of the Squid Euprymna tasmanica with Emphasis on Immune and Hematopoietic Gene Discovery

    PubMed Central

    Salazar, Karla A.; Joffe, Nina R.; Dinguirard, Nathalie; Houde, Peter; Castillo, Maria G.

    2015-01-01

    In the mutualistic relationship between the squid Euprymna tasmanica and the bioluminescent bacterium Vibrio fischeri, several host factors, including immune-related proteins, are known to interact and respond specifically and exclusively to the presence of the symbiont. In squid and octopus, the white body is considered to be an immune organ mainly due to the fact that blood cells, or hemocytes, are known to be present in high numbers and in different developmental stages. Hence, the white body has been described as the site of hematopoiesis in cephalopods. However, to our knowledge, there are no studies showing any molecular evidence of such functions. In this study, we performed a transcriptomic analysis of white body tissue of the Southern dumpling squid, E. tasmanica. Our primary goal was to gain insights into the functions of this tissue and to test for the presence of gene transcripts associated with hematopoietic and immune processes. Several hematopoiesis genes including CPSF1, GATA 2, TFIID, and FGFR2 were found to be expressed in the white body. In addition, transcripts associated with immune-related signal transduction pathways, such as the toll-like receptor/NF-κβ, and MAPK pathways were also found, as well as other immune genes previously identified in E. tasmanica’s sister species, E. scolopes. This study is the first to analyze an immune organ within cephalopods, and to provide gene expression data supporting the white body as a hematopoietic tissue. PMID:25775132

  7. Discovery of single nucleotide polymorphisms in candidate genes associated with fertility and production traits in Holstein cattle

    PubMed Central

    2013-01-01

    Background Identification of single nucleotide polymorphisms (SNPs) for specific genes involved in reproduction might improve reliability of genomic estimates for these low-heritability traits. Semen from 550 Holstein bulls of high (≥ 1.7; n = 288) or low (≤ −2; n = 262) daughter pregnancy rate (DPR) was genotyped for 434 candidate SNPs using the Sequenom MassARRAY® system. Three types of SNPs were evaluated: SNPs previously reported to be associated with reproductive traits or physically close to genetic markers for reproduction, SNPs in genes that are well known to be involved in reproductive processes, and SNPs in genes that are differentially expressed between physiological conditions in a variety of tissues associated in reproductive function. Eleven reproduction and production traits were analyzed. Results A total of 40 SNPs were associated (P < 0.05) with DPR. Among these were genes involved in the endocrine system, cell signaling, immune function and inhibition of apoptosis. A total of 10 genes were regulated by estradiol. In addition, 22 SNPs were associated with heifer conception rate, 33 with cow conception rate, 36 with productive life, 34 with net merit, 23 with milk yield, 19 with fat yield, 13 with fat percent, 19 with protein yield, 22 with protein percent, and 13 with somatic cell score. The allele substitution effect for SNPs associated with heifer conception rate, cow conception rate, productive life and net merit were in the same direction as for DPR. Allele substitution effects for several SNPs associated with production traits were in the opposite direction as DPR. Nonetheless, there were 29 SNPs associated with DPR that were not negatively associated with production traits. Conclusion SNPs in a total of 40 genes associated with DPR were identified as well as SNPs for other traits. It might be feasible to include these SNPs into genomic tests of reproduction and other traits. The genes associated with DPR are likely to be important for understanding the physiology of reproduction. Given the large number of SNPs associated with DPR that were not negatively associated with production traits, it should be possible to select for DPR without compromising production. PMID:23759029

  8. A Novel Biclustering Algorithm for the Discovery of Meaningful Biological Correlations between microRNAs and their Target Genes

    PubMed Central

    2013-01-01

    Background microRNAs (miRNAs) are a class of small non-coding RNAs which have been recognized as ubiquitous post-transcriptional regulators. The analysis of interactions between different miRNAs and their target genes is necessary for the understanding of miRNAs' role in the control of cell life and death. In this paper we propose a novel data mining algorithm, called HOCCLUS2, specifically designed to bicluster miRNAs and target messenger RNAs (mRNAs) on the basis of their experimentally-verified and/or predicted interactions. Indeed, existing biclustering approaches, typically used to analyze gene expression data, fail when applied to miRNA:mRNA interactions since they usually do not extract possibly overlapping biclusters (miRNAs and their target genes may have multiple roles), extract a huge amount of biclusters (difficult to browse and rank on the basis of their importance) and work on similarities of feature values (do not limit the analysis to reliable interactions). Results To overcome these limitations, HOCCLUS2 i) extracts possibly overlapping biclusters, to catch multiple roles of both miRNAs and their target genes; ii) extracts hierarchically organized biclusters, to facilitate bicluster browsing and to distinguish between universe and pathway-specific miRNAs; iii) extracts highly cohesive biclusters, to consider only reliable interactions; iv) ranks biclusters according to the functional similarities, computed on the basis of Gene Ontology, to facilitate bicluster analysis. Conclusions Our results show that HOCCLUS2 is a valid tool to support biologists in the identification of context-specific miRNAs regulatory modules and in the detection of possibly unknown miRNAs target genes. Indeed, results prove that HOCCLUS2 is able to extract cohesiveness-preserving biclusters, when compared with competitive approaches, and statistically confirm (at a confidence level of 99%) that mRNAs which belong to the same biclusters are, on average, more functionally similar than mRNAs which belong to different biclusters. Finally, the hierarchy of biclusters provides useful insights to understand the intrinsic hierarchical organization of miRNAs and their potential multiple interactions on target genes. PMID:23815553

  9. Generation of expressed sequence tags under cadmium stress for gene discovery and development of molecular markers in chickpea.

    PubMed

    Gaur, Rashmi; Bhatia, Sabhyata; Gupta, Meetu

    2014-07-01

    Chickpea is the world's third most important legume crop and belongs to Fabaceae family but suffered from severe yield loss due to various biotic and abiotic stresses. Development of modern genomic tools such as molecular markers and identification of resistant genes associated with these stresses facilitate improvement in chickpea breeding towards abiotic stress tolerance. In this study, 1597 high-quality expressed sequence tags (ESTs) were generated from a cDNA library of variety Pusa 1105 root tissue after cadmium (Cd) treatment. Assembly of ESTs resulted in a total of 914 unigenes of which putative homology was obtained for 38.8 % of unigenes after BLASTX search. In terms of species distribution, majority of sequences found similarity with Medicago truncatula followed by Glycine max, Vitis vinifera and Populus trichocarpa and Pisum sativum sequences. Functional annotation was assigned using Blast2Go, and the Gene Ontology (GO) terms were categorized into biological process, molecular function and cellular component. Approximately 10.83 % of unigenes were assigned at least one GO term. Moreover, in the distribution of transcripts into various biological pathways, 20 of the annotated transcripts were assigned to ten pathways in KEGG database. A majority of the genes were found to be involved in sulphur and nitrogen metabolism. In the quantitative real-time PCR analysis, five of the transcription factors and three of the transporter genes were found to be highly expressed after Cd treatment. Besides, the utility of ESTs was demonstrated by exploiting them for the development of 83 genic molecular markers including EST-simple sequence repeats and intron targeted polymorphism that would assist in tagging of genes related to metal stress for future prospects. PMID:24414095

  10. De Novo Transcriptome Analysis of an Aerial Microalga Trentepohlia jolithus: Pathway Description and Gene Discovery for Carbon Fixation and Carotenoid Biosynthesis

    PubMed Central

    Li, Qianqian; Liu, Jianguo; Zhang, Litao; Liu, Qian

    2014-01-01

    Background Algae in the order Trentepohliales have a broad geographic distribution and are generally characterized by the presence of abundant β-carotene. The many monographs published to date have mainly focused on their morphology, taxonomy, phylogeny, distribution and reproduction; molecular studies of this order are still rare. High-throughput RNA sequencing (RNA-Seq) technology provides a powerful and efficient method for transcript analysis and gene discovery in Trentepohlia jolithus. Methods/Principal Findings Illumina HiSeq 2000 sequencing generated 55,007,830 Illumina PE raw reads, which were assembled into 41,328 assembled unigenes. Based on NR annotation, 53.28% of the unigenes (22,018) could be assigned to gene ontology classes with 54 subcategories and 161,451 functional terms. A total of 26,217 (63.44%) assembled unigenes were mapped to 128 KEGG pathways. Furthermore, a set of 5,798 SSRs in 5,206 unigenes and 131,478 putative SNPs were identified. Moreover, the fact that all of the C4 photosynthesis genes exist in T. jolithus suggests a complex carbon acquisition and fixation system. Similarities and differences between T. jolithus and other algae in carotenoid biosynthesis are also described in depth. Conclusions/Significance This is the first broad transcriptome survey for T. jolithus, increasing the amount of molecular data available for the class Ulvophyceae. As well as providing resources for functional genomics studies, the functional genes and putative pathways identified here will contribute to a better understanding of carbon fixation and fatty acid and carotenoid biosynthesis in T. jolithus. PMID:25254555

  11. Design of the DISCovery project: tailored work-oriented interventions to improve employee health, well-being, and performance-related outcomes in hospital care

    PubMed Central

    2013-01-01

    Background It is well-known that health care workers in today’s general hospitals have to deal with high levels of job demands, which could have negative effects on their health, well-being, and job performance. A way to reduce job-related stress reactions and to optimize positive work-related outcomes is to raise the level of specific job resources and opportunities to recover from work. However, the question remains how to translate the optimization of the balance between job demands, job resources, and recovery opportunities into effective workplace interventions. The aim of the DISCovery project is to develop and implement tailored work-oriented interventions to improve health, well-being, and performance of health care personnel. Methods/Design A quasi-experimental field study with a non-equivalent control group pretest-posttest design will be conducted in a top general hospital. Four existing organizational departments will provide both an intervention and a comparison group. Two types of research methods are used: (1) a longitudinal web-based survey study, and (2) a longitudinal daily diary study. After base-line measures of both methods, existing and yet to be developed interventions will be implemented within the experimental groups. Follow-up measurements will be taken one and two years after the base-line measures to analyze short-term and long-term effects of the interventions. Additionally, a process evaluation and a cost-effectiveness analysis will be carried out. Discussion The DISCovery project fulfills a strong need for theory-driven and scientifically well-performed research on job stress and performance interventions. It will provide insight into (1) how a balance between job demands, job resources, and recovery from work can be optimized, (2) the short-term and long-term effects of tailored work-oriented effects, and (3) indicators for successful or unsuccessful implementation of interventions. PMID:23421647

  12. Comprehensive analysis of differential genes and miRNA profiles for discovery of topping-responsive genes in flue-cured tobacco roots.

    PubMed

    Qi, Yuancheng; Guo, Hongxiang; Li, Ke; Liu, Weiqun

    2012-03-01

    Decapitation/topping is an important cultivating measure for flue-cured tobacco, and diverse biology processes are changed to respond to the topping, such as hormonal balance, root development, source-sink relationship, ability of nicotine synthesis and stress tolerance. The purpose of this study was to clarify the molecular mechanism involved in the response of flue-cured tobacco to topping. The differentially expressed genes and micro RNAs (miRNAs) before and after topping were screened with a combination of suppression subtractive hybridization (SSH) and miRNA deep sequencing. In all, 560 differently expressed clones were sequenced by SSH, and then 129 high quality expressed sequence tags were acquired. These expressed sequence tags were mainly involved in secondary metabolism (13.5%), hormone metabolism (4%), signaling/transcription (17.5%), stress/defense (20%), protein metabolism (13%), carbon metabolism (7%), other metabolism (12%) and unknown function (13%). The results contribute new data to the list of possible candidate genes involved in the response of flue-cured tobacco to topping. NAC transcription factor, a differential gene identified by SSH, had been proved to have a role in the regulation of nicotine biosynthesis. High-throughput sequencing of two small RNA libraries in combination with SSH screening revealed 15 differential miRNAs whose target genes were identical to some differential genes identified in SSH, suggesting that miRNAs play a critical role in post-transcriptional gene regulation in the response of flue-cured tobacco to decapitation. Based on the role of these miRNAs and differential genes identified from SSH in response to topping, an miRNA mediated model for flue-cured tobacco in response to topping is proposed. PMID:22251798

  13. Discovery of new risk loci for IgA nephropathy implicates genes involved in immunity against intestinal pathogens.

    PubMed

    Kiryluk, Krzysztof; Li, Yifu; Scolari, Francesco; Sanna-Cherchi, Simone; Choi, Murim; Verbitsky, Miguel; Fasel, David; Lata, Sneh; Prakash, Sindhuri; Shapiro, Samantha; Fischman, Clara; Snyder, Holly J; Appel, Gerald; Izzi, Claudia; Viola, Battista Fabio; Dallera, Nadia; Del Vecchio, Lucia; Barlassina, Cristina; Salvi, Erika; Bertinetto, Francesca Eleonora; Amoroso, Antonio; Savoldi, Silvana; Rocchietti, Marcella; Amore, Alessandro; Peruzzi, Licia; Coppo, Rosanna; Salvadori, Maurizio; Ravani, Pietro; Magistroni, Riccardo; Ghiggeri, Gian Marco; Caridi, Gianluca; Bodria, Monica; Lugani, Francesca; Allegri, Landino; Delsante, Marco; Maiorana, Mariarosa; Magnano, Andrea; Frasca, Giovanni; Boer, Emanuela; Boscutti, Giuliano; Ponticelli, Claudio; Mignani, Renzo; Marcantoni, Carmelita; Di Landro, Domenico; Santoro, Domenico; Pani, Antonello; Polci, Rosaria; Feriozzi, Sandro; Chicca, Silvana; Galliani, Marco; Gigante, Maddalena; Gesualdo, Loreto; Zamboli, Pasquale; Battaglia, Giovanni Giorgio; Garozzo, Maurizio; Maixnerová, Dita; Tesar, Vladimir; Eitner, Frank; Rauen, Thomas; Floege, Jürgen; Kovacs, Tibor; Nagy, Judit; Mucha, Krzysztof; Pączek, Leszek; Zaniew, Marcin; Mizerska-Wasiak, Małgorzata; Roszkowska-Blaim, Maria; Pawlaczyk, Krzysztof; Gale, Daniel; Barratt, Jonathan; Thibaudin, Lise; Berthoux, Francois; Canaud, Guillaume; Boland, Anne; Metzger, Marie; Panzer, Ulf; Suzuki, Hitoshi; Goto, Shin; Narita, Ichiei; Caliskan, Yasar; Xie, Jingyuan; Hou, Ping; Chen, Nan; Zhang, Hong; Wyatt, Robert J; Novak, Jan; Julian, Bruce A; Feehally, John; Stengel, Benedicte; Cusi, Daniele; Lifton, Richard P; Gharavi, Ali G

    2014-11-01

    We performed a genome-wide association study (GWAS) of IgA nephropathy (IgAN), the most common form of glomerulonephritis, with discovery and follow-up in 20,612 individuals of European and East Asian ancestry. We identified six new genome-wide significant associations, four in ITGAM-ITGAX, VAV3 and CARD9 and two new independent signals at HLA-DQB1 and DEFA. We replicated the nine previously reported signals, including known SNPs in the HLA-DQB1 and DEFA loci. The cumulative burden of risk alleles is strongly associated with age at disease onset. Most loci are either directly associated with risk of inflammatory bowel disease (IBD) or maintenance of the intestinal epithelial barrier and response to mucosal pathogens. The geospatial distribution of risk alleles is highly suggestive of multi-locus adaptation, and genetic risk correlates strongly with variation in local pathogens, particularly helminth diversity, suggesting a possible role for host-intestinal pathogen interactions in shaping the genetic landscape of IgAN. PMID:25305756

  14. Discovery of new risk loci for IgA nephropathy implicates genes involved in immunity against intestinal pathogens

    PubMed Central

    Kiryluk, Krzysztof; Li, Yifu; Scolari, Francesco; Sanna-Cherchi, Simone; Choi, Murim; Verbitsky, Miguel; Fasel, David; Lata, Sneh; Prakash, Sindhuri; Shapiro, Samantha; Fischman, Clara; Snyder, Holly J.; Appel, Gerald; Izzi, Claudia; Viola, Battista Fabio; Dallera, Nadia; Vecchio, Lucia Del; Barlassina, Cristina; Salvi, Erika; Bertinetto, Francesca Eleonora; Amoroso, Antonio; Savoldi, Silvana; Rocchietti, Marcella; Amore, Alessandro; Peruzzi, Licia; Coppo, Rosanna; Salvadori, Maurizio; Ravani, Pietro; Magistroni, Riccardo; Ghiggeri, Gian Marco; Caridi, Gianluca; Bodria, Monica; Lugani, Francesca; Allegri, Landino; Delsante, Marco; Maiorana, Mariarosa; Magnano, Andrea; Frasca, Giovanni; Boer, Emanuela; Boscutti, Giuliano; Ponticelli, Claudio; Mignani, Renzo; Marcantoni, Carmelita; Di Landro, Domenico; Santoro, Domenico; Pani, Antonello; Polci, Rosaria; Feriozzi, Sandro; Chicca, Silvana; Galliani, Marco; Gigante, Maddalena; Gesualdo, Loreto; Zamboli, Pasquale; Maixnerová, Dita; Tesar, Vladimir; Eitner, Frank; Rauen, Thomas; Floege, Jürgen; Kovacs, Tibor; Nagy, Judit; Mucha, Krzysztof; Pączek, Leszek; Zaniew, Marcin; Mizerska-Wasiak, Małgorzata; Roszkowska-Blaim, Maria; Pawlaczyk, Krzysztof; Gale, Daniel; Barratt, Jonathan; Thibaudin, Lise; Berthoux, Francois; Canaud, Guillaume; Boland, Anne; Metzger, Marie; Panzer, Ulf; Suzuki, Hitoshi; Goto, Shin; Narita, Ichiei; Caliskan, Yasar; Xie, Jingyuan; Hou, Ping; Chen, Nan; Zhang, Hong; Wyatt, Robert J.; Novak, Jan; Julian, Bruce A.; Feehally, John; Stengel, Benedicte; Cusi, Daniele; Lifton, Richard P.; Gharavi, Ali G.

    2014-01-01

    We performed a genome-wide association study (GWAS) of IgA nephropathy (IgAN), the most common form of glomerulonephritis, with discovery and follow-up in 20,612 individuals of European and East Asian ancestry. We identified six novel genome-wide significant associations, four in ITGAM-ITGAX, VAV3 and CARD9 and two new independent signals at HLA-DQB1 and DEFA. We replicated the nine previously reported signals, including known SNPs in the HLA-DQB1 and DEFA loci. The cumulative burden of risk alleles is strongly associated with age at disease onset. Most loci are either directly associated with risk of inflammatory bowel disease (IBD) or maintenance of the intestinal epithelial barrier and response to mucosal pathogens. The geo-spatial distribution of risk alleles is highly suggestive of multi-locus adaptation and the genetic risk correlates strongly with variation in local pathogens, particularly helminth diversity, suggesting a possible role for host-intestinal pathogen interactions in shaping the genetic landscape of IgAN. PMID:25305756

  15. Discovery of consensus gene signature and intermodular connectivity defining self-renewal of human embryonic stem cells

    PubMed Central

    Kim, Jeffrey J.; Khalid, Omar; Namazi, AmirHosien; Tu, Thanh G.; Elie, Omid; Lee, Connie; Kim, Yong

    2014-01-01

    Molecular markers defining self-renewing pluripotent embryonic stem cells (ESCs) have been identified by relative comparisons between undifferentiated and differentiated cells. Most of analysis has been done under a specific differentiation condition that may present significantly different molecular changes over others. Therefore, it is currently unclear if there are true consensus markers defining undifferentiated hESCs. To identify a set of key genes consistently altered during differentiation of hESCs regardless of differentiation conditions we have performed microarray analysis on undifferentiated hESCs (H1 and H9) and differentiated EB’s and validated our results using publicly available expression array data sets. We constructed consensus modules by Weighted Gene Correlation Analysis (WGCNA) and discovered novel markers that are consistently present in undifferentiated hESCs under various differentiation conditions. We have validated top markers (downregulated: LCK, KLKB1 and SLC7A3; upregulated: RhoJ, Zeb2 and Adam12) upon differentiation. Functional validation analysis of LCK in self-renewal of hESCs by using LCK inhibitor or gene silencing with siLCK resulted in a loss of undifferentiation characteristics- morphological change, reduced alkaline phosphatase activity and pluripotency gene expression, demonstrating a potential functional role of LCK in self-renewal of hESCs. We have designated hESC markers to interactive networks in the genome, identifying possible interacting partners and showing how new markers relate to each other. Furthermore, comparison of these data sets with available datasets from iPSCs revealed that the level of these newly identified markers were correlated to the establishment of iPSCs, which may imply a potential role of these markers in gaining of cellular potency. PMID:24519983

  16. De Novo Transcriptomic Analysis of an Oleaginous Microalga: Pathway Description and Gene Discovery for Production of Next-Generation Biofuels

    PubMed Central

    Wan, LingLin; Han, Juan; Sang, Min; Li, AiFen; Wu, Hong; Yin, ShunJi; Zhang, ChengWu

    2012-01-01

    Background Eustigmatos cf. polyphem is a yellow-green unicellular soil microalga belonging to the eustimatophyte with high biomass and considerable production of triacylglycerols (TAGs) for biofuels, which is thus referred to as an oleaginous microalga. The paucity of microalgae genome sequences, however, limits development of gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for a non-model microalgae species, E. cf. polyphem, and identify pathways and genes of importance related to biofuel production. Results We performed the de novo assembly of E. cf. polyphem transcriptome using Illumina paired-end sequencing technology. In a single run, we produced 29,199,432 sequencing reads corresponding to 2.33 Gb total nucleotides. These reads were assembled into 75,632 unigenes with a mean size of 503 bp and an N50 of 663 bp, ranging from 100 bp to >3,000 bp. Assembled unigenes were subjected to BLAST similarity searches and annotated with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology identifiers. These analyses identified the majority of carbohydrate, fatty acids, TAG and carotenoids biosynthesis and catabolism pathways in E. cf. polyphem. Conclusions Our data provides the construction of metabolic pathways involved in the biosynthesis and catabolism of carbohydrate, fatty acids, TAG and carotenoids in E. cf. polyphem and provides a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock. PMID:22536352

  17. Functional gene-based discovery of phenazines from the actinobacteria associated with marine sponges in the South China Sea.

    PubMed

    Karuppiah, Valliappan; Li, Yingxin; Sun, Wei; Feng, Guofang; Li, Zhiyong

    2015-07-01

    Phenazines represent a large group of nitrogen-containing heterocyclic compounds produced by the diverse group of bacteria including actinobacteria. In this study, a total of 197 actinobacterial strains were isolated from seven different marine sponge species in the South China Sea using five different culture media. Eighty-seven morphologically different actinobacterial strains were selected and grouped into 13 genera, including Actinoalloteichus, Kocuria, Micrococcus, Micromonospora, Mycobacterium, Nocardiopsis, Prauserella, Rhodococcus, Saccharopolyspora, Salinispora, Serinicoccus, and Streptomyces by the phylogenetic analysis of 16S rRNA gene. Based on the screening of phzE genes, ten strains, including five Streptomyces, two Nocardiopsis, one Salinispora, one Micrococcus, and one Serinicoccus were found to be potential for phenazine production. The level of phzE gene expression was highly expressed in Nocardiopsis sp. 13-33-15, 13-12-13, and Serinicoccus sp. 13-12-4 on the fifth day of fermentation. Finally, 1,6-dihydroxy phenazine (1) from Nocardiopsis sp. 13-33-15 and 13-12-13, and 1,6-dimethoxy phenazine (2) from Nocardiopsis sp. 13-33-15 were isolated and identified successfully based on ESI-MS and NMR analysis. The compounds 1 and 2 showed antibacterial activity against Bacillus mycoides SJ14, Staphylococcus aureus SJ51, Escherichia coli SJ42, and Micrococcus luteus SJ47. This study suggests that the integrated approach of gene screening and chemical analysis is an effective strategy to find the target compounds and lays the basis for the production of phenazine from the sponge-associated actinobacteria. PMID:25820602

  18. Transcriptome Analysis of the Portunus trituberculatus: De Novo Assembly, Growth-Related Gene Identification and Marker Discovery

    PubMed Central

    Lv, Jianjian; Liu, Ping; Gao, Baoquan; Wang, Yu; Wang, Zheng; Chen, Ping; Li, Jian

    2014-01-01

    Background The swimming crab, Portunus trituberculatus, is an important farmed species in China, has been attracting extensive studies, which require more and more genome background knowledge. To date, the sequencing of its whole genome is unavailable and transcriptomic information is also scarce for this species. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive transcript dataset for major tissues of Portunus trituberculatus by the Illumina paired-end sequencing technology. Results Total RNA was isolated from eyestalk, gill, heart, hepatopancreas and muscle. Equal quantities of RNA from each tissue were pooled to construct a cDNA library. Using the Illumina paired-end sequencing technology, we generated a total of 120,137 transcripts with an average length of 1037 bp. Further assembly analysis showed that all contigs contributed to 87,100 unigenes, of these, 16,029 unigenes (18.40% of the total) can be matched in the GenBank non-redundant database. Potential genes and their functions were predicted by GO, KEGG pathway mapping and COG analysis. Based on our sequence analysis and published literature, many putative genes with fundamental roles in growth and muscle development, including actin, myosin, tropomyosin, troponin and other potentially important candidate genes were identified for the first time in this specie. Furthermore, 22,673 SSRs and 66,191 high-confidence SNPs were identified in this EST dataset. Conclusion The transcriptome provides an invaluable new data for a functional genomics resource and future biological research in Portunus trituberculatus. The data will also instruct future functional studies to manipulate or select for genes influencing growth that should find practical applications in aquaculture breeding programs. The molecular markers identified in this study will provide a material basis for future genetic linkage and quantitative trait loci analyses, and will be essential for accelerating aquaculture breeding programs with this species. PMID:24722690

  19. Gene expression analysis and SNP/InDel discovery to investigate yield heterosis of two rubber tree F1 hybrids

    PubMed Central

    Li, Dejun; Zeng, Rizhong; Li, Yan; Zhao, Manman; Chao, Jinquan; Li, Yu; Wang, Kai; Zhu, Lihuang; Tian, Wei-Min; Liang, Chengzhi

    2016-01-01

    As an important industrial material, natural rubber is mainly harvested from the rubber tree. Rubber tree breeding is inefficient, expensive and time-consuming, whereas marker-assisted selection is a feasible method for early selection of high-yield hybrids. We thus sequenced and analyzed the transcriptomes of two parent rubber trees (RRIM 600 and PR 107) and their most productive hybrids (RY 7-33-97 and RY 7-20-59) to understand their gene expression patterns and genetic variations including single nucleotide polymorphisms (SNPs) and small insertions/deletions (InDels). We discovered >31,000 genetic variations in 112,702 assembled unigenes. Our results showed that the higher yield in F1 hybrids was positively associated with their higher genome heterozygosity, which was further confirmed by genotyping 10 SNPs in 20 other varieties. We also showed that RY 7-33-97 and RY 7-20-59 were genetically closer to RRIM 600 and PR 107, respectively, in agreement with both their phenotypic similarities and gene expression profiles. After identifying ethylene- and jasmonic acid–responsive genes at the transcription level, we compared and analyzed the genetic variations underlying rubber biosynthesis and the jasmonic acid and ethylene pathways in detail. Our results suggest that genome-wide genetic variations play a substantive role in maintaining rubber tree heterosis. PMID:27108962

  20. Gene expression analysis and SNP/InDel discovery to investigate yield heterosis of two rubber tree F1 hybrids.

    PubMed

    Li, Dejun; Zeng, Rizhong; Li, Yan; Zhao, Manman; Chao, Jinquan; Li, Yu; Wang, Kai; Zhu, Lihuang; Tian, Wei-Min; Liang, Chengzhi

    2016-01-01

    As an important industrial material, natural rubber is mainly harvested from the rubber tree. Rubber tree breeding is inefficient, expensive and time-consuming, whereas marker-assisted selection is a feasible method for early selection of high-yield hybrids. We thus sequenced and analyzed the transcriptomes of two parent rubber trees (RRIM 600 and PR 107) and their most productive hybrids (RY 7-33-97 and RY 7-20-59) to understand their gene expression patterns and genetic variations including single nucleotide polymorphisms (SNPs) and small insertions/deletions (InDels). We discovered >31,000 genetic variations in 112,702 assembled unigenes. Our results showed that the higher yield in F1 hybrids was positively associated with their higher genome heterozygosity, which was further confirmed by genotyping 10 SNPs in 20 other varieties. We also showed that RY 7-33-97 and RY 7-20-59 were genetically closer to RRIM 600 and PR 107, respectively, in agreement with both their phenotypic similarities and gene expression profiles. After identifying ethylene- and jasmonic acid-responsive genes at the transcription level, we compared and analyzed the genetic variations underlying rubber biosynthesis and the jasmonic acid and ethylene pathways in detail. Our results suggest that genome-wide genetic variations play a substantive role in maintaining rubber tree heterosis. PMID:27108962

  1. Tissue-specific laser microdissection of the Brassica napus funiculus improves gene discovery and spatial identification of biological processes.

    PubMed

    Chan, Ainsley C; Khan, Deirdre; Girard, Ian J; Becker, Michael G; Millar, Jenna L; Sytnik, David; Belmonte, Mark F

    2016-05-01

    The three primary tissue systems of the funiculus each undergo unique developmental programs to support the growth and development of the filial seed. To understand the underlying transcriptional mechanisms that orchestrate development of the funiculus at the globular embryonic stage of seed development, we used laser microdissection coupled with RNA-sequencing to produce a high-resolution dataset of the mRNAs present in the epidermis, cortex, and vasculature of the Brassica napus (canola) funiculus. We identified 7761 additional genes in these tissues compared with the whole funiculus organ alone using this technology. Differential expression and enrichment analyses were used to identify several biological processes associated with each tissue system. Our data show that cell wall modification and lipid metabolism are prominent in the epidermis, cell growth and modification occur in the cortex, and vascular tissue proliferation and differentiation occur in the central vascular strand. We provide further evidence that each of the three tissue systems of the globular stage funiculus are involved in specific biological processes that all co-ordinate to support seed development. The identification of genes and gene regulators responsible for tissue-specific developmental processes of the canola funiculus now serves as a valuable resource for seed improvement research. PMID:27194740

  2. Discovery of bacterial polyhydroxyalkanoate synthase (PhaC)-encoding genes from seasonal Baltic Sea ice and cold estuarine waters.

    PubMed

    Pärnänen, Katariina; Karkman, Antti; Virta, Marko; Eronen-Rasimus, Eeva; Kaartokallio, Hermanni

    2015-01-01

    Polyhydroxyalkanoates (PHAs) are macromolecules produced by bacteria as means for storing carbon and energy in intracellular granules. PHAs have physical properties similar to those of plastics and have become of interest to industry as materials for environmentally friendly bioplastic production. There is an ongoing search for new PHA-producing bacterial strains and PHA-synthesizing enzymes tolerating extreme conditions to find ways of producing PHAs at cold temperatures and high solute concentrations. Moreover, the study of PHA producers in the sea-ice biome can aid in understanding the microbial ecology of carbon cycling in ice-associated ecosystems. In this study, PHA producers and PHA synthase genes were examined under the extreme environmental conditions of sea ice and cold seawater to find evidence of PHA production in an environment requiring adaptation to high salinity and cold temperatures. Sea ice and cold estuarine water samples were collected from the northern Baltic Sea and evidence of PHA production was gathered, using microscopy with Nile Blue A staining of PHA-granules and PCR assays detecting PHA-synthesis genes. The PHA granules and PHA synthases were found at all sampling locations, in both sea ice and water, and throughout the sampling period spanning over 10 years. Our study shows, for the first time, that PHA synthesis occurs in Baltic Sea cold-adapted bacteria in their natural environment, which makes the Baltic Sea and its cold environments an interesting choice in the quest for PHA-synthesizing bacteria and synthesis genes. PMID:25280551

  3. Scientific Discovery for All

    ERIC Educational Resources Information Center

    Zaikowski, Lori; Lichtman, Paul; Quarless, Duncan

    2007-01-01

    The scientific discovery process comes alive for 70 minority students each year at Uniondale High School in New York where students have won top awards for "in-house" projects. Uniondale High School is in a middle-income school district where over 95% of students are from minority groups. Founded in 2000, the Uniondale High School Research Program…

  4. Discovery Farms Dazey Waterway Site

    North Dakota Discovery Farms Dazey waterway site 1 located southeast of Dazey, North Dakota. In 2008, the Dazey Farm became the second farm in the North Dakota Discovery Farms project. Farm is owned and operated by Kim and Denise Amann and their family since 1955....

  5. De novo transcriptome assembly of Ipomoea nil using Illumina sequencing for gene discovery and SSR marker identification.

    PubMed

    Wei, Changhe; Tao, Xiang; Li, Ming; He, Bin; Yan, Lang; Tan, Xuemei; Zhang, Yizheng

    2015-10-01

    Ipomoea nil is widely used as an ornamental plant due to its abundance of flower color, but the limited transcriptome and genomic data hinder research on it. Using illumina platform, transcriptome profiling of I. nil was performed through high-throughput sequencing, which was proven to be a rapid and cost-effective means to characterize gene content. Our goal is to use the resulting information to facilitate the relevant research on flowering and flower color formation in I. nil. In total, 268 million unique illumina RNA-Seq reads were produced and used in the transcriptome assembly. These reads were assembled into 220,117 contigs, of which 137,307 contigs were annotated using the GO and KEGG database. Based on the result of functional annotations, a total of 89,781 contigs were assigned 455,335 GO term annotations. Meanwhile, 17,418 contigs were identified with pathway annotation and they were functionally assigned to 144 KEGG pathways. Our transcriptome revealed at least 55 contigs as probably flowering-related genes in I. nil, and we also identified 25 contigs that encode key enzymes in the phenylpropanoid biosynthesis pathway. Based on the analysis relating to gene expression profiles, in the phenylpropanoid biosynthesis pathway of I. nil, the repression of lignin biosynthesis might lead to the redirection of the metabolic flux into anthocyanin biosynthesis. This may be the most likely reason that I. nil has high anthocyanins content, especially in its flowers. Additionally, 15,537 simple sequence repeats (SSRs) were detected using the MISA software, and these SSRs will undoubtedly benefit future breeding work. Moreover, the information uncovered in this study will also serve as a valuable resource for understanding the flowering and flower color formation mechanisms in I. nil. PMID:25877516

  6. Transcriptome Analysis of the Oriental River Prawn, Macrobrachium nipponense Using 454 Pyrosequencing for Discovery of Genes and Markers

    PubMed Central

    Ma, Keyi; Qiu, Gaofeng; Feng, Jianbin; Li, Jiale

    2012-01-01

    Background The oriental river prawn, Macrobrachium nipponense, is an economically and nutritionally important species of the Palaemonidae family of decapod crustaceans. To date, the sequencing of its whole genome is unavailable as a non-model organism. Transcriptomic information is also scarce for this species. In this study, we performed de novo transcriptome sequencing to produce the first comprehensive expressed sequence tag (EST) dataset for M. nipponense using high-throughput sequencing technologies. Methodology and Principal Findings Total RNA was isolated from eyestalk, gill, heart, ovary, testis, hepatopancreas, muscle, and embryos at the cleavage, gastrula, nauplius and zoea stages. Equal quantities of RNA from each tissue and stage were pooled to construct a cDNA library. Using 454 pyrosequencing technology, we generated a total of 984,204 high quality reads (338.59Mb) with an average length of 344 bp. Clustering and assembly of these reads produced a non-redundant set of 81,411 unique sequences, comprising 42,551 contigs and 38,860 singletons. All of the unique sequences were involved in the molecular function (30,425), cellular component (44,112) and biological process (67,679) categories by GO analysis. Potential genes and their functions were predicted by KEGG pathway mapping and COG analysis. Based on our sequence analysis and published literature, many putative genes involved in sex determination, including DMRT1, FTZ-F1, FOXL2, FEM1 and other potentially important candidate genes, were identified for the first time in this prawn. Furthermore, 6,689 SSRs and 18,107 high-confidence SNPs were identified in this EST dataset. Conclusions The transcriptome provides an invaluable new data for a functional genomics resource and future biological research in M. nipponense. The molecular markers identified in this study will provide a material basis for future genetic linkage and quantitative trait loci analyses, and will be essential for accelerating aquaculture breeding programs with this species. PMID:22745820

  7. Gene Discovery for Enzymes Involved in Limonene Modification or Utilization by the Mountain Pine Beetle-Associated Pathogen Grosmannia clavigera

    PubMed Central

    Wang, Ye; Lim, Lynette; Madilao, Lina; Lah, Ljerka; Bohlmann, Joerg

    2014-01-01

    To successfully colonize and eventually kill pine trees, Grosmannia clavigera (Gs cryptic species), the main fungal pathogen associated with the mountain pine beetle (Dendroctonus ponderosae), has developed multiple mechanisms to overcome host tree chemical defenses, of which terpenoids are a major component. In addition to a monoterpene efflux system mediated by a recently discovered ABC transporter, Gs has genes that are highly induced by monoterpenes and that encode enzymes that modify or utilize monoterpenes [especially (+)-limonene]. We showed that pine-inhabiting Ophiostomale fungi are tolerant to monoterpenes, but only a few, including Gs, are known to utilize monoterpenes as a carbon source. Gas chromatography-mass spectrometry (GC-MS) revealed that Gs can modify (+)-limonene through various oxygenation pathways, producing carvone, p-mentha-2,8-dienol, perillyl alcohol, and isopiperitenol. It can also degrade (+)-limonene through the C-1-oxygenated pathway, producing limonene-1,2-diol as the most abundant intermediate. Transcriptome sequencing (RNA-seq) data indicated that Gs may utilize limonene 1,2-diol through beta-oxidation and then valine and tricarboxylic acid (TCA) metabolic pathways. The data also suggested that at least two gene clusters, located in genome contigs 108 and 161, were highly induced by monoterpenes and may be involved in monoterpene degradation processes. Further, gene knockouts indicated that limonene degradation required two distinct Baeyer-Villiger monooxygenases (BVMOs), an epoxide hydrolase and an enoyl coenzyme A (enoyl-CoA) hydratase. Our work provides information on enzyme-mediated limonene utilization or modification and a more comprehensive understanding of the interaction between an economically important fungal pathogen and its host's defense chemicals. PMID:24837377

  8. Gene discovery for enzymes involved in limonene modification or utilization by the mountain pine beetle-associated pathogen Grosmannia clavigera.

    PubMed

    Wang, Ye; Lim, Lynette; Madilao, Lina; Lah, Ljerka; Bohlmann, Joerg; Breuil, Colette

    2014-08-01

    To successfully colonize and eventually kill pine trees, Grosmannia clavigera (Gs cryptic species), the main fungal pathogen associated with the mountain pine beetle (Dendroctonus ponderosae), has developed multiple mechanisms to overcome host tree chemical defenses, of which terpenoids are a major component. In addition to a monoterpene efflux system mediated by a recently discovered ABC transporter, Gs has genes that are highly induced by monoterpenes and that encode enzymes that modify or utilize monoterpenes [especially (+)-limonene]. We showed that pine-inhabiting Ophiostomale fungi are tolerant to monoterpenes, but only a few, including Gs, are known to utilize monoterpenes as a carbon source. Gas chromatography-mass spectrometry (GC-MS) revealed that Gs can modify (+)-limonene through various oxygenation pathways, producing carvone, p-mentha-2,8-dienol, perillyl alcohol, and isopiperitenol. It can also degrade (+)-limonene through the C-1-oxygenated pathway, producing limonene-1,2-diol as the most abundant intermediate. Transcriptome sequencing (RNA-seq) data indicated that Gs may utilize limonene 1,2-diol through beta-oxidation and then valine and tricarboxylic acid (TCA) metabolic pathways. The data also suggested that at least two gene clusters, located in genome contigs 108 and 161, were highly induced by monoterpenes and may be involved in monoterpene degradation processes. Further, gene knockouts indicated that limonene degradation required two distinct Baeyer-Villiger monooxygenases (BVMOs), an epoxide hydrolase and an enoyl coenzyme A (enoyl-CoA) hydratase. Our work provides information on enzyme-mediated limonene utilization or modification and a more comprehensive understanding of the interaction between an economically important fungal pathogen and its host's defense chemicals. PMID:24837377

  9. Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses

    PubMed Central

    Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael

    2013-01-01

    Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343

  10. Lignification in Sugarcane: Biochemical Characterization, Gene Discovery, and Expression Analysis in Two Genotypes Contrasting for Lignin Content1[W

    PubMed Central

    Bottcher, Alexandra; Cesarino, Igor; Brombini dos Santos, Adriana; Vicentini, Renato; Mayer, Juliana Lischka Sampaio; Vanholme, Ruben; Morreel, Kris; Goeminne, Geert; Moura, Jullyana Cristina Magalhães Silva; Nobile, Paula Macedo; Carmello-Guerreiro, Sandra Maria; Antonio dos Anjos, Ivan; Creste, Silvana; Boerjan, Wout; Landell, Marcos Guimarães de Andrade; Mazzafera, Paulo

    2013-01-01

    Sugarcane (Saccharum spp.) is currently one of the most efficient crops in the production of first-generation biofuels. However, the bagasse represents an additional abundant lignocellulosic resource that has the potential to increase the ethanol production per plant. To achieve a more efficient conversion of bagasse into ethanol, a better understanding of the main factors affecting biomass recalcitrance is needed. Because several studies have shown a negative effect of lignin on saccharification yield, the characterization of lignin biosynthesis, structure, and deposition in sugarcane is an important goal. Here, we present, to our knowledge, the first systematic study of lignin deposition during sugarcane stem development, using histological, biochemical, and transcriptional data derived from two sugarcane genotypes with contrasting lignin contents. Lignin amount and composition were determined in rind (outer) and pith (inner) tissues throughout stem development. In addition, the phenolic metabolome was analyzed by ultra-high-performance liquid chromatography-mass spectrometry, which allowed the identification of 35 compounds related to the phenylpropanoid pathway and monolignol biosynthesis. Furthermore, the Sugarcane EST Database was extensively surveyed to identify lignin biosynthetic gene homologs, and the expression of all identified genes during stem development was determined by quantitative reverse transcription-polymerase chain reaction. Our data provide, to our knowledge, the first in-depth characterization of lignin biosynthesis in sugarcane and form the baseline for the rational metabolic engineering of sugarcane feedstock for bioenergy purposes. PMID:24144790

  11. Natural product discovery: past, present, and future.

    PubMed

    Katz, Leonard; Baltz, Richard H

    2016-03-01

    Microorganisms have provided abundant sources of natural products which have been developed as commercial products for human medicine, animal health, and plant crop protection. In the early years of natural product discovery from microorganisms (The Golden Age), new antibiotics were found with relative ease from low-throughput fermentation and whole cell screening methods. Later, molecular genetic and medicinal chemistry approaches were applied to modify and improve the activities of important chemical scaffolds, and more sophisticated screening methods were directed at target disease states. In the 1990s, the pharmaceutical industry moved to high-throughput screening of synthetic chemical libraries against many potential therapeutic targets, including new targets identified from the human genome sequencing project, largely to the exclusion of natural products, and discovery rates dropped dramatically. Nonetheless, natural products continued to provide key scaffolds for drug development. In the current millennium, it was discovered from genome sequencing that microbes with large genomes have the capacity to produce about ten times as many secondary metabolites as was previously recognized. Indeed, the most gifted actinomycetes have the capacity to produce around 30-50 secondary metabolites. With the precipitous drop in cost for genome sequencing, it is now feasible to sequence thousands of actinomycete genomes to identify the "biosynthetic dark matter" as sources for the discovery of new and novel secondary metabolites. Advances in bioinformatics, mass spectrometry, proteomics, transcriptomics, metabolomics and gene expression are driving the new field of microbial genome mining for applications in natural product discovery and development. PMID:26739136

  12. First Discovery of Two Polyketide Synthase Genes for Mitorubrinic Acid and Mitorubrinol Yellow Pigment Biosynthesis and Implications in Virulence of Penicillium marneffei

    PubMed Central

    Leung, Chris K. F.; Wong, Samson S. Y.; Lau, Susanna K. P.; Yuen, Kwok-Yung

    2012-01-01

    Background The genome of P. marneffei, the most important thermal dimorphic fungus causing respiratory, skin and systemic mycosis in China and Southeast Asia, possesses 23 polyketide synthase (PKS) genes and 2 polyketide synthase nonribosomal peptide synthase hybrid (PKS-NRPS) genes, which is of high diversity compared to other thermal dimorphic pathogenic fungi. We hypothesized that the yellow pigment in the mold form of P. marneffei could also be synthesized by one or more PKS genes. Methodology/Principal Findings All 23 PKS and 2 PKS-NRPS genes of P. marneffei were systematically knocked down. A loss of the yellow pigment was observed in the mold form of the pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants. Sequence analysis showed that PKS11 and PKS12 are fungal non-reducing PKSs. Ultra high performance liquid chromatography-photodiode array detector/electrospray ionization-quadruple time of flight-mass spectrometry (MS) and MS/MS analysis of the culture filtrates of wild type P. marneffei and the pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants showed that the yellow pigment is composed of mitorubrinic acid and mitorubrinol. The survival of mice challenged with the pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants was significantly better than those challenged with wild type P. marneffei (P<0.05). There was also statistically significant decrease in survival of pks11 knockdown, pks12 knockdown and pks11pks12 double knockdown mutants compared to wild type P. marneffei in both J774 and THP1 macrophages (P<0.05). Conclusions/Significance The yellow pigment of the mold form of P. marneffei is composed of mitorubrinol and mitorubrinic acid. This represents the first discovery of PKS genes responsible for mitorubrinol and mitorubrinic acid biosynthesis. pks12 and pks11 are probably responsible for sequential use in the biosynthesis of mitorubrinol and mitorubrinic acid. Mitorubrinol and mitorubrinic acid are virulence factors of P. marneffei by improving its intracellular survival in macrophages. PMID:23094121

  13. Erratum: The Extragalactic Distance Scale Key Project. III. The Discovery of Cepheids and a New Distance to M101 Using the Hubble Space Telescope

    NASA Astrophysics Data System (ADS)

    Kelson, Daniel D; Illingworth, Garth D.; Freedman, Wendy F.; Graham, John A.; Hill, Robert; Madore, Barry F.; Saha, Abhijit; Stetson, Peter B.; Kennicutt, Robert C., Jr.; Mould, Jeremy R.; Hughes, Shaun M.; Ferrarese, Laura; Phelps, Randy; Turner, Anne; Cook, Kem H.; Ford, Holland; Hoessel, John G.; Huchra, John

    1997-03-01

    In the paper ``The Extragalactic Distance Scale Key Project. III. The Discovery of Cepheids and a New Distance to M101 Using the Hubble Space Telescope'' by Daniel D. Kelson, Garth D. Illingworth, Wendy F. Freedman, John A. Graham, Robert Hill, Barry F. Madore, Abhijit Saha, Peter B. Stetson, Robert C. Kennicutt, Jr., Jeremy R. Mould, Shaun M. Hughes, Laura Ferrarese, Randy Phelps, Anne Turner, Kem H. Cook, Holland Ford, John G. Hoessel, and John Huchra (ApJ, 463, 26 [1996]), two of the tables are in error. The magnitudes in Tables B1 and B2, in Appendix B, are ordered incorrectly. As a result, the Julian dates are not associated with their correct Cepheid magnitudes. We have now corrected these data, and updated versions of the tables are available on the World Wide Web. The tables are available in ASCII format at our Key Project site (http://www.ipac.caltech.edu/H0kp/) and will appear in volume 7 of the AAS CDROM. PostScript and paper copies are also available from the first author (http://www.ucolick.org/~kelson/H0/home.html or kelson@ucolick.org).

  14. Discovery of miRNAs and Their Corresponding miRNA Genes in Atlantic Cod (Gadus morhua): Use of Stable miRNAs as Reference Genes Reveals Subgroups of miRNAs That Are Highly Expressed in Particular Organs

    PubMed Central

    Andreassen, Rune; Rangnes, Fredrik; Sivertsen, Maria; Chiang, Michelle; Tran, Michelle; Worren, Merete Molton

    2016-01-01

    Background Atlantic cod (Gadus morhua) is among the economically most important species in the northern Atlantic Ocean and a model species for studying development of the immune system in vertebrates. MicroRNAs (miRNAs) are an abundant class of small RNA molecules that regulate fundamental biological processes at the post-transcriptional level. Detailed knowledge about a species miRNA repertoire is necessary to study how the miRNA transcriptome modulate gene expression. We have therefore discovered and characterized mature miRNAs and their corresponding miRNA genes in Atlantic cod. We have also performed a validation study to identify suitable reference genes for RT-qPCR analysis of miRNA expression in Atlantic cod. Finally, we utilized the newly characterized miRNA repertoire and the dedicated RT-qPCR method to reveal miRNAs that are highly expressed in certain organs. Results The discovery analysis revealed 490 mature miRNAs (401 unique sequences) along with precursor sequences and genomic location of the miRNA genes. Twenty six of these were novel miRNA genes. Validation studies ranked gmo-miR-17-1—5p or the two-gene combination gmo-miR25-3p and gmo-miR210-5p as most suitable qPCR reference genes. Analysis by RT-qPCR revealed 45 miRNAs with significantly higher expression in tissues from one or a few organs. Comparisons to other vertebrates indicate that some of these miRNAs may regulate processes like growth, lipid metabolism, immune response to microbial infections and scar damage repair. Three teleost-specific and three novel Atlantic cod miRNAs were among the differentially expressed miRNAs. Conclusions The number of known mature miRNAs was considerably increased by our identification of miRNAs and miRNA genes in Atlantic cod. This will benefit further functional studies of miRNA expression using deep sequencing methods. The validation study showed that stable miRNAs are suitable reference genes for RT-qPCR analysis of miRNA expression. Applying RT-qPCR we have identified several miRNAs likely to have important regulatory functions in particular organs. PMID:27128807

  15. The HUBBLE SPACE TELESCOPE Extragalactic Distance Scale Key Project. XXIII. The Discovery of Cepheids in NGC 3319

    NASA Astrophysics Data System (ADS)

    Sakai, Shoko; Ferrarese, Laura; Kennicutt, Robert C.; Graham, John A.; Silbermann, N. A.; Mould, Jeremy R.; Freedman, Wendy L.; Bresolin, Fabio; Ford, Holland C.; Gibson, Brad K.; Han, Mingsheng; Harding, Paul; Hoessel, John G.; Huchra, John P.; Hughes, Shaun M.; Illingworth, Garth D.; Kelson, Daniel; Macri, Lucas; Madore, Barry F.; Phelps, Randy L.; Saha, Abhijit; Sebo, Kim M.; Stetson, Peter B.; Turner, Anne

    1999-10-01

    The distance to NGC 3319 has been determined from Cepheid variable stars as part of the Hubble Space Telescope Key Project on the Extragalactic Distance Scale. Thirteen and four epochs of observations, using filters F555W (V) and F814W (I), respectively, were made with the Wide Field Planetary Camera 2. Thirty-three Cepheid variables between periods of 8 and 47 days were discovered. Adopting a Large Magellanic Cloud distance modulus of 18.50 +/- 0.10 mag and extinction of E(V-I)=0.13 mag, a true reddening-corrected distance modulus (based on an analysis employing the ALLFRAME software package) of 30.78 +/- 0.14 (random) +/- 0.10 (systematic) mag and the extinction of E(V-I)=0.06 mag were determined for NGC 3319. This galaxy is the last galaxy observed for the HST H0 Key Project. Based on observations with the NASA/ESA Hubble Space Telescope, obtained at the Space Telescope Science Institute, operated by AURA, Inc., under NASA contract NAS 5-26555.

  16. The Hubble Space Telescope Extragalactic Distance Scale Key Project. 1: The discovery of Cepheids and a new distance to M81

    NASA Technical Reports Server (NTRS)

    Freedman, Wendy L.; Hughes, Shaun M.; Madore, Barry F.; Mould, Jeremy R.; Lee, Myung Gyoon; Stetson, Peter; Kennicutt, Robert C.; Turner, Anne; Ferrarese, Laura; Ford, Holland

    1994-01-01

    We report on the discovery of 30 new Cepheids in the nearby galaxy M81 based on observations using the Hubble Space Telescope (HST). The periods of these Cepheids lie in the range of 10-55 days, based on 18 independent epochs using the HST wide-band F555W filter. The HST F555W and F785LP data have been transformed to the Cousins standard V and I magnitude system using a ground-based calibration. Apparent period-luminosity relations at V and I were constructed, from which apparent distance moduli were measured with respect to assumed values of mu(sub 0) = 18.50 mag and E(B - V) = 0.10 mag for the Large Magellanic Cloud. The difference in the apparent V and I moduli yields a measure of the difference in the total mean extinction between the M81 and the LMC Cepheid samples. A low total mean extinction to the M81 sample of E(B - V) = 0.03 +/- 0.05 mag is obtained. The true distance modulus to M81 is determined to be 27.80 +/- 0.20 mag, corresponding to a distance of 3.63 +/- 0.34 Mpc. These data illustrate that with an optimal (power-law) sampling strategy, the HST provides a powerful tool for the discovery of extragalactic Cepheids and their application to the distance scale. M81 is the first calibrating galaxy in the target sample of the HST Key Project on the Extragalactic Distance Scale, the ultimate aim of which is to provide a value of the Hubble constant to 10% accuracy.

  17. Novel enabling technologies of gene isolation and plant transformation for improved crop protection

    SciTech Connect

    Torok, Tamas

    2013-02-04

    Meeting the needs of agricultural producers requires the continued development of improved transgenic crop protection products. The completed project focused on developing novel enabling technologies of gene discovery and plant transformation to facilitate the generation of such products.

  18. Gene discovery, evolutionary affinity and molecular detection of Oxyspirura petrowi, an eye worm parasite of game birds

    PubMed Central

    2013-01-01

    Background Oxyspirura petrowi appears to be emerging as a nematode parasite that could negatively impact Northern Bobwhite quail individuals and populations within Texas and other regions of the United States. Despite this eye worm's potential importance in the conservation of wild quail, little is known about the general biology and genome composition of O. petrowi. To fill the knowledge gap, we performed a small scale random genome sequence survey, sequenced its 18S rRNA and the intergenic region between the 18S and 28S rRNA genes, studied its phylogenetic affinity, and developed a PCR protocol for the detection of this eye worm. Results We have generated ~240 kb of genome sequence data derived from 348 clones by a random genome survey of an O. petrowi genomic library. The eye worm genome is AT-rich (i.e., 62.2% AT-content), and contains a high number of microsatellite sequences. The discovered genes encode a wide-range of proteins including hypothetical proteins, enzymes, nematode-specific proteins. Phylogenetic analysis based on 18S rRNA sequences indicate that the Spiruroidea is paraphyletic, in which Oxyspirura and its closely related species are sisters to the filarial nematodes. We have also developed a PCR protocol based on the ITS2 sequence that allows sensitive and specific detection of eye worm DNA in feces. Using this newly developed protocol, we have determined that ~28% to 33% of the fecal samples collected from Northern Bobwhites and Scaled Quail in Texas in the spring of 2013 are O. petrowi positive. Conclusions The O. petrowi genome is rich in microsatellite sequences that may be used in future genotyping and molecular fingerprinting analysis. This eye worm is evolutionarily close to the filarial nematodes, implying that therapeutic strategies for filariasis such as Loa loa would be referential in developing treatments for the Thelazoidea parasites. Our qPCR-based survey has confirmed that O. petrowi infection is of potential concern to quail managers in Texas. PMID:24144118

  19. Asymmetric Transcript Discovery by RNA-seq in C. elegans Blastomeres Identifies neg-1, a Gene Important for Anterior Morphogenesis

    PubMed Central

    Osborne Nishimura, Erin; Zhang, Jay C.; Werts, Adam D.; Goldstein, Bob; Lieb, Jason D.

    2015-01-01

    After fertilization but prior to the onset of zygotic transcription, the C. elegans zygote cleaves asymmetrically to create the anterior AB and posterior P1 blastomeres, each of which goes on to generate distinct cell lineages. To understand how patterns of RNA inheritance and abundance arise after this first asymmetric cell division, we pooled hand-dissected AB and P1 blastomeres and performed RNA-seq. Our approach identified over 200 asymmetrically abundant mRNA transcripts. We confirmed symmetric or asymmetric abundance patterns for a subset of these transcripts using smFISH. smFISH also revealed heterogeneous subcellular patterning of the P1-enriched transcripts chs-1 and bpl-1. We screened transcripts enriched in a given blastomere for embryonic defects using RNAi. The gene neg-1 (F32D1.6) encoded an AB-enriched (anterior) transcript and was required for proper morphology of anterior tissues. In addition, analysis of the asymmetric transcripts yielded clues regarding the post-transcriptional mechanisms that control cellular mRNA abundance during asymmetric cell divisions, which are common in developing organisms. PMID:25875092

  20. Cys-loop ligand-gated ion channel gene discovery in the Locusta migratoria manilensis through the neuron transcriptome.

    PubMed

    Wang, Xin; Meng, Xiangkun; Liu, Chuanjun; Gao, Hongli; Zhang, Yixi; Liu, Zewen

    2015-05-01

    As an ideal model, Locusta migratoria manilensis (Meyen) has been widely used in the study of endocrinological and neurobiological processes. Here we created a large transcriptome of the locust neurons, which enriched ion channels whose potential for functional genetic experiments is currently limited. With high-throughput Illumina sequencing technology, we obtained more than 50 million raw reads, which were assembled into 61,056 unique sequences with average size of 737bp. Among the unigenes, a total 24,884 sequences had significant similarities with proteins in the five public databases (NR, SwissProt, GO, COG and KEGG) with a cut-off E-value of 10(-5) using BLASTx. Moreover, the number of potential genes of the cys-loop ligand-gated ion channels (LGICs) was manually curated, including 39 putative nicotinic acetylcholine receptors (nAChRs), 6 putative γ-aminobutyric acid (GABA) gated anion channels, 21 putative glutamate-gated chloride channels (GluCls) and 1 histamine-gated chloride channels (HisCls). In addition, the full-length of 11 nAChRs subunits (9 alpha and 2 beta) were obtained by RACE technique that would be helpful to further studies on nAChR neurochemistry and pharmacological aspects. To our knowledge, this is the first study to characterize the locust neuron transcriptome, which will provide a useful resource especially for future studies on the neuro-function and behavior of the locust. PMID:25701599

  1. Developmental gene discovery in a hemimetabolous insect: de novo assembly and annotation of a transcriptome for the cricket Gryllus bimaculatus.

    PubMed

    Zeng, Victor; Ewen-Campen, Ben; Horch, Hadley W; Roth, Siegfried; Mito, Taro; Extavour, Cassandra G

    2013-01-01

    Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects), representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket), a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts) and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr) identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in Gryllus. PMID:23671567

  2. Developmental Gene Discovery in a Hemimetabolous Insect: De Novo Assembly and Annotation of a Transcriptome for the Cricket Gryllus bimaculatus

    PubMed Central

    Zeng, Victor; Ewen-Campen, Ben; Horch, Hadley W.; Roth, Siegfried; Mito, Taro; Extavour, Cassandra G.

    2013-01-01

    Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects), representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket), a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts) and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr) identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in Gryllus. PMID:23671567

  3. Discovery of precursor and mature microRNAs and their putative gene targets using high-throughput sequencing in pineapple (Ananas comosus var. comosus).

    PubMed

    Yusuf, Noor Hydayaty Md; Ong, Wen Dee; Redwan, Raimi Mohamed; Latip, Mariam Abd; Kumar, S Vijay

    2015-10-15

    MicroRNAs (miRNAs) are a class of small, endogenous non-coding RNAs that negatively regulate gene expression, resulting in the silencing of target mRNA transcripts through mRNA cleavage or translational inhibition. MiRNAs play significant roles in various biological and physiological processes in plants. However, the miRNA-mediated gene regulatory network in pineapple, the model tropical non-climacteric fruit, remains largely unexplored. Here, we report a complete list of pineapple mature miRNAs obtained from high-throughput small RNA sequencing and precursor miRNAs (pre-miRNAs) obtained from ESTs. Two small RNA libraries were constructed from pineapple fruits and leaves, respectively, using Illumina's Solexa technology. Sequence similarity analysis using miRBase revealed 579,179 reads homologous to 153 miRNAs from 41 miRNA families. In addition, a pineapple fruit transcriptome library consisting of approximately 30,000 EST contigs constructed using Solexa sequencing was used for the discovery of pre-miRNAs. In all, four pre-miRNAs were identified (MIR156, MIR399, MIR444 and MIR2673). Furthermore, the same pineapple transcriptome was used to dissect the function of the miRNAs in pineapple by predicting their putative targets in conjunction with their regulatory networks. In total, 23 metabolic pathways were found to be regulated by miRNAs in pineapple. The use of high-throughput sequencing in pineapples to unveil the presence of miRNAs and their regulatory pathways provides insight into the repertoire of miRNA regulation used exclusively in this non-climacteric model plant. PMID:26115767

  4. Systems biology discoveries using non-human primate pluripotent stem and germ cells: novel gene and genomic imprinting interactions as well as unique expression patterns

    PubMed Central

    2010-01-01

    The study of pluripotent stem cells has generated much interest in both biology and medicine. Understanding the fundamentals of biological decisions, including what permits a cell to maintain pluripotency, that is, its ability to self-renew and thereby remain immortal, or to differentiate into multiple types of cells, is of profound importance. For clinical applications, pluripotent cells, including both embryonic stem cells and adult stem cells, have been proposed for cell replacement therapy for a number of human diseases and disorders, including Alzheimer's, Parkinson's, spinal cord injury and diabetes. One challenge in their usage for such therapies is understanding the mechanisms that allow the maintenance of pluripotency and controlling the specific differentiation into required functional target cells. Because of regulatory restrictions and biological feasibilities, there are many crucial investigations that are just impossible to perform using pluripotent stem cells (PSCs) from humans (for example, direct comparisons among panels of inbred embryonic stem cells from prime embryos obtained from pedigreed and fertile donors; genomic analysis of parent versus progeny PSCs and their identical differentiated tissues; intraspecific chimera analyses for pluripotency testing; and so on). However, PSCs from nonhuman primates are being investigated to bridge these knowledge gaps between discoveries in mice and vital information necessary for appropriate clinical evaluations. In this review, we consider the mRNAs and novel genes with unique expression and imprinting patterns that were discovered using systems biology approaches with primate pluripotent stem and germ cells. PMID:20699013

  5. Generalized and specific neurocognitive deficits in psychotic disorders: utility for evaluating pharmacological treatment effects and as intermediate phenotypes for gene discovery.

    PubMed

    Reilly, James L; Sweeney, John A

    2014-05-01

    A growing body of research suggests that schizophrenia and bipolar disorder share overlapping clinical, neurobiological, and genetic features, raising important questions about the boundaries and distinctiveness of these 2 major psychiatric disorders. A generalized cognitive impairment has long been understood to be a core feature of schizophrenia. More recently, it has become apparent that cognitive impairment also occurs in bipolar disorder, particularly in those patients with a history of psychotic symptoms. Whether a generalized deficit exists across a spectrum of psychotic disorders is less clearly established. Additionally, in the context of a broad impairment, it remains a significant challenge to identify deficits in specific cognitive processes that may have distinct neurochemical or regional brain substrates and linkages to particular risk-associated genetic factors. In this article, we review the findings from neuropsychological studies across a spectrum that includes schizophrenia, schizoaffective and bipolar disorders, and conclude the available evidence strongly supports that a generalized deficit is present across psychotic disorders that differs in severity more so than form. We then consider the implications of generalized and specific deficits in psychosis for 2 areas of research--the evaluation of pharmacological treatments targeting cognitive deficits, and the investigation of cognitive intermediate phenotypes in family genetic studies. Examples from the literature that touch on the relevance of the generalized deficit in these contexts are provided, as well as consideration for the continued need to identify specific impairments that are separable from the generalized deficit in order to advance drug and gene discovery. PMID:24574307

  6. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools

    PubMed Central

    Quast, Christian; Pruesse, Elmar; Yilmaz, Pelin; Gerken, Jan; Schweer, Timmy; Yarza, Pablo; Peplies, Jörg; Glöckner, Frank Oliver

    2013-01-01

    SILVA (from Latin silva, forest, http://www.arb-silva.de) is a comprehensive web resource for up to date, quality-controlled databases of aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services. The referred database release 111 (July 2012) contains 3 194 778 small subunit and 288 717 large subunit rRNA gene sequences. Since the initial description of the project, substantial new features have been introduced, including advanced quality control procedures, an improved rRNA gene aligner, online tools for probe and primer evaluation and optimized browsing, searching and downloading on the website. Furthermore, the extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches. PMID:23193283

  7. Genetic Predictors of Adverse Radiotherapy Effects: The Gene-PARE project

    SciTech Connect

    Ho, Alice Y.; Atencio, David P.; Peters, Sheila; Stock, Richard G.; Cesaretti, Jamie A.; Green, Sheryl; Formenti, Silvia C.; Haffty, Bruce; Drumea, Karen; Leitzin, Larisa M.D.; Kuten, Abraham; Azria, David; Ozsahin, Mahmut; Overgaard, Jens; Andreassen, Christian N.; Trop, Cynthia S.; Park, Janelle; Rosenstein, Barry S. |||. E-mail: barry.rosenstein@mssm.edu

    2006-07-01

    Purpose: The development of adverse effects resulting from the radiotherapy of cancer limits the use of this treatment modality. The validation of a test capable of predicting which patients would be most likely to develop adverse responses to radiation treatment, based on the possession of specific genetic variants, would therefore be of value. The purpose of the Genetic Predictors of Adverse Radiotherapy Effects (Gene-PARE) project is to help achieve this goal. Methods and Materials: A continuously expanding biorepository has been created consisting of frozen lymphocytes and DNA isolated from patients treated with radiotherapy. In conjunction with this biorepository, a database is maintained with detailed clinical information pertaining to diagnosis, treatment, and outcome. The DNA samples are screened using denaturing high performance liquid chromatography (DHPLC) and the Surveyor nuclease assay for variants in ATM, TGFB1, XRCC1, XRCC3, SOD2, and hHR21. It is anticipated that additional genes that control the biologic response to radiation will be screened in future work. Results: Evidence has been obtained that possession of variants in genes, the products of which play a role in radiation response, is predictive for the development of adverse effects after radiotherapy. Conclusions: It is anticipated that the Gene-PARE project will yield information that will allow radiation oncologists to use genetic data to optimize treatment on an individual basis.

  8. Space Discovery.

    ERIC Educational Resources Information Center

    Blackman, Joan

    1998-01-01

    Describes one teacher's experience taking Space Discovery courses that were sponsored by the United States Space Foundation (USSF). These courses examine the history of space science, theory of orbits and rocketry, the effects of living in outer space on humans, and space weather. (DDR)

  9. Systems biology impact on antiepileptic drug discovery.

    PubMed

    Margineanu, Doru Georg

    2012-02-01

    Systems biology (SB), a recent trend in bioscience research to consider the complex interactions in biological systems from a holistic perspective, sees the disease as a disturbed network of interactions, rather than alteration of single molecular component(s). SB-relying network pharmacology replaces the prevailing focus on specific drug-receptor interaction and the corollary of rational drug design of "magic bullets", by the search for multi-target drugs that would act on biological networks as "magic shotguns". Epilepsy being a multi-factorial, polygenic and dynamic pathology, SB approach appears particularly fit and promising for antiepileptic drug (AED) discovery. In fact, long before the advent of SB, AED discovery already involved some SB-like elements. A reported SB project aimed to find out new drug targets in epilepsy relies on a relational database that integrates clinical information, recordings from deep electrodes and 3D-brain imagery with histology and molecular biology data on modified expression of specific genes in the brain regions displaying spontaneous epileptic activity. Since hitting a single target does not treat complex diseases, a proper pharmacological promiscuity might impart on an AED the merit of being multi-potent. However, multi-target drug discovery entails the complicated task of optimizing multiple activities of compounds, while having to balance drug-like properties and to control unwanted effects. Specific design tools for this new approach in drug discovery barely emerge, but computational methods making reliable in silico predictions of poly-pharmacology did appear, and their progress might be quite rapid. The current move away from reductionism into network pharmacology allows expecting that a proper integration of the intrinsic complexity of epileptic pathology in AED discovery might result in literally anti-epileptic drugs. PMID:22055355

  10. Discovery and molecular mapping of a new gene conferring resistance to stem rust, Sr53, derived from Aegilops geniculata and characterization of spontaneous translocation stocks with reduced alien chromatin

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This study reports the discovery and molecular mapping of a resistance gene effective against stem rust races RKQQC and TTKSK (Ug99) derived from Aegilops geniculata (2n=4x=28, UgUgMgMg). Two populations from the crosses TA5599 (T5DL-5MgL.5MgS)/TA3809 (ph1b mutant in Chinese Spring background) and T...

  11. Cosmic Discovery

    NASA Astrophysics Data System (ADS)

    Harwit, Martin

    1984-04-01

    In the remarkable opening section of this book, a well-known Cornell astronomer gives precise thumbnail histories of the 43 basic cosmic discoveries - stars, planets, novae, pulsars, comets, gamma-ray bursts, and the like - that form the core of our knowledge of the universe. Many of them, he points out, were made accidentally and outside the mainstream of astronomical research and funding. This observation leads him to speculate on how many more major phenomena there might be and how they might be most effectively sought out in afield now dominated by large instruments and complex investigative modes and observational conditions. The book also examines discovery in terms of its political, financial, and sociological context - the role of new technologies and of industry and the military in revealing new knowledge; and methods of funding, of peer review, and of allotting time on our largest telescopes. It concludes with specific recommendations for organizing astronomy in ways that will best lead to the discovery of the many - at least sixty - phenomena that Harwit estimates are still waiting to be found.

  12. Characterization of Capsicum annuum Genetic Diversity and Population Structure Based on Parallel Polymorphism Discovery with a 30K Unigene Pepper GeneChip

    PubMed Central

    Hill, Theresa A.; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W.; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome-wide transcript-based markers to assess genetic and genomic features among Capsicum annuum. PMID:23409153

  13. The role of the cut gene in the specification of central projections by sensory axons in Drosophila.

    PubMed

    Merritt, D J; Hawken, A; Whitington, P M

    1993-04-01

    Mutations in the cut gene transform sense organs in Drosophila embryos from external sensory (es) receptors to chordotonal (ch) organs. We have investigated whether their central axonal projections are also transformed. Following Lucifer yellow injection of the sensory neuron, wild-type es and ch organs show characteristic, different projection patterns in the CNS. Transformed es neurons in cut embryos are variable in their projection patterns: some resemble wild-type es neurons, others ch neurons, while yet others are unlike either of these. We conclude that the cut gene influences axonal projections, although its action as a simple modality switch is open to question. Additional genes could be involved in the specification of the central axonal projection of the transformed neurons. PMID:8386531

  14. Construction and Evaluation of Normalized cDNA Libraries Enriched with Full-Length Sequences for Rapid Discovery of New Genes from Sisal (Agave sisalana Perr.) Different Developmental Stages

    PubMed Central

    Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

    2012-01-01

    To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944

  15. TOXICOGENOMICS DRUG DISCOVERY AND THE PATHOLOGIST

    EPA Science Inventory

    Toxicogenomics, drug discovery, and pathologist.

    The field of toxicogenomics, which currently focuses on the application of large-scale differential gene expression (DGE) data to toxicology, is starting to influence drug discovery and development in the pharmaceutical indu...

  16. Gene Discovery and Expression Profile Analysis through Sequencing of Expressed Sequence Tags from Different Developmental Stages of the Chytridiomycete Blastocladiella emersonii†

    PubMed Central

    Ribichich, Karina F.; Salem-Izacc, Silvia M.; Georg, Raphaela C.; Vêncio, Ricardo Z. N.; Navarro, Luci D.; Gomes, Suely L.

    2005-01-01

    Blastocladiella emersonii is an aquatic fungus of the chytridiomycete class which diverged early from the fungal lineage and is notable for the morphogenetic processes which occur during its life cycle. Its particular taxonomic position makes this fungus an interesting system to be considered when investigating phylogenetic relationships and studying the biology of lower fungi. To contribute to the understanding of the complexity of the B. emersonii genome, we present here a survey of expressed sequence tags (ESTs) from various stages of the fungal development. Nearly 20,000 cDNA clones from 10 different libraries were partially sequenced from their 5′ end, yielding 16,984 high-quality ESTs. These ESTs were assembled into 4,873 putative transcripts, of which 48% presented no matches with existing sequences in public databases. As a result of Gene Ontology (GO) project annotation, 1,680 ESTs (35%) were classified into biological processes of the GO structure, with transcription and RNA processing, protein biosynthesis, and transport as prevalent processes. We also report full-length sequences, useful for construction of molecular phylogenies, and several ESTs that showed high similarity with known proteins, some of which were not previously described in fungi. Furthermore, we analyzed the expression profile (digital Northern analysis) of each transcript throughout the life cycle of the fungus using Bayesian statistics. The in silico approach was validated by Northern blot analysis with good agreement between the two methodologies. PMID:15701807

  17. The Extragalactic Distance Scale Key Project III. Teh discovery of Cephids and a New Distance to M101 Using the Hubble Space Telescope

    NASA Technical Reports Server (NTRS)

    Kelson, Daniel D.; Madore, Barry

    1994-01-01

    We report on the discovery of 29 cephid variables in the galaxy M101 after using the original Wide Field Camera (WFC 1) and the new Wide Field and Planetary Camera (WFPC 2) on the Hubble Space Telescope (HST), to observe a field in M101 at 14 independent epochs in F555W.

  18. Wildlife Discovery.

    ERIC Educational Resources Information Center

    Silverman, Beth; And Others

    This pocket folder of instructional materials is designed to introduce youths aged 9 to 12 to the nature and needs of wildlife and to give children the opportunity to search for wildlife and their signs. The document includes a member's guide, a leader's guide, field record forms, and wildlife project materials. The illustrated 4-H member's guide…

  19. De novo assembly of the pepper transcriptome (Capsicum annuum): a benchmark for in silico discovery of SNPs, SSRs and candidate genes

    PubMed Central

    2012-01-01

    Background Molecular breeding of pepper (Capsicum spp.) can be accelerated by developing DNA markers associated with transcriptomes in breeding germplasm. Before the advent of next generation sequencing (NGS) technologies, the majority of sequencing data were generated by the Sanger sequencing method. By leveraging Sanger EST data, we have generated a wealth of genetic information for pepper including thousands of SNPs and Single Position Polymorphic (SPP) markers. To complement and enhance these resources, we applied NGS to three pepper genotypes: Maor, Early Jalapeño and Criollo de Morelos-334 (CM334) to identify SNPs and SSRs in the assembly of these three genotypes. Results Two pepper transcriptome assemblies were developed with different purposes. The first reference sequence, assembled by CAP3 software, comprises 31,196 contigs from >125,000 Sanger-EST sequences that were mainly derived from a Korean F1-hybrid line, Bukang. Overlapping probes were designed for 30,815 unigenes to construct a pepper Affymetrix GeneChip® microarray for whole genome analyses. In addition, custom Python scripts were used to identify 4,236 SNPs in contigs of the assembly. A total of 2,489 simple sequence repeats (SSRs) were identified from the assembly, and primers were designed for the SSRs. Annotation of contigs using Blast2GO software resulted in information for 60% of the unigenes in the assembly. The second transcriptome assembly was constructed from more than 200 million Illumina Genome Analyzer II reads (80–120 nt) using a combination of Velvet, CLC workbench and CAP3 software packages. BWA, SAMtools and in-house Perl scripts were used to identify SNPs among three pepper genotypes. The SNPs were filtered to be at least 50 bp from any intron-exon junctions as well as flanking SNPs. More than 22,000 high-quality putative SNPs were identified. Using the MISA software, 10,398 SSR markers were also identified within the Illumina transcriptome assembly and primers were designed for the identified markers. The assembly was annotated by Blast2GO and 14,740 (12%) of annotated contigs were associated with functional proteins. Conclusions Before availability of pepper genome sequence, assembling transcriptomes of this economically important crop was required to generate thousands of high-quality molecular markers that could be used in breeding programs. In order to have a better understanding of the assembled sequences and to identify candidate genes underlying QTLs, we annotated the contigs of Sanger-EST and Illumina transcriptome assemblies. These and other information have been curated in a database that we have dedicated for pepper project. PMID:23110314

  20. Harry Stottlemier's Discovery [Revised Edition].

    ERIC Educational Resources Information Center

    Lipman, Matthew

    "Harry Stottlemeier's Discovery" is the student book for the project in philosophical thinking described in SO 008 123-126. It offers a model of dialogue -- both of children with one another and of children with adults. The story is set among a classroom of children who begin to understand the basics of logical reasoning when Harry, who isn't…

  1. DISCOVERY IN THE URBAN SPRAWL.

    ERIC Educational Resources Information Center

    HYMOVITZ, LEON

    FOR A CULTURAL ENRICHMENT PROJECT ("DISCOVERY") IN A DISADVANTAGED PHILADELPIA HIGH SCHOOL, ATTENDANCE AT MUSIC, ART, AND THEATER EVENTS EARNED POINTS TOWARD A CERTIFICATE. THE STUDENTS ELECTED THE EVENTS FROM A PREPARED LIST OF ACTIVITIES, WHICH OFTEN WERE MADE PART OF THE ACADEMIC PROGRAM AND THE SCHOOL ASSEMBLIES. AS WELL AS OFFERING…

  2. How well do HapMap haplotypes identify common haplotypes of genes? A comparison with haplotypes of 334 genes resequenced in the environmental genome project.

    PubMed

    Taylor, Jack A; Xu, Zong-Li; Kaplan, Norman L; Morris, Richard W

    2006-01-01

    One of the goals of the International HapMap Project is the identification of common haplotypes in genes. However, HapMap uses an incomplete catalogue of single nucleotide polymorphisms (SNPs) and might miss some common haplotypes. We examined this issue using data from the Environmental Genome Project (EGP) which resequenced 335 genes in 90 people, and thus, has a nearly complete catalogue of gene SNPs. The EGP identified a total of 45,243 SNPs, of which 10,780 were common SNPs (minor allele frequency >or=0.1). Using EGP common SNP genotype data, we identified 1,459 haplotypes with frequency >or=0.05 and we use these as "benchmark" haplotypes. HapMap release 16 had genotype information for 1,573 of 10,780 (15%) EGP common SNPs. Using these SNPs, we identified common HapMap haplotypes (frequency >or=0.05) in each of the four HapMap ethnic groups. To compare common HapMap haplotypes to EGP benchmark haplotypes, we collapsed benchmark haplotypes to the set of 1,573 SNPs. Ninety-eight percent of the collapsed benchmark haplotypes could be found as common HapMap haplotypes in one or more of the four HapMap ethnic groups. However, collapsing benchmark haplotypes to the set of SNPs available in HapMap resulted in a loss of haplotype information: 545 of 1,459 (37%) benchmark haplotypes were uniquely identified, and only 25% of genes had all their benchmark haplotypes uniquely identified. We resampled the EGP data to examine the effect of increasing the number of HapMap SNPs to 5 million, and estimate that approximately 40% of common SNPs in genes will be sampled and that half of the genes will have sufficient SNPs to identify all common haplotypes. This inability to distinguish common haplotypes of genes may result in loss of power when examining haplotype-disease association. PMID:16434598

  3. De novo Transcriptome Assembly of Common Wild Rice (Oryza rufipogon Griff.) and Discovery of Drought-Response Genes in Root Tissue Based on Transcriptomic Data

    PubMed Central

    Zhang, Jing-wen; Wang, Yan-yan; Li, Wei-min; Peng, Yu-fa; Yuan, Qian-hua; Pei, Xin-wu

    2015-01-01

    Background The perennial O. rufipogon (common wild rice), which is considered to be the ancestor of Asian cultivated rice species, contains many useful genetic resources, including drought resistance genes. However, few studies have identified the drought resistance and tissue-specific genes in common wild rice. Results In this study, transcriptome sequencing libraries were constructed, including drought-treated roots (DR) and control leaves (CL) and roots (CR). Using Illumina sequencing technology, we generated 16.75 million bases of high-quality sequence data for common wild rice and conducted de novo assembly and annotation of genes without prior genome information. These reads were assembled into 119,332 unigenes with an average length of 715 bp. A total of 88,813 distinct sequences (74.42% of unigenes) significantly matched known genes in the NCBI NT database. Differentially expressed gene (DEG) analysis showed that 3617 genes were up-regulated and 4171 genes were down-regulated in the CR library compared with the CL library. Among the DEGs, 535 genes were expressed in roots but not in shoots. A similar comparison between the DR and CR libraries showed that 1393 genes were up-regulated and 315 genes were down-regulated in the DR library compared with the CR library. Finally, 37 genes that were specifically expressed in roots were screened after comparing the DEGs identified in the above-described analyses. Conclusion This study provides a transcriptome sequence resource for common wild rice plants and establishes a digital gene expression profile of wild rice plants under drought conditions using the assembled transcriptome data as a reference. Several tissue-specific and drought-stress-related candidate genes were identified, representing a fully characterized transcriptome and providing a valuable resource for genetic and genomic studies in plants. PMID:26134138

  4. Novel interferon delta genes in mammals: cloning of one gene from the sheep, two genes expressed by the horse conceptus and discovery of related sequences in several taxa by genomic database screening.

    PubMed

    Cochet, Marielle; Vaiman, Daniel; Lefèvre, François

    2009-03-15

    Interferons (IFNs) are cytokines of vertebrates with many biological effects including antiviral, immunoregulatory and antiproliferative activities. Among them, mammalian type I IFNs represents a large family of related proteins, mainly virus-inducible, divided in 10 distinct subfamilies named alpha, beta, omega, delta, epsilon, alphaomega, nu, tau, kappa and zeta (or Limitin). Some type I IFN subfamilies are physiologically expressed by the conceptus during early pregnancy in ungulates. This is the case in ruminants with IFN-tau (which triggers the maintenance of the maternal corpus luteum during early pregnancy) and in the pig with IFN-delta, a type I IFN that was, until now, only described in this species (Lefèvre, F. and Boulay, V., 1993; Lefèvre, F., Guillomot, M., D'Andrea, S., Battegay, S. and La Bonnardière, C., 1998a) and whose biological role in early maternal-fetal interactions is unknown. We present here definitive evidences that IFN-delta is actually more widely represented among mammals. We report the cloning of three genes coding for non-porcine and biologically active IFN-deltas: one from the sheep, named OvIFN-delta, and two from the horse, named EqIFN-delta1 and EqIFN-delta2. Interestingly, OvIFN-delta (139 aa) appears to be the shortest natural type I IFN presently known. Moreover, we identified by genomic database screening nineteen potentially functional IFN-delta genes from various species belonging to the four mammalian lineages (Afrotheria, Xenarthra, Euarchontoglires and Laurasiatheria). These novel type I IFNs display a high rate of identity with previously known porcine IFN-deltas. Phylogenetic analysis indicates that the newly defined IFN-delta family includes murine Limitin/IFN-zeta, their closest neighbor, but is clearly distinct from all other type I families. We also show that, although OvIFN-delta gene transcripts are not detectable in Day-14 and Day-15 sheep conceptuses, the two equine genes are expressed by the horse conceptus tissues at the beginning of pregnancy (Day 16 and Day 22). This suggests that, similar to the pig, IFN-delta could play an important role in maternal-fetal interactions during early pregnancy in the horse. PMID:19110041

  5. STS-114: Discovery Impromptu Briefing

    NASA Technical Reports Server (NTRS)

    2005-01-01

    Dr. Griffin, NASA Administrator, is accompanied by members of The U.S. House of Representatives in this STS-114 Discovery Impromptu briefing. The U.S. House of Representatives present include: Sherwood Boehlert, House Science Committee Chairman, Senator Hutchinson, Sheila Jackson, 18th Congressional District Texas, Al Green, 9th Congressional District, Representative Jim Davis, Florida, and Gene Green, 29th District, Texas. Griffin talks about the problem that occurred with the external fuel tank sensor of the Space Shuttle Discovery and the effort NASA is pursuing to track the problem, and identify the root cause. He answers questions from the news media about the next steps for the Space Shuttle Discovery, time frame for the launch, and activities for the astronauts for the next few days.

  6. The Status, Quality, and Expansion of the NIH Full-Length cDNA Project: The Mammalian Gene Collection (MGC)

    PubMed Central

    2004-01-01

    The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5′-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline. PMID:15489334

  7. Evolution of hedgehog and hedgehog-related genes, their origin from Hog proteins in ancestral eukaryotes and discovery of a novel Hint motif

    PubMed Central

    Bürglin, Thomas R

    2008-01-01

    Background The Hedgehog (Hh) signaling pathway plays important roles in human and animal development as well as in carcinogenesis. Hh molecules have been found in both protostomes and deuterostomes, but curiously the nematode Caenorhabditis elegans lacks a bona-fide Hh. Instead a series of Hh-related proteins are found, which share the Hint/Hog domain with Hh, but have distinct N-termini. Results We performed extensive genome searches such as the cnidarian Nematostella vectensis and several nematodes to gain further insights into Hh evolution. We found six genes in N. vectensis with a relationship to Hh: two Hh genes, one gene with a Hh N-terminal domain fused to a Willebrand factor type A domain (VWA), and three genes containing Hint/Hog domains with distinct novel N-termini. In the nematode Brugia malayi we find the same types of hh-related genes as in C. elegans. In the more distantly related Enoplea nematodes Xiphinema and Trichinella spiralis we find a bona-fide Hh. In addition, T. spiralis also has a quahog gene like C. elegans, and there are several additional hh-related genes, some of which have secreted N-terminal domains of only 15 to 25 residues. Examination of other Hh pathway components revealed that T. spiralis - like C. elegans - lacks some of these components. Extending our search to all eukaryotes, we recovered genes containing a Hog domain similar to Hh from many different groups of protists. In addition, we identified a novel Hint gene family present in many eukaryote groups that encodes a VWA domain fused to a distinct Hint domain we call Vint. Further members of a poorly characterized Hint family were also retrieved from bacteria. Conclusion In Cnidaria and nematodes the evolution of hh genes occurred in parallel to the evolution of other genes that contain a Hog domain but have different N-termini. The fact that Hog genes comprising a secreted N-terminus and a Hog domain are found in many protists indicates that this gene family must have arisen in very early eukaryotic evolution, and gave rise eventually to hh and hh-related genes in animals. The results indicate a hitherto unsuspected ability of Hog domain encoding genes to evolve new N-termini. In one instance in Cnidaria, the Hh N-terminal signaling domain is associated with a VWA domain and lacks a Hog domain, suggesting a modular mode of evolution also for the N-terminal domain. The Hog domain proteins, the inteins and VWA-Vint proteins are three families of Hint domain proteins that evolved in parallel in eukaryotes. PMID:18334026

  8. Discovery of rare variants for complex phenotypes.

    PubMed

    Kosmicki, Jack A; Churchhouse, Claire L; Rivas, Manuel A; Neale, Benjamin M

    2016-06-01

    With the rise of sequencing technologies, it is now feasible to assess the role rare variants play in the genetic contribution to complex trait variation. While some of the earlier targeted sequencing studies successfully identified rare variants of large effect, unbiased gene discovery using exome sequencing has experienced limited success for complex traits. Nevertheless, rare variant association studies have demonstrated that rare variants do contribute to phenotypic variability, but sample sizes will likely have to be even larger than those of common variant association studies to be powered for the detection of genes and loci. Large-scale sequencing efforts of tens of thousands of individuals, such as the UK10K Project and aggregation efforts such as the Exome Aggregation Consortium, have made great strides in advancing our knowledge of the landscape of rare variation, but there remain many considerations when studying rare variation in the context of complex traits. We discuss these considerations in this review, presenting a broad range of topics at a high level as an introduction to rare variant analysis in complex traits including the issues of power, study design, sample ascertainment, de novo variation, and statistical testing approaches. Ultimately, as sequencing costs continue to decline, larger sequencing studies will yield clearer insights into the biological consequence of rare mutations and may reveal which genes play a role in the etiology of complex traits. PMID:27221085

  9. High-resolution mapping of the S-locus in Turnera leads to the discovery of three genes tightly associated with the S-alleles.

    PubMed

    Labonne, Jonathan J D; Goultiaeva, Alina; Shore, Joel S

    2009-06-01

    While the breeding system known as distyly has been used as a model system in genetics, and evolutionary biology for over a century, the genes determining this system remain unknown. To positionally clone genes determining distyly, a high-resolution map of the S-locus region of Turnera has been constructed using segregation data from 2,013 backcross progeny. We discovered three putative genes tightly linked with the S-locus. An N-acetyltransferase (TkNACE) flanks the S-locus at 0.35 cM while a sulfotransferase (TkST1) and a non-LTR retroelement (TsRETRO) show complete linkage to the S-locus. An assay of population samples of six species revealed that TsRETRO, initially discovered in diploid Turnera subulata, is also associated with the S-allele in tetraploid T. subulata and diploid Turnera scabra. The sulfotransferase gene shows some level of differential expression in long versus short styles, indicating it might be involved in some aspect of distyly. The complete linkage of TkST1 and TsRETRO to the S-locus suggests that both genes may reside within, or in the immediate vicinity of the S-locus. Chromosome walking has been initiated using one of the genes discovered in the present study to identify the genes determining distyly. PMID:19283410

  10. Code-Assisted Discovery of TAL Effector Targets in Bacterial Leaf Streak of Rice Reveals Contrast with Bacterial Blight and a Novel Susceptibility Gene

    PubMed Central

    Cernadas, Raul A.; Doyle, Erin L.; Niño-Liu, David O.; Wilkins, Katherine E.; Bancroft, Timothy; Wang, Li; Schmidt, Clarice L.; Caldo, Rico; Yang, Bing; White, Frank F.; Nettleton, Dan; Wise, Roger P.; Bogdanove, Adam J.

    2014-01-01

    Bacterial leaf streak of rice, caused by Xanthomonas oryzae pv. oryzicola (Xoc) is an increasingly important yield constraint in this staple crop. A mesophyll colonizer, Xoc differs from X. oryzae pv. oryzae (Xoo), which invades xylem to cause bacterial blight of rice. Both produce multiple distinct TAL effectors, type III-delivered proteins that transactivate effector-specific host genes. A TAL effector finds its target(s) via a partially degenerate code whereby the modular effector amino acid sequence identifies nucleotide sequences to which the protein binds. Virulence contributions of some Xoo TAL effectors have been shown, and their relevant targets, susceptibility (S) genes, identified, but the role of TAL effectors in leaf streak is uncharacterized. We used host transcript profiling to compare leaf streak to blight and to probe functions of Xoc TAL effectors. We found that Xoc and Xoo induce almost completely different host transcriptional changes. Roughly one in three genes upregulated by the pathogens is preceded by a candidate TAL effector binding element. Experimental analysis of the 44 such genes predicted to be Xoc TAL effector targets verified nearly half, and identified most others as false predictions. None of the Xoc targets is a known bacterial blight S gene. Mutational analysis revealed that Tal2g, which activates two genes, contributes to lesion expansion and bacterial exudation. Use of designer TAL effectors discriminated a sulfate transporter gene as the S gene. Across all targets, basal expression tended to be higher than genome-average, and induction moderate. Finally, machine learning applied to real vs. falsely predicted targets yielded a classifier that recalled 92% of the real targets with 88% precision, providing a tool for better target prediction in the future. Our study expands the number of known TAL effector targets, identifies a new class of S gene, and improves our ability to predict functional targeting. PMID:24586171

  11. Discovery of Gene Cluster for Mycosporine-Like Amino Acid Biosynthesis from Actinomycetales Microorganisms and Production of a Novel Mycosporine-Like Amino Acid by Heterologous Expression

    PubMed Central

    Miyamoto, Kiyoko T.; Komatsu, Mamoru

    2014-01-01

    Mycosporines and mycosporine-like amino acids (MAAs), including shinorine (mycosporine-glycine-serine) and porphyra-334 (mycosporine-glycine-threonine), are UV-absorbing compounds produced by cyanobacteria, fungi, and marine micro- and macroalgae. These MAAs have the ability to protect these organisms from damage by environmental UV radiation. Although no reports have described the production of MAAs and the corresponding genes involved in MAA biosynthesis from Gram-positive bacteria to date, genome mining of the Gram-positive bacterial database revealed that two microorganisms belonging to the order Actinomycetales, Actinosynnema mirum DSM 43827 and Pseudonocardia sp. strain P1, possess a gene cluster homologous to the biosynthetic gene clusters identified from cyanobacteria. When the two strains were grown in liquid culture, Pseudonocardia sp. accumulated a very small amount of MAA-like compound in a medium-dependent manner, whereas A. mirum did not produce MAAs under any culture conditions, indicating that the biosynthetic gene cluster of A. mirum was in a cryptic state in this microorganism. In order to characterize these biosynthetic gene clusters, each biosynthetic gene cluster was heterologously expressed in an engineered host, Streptomyces avermitilis SUKA22. Since the resultant transformants carrying the entire biosynthetic gene cluster controlled by an alternative promoter produced mainly shinorine, this is the first confirmation of a biosynthetic gene cluster for MAA from Gram-positive bacteria. Furthermore, S. avermitilis SUKA22 transformants carrying the biosynthetic gene cluster for MAA of A. mirum accumulated not only shinorine and porphyra-334 but also a novel MAA. Structure elucidation revealed that the novel MAA is mycosporine-glycine-alanine, which substitutes l-alanine for the l-serine of shinorine. PMID:24907338

  12. Discovery of gene cluster for mycosporine-like amino acid biosynthesis from Actinomycetales microorganisms and production of a novel mycosporine-like amino acid by heterologous expression.

    PubMed

    Miyamoto, Kiyoko T; Komatsu, Mamoru; Ikeda, Haruo

    2014-08-01

    Mycosporines and mycosporine-like amino acids (MAAs), including shinorine (mycosporine-glycine-serine) and porphyra-334 (mycosporine-glycine-threonine), are UV-absorbing compounds produced by cyanobacteria, fungi, and marine micro- and macroalgae. These MAAs have the ability to protect these organisms from damage by environmental UV radiation. Although no reports have described the production of MAAs and the corresponding genes involved in MAA biosynthesis from Gram-positive bacteria to date, genome mining of the Gram-positive bacterial database revealed that two microorganisms belonging to the order Actinomycetales, Actinosynnema mirum DSM 43827 and Pseudonocardia sp. strain P1, possess a gene cluster homologous to the biosynthetic gene clusters identified from cyanobacteria. When the two strains were grown in liquid culture, Pseudonocardia sp. accumulated a very small amount of MAA-like compound in a medium-dependent manner, whereas A. mirum did not produce MAAs under any culture conditions, indicating that the biosynthetic gene cluster of A. mirum was in a cryptic state in this microorganism. In order to characterize these biosynthetic gene clusters, each biosynthetic gene cluster was heterologously expressed in an engineered host, Streptomyces avermitilis SUKA22. Since the resultant transformants carrying the entire biosynthetic gene cluster controlled by an alternative promoter produced mainly shinorine, this is the first confirmation of a biosynthetic gene cluster for MAA from Gram-positive bacteria. Furthermore, S. avermitilis SUKA22 transformants carrying the biosynthetic gene cluster for MAA of A. mirum accumulated not only shinorine and porphyra-334 but also a novel MAA. Structure elucidation revealed that the novel MAA is mycosporine-glycine-alanine, which substitutes l-alanine for the l-serine of shinorine. PMID:24907338

  13. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    PubMed Central

    Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A

    2009-01-01

    Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an important step towards the identification of all genes in the citrus genome. Furthermore, public availability of the cDNA clones generated in this study, and not only their sequence, enables testing of the biological function of the genes represented in the collection. Expression of the citrus SEP3 homologue, CitrSEP, in Arabidopsis results in early flowering, along with other phenotypes resembling the over-expression of the Arabidopsis SEPALLATA genes. Our findings suggest that the members of the SEP gene family play similar roles in these quite distant plant species. PMID:19747386

  14. Discovery of a Strongly-Interrelated Gene Network in Corals under Constant Darkness by Correlation Analysis after Wavelet Transform on Complex Network Model

    PubMed Central

    Zhou, Xilong; Liu, Xuefeng; Zhang, Zhaobao; Wang, Xumin; Liu, Tao; Liu, Guiming

    2014-01-01

    Coral reefs occupy a relatively small portion of sea area, yet serve as a crucial source of biodiversity by establishing harmonious ecosystems with marine plants and animals. Previous researches mainly focused on screening several key genes induced by stress. Here we proposed a novel methodcorrelation analysis after wavelet transform of complex network model, to explore the effect of light on gene expression in the coral Acropora millepora based on microarray data. In this method, wavelet transform and the conception of complex network were adopted, and 50 key genes with large differences were finally captured, including both annotated genes and novel genes without accurate annotation. These results shed light on our understanding of coral's response toward light changes and the genome-wide interaction among genes under the control of biorhythm, and hence help us to better protect the coral reef ecosystems. Further studies are needed to explore how functional connections are related to structural connections, and how connectivity arises from the interactions within and between different systems. The method introduced in this study for analyzing microarray data will allow researchers to explore genome-wide interaction network with their own dataset and understand the relevant biological processes. PMID:24651851

  15. Transcriptome Analysis and Discovery of Genes Involved in Immune Pathways from Coelomocytes of Sea Cucumber (Apostichopus japonicus) after Vibrio splendidus Challenge.

    PubMed

    Gao, Qiong; Liao, Meijie; Wang, Yingeng; Li, Bin; Zhang, Zheng; Rong, Xiaojun; Chen, Guiping; Wang, Lan

    2015-01-01

    Vibrio splendidus is identified as one of the major pathogenic factors for the skin ulceration syndrome in sea cucumber (Apostichopus japonicus), which has vastly limited the development of the sea cucumber culture industry. In order to screen the immune genes involving Vibrio splendidus challenge in sea cucumber and explore the molecular mechanism of this process, the related transcriptome and gene expression profiling of resistant and susceptible biotypes of sea cucumber with Vibrio splendidus challenge were collected for analysis. A total of 319,455,942 trimmed reads were obtained, which were assembled into 186,658 contigs. After that, 89,891 representative contigs (without isoform) were clustered. The analysis of the gene expression profiling identified 358 differentially expression genes (DEGs) in the bacterial-resistant group, and 102 DEGs in the bacterial-susceptible group, compared with that in control group. According to the reported references and annotation information from BLAST, GO and KEGG, 30 putative bacterial-resistant genes and 19 putative bacterial-susceptible genes were identified from DEGs. The qRT-PCR results were consistent with the RNA-Seq results. Furthermore, many DGEs were involved in immune signaling related pathways, such as Endocytosis, Lysosome, MAPK, Chemokine and the ERBB signaling pathway. PMID:26193268

  16. Transcriptome Analysis and Discovery of Genes Involved in Immune Pathways from Coelomocytes of Sea Cucumber (Apostichopus japonicus) after Vibrio splendidus Challenge

    PubMed Central

    Gao, Qiong; Liao, Meijie; Wang, Yingeng; Li, Bin; Zhang, Zheng; Rong, Xiaojun; Chen, Guiping; Wang, Lan

    2015-01-01

    Vibrio splendidus is identified as one of the major pathogenic factors for the skin ulceration syndrome in sea cucumber (Apostichopus japonicus), which has vastly limited the development of the sea cucumber culture industry. In order to screen the immune genes involving Vibrio splendidus challenge in sea cucumber and explore the molecular mechanism of this process, the related transcriptome and gene expression profiling of resistant and susceptible biotypes of sea cucumber with Vibrio splendidus challenge were collected for analysis. A total of 319,455,942 trimmed reads were obtained, which were assembled into 186,658 contigs. After that, 89,891 representative contigs (without isoform) were clustered. The analysis of the gene expression profiling identified 358 differentially expression genes (DEGs) in the bacterial-resistant group, and 102 DEGs in the bacterial-susceptible group, compared with that in control group. According to the reported references and annotation information from BLAST, GO and KEGG, 30 putative bacterial-resistant genes and 19 putative bacterial-susceptible genes were identified from DEGs. The qRT-PCR results were consistent with the RNA-Seq results. Furthermore, many DGEs were involved in immune signaling related pathways, such as Endocytosis, Lysosome, MAPK, Chemokine and the ERBB signaling pathway. PMID:26193268

  17. Use of the site-specific retargeting jump-in platform cell line to support biologic drug discovery.

    PubMed

    Butler, Robin; Hornigold, David; Huang, Ling; Huntington, Catherine; London, Tim; Dillon, Janette; Tigue, Natalie J; Rossi, Alessandra; Naylor, Jacqueline; Wilkinson, Trevor

    2015-04-01

    Biologics represent a fast-growing class of therapeutics in the pharmaceutical sector. Discovery of therapeutic antibodies and characterization of peptides can necessitate high expression of the target gene requiring the generation of clonal stably transfected cell lines. Traditional challenges of stable cell line transfection include gene silencing and cell-to-cell variability. Our inability to control these can present challenges in lead isolation. Recent progress in site-specific targeting of transgene to specific genomic loci has transformed the ability to generate stably transfected mammalian cell lines. In this article, we describe how the use of the Jump-In platform (Life Technologies, Carlsbad, CA) has been applied to drug discovery projects. It can easily and rapidly generate homogeneous high-expressing cell pools with a high degree of reproducibility. Their use in cell-based screening to identify specific binders, identify binding to relevant species variants, or detect functionally relevant therapeutic antibodies is central in driving drug discovery. PMID:25534831

  18. Analysis of expressed sequence tags from Actinidia: applications of a cross species EST database for gene discovery in the areas of flavor, health, color and ripening

    PubMed Central

    Crowhurst, Ross N; Gleave, Andrew P; MacRae, Elspeth A; Ampomah-Dwamena, Charles; Atkinson, Ross G; Beuning, Lesley L; Bulley, Sean M; Chagne, David; Marsh, Ken B; Matich, Adam J; Montefiori, Mirco; Newcomb, Richard D; Schaffer, Robert J; Usadel, Björn; Allan, Andrew C; Boldingh, Helen L; Bowen, Judith H; Davy, Marcus W; Eckloff, Rheinhart; Ferguson, A Ross; Fraser, Lena G; Gera, Emma; Hellens, Roger P; Janssen, Bart J; Klages, Karin; Lo, Kim R; MacDiarmid, Robin M; Nain, Bhawana; McNeilage, Mark A; Rassam, Maysoon; Richardson, Annette C; Rikkerink, Erik HA; Ross, Gavin S; Schröder, Roswitha; Snowden, Kimberley C; Souleyre, Edwige JF; Templeton, Matt D; Walton, Eric F; Wang, Daisy; Wang, Mindy Y; Wang, Yanming Y; Wood, Marion; Wu, Rongmei; Yauk, Yar-Khing; Laing, William A

    2008-01-01

    Background Kiwifruit (Actinidia spp.) are a relatively new, but economically important crop grown in many different parts of the world. Commercial success is driven by the development of new cultivars with novel consumer traits including flavor, appearance, healthful components and convenience. To increase our understanding of the genetic diversity and gene-based control of these key traits in Actinidia, we have produced a collection of 132,577 expressed sequence tags (ESTs). Results The ESTs were derived mainly from four Actinidia species (A. chinensis, A. deliciosa, A. arguta and A. eriantha) and fell into 41,858 non redundant clusters (18,070 tentative consensus sequences and 23,788 EST singletons). Analysis of flavor and fragrance-related gene families (acyltransferases and carboxylesterases) and pathways (terpenoid biosynthesis) is presented in comparison with a chemical analysis of the compounds present in Actinidia including esters, acids, alcohols and terpenes. ESTs are identified for most genes in color pathways controlling chlorophyll degradation and carotenoid biosynthesis. In the health area, data are presented on the ESTs involved in ascorbic acid and quinic acid biosynthesis showing not only that genes for many of the steps in these pathways are represented in the database, but that genes encoding some critical steps are absent. In the convenience area, genes related to different stages of fruit softening are identified. Conclusion This large EST resource will allow researchers to undertake the tremendous challenge of understanding the molecular basis of genetic diversity in the Actinidia genus as well as provide an EST resource for comparative fruit genomics. The various bioinformatics analyses we have undertaken demonstrates the extent of coverage of ESTs for genes encoding different biochemical pathways in Actinidia. PMID:18655731

  19. Discoveries of nicotinamide riboside as a nutrient and conserved NRK genes establish a Preiss-Handler independent route to NAD+ in fungi and humans.

    PubMed

    Bieganowski, Pawel; Brenner, Charles

    2004-05-14

    NAD+ is essential for life in all organisms, both as a coenzyme for oxidoreductases and as a source of ADPribosyl groups used in various reactions, including those that retard aging in experimental systems. Nicotinic acid and nicotinamide were defined as the vitamin precursors of NAD+ in Elvehjem's classic discoveries of the 1930s. The accepted view of eukaryotic NAD+ biosynthesis, that all anabolism flows through nicotinic acid mononucleotide, was challenged experimentally and revealed that nicotinamide riboside is an unanticipated NAD+ precursor in yeast. Nicotinamide riboside kinases from yeast and humans essential for this pathway were identified and found to be highly specific for phosphorylation of nicotinamide riboside and the cancer drug tiazofurin. Nicotinamide riboside was discovered as a nutrient in milk, suggesting that nicotinamide riboside is a useful compound for elevation of NAD+ levels in humans. PMID:15137942

  20. Discovery and Characterization of a Silent Gene Cluster that Produces Azaphilones from Aspergillus niger ATCC 1015 Reveal a Hydroxylation-Mediated Pyran-Ring Formation

    PubMed Central

    Zabala, Angelica O.; Xu, Wei; Chooi, Yit-Heng; Tang, Yi

    2012-01-01

    SUMMARY Azaphilones are a class of fungal metabolites characterized by a highly oxygenated pyrano-quinone bicyclic core and exhibits a broad range of bioactivities. While widespread among various fungi, their biosynthesis has not been thoroughly elucidated. By activation of a silent (aza) gene cluster in Aspergillus niger ATCC 1015, we have discovered six new azaphilone compounds, azanigerones A-F (1, 3-7). Transcriptional analysis and deletion of a key polyketide synthase (PKS) gene further confirmed the involvement of the aza gene cluster. The biosynthetic pathway was shown to involve the convergent actions of a highly-reducing and a non-reducing PKSs. Most significantly, in vitro reaction of a key flavin-dependent monooxygenase encoded in the cluster with an early benzaldehyde intermediate revealed its roles in hydroxylation and pyran-ring formation to afford the characteristic bicylic core shared by azaphilones. PMID:22921072

  1. Whole-exome Sequencing and an iPSC-Derived Cardiomyocyte Model Provides a Powerful Platform for Gene Discovery in Left Ventricular Hypertrophy

    PubMed Central

    Zhi, D.; Irvin, M. R.; Gu, C. C.; Stoddard, A. J.; Lorier, R.; Matter, A.; Rao, D. C.; Srinivasasainagendra, V.; Tiwari, H. K.; Turner, A.; Broeckel, U.; Arnett, D. K.

    2012-01-01

    Rationale: Left ventricular hypertrophy (LVH) is a heritable predictor of cardiovascular disease, particularly in blacks. Objective: Determine the feasibility of combining evidence from two distinct but complementary experimental approaches to identify novel genetic predictors of increased LV mass. Methods: Whole-exome sequencing (WES) was conducted in seven African-American sibling trios ascertained on high average familial LV mass indexed to height (LVMHT) using Illumina HiSeq technology. Identified missense or nonsense (MS/NS) mutations were examined for association with LVMHT using linear mixed models adjusted for age, sex, body weight, and familial relationship. To functionally assess WES findings, human induced pluripotent stem cell-derived cardiomyocytes (induced pluripotent stem cell-CM) were stimulated to induce hypertrophy; mRNA sequencing (RNA-seq) was used to determine gene expression differences associated with hypertrophy onset. Statistically significant findings under both experimental approaches identified LVH candidate genes. Candidate genes were further prioritized by seven supportive criteria that included additional association tests (two criteria), regional linkage evidence in the larger HyperGEN cohort (one criterion), and publically available gene and variant based annotations (four criteria). Results: WES reads covered 91% of the target capture region (of size 37.2 MB) with an average coverage of 65×. WES identified 31,426 MS/NS mutations among the 21 individuals. A total of 295 MS/NS variants in 265 genes were associated with LVMHT with q-value <0.25. Of the 265 WES genes, 44 were differentially expressed (P < 0.05) in hypertrophied cells. Among the 44 candidate genes identified, 5, including HLA-B, HTT, MTSS1, SLC5A12, and THBS1, met 3 of 7 supporting criteria. THBS1 encodes an adhesive glycoprotein that promotes matrix preservation in pressure-overload LVH. THBS1 gene expression was 34% higher in hypertrophied cells (P = 0.0003) and a predicted conserved and damaging NS variant in exon 13 (A2099G) was significantly associated with LVHMT (P = 4 × 10−6). Conclusion: Combining evidence from cutting-edge genetic and cellular experiments can enable identification of novel LVH risk loci. PMID:22654895

  2. Discovery and characterization of the first non-coding RNA that regulates gene expression, micF RNA: A historical perspective

    PubMed Central

    Delihas, Nicholas

    2015-01-01

    The first evidence that RNA can function as a regulator of gene expression came from experiments with prokaryotes in the 1980s. It was shown that Escherichia coli micF is an independent gene, has its own promoter, and encodes a small non-coding RNA that base pairs with and inhibits translation of a target messenger RNA in response to environmental stress conditions. The micF RNA was isolated, sequenced and shown to be a primary transcript. In vitro experiments showed binding to the target ompF mRNA. Secondary structure probing revealed an imperfect micF RNA/ompF RNA duplex interaction and the presence of a non-canonical base pair. Several transcription factors, including OmpR, regulate micF transcription in response to environmental factors. micF has also been found in other bacterial species, however, recently Gerhart Wagner and Jörg Vogel showed pleiotropic effects and found micF inhibits expression of multiple target mRNAs; importantly, one is the global regulatory gene lrp. In addition, micF RNA was found to interact with its targets in different ways; it either inhibits ribosome binding or induces degradation of the message. Thus the concept and initial experimental evidence that RNA can regulate gene expression was born with prokaryotes. PMID:26629310

  3. Targeted discovery of single-nucleotide polymorphisms in an unmarked wheat chromosomal region containing the Hessian fly resistance gene H33

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The highly effective Hessian fly-resistance gene, H33, was introgressed from durum wheat into common wheat and genetically mapped to chromosome 3AS, in previous research. However, H33 located to a region that is well-known to be devoid of molecular markers, with the closest flanking simple sequence ...

  4. Code-assisted discovery of TAL effector targets in bacterial leaf streak of rice reveals contrast with bacterial blight and a novel susceptibility gene

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Transcription activator-like (TAL) effectors found in Xanthomonas spp. promote bacterial growth and plant susceptibility by binding specific DNA sequences or, effector-binding elements (EBEs), and inducing host gene expression. In this study, we have found substantially different transcriptional pro...

  5. Discovery of candidate disease genes in ENU-induced mouse mutants by large-scale sequencing, including a splice-site mutation in nucleoredoxin

    Technology Transfer Automated Retrieval System (TEKTRAN)

    An accurate and precisely annotated genome assembly is a fundamental requirement for functional genomic analysis. Here, the complete DNA sequence and gene annotation of mouse Chromosome 11 was used to test the efficacy of large-scale sequencing for mutation identification. We re-sequenced the 14,000...

  6. Transcriptome sequencing and annotation of the microalgae Dunaliella tertiolecta: Pathway description and gene discovery for production of next-generation biofuels

    PubMed Central

    2011-01-01

    Background Biodiesel or ethanol derived from lipids or starch produced by microalgae may overcome many of the sustainability challenges previously ascribed to petroleum-based fuels and first generation plant-based biofuels. The paucity of microalgae genome sequences, however, limits gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for the non-model microalgae species, Dunaliella tertiolecta, and identify pathways and genes of importance related to biofuel production. Results Next generation DNA pyrosequencing technology applied to D. tertiolecta transcripts produced 1,363,336 high quality reads with an average length of 400 bases. Following quality and size trimming, ~ 45% of the high quality reads were assembled into 33,307 isotigs with a 31-fold coverage and 376,482 singletons. Assembled sequences and singletons were subjected to BLAST similarity searches and annotated with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology (KO) identifiers. These analyses identified the majority of lipid and starch biosynthesis and catabolism pathways in D. tertiolecta. Conclusions The construction of metabolic pathways involved in the biosynthesis and catabolism of fatty acids, triacylglycrols, and starch in D. tertiolecta as well as the assembled transcriptome provide a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock. PMID:21401935

  7. Discovery of a Dicer-Independent, Cell-Type Dependent Alternate Targeting Sequence Generator: Implications in Gene Silencing & Pooled RNAi Screens

    PubMed Central

    Bhinder, Bhavneet; Shum, David; Li, Mu; Ibáñez, Glorymar; Vlassov, Alexander V.; Magdaleno, Susan; Djaballah, Hakim

    2014-01-01

    There is an acceptance that plasmid-based delivery of interfering RNA always generates the intended targeting sequences in cells, making it as specific as its synthetic counterpart. However, recent studies have reported on cellular inefficiencies of the former, especially in light of emerging gene discordance at inter-screen level and across formats. Focusing primarily on the TRC plasmid-based shRNA hairpins, we reasoned that alleged specificities were perhaps compromised due to altered processing; resulting in a multitude of random interfering sequences. For this purpose, we opted to study the processing of hairpin TRCN#40273 targeting CTTN; which showed activity in a miRNA-21 gain-of-function shRNA screen, but inactive when used as an siRNA duplex. Using a previously described walk-through method, we identified 36 theoretical cleavage variants resulting in 78 potential siRNA duplexes targeting 53 genes. We synthesized and tested all of them. Surprisingly, six duplexes targeting ASH1L, DROSHA, GNG7, PRKCH, THEM4, and WDR92 scored as active. QRT-PCR analysis on hairpin transduced reporter cells confirmed knockdown of all six genes, besides CTTN; revealing a surprising 7 gene-signature perturbation by this one single hairpin. We expanded our qRT-PCR studies to 26 additional cell lines and observed unique knockdown profiles associated with each cell line tested; even for those lacking functional DICER1 gene suggesting no obvious dependence on dicer for shRNA hairpin processing; contrary to published models. Taken together, we report on a novel dicer independent, cell-type dependent mechanism for non-specific RNAi gene silencing we coin Alternate Targeting Sequence Generator (ATSG). In summary, ATSG adds another dimension to the already complex interpretation of RNAi screening data, and provides for the first time strong evidence in support of arrayed screening, and questions the scientific merits of performing pooled RNAi screens, where deconvolution of up to genome-scale pools is indispensable for target identification. PMID:24987961

  8. Discovery of a Novel Immune Gene Signature with Profound Prognostic Value in Colorectal Cancer: A Model of Cooperativity Disorientation Created in the Process from Development to Cancer

    PubMed Central

    An, Ning; Shi, Xiaoyu; Zhang, Yueming; Lv, Ning; Feng, Lin; Di, Xuebing; Han, Naijun; Wang, Guiqi

    2015-01-01

    Immune response-related genes play a major role in colorectal carcinogenesis by mediating inflammation or immune-surveillance evasion. Although remarkable progress has been made to investigate the underlying mechanism, the understanding of the complicated carcinogenesis process was enormously hindered by large-scale tumor heterogeneity. Development and carcinogenesis share striking similarities in their cellular behavior and underlying molecular mechanisms. The association between embryonic development and carcinogenesis makes embryonic development a viable reference model for studying cancer thereby circumventing the potentially misleading complexity of tumor heterogeneity. Here we proposed that the immune genes, responsible for intra-immune cooperativity disorientation (defined in this study as disruption of developmental expression correlation patterns during carcinogenesis), probably contain untapped prognostic resource of colorectal cancer. In this study, we determined the mRNA expression profile of 137 human biopsy samples, including samples from different stages of human colonic development, colorectal precancerous progression and colorectal cancer samples, among which 60 were also used to generate miRNA expression profile. We originally established Spearman correlation transition model to quantify the cooperativity disorientation associated with the transition from normal to precancerous to cancer tissue, in conjunction with miRNA-mRNA regulatory network and machine learning algorithm to identify genes with prognostic value. Finally, a 12-gene signature was extracted, whose prognostic value was evaluated using Kaplan–Meier survival analysis in five independent datasets. Using the log-rank test, the 12-gene signature was closely related to overall survival in four datasets (GSE17536, n = 177, p = 0.0054; GSE17537, n = 55, p = 0.0039; GSE39582, n = 562, p = 0.13; GSE39084, n = 70, p = 0.11), and significantly associated with disease-free survival in four datasets (GSE17536, n = 177, p = 0.0018; GSE17537, n = 55, p = 0.016; GSE39582, n = 557, p = 4.4e-05; GSE14333, n = 226, p = 0.032). Cox regression analysis confirmed that the 12-gene signature was an independent factor in predicting colorectal cancer patient’s overall survival (hazard ratio: 1.759; 95% confidence interval: 1.126–2.746; p = 0.013], as well as disease-free survival (hazard ratio: 2.116; 95% confidence interval: 1.324–3.380; p = 0.002). PMID:26325386

  9. Developing Each Child's Potential: The Discovery Program.

    ERIC Educational Resources Information Center

    Schulthes, Diane; Wolosky, Jamie

    1998-01-01

    Describes a middle school program that focuses on creating educational experiences that foster life-long learning for all students. The Discovery Program provides a range of differentiated teaching/learning activities, including research competitions, language-arts studies, social-action projects, art exhibitions, mathematics projects, and…

  10. Discovery of toxin-encoding genes from the false viper Macropisthodon rudis, a rear-fanged snake, by transcriptome analysis of venom gland.

    PubMed

    Zhang, Zhixiao; Zhang, Xi; Hu, Tingsong; Zhou, Weiguo; Cui, Qinghua; Tian, Jing; Zheng, Ying; Fan, Quanshui

    2015-11-01

    Although rear-fanged snakes are often considered as non-threatening to humans, some species are lethal or medically hazardous. The toxin components and bioactivities of front-fanged snakes have been extensively studied; however, only limited research has explored the venoms of rear-fanged snakes. The false viper, Macropisthodon rudis, is widespread in southern China, but little is known about the toxins that this snake produces. Here, we analyzed the transcriptome of the venom gland of M. rudis using high-throughput sequencing with an illumina HiSeq 2000. The raw data were assembled and annotated using public databases. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and gene ontology (GO) were analyzed. Using sequence comparisons, snake venom metalloproteinases (SVMPs) and a phosphodiesterase (PDE) were discovered in the venom gland of M. rudis. PMID:26403866

  11. Insights into Hepatopancreatic Functions for Nutrition Metabolism and Ovarian Development in the Crab Portunus trituberculatus: Gene Discovery in the Comparative Transcriptome of Different Hepatopancreas Stages

    PubMed Central

    Liu, Zhijun; Zheng, Huajun; Cheng, Yongxu

    2014-01-01

    The crustacean hepatopancreas has different functions including absorption, storage of nutrients and vitellogenesis during growth, and ovarian development. However, genetic information on the biological functions of the crustacean hepatopancreas during such processes is limited. The swimming crab, Portunus trituberculatus, is a commercially important species for both aquaculture and fisheries in the Asia-Pacific region. This study compared the transcriptome in the hepatopancreas of female P. trituberculatus during the growth and ovarian maturation stages by 454 high-throughput pyrosequencing and bioinformatics. The goal was to discover genes in the hepatopancreas involved in food digestion, nutrition metabolism and ovarian development, and to identify patterns of gene expression during growth and ovarian maturation. Our transcriptome produced 303,450 reads with an average length of 351 bp, and the high quality reads were assembled into 21,635 contigs and 31,844 singlets. Based on BLASTP searches of the deduced protein sequences, there were 7,762 contigs and 4,098 singlets with functional annotation. Further analysis revealed 33,427 unigenes with ORFs, including 17,388 contigs and 16,039 singlets in the hepatopancreas, while only 7,954 unigenes (5,691 contigs and 2,263 singlets) with the predicted protein sequences were annotated with biological functions. The deduced protein sequences were assigned to 3,734 GO terms, 25 COG categories and 294 specific pathways. Furthermore, there were 14, 534, and 22 identified unigenes involved in food digestion, nutrition metabolism and ovarian development, respectively. 212 differentially expressed genes (DEGs) were found between the growth and endogenous stage of the hepatopancreas, while there were 382 DEGs between the endogenous and exogenous stage hepatopancreas. Our results not only enhance the understanding of crustacean hepatopancreatic functions during growth and ovarian development, but also represent a basis for further research on new genes and functional genomics of P. trituberculatus or closely related species. PMID:24454766

  12. Discovery of a brain promoter from the human transferrin gene and its utilization for development of transgenic mice that express human apolipoprotein E alleles.

    PubMed Central

    Bowman, B H; Jansen, L; Yang, F; Adrian, G S; Zhao, M; Atherton, S S; Buchanan, J M; Greene, R; Walter, C; Herbert, D C

    1995-01-01

    Transgenic mice carrying heterologous genes directed by a 670-bp segment of the regulatory sequence from the human transferrin (TF) gene demonstrated high expression in brain. Mice carrying the chimeric 0.67kbTF-CAT gene expressed TF-CAT in neurons and glial cells of the nucleus basalis, the cerebrum, corpus callosum, cerebellum, and hippocampus. In brains from two independent TF-CAT transgenic founder lines, copy number of TF-CAT mRNA exceeded the number of mRNA transcripts encoding either mouse endogenous transferrin or mouse endogenous amyloid precursor protein. In two transgenic founder lines, the chloramphenicol acetyltransferase (CAT) protein synthesized from the TF-CAT mRNA was estimated to be 0.10-0.15% of the total soluble proteins of the brain. High expression observed in brain indicates that the 0.67kbTF promoter is a promising director of brain expression of heterologous genes. Therefore, the promoter has been used to express the three common human apolipoprotein E (apoE) alleles in transgenic mouse brains. The apoE alleles have been implicated in the expression of Alzheimer disease, and the human apoE isoforms are reported to interact with different affinities to the brain beta-amyloid and tau protein in vitro. Results of this study demonstrate high expression and production of human apoE proteins in transgenic mouse brains. The model may be used to characterize the interaction of human apoE isoforms with other brain proteins and provide information helpful in designing therapeutic strategies for Alzheimer disease. Images Fig. 1 Fig. 3 Fig. 4 Fig. 5 Fig. 6 PMID:8618855

  13. PTF SN discovery report, July 2012

    NASA Astrophysics Data System (ADS)

    Gal-Yam, Avishay; Ben-Ami, Sagi; Arcavi, I.; Yaron, O.; Nugent, Peter; Sesar, B.; Cao, Y.; Silverman, J.; Clubb, K.; Filippenko, A. V.; Cenko, S. B.; Parrent, J.; Maguire, K.; Sullivan, M.

    2012-08-01

    The PTF (ATEL #1964, #3253; http://www.astro.caltech.edu/ptf/ ; Law et al. 2009, PASP, 121, 1395; Rau et al. 2009, PASP, 121, 1334) reports the discovery of 14 new supernovae. PTF discoveries are made by autonomous PTF software (Bloom et al. 2011, http://adsabs.harvard.edu/abs/2011arXiv1106.5491B ), as well as by the Galaxy Zoo Supernova Project (Smith et al. 2011, MNRAS, 412, 1309; http://supernova.galaxyzoo.org ).

  14. PTF SN discovery report, August 2012

    NASA Astrophysics Data System (ADS)

    Arcavi, Iair; Gal-Yam, A.; Ben-Ami, S.; Yaron, O.; Horesh, P. Nugent A.; Cao, Y.; Bellm, E.; Fynbo, J.; Wiis, J.; Olesen, J.; Engedal, L.; Larsen, A.; Kasliwal, M.; Pan, Y.-C.; Graham, M.; Parrent, J.; Quimby, R.; PTF Team

    2012-08-01

    The PTF (ATEL #1964, #3253; http://www.astro.caltech.edu/ptf; Law et al. 2009, PASP, 121, 1395; Rau et al. 2009, PASP, 121, 1334) reports the discovery of 12 new supernovae. PTF discoveries are made by autonomous PTF software (Bloom et al. 2011, http://adsabs.harvard.edu/abs/2011arXiv1106.5491B ), as well as by the Galaxy Zoo Supernova Project (Smith et al. 2011, MNRAS, 412, 1309; http://supernova.galaxyzoo.org ).

  15. PTF SN discovery report, July 2012

    NASA Astrophysics Data System (ADS)

    Gal-Yam, Avishay; Ben-Ami, Sagi; Yaron, O.; Arcavi, I.; Nugent, Peter; Levitan, D.; Perley, D.; Kulkarni, S. R.; Sesar, B.; Cao, Y.; Bellm, E.; Barlow, T.; Silverman, J.; Clubb, K.; Miller, A.; Fox, O.; Pan, Y.-C.; Maguire, K.; Sullvan, M.; Walker, E.; Kasliwal, M.; Graham, M.; Parrent, J.

    2012-07-01

    The PTF (ATEL #1964, #3253; http://www.astro.caltech.edu/ptf/ ; Law et al. 2009, PASP, 121, 1395; Rau et al. 2009, PASP, 121, 1334) reports the discovery of 20 new supernovae. PTF discoveries are made by autonomous PTF software (Bloom et al. 2011, http://adsabs.harvard.edu/abs/2011arXiv1106.5491B ), as well as by the Galaxy Zoo Supernova Project (Smith et al. 2011, MNRAS, 412, 1309; http://supernova.galaxyzoo.org ).

  16. PTF SN discovery report, October 9, 2012

    NASA Astrophysics Data System (ADS)

    Gal-Yam, A.; Nugent, P.; Cao, Y.; Levitan, D.; Hallinan, G.; Kyne, G.; Silverman, J.; Clubb, K.; Miller, A.; Fox, O.; Suzuki, N.; Quimby, R.

    2012-10-01

    The PTF (ATEL #1964, #3253; http://www.astro.caltech.edu/ptf; Law et al. 2009, PASP, 121, 1395; Rau et al. 2009, PASP, 121, 1334) reports the discovery of 9 new supernovae. PTF discoveries are made by autonomous PTF software (Bloom et al. 2011, http://adsabs.harvard.edu/abs/2011arXiv1106.5491B ), as well as by the Galaxy Zoo Supernova Project (Smith et al. 2011, MNRAS, 412, 1309; http://supernova.galaxyzoo.org ).

  17. PTF SN discovery report, April 2012

    NASA Astrophysics Data System (ADS)

    Gal-Yam, Avishay; Arcavi, I.; Ben-Ami, S.; Yaron, O.; Nugent, P.; Levitam, D.; Simonian, G.; Sesar, B.; Cao, Y.; Horesh, A.; Bellm, E.; Silverman, J.; Miller, A.; Cenko, S. B.; Clubb, K. I.; Filippenko, A. V.; Shivvers, I.; Kasliwal, M.; Parrent, J.; Maguire, K.; Pan, Y.-C.

    2012-05-01

    The PTF (ATEL #1964, #3253; http://www.astro.caltech.edu/ptf/; Law et al. 2009, PASP, 121, 1395; Rau et al. 2009, PASP, 121, 1334) reports the discovery of 19 new supernovae. PTF discoveries are made by autonomous PTF software (Bloom et al. 2011, http://adsabs.harvard.edu/abs/2011arXiv1106.5491B ), as well as by the Galaxy Zoo Supernova Project (Smith et al. 2011, MNRAS, 412, 1309; http://supernova.galaxyzoo.org ).

  18. PTF SN discovery report, March 2012

    NASA Astrophysics Data System (ADS)

    Gal-Yam, Avishay; Arcavi, I.; Ben-Ami, S.; Yaron, O.; Nugent, P.; Levitam, D.; Simonian, G.; Sesar, B.; Cao, Y.; Horesh, A.; Bellm, E.; Silverman, J.; Miller, A.; Cenko, S. B.; Clubb, K. I.; Filippenko, A. V.; Shivvers, I.; Kasliwal, M.; Parrent, J.; Maguire, K.; Pan, Y.-C.

    2012-05-01

    The PTF (ATEL #1964, #3253; http://www.astro.caltech.edu/ptf/; Law et al. 2009, PASP, 121, 1395; Rau et al. 2009, PASP, 121, 1334) reports the discovery of 26 new supernovae. PTF discoveries are made by autonomous PTF software (Bloom et al. 2011, http://adsabs.harvard.edu/abs/2011arXiv1106.5491B ), as well as by the Galaxy Zoo Supernova Project (Smith et al. 2011, MNRAS, 412, 1309; http://supernova.galaxyzoo.org ).

  19. PTF SN discovery report, September 8, 2012

    NASA Astrophysics Data System (ADS)

    Gal-Yam, A.; Nugent, P.; Walker, E.; Cenko, S. B.; Fox, O.

    2012-09-01

    The PTF (ATEL #1964, #3253; http://www.astro.caltech.edu/ptf ; Law et al. 2009, PASP, 121, 1395; Rau et al. 2009, PASP, 121, 1334) reports the discovery of 8 new supernovae. PTF discoveries are made by autonomous PTF software (Bloom et al. 2011, http://adsabs.harvard.edu/abs/2011arXiv1106.5491B ), as well as by the Galaxy Zoo Supernova Project (Smith et al. 2011, MNRAS, 412, 1309; http://supernova.galaxyzoo.org ).

  20. INTEGRATING COMPUTATIONAL PROTEIN FUNCTION PREDICTION INTO DRUG DISCOVERY INITIATIVES

    PubMed Central

    Grant, Marianne A.

    2014-01-01

    Pharmaceutical researchers must evaluate vast numbers of protein sequences and formulate innovative strategies for identifying valid targets and discovering leads against them as a way of accelerating drug discovery. The ever increasing number and diversity of novel protein sequences identified by genomic sequencing projects and the success of worldwide structural genomics initiatives have spurred great interest and impetus in the development of methods for accurate, computationally empowered protein function prediction and active site identification. Previously, in the absence of direct experimental evidence, homology-based protein function annotation remained the gold-standard for in silico analysis and prediction of protein function. However, with the continued exponential expansion of sequence databases, this approach is not always applicable, as fewer query protein sequences demonstrate significant homology to protein gene products of known function. As a result, several non-homology based methods for protein function prediction that are based on sequence features, structure, evolution, biochemical and genetic knowledge have emerged. Herein, we review current bioinformatic programs and approaches for protein function prediction/annotation and discuss their integration into drug discovery initiatives. The development of such methods to annotate protein functional sites and their application to large protein functional families is crucial to successfully utilizing the vast amounts of genomic sequence information available to drug discovery and development processes. PMID:25530654

  1. The K2-ESPRINT Project I: Discovery of the Disintegrating Rocky Planet K2-22b with a Cometary Head and Leading Tail

    NASA Astrophysics Data System (ADS)

    Sanchis-Ojeda, R.; Rappaport, S.; Pallè, E.; Delrez, L.; DeVore, J.; Gandolfi, D.; Fukui, A.; Ribas, I.; Stassun, K. G.; Albrecht, S.; Dai, F.; Gaidos, E.; Gillon, M.; Hirano, T.; Holman, M.; Howard, A. W.; Isaacson, H.; Jehin, E.; Kuzuhara, M.; Mann, A. W.; Marcy, G. W.; Miles-Páez, P. A.; Montañés-Rodríguez, P.; Murgas, F.; Narita, N.; Nowak, G.; Onitsuka, M.; Paegert, M.; Van Eylen, V.; Winn, J. N.; Yu, L.

    2015-10-01

    We present the discovery of a transiting exoplanet candidate in the K2 Field-1 with an orbital period of 9.1457 hr: K2-22b. The highly variable transit depths, ranging from ∼0% to 1.3%, are suggestive of a planet that is disintegrating via the emission of dusty effluents. We characterize the host star as an M-dwarf with Teff ≃ 3800 K. We have obtained ground-based transit measurements with several 1-m class telescopes and with the GTC. These observations (1) improve the transit ephemeris; (2) confirm the variable nature of the transit depths; (3) indicate variations in the transit shapes; and (4) demonstrate clearly that at least on one occasion the transit depths were significantly wavelength dependent. The latter three effects tend to indicate extinction of starlight by dust rather than by any combination of solid bodies. The K2 observations yield a folded light curve with lower time resolution but with substantially better statistical precision compared with the ground-based observations. We detect a significant “bump” just after the transit egress, and a less significant bump just prior to transit ingress. We interpret these bumps in the context of a planet that is not only likely streaming a dust tail behind it, but also has a more prominent leading dust trail that precedes it. This effect is modeled in terms of dust grains that can escape to beyond the planet's Hill sphere and effectively undergo “Roche lobe overflow,” even though the planet's surface is likely underfilling its Roche lobe by a factor of 2.

  2. A statistical framework for genome-wide discovery of biomarker splice variations with GeneChip Human Exon 1.0 ST Arrays.

    PubMed

    Yoshida, Ryo; Numata, Kazuyuki; Imoto, Seiya; Nagasaki, Masao; Doi, Atsushi; Ueno, Kazuko; Miyano, Satoru

    2006-01-01

    Alternative splicing is an important regulatory mechanism that generates multiple mRNA transcripts which are transcribed into functionally diverse proteins. According to the current studies, aberrant transcripts due to splicing mutations are known to cause for 15% of genetic diseases. Therefore understanding regulatory mechanism of alternative splicing is essential for identifying potential biomarkers for several types of human diseases. Most recently, advent of GeneChip Human Exon 1.0 ST Array enables us to measure genome-wide expression profiles of over one million exons. With this new microarray platform, analysis of functional gene expressions could be extended to detect not only differentially expressed genes, but also a set of specific-splicing events that are differentially observed between one or more experimental conditions, e.g. tumor or normal control cells. In this study, we address the statistical problems to identify differentially observed splicing variations from exon expression profiles. The proposed method is organized according to the following process: (1) Data preprocessing for removing systematic biases from the probe intensities. (2) Whole transcript analysis with the analysis of variance (ANOVA) to identify a set of loci that cause the alternative splicing-related to a certain disease. We test the proposed statistical approach on exon expression profiles of colorectal carcinoma. The applicability is verified and discussed in relation to the existing biological knowledge. This paper intends to highlight the potential role of statistical analysis of all exon microarray data. Our work is an important first step toward development of more advanced statistical technology. Supplementary information and materials are available from http://bonsai.ims.u-tokyo.ac.jp/~yoshidar/IBSB2006_ExonArray.htm. PMID:17503359

  3. Identification of Genome-Wide Variants and Discovery of Variants Associated with Brassica rapa Clubroot Resistance Gene Rcr1 through Bulked Segregant RNA Sequencing

    PubMed Central

    Yu, Fengqun; Zhang, Xingguo; Huang, Zhen; Chu, Mingguang; Song, Tao; Falk, Kevin C.; Deora, Abhinandan; Chen, Qilin; Zhang, Yan; McGregor, Linda; Gossen, Bruce D.; McDonald, Mary Ruth; Peng, Gary

    2016-01-01

    Clubroot, caused by Plasmodiophora brassicae, is an important disease on Brassica species worldwide. A clubroot resistance gene, Rcr1, with efficacy against pathotype 3 of P. brassicae, was previously mapped to chromosome A03 of B. rapa in pak choy cultivar “Flower Nabana”. In the current study, resistance to pathotypes 2, 5 and 6 was shown to be associated with Rcr1 region on chromosome A03. Bulked segregant RNA sequencing was performed and short read sequences were assembled into 10 chromosomes of the B. rapa reference genome v1.5. For the resistant (R) bulks, a total of 351.8 million (M) sequences, 30,836.5 million bases (Mb) in length, produced 120-fold coverage of the reference genome. For the susceptible (S) bulks, 322.9 M sequences, 28,216.6 Mb in length, produced 109-fold coverage. In total, 776.2 K single nucleotide polymorphisms (SNPs) and 122.2 K insertion / deletion (InDels) in R bulks and 762.8 K SNPs and 118.7 K InDels in S bulks were identified; each chromosome had about 87% SNPs and 13% InDels, with 78% monomorphic and 22% polymorphic variants between the R and S bulks. Polymorphic variants on each chromosome were usually below 23%, but made up 34% of the variants on chromosome A03. There were 35 genes annotated in the Rcr1 target region and variants were identified in 21 genes. The numbers of poly variants differed significantly among the genes. Four out of them encode Toll-Interleukin-1 receptor / nucleotide-binding site / leucine-rich-repeat proteins; Bra019409 and Bra019410 harbored the higher numbers of polymorphic variants, which indicates that they are more likely candidates of Rcr1. Fourteen SNP markers in the target region were genotyped using the Kompetitive Allele Specific PCR method and were confirmed to associate with Rcr1. Selected SNP markers were analyzed with 26 recombinants obtained from a segregating population consisting of 1587 plants, indicating that they were completely linked to Rcr1. Nine SNP markers were used for marker-assisted introgression of Rcr1 into B. napus canola from B. rapa, with 100% accuracy in this study. PMID:27078023

  4. Identification of Genome-Wide Variants and Discovery of Variants Associated with Brassica rapa Clubroot Resistance Gene Rcr1 through Bulked Segregant RNA Sequencing.

    PubMed

    Yu, Fengqun; Zhang, Xingguo; Huang, Zhen; Chu, Mingguang; Song, Tao; Falk, Kevin C; Deora, Abhinandan; Chen, Qilin; Zhang, Yan; McGregor, Linda; Gossen, Bruce D; McDonald, Mary Ruth; Peng, Gary

    2016-01-01

    Clubroot, caused by Plasmodiophora brassicae, is an important disease on Brassica species worldwide. A clubroot resistance gene, Rcr1, with efficacy against pathotype 3 of P. brassicae, was previously mapped to chromosome A03 of B. rapa in pak choy cultivar "Flower Nabana". In the current study, resistance to pathotypes 2, 5 and 6 was shown to be associated with Rcr1 region on chromosome A03. Bulked segregant RNA sequencing was performed and short read sequences were assembled into 10 chromosomes of the B. rapa reference genome v1.5. For the resistant (R) bulks, a total of 351.8 million (M) sequences, 30,836.5 million bases (Mb) in length, produced 120-fold coverage of the reference genome. For the susceptible (S) bulks, 322.9 M sequences, 28,216.6 Mb in length, produced 109-fold coverage. In total, 776.2 K single nucleotide polymorphisms (SNPs) and 122.2 K insertion / deletion (InDels) in R bulks and 762.8 K SNPs and 118.7 K InDels in S bulks were identified; each chromosome had about 87% SNPs and 13% InDels, with 78% monomorphic and 22% polymorphic variants between the R and S bulks. Polymorphic variants on each chromosome were usually below 23%, but made up 34% of the variants on chromosome A03. There were 35 genes annotated in the Rcr1 target region and variants were identified in 21 genes. The numbers of poly variants differed significantly among the genes. Four out of them encode Toll-Interleukin-1 receptor / nucleotide-binding site / leucine-rich-repeat proteins; Bra019409 and Bra019410 harbored the higher numbers of polymorphic variants, which indicates that they are more likely candidates of Rcr1. Fourteen SNP markers in the target region were genotyped using the Kompetitive Allele Specific PCR method and were confirmed to associate with Rcr1. Selected SNP markers were analyzed with 26 recombinants obtained from a segregating population consisting of 1587 plants, indicating that they were completely linked to Rcr1. Nine SNP markers were used for marker-assisted introgression of Rcr1 into B. napus canola from B. rapa, with 100% accuracy in this study. PMID:27078023

  5. Role of ChIP-seq in the discovery of transcription factor binding sites, differential gene regulation mechanism, epigenetic marks and beyond

    PubMed Central

    Mundade, Rasika; Ozer, Hatice Gulcin; Wei, Han; Prabhu, Lakshmi; Lu, Tao

    2014-01-01

    Many biologically significant processes, such as cell differentiation and cell cycle progression, gene transcription and DNA replication, chromosome stability and epigenetic silencing etc. depend on the crucial interactions between cellular proteins and DNA. Chromatin immunoprecipitation (ChIP) is an important experimental technique for studying interactions between specific proteins and DNA in the cell and determining their localization on a specific genomic locus. In recent years, the combination of ChIP with second generation DNA-sequencing technology (ChIP-seq) allows precise genomic functional assay. This review addresses the important applications of ChIP-seq with an emphasis on its role in genome-wide mapping of transcription factor binding sites, the revelation of underlying molecular mechanisms of differential gene regulation that are governed by specific transcription factors, and the identification of epigenetic marks. Furthermore, we also describe the ChIP-seq data analysis workflow and a perspective for the exciting potential advancement of ChIP-seq technology in the future. PMID:25486472

  6. Role of ChIP-seq in the discovery of transcription factor binding sites, differential gene regulation mechanism, epigenetic marks and beyond.

    PubMed

    Mundade, Rasika; Ozer, Hatice Gulcin; Wei, Han; Prabhu, Lakshmi; Lu, Tao

    2014-01-01

    Many biologically significant processes, such as cell differentiation and cell cycle progression, gene transcription and DNA replication, chromosome stability and epigenetic silencing etc. depend on the crucial interactions between cellular proteins and DNA. Chromatin immunoprecipitation (ChIP) is an important experimental technique for studying interactions between specific proteins and DNA in the cell and determining their localization on a specific genomic locus. In recent years, the combination of ChIP with second generation DNA-sequencing technology (ChIP-seq) allows precise genomic functional assay. This review addresses the important applications of ChIP-seq with an emphasis on its role in genome-wide mapping of transcription factor binding sites, the revelation of underlying molecular mechanisms of differential gene regulation that are governed by specific transcription factors, and the identification of epigenetic marks. Furthermore, we also describe the ChIP-seq data analysis workflow and a perspective for the exciting potential advancement of ChIP-seq technology in the future. PMID:25486472

  7. Transcriptome Analysis of Androgenic Gland for Discovery of Novel Genes from the Oriental River Prawn, Macrobrachium nipponense, Using Illumina Hiseq 2000

    PubMed Central

    Jin, Shubo; Fu, Hongtuo; Zhou, Qiao; Sun, Shengming; Jiang, Sufei; Xiong, Yiwei; Gong, Yongsheng; Qiao, Hui; Zhang, Wenyi

    2013-01-01

    Background The oriental river prawn, Macrobrachium nipponense, is an important aquaculture species in China, even in whole of Asia. The androgenic gland produces hormones that play crucial roles in sexual differentiation to maleness. This study is the first de novo M. nipponense transcriptome analysis using cDNA prepared from mRNA isolated from the androgenic gland. Illumina/Solexa was used for sequencing. Methodology and Principal Finding The total volume of RNA sample was more than 5 ug. We generated 70,853,361 high quality reads after eliminating adapter sequences and filtering out low-quality reads. A total of 78,408 isosequences were obtained by clustering and assembly of the clean reads, producing 57,619 non-redundant transcripts with an average length of 1244.19 bp. In total 70,702 isosequences were matched to the Nr database, additional analyses were performed by GO (33,203), KEGG (17,868), and COG analyses (13,817), identifying the potential genes and their functions. A total of 47 sex-determination related gene families were identified from the M. nipponense androgenic gland transcriptome based on the functional annotation of non-redundant transcripts and comparisons with the published literature. Furthermore, a total of 40 candidate novel genes were found, that may contribute to sex-determination based on their extremely high expression levels in the androgenic compared to other sex glands,. Further, 437 SSRs and 65,535 high-confidence SNPs were identified in this EST dataset from which 14 EST-SSR markers have been isolated. Conclusion Our study provides new sequence information for M. nipponense, which will be the basis for further genetic studies on decapods crustaceans. More importantly, this study dramatically improves understanding of sex-determination mechanisms, and advances sex-determination research in all crustacean species. The huge number of potential SSR and SNP markers isolated from the transcriptome may shed the lights on research in many fields, including the evolution and molecular ecology of Macrobrachium species. PMID:24204682

  8. Transcriptome Analysis for Identification of Genes Related to Gonad Differentiation, Growth, Immune Response and Marker Discovery in The Turbot (Scophthalmus maximus)

    PubMed Central

    Ma, Deyou; Ma, Aijun; Huang, Zhihui; Wang, Guangning; Wang, Ting; Xia, Dandan; Ma, Benhe

    2016-01-01

    Background Turbot Scophthalmus maximus is an economically important species extensively aquacultured in China. The genetic selection program is necessary and urgent for the sustainable development of this industry, requiring more and more genome background knowledge. Transcriptome sequencing is an excellent alternative way to identify transcripts involved in specific biological processes and exploit a considerable quantity of molecular makers when no genome sequences are available. In this study, a comprehensive transcript dataset for major tissues of S. maximus was produced on basis of an Illumina platform. Results Total RNA was isolated from liver, spleen, kidney, cerebrum, gonad (testis and ovary) and muscle. Equal quantities of RNA from each type of tissues were pooled to construct two cDNA libraries (male and female). Using the Illumina paired-end sequencing technology, nearly 44.22 million clean reads in length of 100 bp were generated and then assembled into 106,643 contigs, of which 71,107 were named unigenes with an average length of 892 bp after the elimination of redundancies. Of these, 24,052 unigenes (33.83% of the total) were successfully annotated. GO, KEGG pathway mapping and COG analysis were performed to predict potential genes and their functions. Based on our sequence analysis and published documents, many candidate genes with fundamental roles in sex determination and gonad differentiation (dmrt1), growth (ghrh, myf5, prl/prlr) and immune response (TLR1/TLR21/TLR22, IL-15/IL-34), were identified for the first time in this species. In addition, a large number of credible genetic markers, including 21,192 SSRs and 8,642 SNPs, were identified in the present dataset. Conclusion This informative transcriptome provides valuable new data to increase genomic resources of Scophthalmus maximus. The future studies of corresponding gene functions will be very useful for the management of reproduction, growth and disease control in turbot aquaculture breeding programs. The molecular markers identified in this database will aid in genetic linkage analyses, mapping of quantitative trait loci, and acceleration of marker assisted selection programs. PMID:26925843

  9. Sequencing, De Novo Assembly, and Annotation of the Transcriptome of the Endangered Freshwater Pearl Bivalve, Cristaria plicata, Provides Novel Insights into Functional Genes and Marker Discovery

    PubMed Central

    Kang, Se Won; Hwang, Hee-Ju; Park, So Young; Park, Eun Bi; Chung, Jong Min; Song, Dae Kwon; Kim, Changmu; Kim, Soonok; Lee, Jun Sang; Han, Yeon Soo; Park, Hong Seog; Lee, Yong Seok

    2016-01-01

    Background The freshwater mussel Cristaria plicata (Bivalvia: Eulamellibranchia: Unionidae), is an economically important species in molluscan aquaculture due to its use in pearl farming. The species have been listed as endangered in South Korea due to the loss of natural habitats caused by anthropogenic activities. The decreasing population and a lack of genomic information on the species is concerning for environmentalists and conservationists. In this study, we conducted a de novo transcriptome sequencing and annotation analysis of C. plicata using Illumina HiSeq 2500 next-generation sequencing (NGS) technology, the Trinity assembler, and bioinformatics databases to prepare a sustainable resource for the identification of candidate genes involved in immunity, defense, and reproduction. Results The C. plicata transcriptome analysis included a total of 286,152,584 raw reads and 281,322,837 clean reads. The de novo assembly identified a total of 453,931 contigs and 374,794 non-redundant unigenes with average lengths of 731.2 and 737.1 bp, respectively. Furthermore, 100% coverage of C. plicata mitochondrial genes within two unigenes supported the quality of the assembler. In total, 84,274 unigenes showed homology to entries in at least one database, and 23,246 unigenes were allocated to one or more Gene Ontology (GO) terms. The most prominent GO biological process, cellular component, and molecular function categories (level 2) were cellular process, membrane, and binding, respectively. A total of 4,776 unigenes were mapped to 123 biological pathways in the KEGG database. Based on the GO terms and KEGG annotation, the unigenes were suggested to be involved in immunity, stress responses, sex-determination, and reproduction. A total of 17,251 cDNA simple sequence repeats (cSSRs) were identified from 61,141 unigenes (size of >1 kb) with the most abundant being dinucleotide repeats. Conclusions This dataset represents the first transcriptome analysis of the endangered mollusc, C. plicata. The transcriptome provides a comprehensive sequence resource for the conservation of genetic information in this species and enrichment of the genetic database. The development of molecular markers will assist in the genetic improvement of C. plicata. PMID:26872384

  10. Deciphering the colon cancer genes--report of the InSiGHT-Human Variome Project Workshop, UNESCO, Paris 2010.

    PubMed

    Kohonen-Corish, Maija R J; Macrae, Finlay; Genuardi, Maurizio; Aretz, Stefan; Bapat, Bharati; Bernstein, Inge T; Burn, John; Cotton, Richard G H; den Dunnen, Johan T; Frebourg, Thierry; Greenblatt, Marc S; Hofstra, Robert; Holinski-Feder, Elke; Lappalainen, Ilkka; Lindblom, Annika; Maglott, Donna; Møller, Pål; Morreau, Hans; Möslein, Gabriela; Sijmons, Rolf; Spurdle, Amanda B; Tavtigian, Sean; Tops, Carli M J; Weber, Thomas K; de Wind, Niels; Woods, Michael O

    2011-04-01

    The Human Variome Project (HVP) has established a pilot program with the International Society for Gastrointestinal Hereditary Tumours (InSiGHT) to compile all inherited variation affecting colon cancer susceptibility genes. An HVP-InSiGHT Workshop was held on May 10, 2010, prior to the HVP Integration and Implementation Meeting at UNESCO in Paris, to review the progress of this pilot program. A wide range of topics were covered, including issues relating to genotype-phenotype data submission to the InSiGHT Colon Cancer Gene Variant Databases (chromium.liacs.nl/LOVD2/colon_cancer/home.php). The meeting also canvassed the recent exciting developments in models to evaluate the pathogenicity of unclassified variants using in silico data, tumor pathology information, and functional assays, and made further plans for the future progress and sustainability of the pilot program. PMID:21387463

  11. Studying human disease genes in Caenorhabditis elegans: a molecular genetics laboratory project.

    PubMed

    Cox-Paulson, Elisabeth A; Grana, Theresa M; Harris, Michelle A; Batzli, Janet M

    2012-01-01

    Scientists routinely integrate information from various channels to explore topics under study. We designed a 4-wk undergraduate laboratory module that used a multifaceted approach to study a question in molecular genetics. Specifically, students investigated whether Caenorhabditis elegans can be a useful model system for studying genes associated with human disease. In a large-enrollment, sophomore-level laboratory course, groups of three to four students were assigned a gene associated with either breast cancer (brc-1), Wilson disease (cua-1), ovarian dysgenesis (fshr-1), or colon cancer (mlh-1). Students compared observable phenotypes of wild-type C. elegans and C. elegans with a homozygous deletion in the assigned gene. They confirmed the genetic deletion with nested polymerase chain reaction and performed a bioinformatics analysis to predict how the deletion would affect the encoded mRNA and protein. Students also performed RNA interference (RNAi) against their assigned gene and evaluated whether RNAi caused a phenotype similar to that of the genetic deletion. As a capstone activity, students prepared scientific posters in which they presented their data, evaluated whether C. elegans was a useful model system for studying their assigned genes, and proposed future directions. Assessment showed gains in understanding genotype versus phenotype, RNAi, common bioinformatics tools, and the utility of model organisms. PMID:22665589

  12. Human lung project: evaluating variance of gene expression in the human lung.

    PubMed

    Gruber, Michael P; Coldren, Christopher D; Woolum, Malcolm D; Cosgrove, Gregory P; Zeng, Chan; Barn, Anna E; Moore, Mark D; Cool, Carlyne D; Worthen, G Scott; Brown, Kevin K; Geraci, Mark W

    2006-07-01

    Nondiseased tissue is an important reference for microarray studies of pulmonary disease. We obtained 23 single lungs from multiorgan donors at time of procurement. Donors varied in age, sex, smoking history, and ethnicity. Lungs were dissected into upper and lower lobe peripheral sections for RNA extraction. Microarray analysis was performed using Affymetrix Hu-133 Plus 2.0 arrays. We observed that the relative variability of gene expression increased rapidly from technical (lowest), to regional, to population (highest). In addition, age and sex have measurable effects on gene expression. Gene expression variability is heterogeneously distributed among biologic categories. We conclude that gene expression variability is greater between individuals than within individuals and that population variability is the most important factor in the study design of microarray experiments of the human lung. Classes of genes with high population variability are biologically important and provide a novel perspective into lung physiology and pathobiology. Our study represents the first comprehensive analysis of nondiseased lung tissue. The generation of this robust dataset has important implications for the design and implementation of future comparative expression analysis with pulmonary disease states. PMID:16498083

  13. Parallel discovery of Alzheimer's therapeutics.

    PubMed

    Lo, Andrew W; Ho, Carole; Cummings, Jayna; Kosik, Kenneth S

    2014-06-18

    As the prevalence of Alzheimer's disease (AD) grows, so do the costs it imposes on society. Scientific, clinical, and financial interests have focused current drug discovery efforts largely on the single biological pathway that leads to amyloid deposition. This effort has resulted in slow progress and disappointing outcomes. Here, we describe a "portfolio approach" in which multiple distinct drug development projects are undertaken simultaneously. Although a greater upfront investment is required, the probability of at least one success should be higher with "multiple shots on goal," increasing the efficiency of this undertaking. However, our portfolio simulations show that the risk-adjusted return on investment of parallel discovery is insufficient to attract private-sector funding. Nevertheless, the future cost savings of an effective AD therapy to Medicare and Medicaid far exceed this investment, suggesting that government funding is both essential and financially beneficial. PMID:24944190

  14. STS-92 Discovery Launch

    NASA Technical Reports Server (NTRS)

    2000-01-01

    Viewed from across the waters of Banana Creek, clouds of smoke and steam are illuminated by the flames from Space Shuttle Discovery'''s perfect on-time launch at 7:17 p.m. EDT. Discovery carries a crew of seven on a construction flight to the International Space Station. Discovery also carries a payload that includes the Integrated Truss Structure Z-1, first of 10 trusses that will form the backbone of the Space Station, and the third Pressurized Mating Adapter that will provide a Shuttle docking port for solar array installation on the sixth Station flight and Lab installation on the seventh Station flight. Discovery'''s landing is expected Oct. 22 at 2:10 p.m. EDT.

  15. A transcriptomic analysis of striped catfish (Pangasianodon hypophthalmus) in response to salinity adaptation: De novo assembly, gene annotation and marker discovery.

    PubMed

    Thanh, Nguyen Minh; Jung, Hyungtaek; Lyons, Russell E; Chand, Vincent; Tuan, Nguyen Viet; Thu, Vo Thi Minh; Mather, Peter

    2014-06-01

    The striped catfish (Pangasianodon hypophthalmus) culture industry in the Mekong Delta in Vietnam has developed rapidly over the past decade. The culture industry now however, faces some significant challenges, especially related to climate change impacts notably from predicted extensive saltwater intrusion into many low topographical coastal provinces across the Mekong Delta. This problem highlights a need for development of culture stocks that can tolerate more saline culture environments as a response to expansion of saline water-intruded land. While a traditional artificial selection program can potentially address this need, understanding the genomic basis of salinity tolerance can assist development of more productive culture lines. The current study applied a transcriptomic approach using Ion PGM technology to generate expressed sequence tag (EST) resources from the intestine and swim bladder from striped catfish reared at a salinity level of 9ppt which showed best growth performance. Total sequence data generated was 467.8Mbp, consisting of 4,116,424 reads with an average length of 112bp. De novo assembly was employed that generated 51,188 contigs, and allowed identification of 16,116 putative genes based on the GenBank non-redundant database. GO annotation, KEGG pathway mapping, and functional annotation of the EST sequences recovered with a wide diversity of biological functions and processes. In addition, more than 11,600 simple sequence repeats were also detected. This is the first comprehensive analysis of a striped catfish transcriptome, and provides a valuable genomic resource for future selective breeding programs and functional or evolutionary studies of genes that influence salinity tolerance in this important culture species. PMID:24841517

  16. Discovery of a novel functional leptin protein (LEP) in zebra finches: evidence for the existence of an authentic avian leptin gene predominantly expressed in the brain and pituitary.

    PubMed

    Huang, Guian; Li, Juan; Wang, Hongning; Lan, Xinyu; Wang, Yajun

    2014-09-01

    Leptin (LEP) is reported to play important roles in controlling energy balance in vertebrates, including birds. However, it remains an open question whether an authentic "LEP gene" exists and functions in birds. Here, we identified and characterized a LEP gene (zebra finch LEP [zbLEP]) encoding a 172-amino acid precursor in zebra finches. Despite zbLEP showing limited amino acid sequence identity (26%-29%) to human and mouse LEPs, synteny analysis proved that zbLEP is orthologous to mammalian LEP. Using a pAH32 luciferase reporter system and Western blot analysis, we demonstrated that the recombinant zbLEP protein could potently activate finch and chicken LEP receptors (zbLEPR; cLEPR) expressed in human embryonic kidney 293 cells and enhance signal transducer and activator of transcription 3 phosphorylation, further indicating that zbLEP is a functional ligand for avian LEPRs. Interestingly, quantitative real-time RT-PCR revealed that zbLEP mRNA is expressed nearly exclusively in the pituitary and various brain regions but undetectable in adipose tissue and liver, whereas zbLEPR mRNA is widely expressed in adult finch tissues examined with abundant expression noted in pituitary, implying that unlike mammalian LEP, finch LEP may not act as an adipocyte-derived signal to control energy balance. As in finches, a LEP highly homologous to zbLEP was also identified in budgerigar genome. Strikingly, finch and budgerigar LEPs show little homology with chicken LEP (cLEP) previously reported, suggesting that the so-called cLEP is incorrect. Collectively, our data provide convincing evidence for the existence of an authentic functional LEP in avian species and suggest an important role of brain- and pituitary-derived LEP played in vertebrates. PMID:24823393

  17. Antarctic macrozooplankton of the southwest Atlantic sector and Bellingshausen Sea: Baseline historical distributions ( Discovery Investigations, 1928-1935) related to temperature and food, with projections for subsequent ocean warming

    NASA Astrophysics Data System (ADS)

    Mackey, A. P.; Atkinson, A.; Hill, S. L.; Ward, P.; Cunningham, N. J.; Johnston, N. M.; Murphy, E. J.

    2012-01-01

    Since the Discovery Investigations of the 1920s and 1930s, seawater temperatures have increased in the Atlantic sector by ˜1 °C; greater than the global mean rise. The aims of this paper were first to rescue the Discovery macrozooplankton data, second to provide quantitative "baseline" distribution maps, relating these to indices of temperature and food. Our third aim was to use the relationships we derived between abundance and temperature to project the potential affect of a 1 °C warming on the Discovery era distribution patterns. Based on the 1 m ringnet data retrieved from 615 stations (Nov-March), four taxa comprised >90% of the Antarctic macrozooplankton abundance: Rhincalanus gigas, Thysanoessa spp., Euphausia superba, and Chaetognaths. Most of the taxa, especially the more abundant ones, were warm water species penetrating into Antarctica and thus total macrozooplankton abundance decreased about 100-fold from 50°S to 70°S. While temperature correlated best with distribution at this large scale, food availability (proxied by a present-day satellite-based Chlorophyll a climatology) had a secondary effect, with the major euphausiids Euphausia superba and Thysanoessa spp. concentrated in high chl a areas. A modelled uniform 1 °C temperature rise produced a poleward shift for all taxa, but the Antarctic continent blocked this re-adjustment for the high latitude species, constricting their predicted range. More widespread polar/sub-polar species were predicted to increase their penetration into Antarctica by 4-12° in latitude, whereas the poleward shift in potential range of sub-Antarctic taxa were limited by the steep temperature gradient across the Antarctic Polar Front (APF). However, within the Scotia Sea the relatively warm temperatures of the northern Antarctic Zone, abundant food due to iron fertilisation and intense eddy activity provide a "gateway" for northern species to penetrate south of the APF. Our model predictions, based on measured distributional ranges and observed temperature increases, provide a yardstick with which to compare modern day data compilations and assess the potential effects of future temperature increases.

  18. Drosophila olfactory local interneurons and projection neurons derive from a common neuroblast lineage specified by the empty spiracles gene

    PubMed Central

    Das, Abhijit; Sen, Sonia; Lichtneckert, Robert; Okada, Ryuichi; Ito, Kei; Rodrigues, Veronica; Reichert, Heinrich

    2008-01-01

    Background Encoding of olfactory information in insects occurs in the antennal lobe where the olfactory receptor neurons interact with projection neurons and local interneurons in a complex sensory processing circuitry. While several studies have addressed the developmental mechanisms involved in specification and connectivity of olfactory receptor neurons and projection neurons in Drosophila, the local interneurons are far less well understood. Results In this study, we use genetic marking techniques combined with antibody labelling and neuroblast ablation to analyse lineage specific aspects of local interneuron development. We find that a large set of local interneurons labelled by the GAL4-LN1 (NP1227) and GAL4-LN2 (NP2426) lines arise from the lateral neuroblast, which has also been shown to generate uniglomerular projection neurons. Moreover, we find that a remarkable diversity of local interneuron cell types with different glomerular innervation patterns and neurotransmitter expression derives from this lineage. We analyse the birth order of these two distinct neuronal types by generating MARCM (mosaic analysis with a repressible cell marker) clones at different times during larval life. This analysis shows that local interneurons arise throughout the proliferative cycle of the lateral neuroblast beginning in the embryo, while uniglomerular projection neurons arise later during the second larval instar. The lateral neuroblast requires the function of the cephalic gap gene empty spiracles for the development of olfactory interneurons. In empty spiracles null mutant clones, most of the local interneurons and lateral projection neurons are lacking. These findings reveal similarities in the development of local interneurons and projection neurons in the olfactory system of Drosophila. Conclusion We find that the lateral neuroblast of the deutocerebrum gives rise to a large and remarkably diverse set of local interneurons as well as to projection neurons in the antennal lobe. Moreover, we show that specific combinations of these two neuron types are produced in specific time windows in this neuroblast lineage. The development of both these cell types in this lineage requires the function of the empty spiracles gene. PMID:19055770

  19. A roadmap for natural product discovery based on large-scale genomics and metabolomics.

    PubMed

    Doroghazi, James R; Albright, Jessica C; Goering, Anthony W; Ju, Kou-San; Haines, Robert R; Tchalukov, Konstantin A; Labeda, David P; Kelleher, Neil L; Metcalf, William W

    2014-11-01

    Actinobacteria encode a wealth of natural product biosynthetic gene clusters, whose systematic study is complicated by numerous repetitive motifs. By combining several metrics, we developed a method for the global classification of these gene clusters into families (GCFs) and analyzed the biosynthetic capacity of Actinobacteria in 830 genome sequences, including 344 obtained for this project. The GCF network, comprising 11,422 gene clusters grouped into 4,122 GCFs, was validated in hundreds of strains by correlating confident mass spectrometric detection of known small molecules with the presence or absence of their established biosynthetic gene clusters. The method also linked previously unassigned GCFs to known natural products, an approach that will enable de novo, bioassay-free discovery of new natural products using large data sets. Extrapolation from the 830-genome data set reveals that Actinobacteria encode hundreds of thousands of future drug leads, and the strong correlation between phylogeny and GCFs frames a roadmap to efficiently access them. PMID:25262415

  20. A Roadmap for Natural Product Discovery Based on Large-Scale Genomics and Metabolomics

    PubMed Central

    Doroghazi, James R.; Albright, Jessica C.; Goering, Anthony W.; Ju, Kou-San; Haines, Robert R.; Tchalukov, Konstantin A.; Labeda, David P.; Kelleher, Neil L.; Metcalf, William W.

    2014-01-01

    Actinobacteria encode a wealth of natural product biosynthetic gene clusters (NPGCs), whose systematic study is complicated by numerous repetitive motifs. By combining several metrics we developed a method for global classification of these gene clusters into families (GCFs) and analyzed the biosynthetic capacity of Actinobacteria in 830 genome sequences, including 344 obtained for this project. The GCF network, comprised of 11,422 gene clusters grouped into 4,122 GCFs, was validated in hundreds of strains by correlating confident mass spectrometric detection of known small molecules with the presence/absence of their established biosynthetic gene clusters. The method also linked previously unassigned GCFs to known natural products, an approach that will enable de novo, bioassay-free discovery of novel natural products using large data sets. Extrapolation from the 830-genome dataset reveals that Actinobacteria encode hundreds of thousands of future drug leads, while the strong correlation between phylogeny and GCFs frames a roadmap to efficiently access them. PMID:25262415

  1. Using Comparative Genomics to Drive New Discoveries in Microbiology

    PubMed Central

    Haft, Daniel H.

    2015-01-01

    Bioinformatics looks to many microbiologists like a service industry. In this view, annotation starts with what is known from experiments in the lab, makes reasonable inferences of which genes match other genes in function, builds databases to make all that we know accessible, but creates nothing truly new. Experiments lead, then biocuration and computational biology follow. But the astounding success of genome sequencing is changing the annotation paradigm. Every genome sequenced is an intercepted coded message from the microbial world, and as all cryptographers know, it is easier to decode a thousand messages than a single message. Some biology is best discovered not by phenomenology, but by decoding genome content, forming hypotheses, and doing the first few rounds of validation computationally. Through such reasoning, a role and function may be assigned to a protein with no sequence similarity to any protein yet studied. Experimentation can follow after the discovery to cement and to extend the findings. Unfortunately, this approach remains so unfamiliar to most bench scientists that lab work and comparative genomics typically segregate to different teams working on unconnected projects. This review will discuss several themes in comparative genomics as a discovery method, including highly derived data, use of patterns of design to reason by analogy, and in silico testing of computationally generated hypotheses. PMID:25617609

  2. Discovery and Classification in Astronomy

    NASA Astrophysics Data System (ADS)

    Dick, Steven J.

    2013-10-01

    Preface; Abbreviations; Introduction: the natural history of the heavens and the natural history of discovery; Part I. Entrée: 1. The Pluto affair; Part II. Narratives of Discovery: 2. Moons, rings, and asteroids: discovery in the realm of the planets; 3. In Herschel's gardens: nebulous discoveries in the realm of the stars; 4. Dwarfs, giants, and planets (again!): the discovery of the stars themselves; 5. Galaxies, quasars, and clusters: discovery in the realm of the galaxies; Part III. Patterns of Discovery: 6. The structure of discovery; 7. The varieties of discovery; 8. Discovery and classification; Part IV. Drivers of Discovery: 9. Technology and theory as drivers of discovery; Part V. The Synthesis of Discovery: 10. Luxuriant gardens and the master narrative; 11. The meaning of discovery; Appendix I; Appendix II.

  3. Metagenomic biomarker discovery and explanation.

    PubMed

    Segata, Nicola; Izard, Jacques; Waldron, Levi; Gevers, Dirk; Miropolsky, Larisa; Garrett, Wendy S; Huttenhower, Curtis

    2011-01-01

    This study describes and validates a new method for metagenomic biomarker discovery by way of class comparison, tests of biological consistency and effect size estimation. This addresses the challenge of finding organisms, genes, or pathways that consistently explain the differences between two or more microbial communities, which is a central problem to the study of metagenomics. We extensively validate our method on several microbiomes and a convenient online interface for the method is provided at http://huttenhower.sph.harvard.edu/lefse/. PMID:21702898

  4. Discovery of Western European R1b1a2 Y Chromosome Variants in 1000 Genomes Project Data: An Online Community Approach

    PubMed Central

    Rocca, Richard A.; Magoon, Gregory; Reynolds, David F.; Krahn, Thomas; Tilroe, Vincent O.; Op den Velde Boots, Peter M.; Grierson, Andrew J.

    2012-01-01

    The authors have used an online community approach, and tools that were readily available via the Internet, to discover genealogically and therefore phylogenetically relevant Y-chromosome polymorphisms within core haplogroup R1b1a2-L11/S127 (rs9786076). Presented here is the analysis of 135 unrelated L11 derived samples from the 1000 Genomes Project. We were able to discover new variants and build a much more complex phylogenetic relationship for L11 sub-clades. Many of the variants were further validated using PCR amplification and Sanger sequencing. The identification of these new variants will help further the understanding of population history including patrilineal migrations in Western and Central Europe where R1b1a2 is the most frequent haplogroup. The fine-grained phylogenetic tree we present here will also help to refine historical genetic dating studies. Our findings demonstrate the power of citizen science for analysis of whole genome sequence data. PMID:22911832

  5. Environmental Regulation of Plant Gene Expression: An Rt-qPCR Laboratory Project for an Upper-Level Undergraduate Biochemistry or Molecular Biology Course

    ERIC Educational Resources Information Center

    Eickelberg, Garrett J.; Fisher, Alison J.

    2013-01-01

    We present a novel laboratory project employing "real-time" RT-qPCR to measure the effect of environment on the expression of the "FLOWERING LOCUS C" gene, a key regulator of floral timing in "Arabidopsis thaliana" plants. The project requires four 3-hr laboratory sessions and is aimed at upper-level undergraduate

  6. Environmental Regulation of Plant Gene Expression: An Rt-qPCR Laboratory Project for an Upper-Level Undergraduate Biochemistry or Molecular Biology Course

    ERIC Educational Resources Information Center

    Eickelberg, Garrett J.; Fisher, Alison J.

    2013-01-01

    We present a novel laboratory project employing "real-time" RT-qPCR to measure the effect of environment on the expression of the "FLOWERING LOCUS C" gene, a key regulator of floral timing in "Arabidopsis thaliana" plants. The project requires four 3-hr laboratory sessions and is aimed at upper-level undergraduate…

  7. System for Information Discovery

    Energy Science and Technology Software Center (ESTSC)

    1998-09-25

    SID characterizes natural language based documents so that they may be related and retrieved based on content similarity. This technology processes textual documents, autonoumsly identifies the major topics of the document set, and constructs an interpretable, high dimensional representation of each document. SID also provides the ability to interactively reweight representations based on user need, so users may analyze the dataset from multiple points of view. The particular advantages SID offers are speed, data compression,more » flexibility in representation, and incremental processing. SPIRE consists of software for visual analysis of text-based information sources. This technology enables users to make discoveries about the content of very large sets of textual documents without requiring the user to read or presort the documents. It employs algorithms for text and word proximity analysis to identify the key themes within the documents. The results of this analysis are projected onto a visual spatial proximity display (Galaxies or Themescape) where document proximity represents the degree of relatedness of theme.« less

  8. Mapping Our Genes--The Genome Projects: How Big, How Fast?

    ERIC Educational Resources Information Center

    Congress of the U.S., Washington, DC. Office of Technology Assessment.

    Scientific and technical journals in biology and medicine in recent years have extensively covered a debate about whether and how to determine the function and order of human genes on human chromosomes and when to determine the sequence of molecular building blocks that comprise DNA in those chromosomes. In 1987, these issues rose to become part…

  9. Studying Human Disease Genes in "Caenorhabditis Elegans": A Molecular Genetics Laboratory Project

    ERIC Educational Resources Information Center

    Cox-Paulson, Elisabeth A.; Grana, Theresa M.; Harris, Michelle A.; Batzli, Janet M.

    2012-01-01

    Scientists routinely integrate information from various channels to explore topics under study. We designed a 4-wk undergraduate laboratory module that used a multifaceted approach to study a question in molecular genetics. Specifically, students investigated whether "Caenorhabditis elegans" can be a useful model system for studying genes

  10. Genomic instability, driver genes and cell selection: Projections from cancer to stem cells.

    PubMed

    Ben-David, Uri

    2015-04-01

    Cancer cells and stem cells share many traits, including a tendency towards genomic instability. Human cancers exhibit tumor-specific genomic aberrations, which often affect their malignancy and drug response. During their culture propagation, human pluripotent stem cells (hPSCs) also acquire characteristic genomic aberrations, which may have significant impact on their molecular and cellular phenotypes. These aberrations vary in size from single nucleotide alterations to copy number alterations to whole chromosome gains. A prominent challenge in both cancer and stem cell research is to identify "driver aberrations" that confer a selection advantage, and "driver genes" that underlie the recurrence of these aberrations. Following principles that are already well-established in cancer research, candidate driver genes have also been suggested in hPSCs. Experimental validation of the functional role of such candidates can uncover whether these are bona fide driver genes. The identification of driver genes may bring us closer to a mechanistic understanding of the genomic instability of stem cells. Guided by terminologies and methodologies commonly applied in cancer research, such understanding may have important ramifications for both stem cell and cancer biology. This article is part of a Special Issue entitled: Stress as a fundamental theme in cell plasticity. PMID:25132386

  11. Discovery of a new gene pool and a high genetic diversity of the chestnut blight fungus Cryphonectria parasitica in Caucasian Georgia.

    PubMed

    Prospero, S; Lutz, A; Tavadze, B; Supatashvili, A; Rigling, D

    2013-12-01

    In this study, we investigated the population genetic structure and possible origins of the plant pathogen Cryphonectria parasitica in Caucasian Georgia, a region within the centre of origin of the host species Castanea sativa. A total of 427 C. parasitica isolates from nine populations were genotyped at 10 microsatellite loci. A high genetic diversity was detected, but the overall Georgian population was dominated by three haplotypes which were present in most individual populations. Two of them have not been previously found in Europe. Bayesian clustering analysis and principal component analysis could not identify their source population, neither in Asia nor in North America. On the other hand, one haplotype is frequent in Central Europe and probably naturally invaded Caucasian Georgia from neighbouring Turkey. Seventy-three haplotypes were unique to specific populations, and 66 of them were represented by a single isolate. Allele patterns suggest that most of these haplotypes emerged locally through sexual recombination between haplotypes of the Georgian and the central European gene pool. Due to the high incidence of haplotypes not otherwise present in Europe, Caucasian Georgia represents an additional source of diversity for the European C. parasitica population. PMID:23994123

  12. Recent advances in candidate-gene and whole-genome approaches to the discovery of anthelmintic resistance markers and the description of drug/receptor interactions

    PubMed Central

    Kotze, Andrew C.; Hunt, Peter W.; Skuce, Philip; von Samson-Himmelstjerna, Georg; Martin, Richard J.; Sager, Heinz; Krücken, Jürgen; Hodgkinson, Jane; Lespine, Anne; Jex, Aaron R.; Gilleard, John S.; Beech, Robin N.; Wolstenholme, Adrian J.; Demeler, Janina; Robertson, Alan P.; Charvet, Claude L.; Neveu, Cedric; Kaminsky, Ronald; Rufener, Lucien; Alberich, Melanie; Menez, Cecile; Prichard, Roger K.

    2014-01-01

    Anthelmintic resistance has a great impact on livestock production systems worldwide, is an emerging concern in companion animal medicine, and represents a threat to our ongoing ability to control human soil-transmitted helminths. The Consortium for Anthelmintic Resistance and Susceptibility (CARS) provides a forum for scientists to meet and discuss the latest developments in the search for molecular markers of anthelmintic resistance. Such markers are important for detecting drug resistant worm populations, and indicating the likely impact of the resistance on drug efficacy. The molecular basis of resistance is also important for understanding how anthelmintics work, and how drug resistant populations arise. Changes to target receptors, drug efflux and other biological processes can be involved. This paper reports on the CARS group meeting held in August 2013 in Perth, Australia. The latest knowledge on the development of molecular markers for resistance to each of the principal classes of anthelmintics is reviewed. The molecular basis of resistance is best understood for the benzimidazole group of compounds, and we examine recent work to translate this knowledge into useful diagnostics for field use. We examine recent candidate-gene and whole-genome approaches to understanding anthelmintic resistance and identify markers. We also look at drug transporters in terms of providing both useful markers for resistance, as well as opportunities to overcome resistance through the targeting of the transporters themselves with inhibitors. Finally, we describe the tools available for the application of the newest high-throughput sequencing technologies to the study of anthelmintic resistance. PMID:25516826

  13. A Genome-wide Analysis of Admixture in Uyghurs and a High-Density Admixture Map for Disease-Gene Discovery

    PubMed Central

    Xu, Shuhua; Jin, Li

    2008-01-01

    Following up on our previous study, we conducted a genome-wide analysis of admixture for two Uyghur population samples (HGDP-UG and PanAsia-UG), collected from the northern and southern regions of Xinjiang in China, respectively. Both HGDP-UG and PanAsia-UG showed a substantial admixture of East-Asian (EAS) and European (EUR) ancestries, with an empirical estimation of ancestry contribution of 53:47 (EAS:EUR) and 48:52 for HGDP-UG and PanAsia-UG, respectively. The effective admixture time under a model with a single pulse of admixture was estimated as 110 generations and 129 generations, or admixture events occurred about 2200 and 2580 years ago for HGDP-UG and PanAsia-UG, respectively, assuming an average of 20 yr per generation. Despite Uyghurs' earlier history compared to other admixture populations, admixture mapping, holds promise for this population, because of its large size and its mixture of ancestry from different continents. We screened multiple databases and identified a genome-wide single-nucleotide polymorphism panel that can distinguish EAS and EUR ancestry of chromosomal segments in Uyghurs. The panel contains 8150 ancestry-informative markers (AIMs) showing large frequency differences between EAS and EUR populations (FST > 0.25, mean FST = 0.43) but small frequency differences (7999 AIMs validated) within both populations (FST < 0.05, mean FST < 0.01). We evaluated the effectiveness of this admixture map for localizing disease genes in two Uyghur populations. To our knowledge, our map constitutes the first practical resource for admixture mapping in Uyghurs, and it will enable studies of diseases showing differences in genetic risk between EUR and EAS populations. PMID:18760393

  14. Identification of Human Ether-à-go-go Related Gene Modulators by Three Screening Platforms in an Academic Drug-Discovery Setting

    PubMed Central

    Huang, Xi-Ping; Mangano, Thomas; Hufeisen, Sandy; Setola, Vincent

    2010-01-01

    Abstract The human Ether-à-go-go related gene (hERG) potassium channel is responsible for the rapid delayed rectifier potassium current that plays a critical role in the repolarization of cardiomyocytes during the cardiac action potential. In humans, inhibition of hERG by drugs can prolong the electrocardiographic QT interval, which, in rare instance, leads to ventricular arrhythmia and sudden cardiac death. As such, several medications that block hERG channels in vitro have been withdrawn from the market due to QT prolongation and arrhythmias. The current FDA guidelines recommend that drug candidates destined for human use be evaluated for potential hERG activity (www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm074963.pdf). Here, we employed automated planar patch clamp (APPC), high-throughput fluorescent Tl+ flux, and moderate-throughput [3H]dofetilide competition binding assays to characterize a panel of 49 drugs for their activities at the hERG channel. Notably, we used the same HEK293-hERG cell line for all assays, facilitating comparisons of hERG potencies across screening platforms. In general, hERG inhibitors were most potent in APPC assays, intermediate potent in [3H]dofetilide binding assays, and least potent in Tl+ flux assays. Binding affinity constants (pKi values) and Tl+ flux potencies (pEC50 values) correlated well with APPC pEC50 values. Further, the inhibitory potencies of many known hERG inhibitors in APPC matched literature values from manual and/or automated patch clamp systems. We also developed a novel fluorescent Tl+ flux assays to measure the effects of drugs that modulate hERG trafficking and surface expression. PMID:21158687

  15. Radiation Detection Material Discovery Initiative at PNNL

    NASA Astrophysics Data System (ADS)

    Milbrath, Brian

    2006-05-01

    Today's security threats are being met with 30-year old radiation technology. Discovery of new radiation detection materials is currently a slow and Edisonian process. With heightened concerns over nuclear proliferation, terrorism and unconventional warfare, an alternative strategy for identification and development of potential radiation detection materials must be adopted. Through the Radiation Detection Materials Discovery Initiative, PNNL focuses on the science-based discovery of next generation materials for radiation detection by addressing three ``grand challenges'': fundamental understanding of radiation detection, identification of new materials, and accelerating the discovery process. The new initiative has eight projects addressing these challenges, which will be described, including early work, paths forward and the opportunities for collaboration.

  16. THE BIOCATALYTIC DESULFURIZATION PROJECT

    SciTech Connect

    Steven E. Bonde; David Nunn

    2003-01-01

    During the first quarter of the Biological Desulfurization project several activities were pursued. A project kickoff meeting was held at the Diversa facility in San Diego, CA. Activities that were in process before the meeting and begun afterwards by Diversa Corporation and Petro Star Inc. include: Technology transfer in the form of information generated by Enchira to Diversa, the purchase and installation of equipment by Diversa, development of synthetic methods and preparation of organo-sulfur substrates for use in determining enzyme activities, production of extract via Petro Star's CED process, detailed analysis of Petro Star Inc. diesel and CED extract, and several activities in molecular biology. Diversa Corporation, in the area of molecular biology, engaged in several activities in support of the task list of the contract. These included: construction of a genomic library; development and utilization of a sequence-based gene discovery effort; a parallel discovery approach based on functional expression of enzymes with the ability to oxidize organosulfur compounds. Biodesulfurization genes have already been identified and are being sequenced and subcloned for expression in heterologous biological hosts. Diversa has evaluated and adapted assays developed by Enchira used to assess the activities of DBT and DBTO{sub 2} monooxygenases. Finally, Diversa personnel have developed two novel selection/screen strategies for the improvement of biocatalyst strains by directed evolution.

  17. Serendipity and Scientific Discovery.

    ERIC Educational Resources Information Center

    Rosenman, Martin F.

    1988-01-01

    The discovery of penicillin is cited in a discussion of the role of serendipity as it relates to scientific discovery. The importance of sagacity as a personality trait is noted. Successful researchers have questioning minds, are willing to view data from several perspectives, and recognize and appreciate the unexpected. (JW)

  18. Friends' Discovery Camp

    ERIC Educational Resources Information Center

    Seymour, Seth

    2008-01-01

    This article features Friends' Discovery Camp, a program that allows children with and without autism spectrum disorder to learn and play together. In Friends' Discovery Camp, campers take part in sensory-rich experiences, ranging from hands-on activities and performing arts to science experiments and stories teaching social skills. Now in its 7th…

  19. Decades of Discovery

    DOE R&D Accomplishments Database

    2011-06-01

    For the past two-and-a-half decades, the Office of Science at the U.S. Department of Energy has been at the forefront of scientific discovery. Over 100 important discoveries supported by the Office of Science are represented in this document.

  20. The human decorin gene: Intron-exon organization, discovery of two alternatively spliced exons in the 5[prime] untralsated region, and mapping of the gene to chromosome 12q23

    SciTech Connect

    Danielson, K.G.; Fazzio, A.; Cohen, I.; Cannizzaro, L.A.; Eichstetter, I.; Iozzo, R.V. )

    1993-01-01

    Decorin is a chondroitin/dermatan sulfate proteoglycan expressed by most vascular and avascular connective tissues and, because of its ability to interact with collagen and growth factors, has been implicated in the control of matrix assembly and cellular growth. To understand the molecular mechanisms involved in regulating its tissue expression, we have isolated a number of genomic clones encoding the complete decorin gene. The human decorin gene spans over 38 kb of continuous DNA sequence and contains eight exons and very large introns, two of which are 5.4 and > 13.2 kb. We have discovered two alternatively spliced leader exons, exons Ia and Ib, in the 5[prime] untranslated region. These exons were identified by cloning and sequencing cDNAs obtained by polymerase chain reaction amplification of a fibroblast cDNA library. Using Northern blotting or reverse transcriptase PCR, we detected the two leader exons in a variety of mRNAs isolated from human cell lines and tissues. Interestingly, sequences highly (74-87%) homologous to exons Ia and lb are found in the 5[prime]untranslated region of avian and bovine decorin, respectively. This high degree of conservation among species suggests regulatory functions for these leader exons. In the 3' untranslated region there are several polyadenylation sites, and at least two of these sites could give rise to the transcripts of [approx]1.6 and [approx]1.9 kb, typically detected in a variety of tissues and cells. Using a genomic clone as the labeled probe and in situ hybridization of human metaphase chromosomes, we have mapped the decorin gene to the discrete region of human chromosome 12q23. This sturdy provides the molecular basis for discerning the transcriptional control of the decorin gene and offers the opportunity to investigate genetic disorders linked to this important human gene. 57 refs., 11 figs., 3 tabs.

  1. EXPOSING STUDENTS AND TEACHERS TO SCIENCE WITH SHORT TERM BARLEY GENE MAPPING PROJECTS.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Many universities sponsor science research programs during the summers to provide hands-on laboratory experience to high school students and teachers. Our objective was to design a project that exposes the students to the full range of research, from developing and testing a hypothesis through prese...

  2. "Eureka, Eureka!" Discoveries in Science

    ERIC Educational Resources Information Center

    Agarwal, Pankaj

    2011-01-01

    Accidental discoveries have been of significant value in the progress of science. Although accidental discoveries are more common in pharmacology and chemistry, other branches of science have also benefited from such discoveries. While most discoveries are the result of persistent research, famous accidental discoveries provide a fascinating

  3. "Eureka, Eureka!" Discoveries in Science

    ERIC Educational Resources Information Center

    Agarwal, Pankaj

    2011-01-01

    Accidental discoveries have been of significant value in the progress of science. Although accidental discoveries are more common in pharmacology and chemistry, other branches of science have also benefited from such discoveries. While most discoveries are the result of persistent research, famous accidental discoveries provide a fascinating…

  4. Cancer biomarker discovery and validation

    PubMed Central

    Goossens, Nicolas; Nakagawa, Shigeki; Sun, Xiaochen; Hoshida, Yujin

    2015-01-01

    With the emergence of genomic profiling technologies and selective molecular targeted therapies, biomarkers play an increasingly important role in the clinical management of cancer patients. Single gene/protein or multi-gene “signature”-based assays have been introduced to measure specific molecular pathway deregulations that guide therapeutic decision-making as predictive biomarkers. Genome-based prognostic biomarkers are also available for several cancer types for potential incorporation into clinical prognostic staging systems or practice guidelines. However, there is still a large gap between initial biomarker discovery studies and their clinical translation due to the challenges in the process of cancer biomarker development. In this review we summarize the steps of biomarker development, highlight key issues in successful validation and implementation, and overview representative examples in the oncology field. We also discuss regulatory issues and future perspectives in the era of big data analysis and precision medicine. PMID:26213686

  5. Unsupervised Discovery of Subspace Trends.

    PubMed

    Xu, Yan; Qiu, Peng; Roysam, Badrinath

    2015-10-01

    This paper presents unsupervised algorithms for discovering previously unknown subspace trends in high-dimensional data sets without the benefit of prior information. A subspace trend is a sustained pattern of gradual/progressive changes within an unknown subset of feature dimensions. A fundamental challenge to subspace trend discovery is the presence of irrelevant data dimensions, noise, outliers, and confusion from multiple subspace trends driven by independent factors that are mixed in with each other. These factors can obscure the trends in conventional dimension reduction & projection based data visualizations. To overcome these limitations, we propose a novel graph-theoretic neighborhood similarity measure for detecting concordant progressive changes across data dimensions. Using this measure, we present an unsupervised algorithm for trend-relevant feature selection, subspace trend discovery, quantification of trend strength, and validation. Our method successfully identified verifiable subspace trends in diverse synthetic and real-world biomedical datasets. Visualizations derived from the selected trend-relevant features revealed biologically meaningful hidden subspace trend(s) that were obscured by irrelevant features and noise. Although our examples are drawn from the biological domain, the proposed algorithm is broadly applicable to exploratory analysis of high-dimensional data including visualization, hypothesis generation, knowledge discovery, and prediction in diverse other applications. PMID:26353189

  6. Final report on LDRD project : elucidating performance of proton-exchange-membrane fuel cells via computational modeling with experimental discovery and validation.

    SciTech Connect

    Wang, Chao Yang (Pennsylvania State University, University Park, PA); Pasaogullari, Ugur (Pennsylvania State University, University Park, PA); Noble, David R.; Siegel, Nathan P.; Hickner, Michael A.; Chen, Ken Shuang

    2006-11-01

    In this report, we document the accomplishments in our Laboratory Directed Research and Development project in which we employed a technical approach of combining experiments with computational modeling and analyses to elucidate the performance of hydrogen-fed proton exchange membrane fuel cells (PEMFCs). In the first part of this report, we document our focused efforts on understanding water transport in and removal from a hydrogen-fed PEMFC. Using a transparent cell, we directly visualized the evolution and growth of liquid-water droplets at the gas diffusion layer (GDL)/gas flow channel (GFC) interface. We further carried out a detailed experimental study to observe, via direct visualization, the formation, growth, and instability of water droplets at the GDL/GFC interface using a specially-designed apparatus, which simulates the cathode operation of a PEMFC. We developed a simplified model, based on our experimental observation and data, for predicting the onset of water-droplet instability at the GDL/GFC interface. Using a state-of-the-art neutron imaging instrument available at NIST (National Institute of Standard and Technology), we probed liquid-water distribution inside an operating PEMFC under a variety of operating conditions and investigated effects of evaporation due to local heating by waste heat on water removal. Moreover, we developed computational models for analyzing the effects of micro-porous layer on net water transport across the membrane and GDL anisotropy on the temperature and water distributions in the cathode of a PEMFC. We further developed a two-phase model based on the multiphase mixture formulation for predicting the liquid saturation, pressure drop, and flow maldistribution across the PEMFC cathode channels. In the second part of this report, we document our efforts on modeling the electrochemical performance of PEMFCs. We developed a constitutive model for predicting proton conductivity in polymer electrolyte membranes and compared model prediction with experimental data obtained in our laboratory and from literature. Moreover, we developed a one-dimensional analytical model for predicting electrochemical performance of an idealized PEMFC with small surface over-potentials. Furthermore, we developed a multi-dimensional computer model, which is based on the finite-element method and a fully-coupled implicit solution scheme via Newton's technique, for simulating the performance of PEMFCs. We demonstrated utility of our finite-element model by comparing the computed current density distribution and overall polarization with those measured using a segmented cell. In the last part of this report, we document an exploratory experimental study on MEA (membrane electrode assembly) degradation.

  7. Breeding Specialty Starch Maize Using Exotic Genetic Resources for Gene Discovery of Novel Alleles and Modifiers with Materials Generated from the USDA-ARS GEM Project

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Amylomaize VII, a class of High Amylose Maize with at least 70% of the kernel starch composed of the linear amylose polymer, has had numerous food and industrial applications including the manufacturing of biodegradable plastics, adhesives and candies. More recently it has been found to be a signi...

  8. Where is the Discovery in Literature-Based Discovery?

    NASA Astrophysics Data System (ADS)

    Kostoff, R. N.

    This chapter addresses the core of literature-based discovery (LBD), namely, what is discovery and how is the generation of discovery confirmed. The chapter starts with definitions of discovery and innovation, especially in the LBD context, and then proceeds to describe radical discovery and LBD. It then describes the vetting necessary to confirm the presence of discovery. Finally, the chapter concludes with a few examples where use of more comprehensive vetting techniques would have been prudent before discovery was reported. The LBD focus is on open discovery systems (start with a problem, discover a solution, or vice versa) exclusively.

  9. Exons, Introns and Talking Genes: The Sience Behind the Human Genome Project

    SciTech Connect

    Jacobson, K.B.

    1993-01-01

    This book presents in simple terms the basis of molecular genetics and how it is used to obtain an understanding of the human genome. The author's central focus is the transistion of genetics from statistics to experimental manipulations, and he offers analogies that help readers visualize the genome, thereby avoiding conventional scientific presentations. He illustrates how genetics is used in scientific laboratories, in courtrooms, and in hospitals. Little is presented about the complex social and ethical issues raised by the Human Genome project.

  10. Purposive discovery of operations

    NASA Technical Reports Server (NTRS)

    Sims, Michael H.; Bresina, John L.

    1992-01-01

    The Generate, Prune & Prove (GPP) methodology for discovering definitions of mathematical operators is introduced. GPP is a task within the IL exploration discovery system. We developed GPP for use in the discovery of mathematical operators with a wider class of representations than was possible with the previous methods by Lenat and by Shen. GPP utilizes the purpose for which an operator is created to prune the possible definitions. The relevant search spaces are immense and there exists insufficient information for a complete evaluation of the purpose constraint, so it is necessary to perform a partial evaluation of the purpose (i.e., pruning) constraint. The constraint is first transformed so that it is operational with respect to the partial information, and then it is applied to examples in order to test the generated candidates for an operator's definition. In the GPP process, once a candidate definition survives this empirical prune, it is passed on to a theorem prover for formal verification. We describe the application of this methodology to the (re)discovery of the definition of multiplication for Conway numbers, a discovery which is difficult for human mathematicians. We successfully model this discovery process utilizing information which was reasonably available at the time of Conway's original discovery. As part of this discovery process, we reduce the size of the search space from a computationally intractable size to 3468 elements.

  11. Targeting Protein Tyrosine Phosphatases for Anticancer Drug Discovery

    PubMed Central

    Scott, Latanya. M.; Lawrence, Harshani. R.; Sebti, Sad. M.; Lawrence, Nicholas. J.; Wu, Jie.

    2010-01-01

    Protein tyrosine phosphatases (PTPs) are a diverse family of enzymes encoded by 107 genes in the human genome. Together with protein tyrosine kinases (PTKs), PTPs regulate various cellular activities essential for the initiation and maintenance of malignant phenotypes. While PTK inhibitors are now used routinely for cancer treatment, the PTP inhibitor development field is still in the discovery phase. In this article, the suitability of targeting PTPs for novel anticancer drug discovery is discussed. Examples are presented for PTPs that have been targeted for anticancer drug discovery as well as potential new PTP targets for novel anticancer drug discovery. PMID:20337577

  12. Viral surveillance and discovery.

    PubMed

    Lipkin, Walter Ian; Firth, Cadhla

    2013-04-01

    The field of virus discovery has burgeoned with the advent of high throughput sequencing platforms and bioinformatics programs that enable rapid identification and molecular characterization of known and novel agents, investments in global microbial surveillance that include wildlife and domestic animals as well as humans, and recognition that viruses may be implicated in chronic as well as acute diseases. Here we review methods for viral surveillance and discovery, strategies and pitfalls in linking discoveries to disease, and identify opportunities for improvements in sequencing instrumentation and analysis, the use of social media and medical informatics that will further advance clinical medicine and public health. PMID:23602435

  13. The Greatest Mathematical Discovery?

    SciTech Connect

    Bailey, David H.; Borwein, Jonathan M.

    2010-05-12

    What mathematical discovery more than 1500 years ago: (1) Is one of the greatest, if not the greatest, single discovery in the field of mathematics? (2) Involved three subtle ideas that eluded the greatest minds of antiquity, even geniuses such as Archimedes? (3) Was fiercely resisted in Europe for hundreds of years after its discovery? (4) Even today, in historical treatments of mathematics, is often dismissed with scant mention, or else is ascribed to the wrong source? Answer: Our modern system of positional decimal notation with zero, together with the basic arithmetic computational schemes, which were discovered in India about 500 CE.

  14. A Discovery Walk in Natural Science.

    ERIC Educational Resources Information Center

    Schenberg, Samuel; And Others

    This booklet is the correlated script for a 48-frame color filmstrip designed to expand the environmental experiences of urban school children and their teachers. Included throughout the script are related projects that encourage discovery activities inside and out of the classroom. Based upon the knowledge that over 75 percent of all United…

  15. YODA: The young observant discovery agent

    SciTech Connect

    Shen, W.M.; Adibi, J.; Cho, Bonghan

    1996-12-31

    The YODA project at USC/ISI consists of a group of young researchers who share a passion for autonomous systems that can bootstrap their knowledge of real environments by exploration, experimentation, learning, and discovery. Our goal is to create a mobile agent that can autonomously learn from its environment based on its own actions, percepts, and missions.

  16. Teachers Learn About Inquiry/Discovery Approaches.

    ERIC Educational Resources Information Center

    Elliott, John

    Curriculum reformers in the United Kingdom who have expressed concern with the failure of the research, development, and diffusion model to implement inquiry/discovery learning have tended to offer a problem-solving approach to foster innovation at the classroom level. This approach is illustrated in the Ford Teaching Project, sponsored by the…

  17. PINE Discovery Box, 101 Stimulating Ideas.

    ERIC Educational Resources Information Center

    Busch, Phyllis S.

    This manual is intended for use with the PINE (Projects in Imaginative Nature Education) discovery box in elementary school conservation education. The box contains 21 natural specimens which can serve as the starting point for simple student investigations. Specimens and activities are keyed for grade level. For each item, background information…

  18. The Learning Discovery

    ERIC Educational Resources Information Center

    Prout, Joan

    1975-01-01

    The learning discovery of youngsters is a do-it-yourself teaching method for clerical, administrative, and accountant trainees at the Bankside House headquarters of the Central Electricity Generating Board's South Eastern Region, London. (Author)

  19. The requirements discovery process

    SciTech Connect

    Bahill, A.T.; Dean, F.F.

    1997-02-01

    Cost and schedule overruns are often caused by poor requirements that are produced by people who do not understand the requirement process. This paper provides a high-level overview of the requirements discovery process.

  20. The Discovery of Noggin.

    ERIC Educational Resources Information Center

    Oppenheimer, Steven B.

    1995-01-01

    Discusses recently published work that appears to have many of the answers to the question of how the nervous system develops. Focuses on the discovery of what is believed to be neural inducer, a protein called noggin. (LZ)

  1. The Alabama Drug Discovery Alliance: A Collaborative Partnership to Facilitate Academic Drug Discovery

    PubMed Central

    Everts, Maaike; Knight, W. Blaine; Harris, David R.; Secrist, John A.; Whitley, Richard J.

    2011-01-01

    The Alabama Drug Discovery Alliance is a collaboration between the University of Alabama at Birmingham and Southern Research Institute that aims to support the discovery and development of therapeutic molecules that address an unmet medical need. The alliance builds on the expertise present at both institutions and has the dedicated commitment of their respective technology transfer and intellectual property offices to guide any commercial opportunities that may arise from the supported efforts. Although most projects involve high throughput screening, projects at any stage in the drug discovery and development pathway are eligible for support. Irrespective of the target and stage of any project, well-functioning interdisciplinary teams are crucial to a project’s progress. These teams consist of investigators with a wide variety of expertise from both institutions to contribute to the program’s success. PMID:21448756

  2. Accelerating scientific discovery : 2007 annual report.

    SciTech Connect

    Beckman, P.; Dave, P.; Drugan, C.

    2008-11-14

    As a gateway for scientific discovery, the Argonne Leadership Computing Facility (ALCF) works hand in hand with the world's best computational scientists to advance research in a diverse span of scientific domains, ranging from chemistry, applied mathematics, and materials science to engineering physics and life sciences. Sponsored by the U.S. Department of Energy's (DOE) Office of Science, researchers are using the IBM Blue Gene/L supercomputer at the ALCF to study and explore key scientific problems that underlie important challenges facing our society. For instance, a research team at the University of California-San Diego/ SDSC is studying the molecular basis of Parkinson's disease. The researchers plan to use the knowledge they gain to discover new drugs to treat the disease and to identify risk factors for other diseases that are equally prevalent. Likewise, scientists from Pratt & Whitney are using the Blue Gene to understand the complex processes within aircraft engines. Expanding our understanding of jet engine combustors is the secret to improved fuel efficiency and reduced emissions. Lessons learned from the scientific simulations of jet engine combustors have already led Pratt & Whitney to newer designs with unprecedented reductions in emissions, noise, and cost of ownership. ALCF staff members provide in-depth expertise and assistance to those using the Blue Gene/L and optimizing user applications. Both the Catalyst and Applications Performance Engineering and Data Analytics (APEDA) teams support the users projects. In addition to working with scientists running experiments on the Blue Gene/L, we have become a nexus for the broader global community. In partnership with the Mathematics and Computer Science Division at Argonne National Laboratory, we have created an environment where the world's most challenging computational science problems can be addressed. Our expertise in high-end scientific computing enables us to provide guidance for applications that are transitioning to petascale as well as to produce software that facilitates their development, such as the MPICH library, which provides a portable and efficient implementation of the MPI standard--the prevalent programming model for large-scale scientific applications--and the PETSc toolkit that provides a programming paradigm that eases the development of many scientific applications on high-end computers.

  3. Rapid Annotation of Anonymous Sequences from Genome Projects Using Semantic Similarities and a Weighting Scheme in Gene Ontology

    PubMed Central

    Fontana, Paolo; Cestaro, Alessandro; Velasco, Riccardo; Formentin, Elide; Toppo, Stefano

    2009-01-01

    Background Large-scale sequencing projects have now become routine lab practice and this has led to the development of a new generation of tools involving function prediction methods, bringing the latter back to the fore. The advent of Gene Ontology, with its structured vocabulary and paradigm, has provided computational biologists with an appropriate means for this task. Methodology We present here a novel method called ARGOT (Annotation Retrieval of Gene Ontology Terms) that is able to process quickly thousands of sequences for functional inference. The tool exploits for the first time an integrated approach which combines clustering of GO terms, based on their semantic similarities, with a weighting scheme which assesses retrieved hits sharing a certain number of biological features with the sequence to be annotated. These hits may be obtained by different methods and in this work we have based ARGOT processing on BLAST resul