Science.gov

Sample records for accelerate gene discovery

  1. Accelerating the Rate of Astronomical Discovery

    NASA Astrophysics Data System (ADS)

    This meeting marks the the International Year of Astronomy by reviewing the extent to which astronomers are achieving the optimal rate of astronomical discovery. Can we identify and overcome the limits to progress? What steps can be taken to accelerate the rate of expansion of astronomical knowledge? What lessons can be learnt both from the recent and distant past? As the public announcements regarding the 2009 IYA have emphasized, new astronomical discoveries are currently being made at an extraordinary rate, while the invention of the telescope ushered in an equally momentous "golden age of discovery" 400 years ago. The meeting addresses a range of potential limits to progress-paradigmatic, technological, organizational, and political-examining each issue both from modern and historical perspectives, and drawing lessons to guide future progress. The program focusses on how astronomy actually progresses, using careful historical studies and real data, rather than anecdotes and folklore.

  2. Accelerating novel candidate gene discovery in neurogenetic disorders via whole-exome sequencing of prescreened multiplex consanguineous families.

    PubMed

    Alazami, Anas M; Patel, Nisha; Shamseldin, Hanan E; Anazi, Shamsa; Al-Dosari, Mohammed S; Alzahrani, Fatema; Hijazi, Hadia; Alshammari, Muneera; Aldahmesh, Mohammed A; Salih, Mustafa A; Faqeih, Eissa; Alhashem, Amal; Bashiri, Fahad A; Al-Owain, Mohammed; Kentab, Amal Y; Sogaty, Sameera; Al Tala, Saeed; Temsah, Mohamad-Hani; Tulbah, Maha; Aljelaify, Rasha F; Alshahwan, Saad A; Seidahmed, Mohammed Zain; Alhadid, Adnan A; Aldhalaan, Hesham; AlQallaf, Fatema; Kurdi, Wesam; Alfadhel, Majid; Babay, Zainab; Alsogheer, Mohammad; Kaya, Namik; Al-Hassnan, Zuhair N; Abdel-Salam, Ghada M H; Al-Sannaa, Nouriya; Al Mutairi, Fuad; El Khashab, Heba Y; Bohlega, Saeed; Jia, Xiaofei; Nguyen, Henry C; Hammami, Rakad; Adly, Nouran; Mohamed, Jawahir Y; Abdulwahab, Firdous; Ibrahim, Niema; Naim, Ewa A; Al-Younes, Banan; Meyer, Brian F; Hashem, Mais; Shaheen, Ranad; Xiong, Yong; Abouelhoda, Mohamed; Aldeeri, Abdulrahman A; Monies, Dorota M; Alkuraya, Fowzan S

    2015-01-13

    Our knowledge of disease genes in neurological disorders is incomplete. With the aim of closing this gap, we performed whole-exome sequencing on 143 multiplex consanguineous families in whom known disease genes had been excluded by autozygosity mapping and candidate gene analysis. This prescreening step led to the identification of 69 recessive genes not previously associated with disease, of which 33 are here described (SPDL1, TUBA3E, INO80, NID1, TSEN15, DMBX1, CLHC1, C12orf4, WDR93, ST7, MATN4, SEC24D, PCDHB4, PTPN23, TAF6, TBCK, FAM177A1, KIAA1109, MTSS1L, XIRP1, KCTD3, CHAF1B, ARV1, ISCA2, PTRH2, GEMIN4, MYOCD, PDPR, DPH1, NUP107, TMEM92, EPB41L4A, and FAM120AOS). We also encountered instances in which the phenotype departed significantly from the established clinical presentation of a known disease gene. Overall, a likely causal mutation was identified in >73% of our cases. This study contributes to the global effort toward a full compendium of disease genes affecting brain function.

  3. Choosing experiments to accelerate collective discovery

    SciTech Connect

    Rzhetsky, Andrey; Foster, Jacob G.; Foster, Ian T.; Evans, James A.

    2015-11-24

    Scientists perform a tiny subset of all possible experiments. What characterizes the experiments they choose? What are the consequences of those choices for the pace of scientific discovery? We model scientific knowledge as a network and science as a sequence of experiments designed to gradually uncover it. By analyzing millions of biomedical articles published over 30 y, we find that biomedical scientists pursue conservative research strategies exploring the local neighborhood of central, important molecules. Although such strategies probably serve scientific careers, we show that they slow scientific advance, especially in mature fields, where more risk and less redundant experimentation would accelerate discovery of the network. Lastly, we also consider institutional arrangements that could help science pursue these more efficient strategies.

  4. Choosing experiments to accelerate collective discovery

    DOE PAGES

    Rzhetsky, Andrey; Foster, Jacob G.; Foster, Ian T.; Evans, James A.

    2015-11-24

    Scientists perform a tiny subset of all possible experiments. What characterizes the experiments they choose? What are the consequences of those choices for the pace of scientific discovery? We model scientific knowledge as a network and science as a sequence of experiments designed to gradually uncover it. By analyzing millions of biomedical articles published over 30 y, we find that biomedical scientists pursue conservative research strategies exploring the local neighborhood of central, important molecules. Although such strategies probably serve scientific careers, we show that they slow scientific advance, especially in mature fields, where more risk and less redundant experimentation wouldmore » accelerate discovery of the network. Lastly, we also consider institutional arrangements that could help science pursue these more efficient strategies.« less

  5. Accelerating Scientific Discovery Through Computation and Visualization

    PubMed Central

    Sims, James S.; Hagedorn, John G.; Ketcham, Peter M.; Satterfield, Steven G.; Griffin, Terence J.; George, William L.; Fowler, Howland A.; am Ende, Barbara A.; Hung, Howard K.; Bohn, Robert B.; Koontz, John E.; Martys, Nicos S.; Bouldin, Charles E.; Warren, James A.; Feder, David L.; Clark, Charles W.; Filla, B. James; Devaney, Judith E.

    2000-01-01

    The rate of scientific discovery can be accelerated through computation and visualization. This acceleration results from the synergy of expertise, computing tools, and hardware for enabling high-performance computation, information science, and visualization that is provided by a team of computation and visualization scientists collaborating in a peer-to-peer effort with the research scientists. In the context of this discussion, high performance refers to capabilities beyond the current state of the art in desktop computing. To be effective in this arena, a team comprising a critical mass of talent, parallel computing techniques, visualization algorithms, advanced visualization hardware, and a recurring investment is required to stay beyond the desktop capabilities. This article describes, through examples, how the Scientific Applications and Visualization Group (SAVG) at NIST has utilized high performance parallel computing and visualization to accelerate condensate modeling, (2) fluid flow in porous materials and in other complex geometries, (3) flows in suspensions, (4) x-ray absorption, (5) dielectric breakdown modeling, and (6) dendritic growth in alloys. PMID:27551642

  6. Accelerators for Discovery Science and Security applications

    NASA Astrophysics Data System (ADS)

    Todd, A. M. M.; Bluem, H. P.; Jarvis, J. D.; Park, J. H.; Rathke, J. W.; Schultheiss, T. J.

    2015-05-01

    Several Advanced Energy Systems (AES) accelerator projects that span applications in Discovery Science and Security are described. The design and performance of the IR and THz free electron laser (FEL) at the Fritz-Haber-Institut der Max-Planck-Gesellschaft in Berlin that is now an operating user facility for physical chemistry research in molecular and cluster spectroscopy as well as surface science, is highlighted. The device was designed to meet challenging specifications, including a final energy adjustable in the range of 15-50 MeV, low longitudinal emittance (<50 keV-psec) and transverse emittance (<20 π mm-mrad), at more than 200 pC bunch charge with a micropulse repetition rate of 1 GHz and a macropulse length of up to 15 μs. Secondly, we will describe an ongoing effort to develop an ultrafast electron diffraction (UED) source that is scheduled for completion in 2015 with prototype testing taking place at the Brookhaven National Laboratory (BNL) Accelerator Test Facility (ATF). This tabletop X-band system will find application in time-resolved chemical imaging and as a resource for drug-cell interaction analysis. A third active area at AES is accelerators for security applications where we will cover some top-level aspects of THz and X-ray systems that are under development and in testing for stand-off and portal detection.

  7. Metagenomics and novel gene discovery

    PubMed Central

    Culligan, Eamonn P; Sleator, Roy D; Marchesi, Julian R; Hill, Colin

    2014-01-01

    Metagenomics provides a means of assessing the total genetic pool of all the microbes in a particular environment, in a culture-independent manner. It has revealed unprecedented diversity in microbial community composition, which is further reflected in the encoded functional diversity of the genomes, a large proportion of which consists of novel genes. Herein, we review both sequence-based and functional metagenomic methods to uncover novel genes and outline some of the associated problems of each type of approach, as well as potential solutions. Furthermore, we discuss the potential for metagenomic biotherapeutic discovery, with a particular focus on the human gut microbiome and finally, we outline how the discovery of novel genes may be used to create bioengineered probiotics. PMID:24317337

  8. Accelerating Gene Discovery by Phenotyping Whole-Genome Sequenced Multi-mutation Strains and Using the Sequence Kernel Association Test (SKAT).

    PubMed

    Timbers, Tiffany A; Garland, Stephanie J; Mohan, Swetha; Flibotte, Stephane; Edgley, Mark; Muncaster, Quintin; Au, Vinci; Li-Leger, Erica; Rosell, Federico I; Cai, Jerry; Rademakers, Suzanne; Jansen, Gert; Moerman, Donald G; Leroux, Michel R

    2016-08-01

    Forward genetic screens represent powerful, unbiased approaches to uncover novel components in any biological process. Such screens suffer from a major bottleneck, however, namely the cloning of corresponding genes causing the phenotypic variation. Reverse genetic screens have been employed as a way to circumvent this issue, but can often be limited in scope. Here we demonstrate an innovative approach to gene discovery. Using C. elegans as a model system, we used a whole-genome sequenced multi-mutation library, from the Million Mutation Project, together with the Sequence Kernel Association Test (SKAT), to rapidly screen for and identify genes associated with a phenotype of interest, namely defects in dye-filling of ciliated sensory neurons. Such anomalies in dye-filling are often associated with the disruption of cilia, organelles which in humans are implicated in sensory physiology (including vision, smell and hearing), development and disease. Beyond identifying several well characterised dye-filling genes, our approach uncovered three genes not previously linked to ciliated sensory neuron development or function. From these putative novel dye-filling genes, we confirmed the involvement of BGNT-1.1 in ciliated sensory neuron function and morphogenesis. BGNT-1.1 functions at the trans-Golgi network of sheath cells (glia) to influence dye-filling and cilium length, in a cell non-autonomous manner. Notably, BGNT-1.1 is the orthologue of human B3GNT1/B4GAT1, a glycosyltransferase associated with Walker-Warburg syndrome (WWS). WWS is a multigenic disorder characterised by muscular dystrophy as well as brain and eye anomalies. Together, our work unveils an effective and innovative approach to gene discovery, and provides the first evidence that B3GNT1-associated Walker-Warburg syndrome may be considered a ciliopathy. PMID:27508411

  9. Accelerating Gene Discovery by Phenotyping Whole-Genome Sequenced Multi-mutation Strains and Using the Sequence Kernel Association Test (SKAT)

    PubMed Central

    Garland, Stephanie J.; Mohan, Swetha; Flibotte, Stephane; Muncaster, Quintin; Cai, Jerry; Rademakers, Suzanne; Moerman, Donald G.; Leroux, Michel R.

    2016-01-01

    Forward genetic screens represent powerful, unbiased approaches to uncover novel components in any biological process. Such screens suffer from a major bottleneck, however, namely the cloning of corresponding genes causing the phenotypic variation. Reverse genetic screens have been employed as a way to circumvent this issue, but can often be limited in scope. Here we demonstrate an innovative approach to gene discovery. Using C. elegans as a model system, we used a whole-genome sequenced multi-mutation library, from the Million Mutation Project, together with the Sequence Kernel Association Test (SKAT), to rapidly screen for and identify genes associated with a phenotype of interest, namely defects in dye-filling of ciliated sensory neurons. Such anomalies in dye-filling are often associated with the disruption of cilia, organelles which in humans are implicated in sensory physiology (including vision, smell and hearing), development and disease. Beyond identifying several well characterised dye-filling genes, our approach uncovered three genes not previously linked to ciliated sensory neuron development or function. From these putative novel dye-filling genes, we confirmed the involvement of BGNT-1.1 in ciliated sensory neuron function and morphogenesis. BGNT-1.1 functions at the trans-Golgi network of sheath cells (glia) to influence dye-filling and cilium length, in a cell non-autonomous manner. Notably, BGNT-1.1 is the orthologue of human B3GNT1/B4GAT1, a glycosyltransferase associated with Walker-Warburg syndrome (WWS). WWS is a multigenic disorder characterised by muscular dystrophy as well as brain and eye anomalies. Together, our work unveils an effective and innovative approach to gene discovery, and provides the first evidence that B3GNT1-associated Walker-Warburg syndrome may be considered a ciliopathy. PMID:27508411

  10. Accelerating scientific discovery : 2007 annual report.

    SciTech Connect

    Beckman, P.; Dave, P.; Drugan, C.

    2008-11-14

    As a gateway for scientific discovery, the Argonne Leadership Computing Facility (ALCF) works hand in hand with the world's best computational scientists to advance research in a diverse span of scientific domains, ranging from chemistry, applied mathematics, and materials science to engineering physics and life sciences. Sponsored by the U.S. Department of Energy's (DOE) Office of Science, researchers are using the IBM Blue Gene/L supercomputer at the ALCF to study and explore key scientific problems that underlie important challenges facing our society. For instance, a research team at the University of California-San Diego/ SDSC is studying the molecular basis of Parkinson's disease. The researchers plan to use the knowledge they gain to discover new drugs to treat the disease and to identify risk factors for other diseases that are equally prevalent. Likewise, scientists from Pratt & Whitney are using the Blue Gene to understand the complex processes within aircraft engines. Expanding our understanding of jet engine combustors is the secret to improved fuel efficiency and reduced emissions. Lessons learned from the scientific simulations of jet engine combustors have already led Pratt & Whitney to newer designs with unprecedented reductions in emissions, noise, and cost of ownership. ALCF staff members provide in-depth expertise and assistance to those using the Blue Gene/L and optimizing user applications. Both the Catalyst and Applications Performance Engineering and Data Analytics (APEDA) teams support the users projects. In addition to working with scientists running experiments on the Blue Gene/L, we have become a nexus for the broader global community. In partnership with the Mathematics and Computer Science Division at Argonne National Laboratory, we have created an environment where the world's most challenging computational science problems can be addressed. Our expertise in high-end scientific computing enables us to provide guidance for applications

  11. Choosing experiments to accelerate collective discovery.

    PubMed

    Rzhetsky, Andrey; Foster, Jacob G; Foster, Ian T; Evans, James A

    2015-11-24

    A scientist's choice of research problem affects his or her personal career trajectory. Scientists' combined choices affect the direction and efficiency of scientific discovery as a whole. In this paper, we infer preferences that shape problem selection from patterns of published findings and then quantify their efficiency. We represent research problems as links between scientific entities in a knowledge network. We then build a generative model of discovery informed by qualitative research on scientific problem selection. We map salient features from this literature to key network properties: an entity's importance corresponds to its degree centrality, and a problem's difficulty corresponds to the network distance it spans. Drawing on millions of papers and patents published over 30 years, we use this model to infer the typical research strategy used to explore chemical relationships in biomedicine. This strategy generates conservative research choices focused on building up knowledge around important molecules. These choices become more conservative over time. The observed strategy is efficient for initial exploration of the network and supports scientific careers that require steady output, but is inefficient for science as a whole. Through supercomputer experiments on a sample of the network, we study thousands of alternatives and identify strategies much more efficient at exploring mature knowledge networks. We find that increased risk-taking and the publication of experimental failures would substantially improve the speed of discovery. We consider institutional shifts in grant making, evaluation, and publication that would help realize these efficiencies.

  12. Choosing experiments to accelerate collective discovery

    PubMed Central

    Rzhetsky, Andrey; Foster, Jacob G.; Foster, Ian T.

    2015-01-01

    A scientist’s choice of research problem affects his or her personal career trajectory. Scientists’ combined choices affect the direction and efficiency of scientific discovery as a whole. In this paper, we infer preferences that shape problem selection from patterns of published findings and then quantify their efficiency. We represent research problems as links between scientific entities in a knowledge network. We then build a generative model of discovery informed by qualitative research on scientific problem selection. We map salient features from this literature to key network properties: an entity’s importance corresponds to its degree centrality, and a problem’s difficulty corresponds to the network distance it spans. Drawing on millions of papers and patents published over 30 years, we use this model to infer the typical research strategy used to explore chemical relationships in biomedicine. This strategy generates conservative research choices focused on building up knowledge around important molecules. These choices become more conservative over time. The observed strategy is efficient for initial exploration of the network and supports scientific careers that require steady output, but is inefficient for science as a whole. Through supercomputer experiments on a sample of the network, we study thousands of alternatives and identify strategies much more efficient at exploring mature knowledge networks. We find that increased risk-taking and the publication of experimental failures would substantially improve the speed of discovery. We consider institutional shifts in grant making, evaluation, and publication that would help realize these efficiencies. PMID:26554009

  13. Independent Gene Discovery and Testing

    ERIC Educational Resources Information Center

    Palsule, Vrushalee; Coric, Dijana; Delancy, Russell; Dunham, Heather; Melancon, Caleb; Thompson, Dennis; Toms, Jamie; White, Ashley; Shultz, Jeffry

    2010-01-01

    A clear understanding of basic gene structure is critical when teaching molecular genetics, the central dogma and the biological sciences. We sought to create a gene-based teaching project to improve students' understanding of gene structure and to integrate this into a research project that can be implemented by instructors at the secondary level…

  14. Bioinformatics approaches to cancer gene discovery.

    PubMed

    Narayanan, Ramaswamy

    2007-01-01

    The Cancer Gene Anatomy Project (CGAP) database of the National Cancer Institute has thousands of known and novel expressed sequence tags (ESTs). These ESTs, derived from diverse normal and tumor cDNA libraries, offer an attractive starting point for cancer gene discovery. Data-mining the CGAP database led to the identification of ESTs that were predicted to be specific to select solid tumors. Two genes from these efforts were taken to proof of concept for diagnostic and therapeutics indications of cancer. Microarray technology was used in conjunction with bioinformatics to understand the mechanism of one of the targets discovered. These efforts provide an example of gene discovery by using bioinformatics approaches. The strengths and weaknesses of this approach are discussed in this review.

  15. Human brain evolution: From gene discovery to phenotype discovery

    PubMed Central

    Preuss, Todd M.

    2012-01-01

    The rise of comparative genomics and related technologies has added important new dimensions to the study of human evolution. Our knowledge of the genes that underwent expression changes or were targets of positive selection in human evolution is rapidly increasing, as is our knowledge of gene duplications, translocations, and deletions. It is now clear that the genetic differences between humans and chimpanzees are far more extensive than previously thought; their genomes are not 98% or 99% identical. Despite the rapid growth in our understanding of the evolution of the human genome, our understanding of the relationship between genetic changes and phenotypic changes is tenuous. This is true even for the most intensively studied gene, FOXP2, which underwent positive selection in the human terminal lineage and is thought to have played an important role in the evolution of human speech and language. In part, the difficulty of connecting genes to phenotypes reflects our generally poor knowledge of human phenotypic specializations, as well as the difficulty of interpreting the consequences of genetic changes in species that are not amenable to invasive research. On the positive side, investigations of FOXP2, along with genomewide surveys of gene-expression changes and selection-driven sequence changes, offer the opportunity for “phenotype discovery,” providing clues to human phenotypic specializations that were previously unsuspected. What is more, at least some of the specializations that have been proposed are amenable to testing with noninvasive experimental techniques appropriate for the study of humans and apes. PMID:22723367

  16. SpS5: Accelerating the Rate of Astronomical Discovery

    NASA Astrophysics Data System (ADS)

    Norris, Ray P.

    2010-11-01

    Special Session 5 on Accelerating the Rate of Astronomical Discovery addressed a range of potential limits to progress: paradigmatic, technological, organizational, and political. It examined each issue both from modern and historical perspectives, and drew lessons to guide future progress. A number of issues were identified which may regulate the flow of discoveries, such as the balance between large strongly-focussed projects and instruments, designed to answer the most fundamental questions confronting us, and the need to maintain a creative environment with room for unorthodox thinkers and bold, high risk, projects. Also important is the need to maintain historical and cultural perspectives, and the need to engage the minds of the most brilliant young people on the planet, regardless of their background, ethnicity, gender, or geography.

  17. Cancer gene discovery using digital differential display.

    PubMed

    Scheurle, D; DeYoung, M P; Binninger, D M; Page, H; Jahanzeb, M; Narayanan, R

    2000-08-01

    The Cancer Gene Anatomy Project database of the National Cancer Institute has thousands of expressed sequences, both known and novel, in the form of expressed sequence tags (ESTs). These ESTs, derived from diverse normal and tumor cDNA libraries, offer an attractive starting point for cancer gene discovery. Using a data-mining tool called Digital Differential Display (DDD) from the Cancer Gene Anatomy Project database, ESTs from six different solid tumor types (breast, colon, lung, ovary, pancreas, and prostate) were analyzed for differential expression. An electronic expression profile and chromosomal map position of these hits were generated from the Unigene database. The hits were categorized into major classes of genes including ribosomal proteins, enzymes, cell surface molecules, secretory proteins, adhesion molecules, and immunoglobulins and were found to be differentially expressed in these tumorderived libraries. Genes known to be up-regulated in prostate, breast, and pancreatic carcinomas were discovered by DDD, demonstrating the utility of this technique. Two hundred known genes and 500 novel sequences were discovered to be differentially expressed in these select tumor-derived libraries. Test genes were validated for expression specificity by reverse transcription-PCR, providing a proof of concept for gene discovery by DDD. A comprehensive database of hits can be accessed at http:// www.fau.edu/cmbb/publications/cancergenes. htm. This solid tumor DDD database should facilitate target identification for cancer diagnostics and therapeutics.

  18. From the Discovery of Radioactivity to the First Accelerator Experiments

    NASA Astrophysics Data System (ADS)

    Walter, Michael

    The chapter reviews the historical phases of cosmic ray research from the very beginning around 1900 until the 1940s when first particle accelerators replaced cosmic particles as source for elementary particle interactions. In opposite to the discovery of X-rays or the ionising α-, β- and γ-rays, it was an arduous path to the definite acceptance of the new radiation. The starting point was the explanation that air becomes conductive by the ionising radiation of radioactive elements in the surroundings. In the following years the penetration power of the radiation was studied with the result, that there seems be a component harder than the known γ-rays. Victor F. Hess did in 1912 the key experiment with a hydrogen balloon. He measured with three detectors an increase of ionisation up to altitudes of 5 300 m and discovered the extraterrestrial penetrating radiation. The next phase is characterised by W. Kolhörster's confirmation in 1914, doubts by R.A. Millikan and others as well as the spectacular re-discovery of cosmic rays by Millikan in 1926. With the invention of new detectors as the cloud chamber and the Geiger-Müller counter and of the coincidence method the properties of cosmic rays could be investigated. One of the striking results was the discovery that cosmic rays are of corpuscular nature. The broad research activities starting end of the 1920s were the begin of a scientific success story, which nobody of the early protagonists might have imagined. In 1932 C.D. Anderson discovered the antiparticle of the electron. It was the birth of elementary particle physics. Four years later the muon was discovered which was for many years wrongly assumed to be the carrier of the short range nuclear force predicted by H. Yukawa. One of the last high-lights before the particle accelerators took over this field of fundamental research was the discovery of the Yukawa particle. In photographic emulsions exposed by cosmic particles the pion was found in 1947. This

  19. Cancer gene discovery: exploiting insertional mutagenesis

    PubMed Central

    Ranzani, Marco; Annunziato, Stefano; Adams, David J.; Montini, Eugenio

    2013-01-01

    Insertional mutagenesis has been utilized as a functional forward genetics screen for the identification of novel genes involved in the pathogenesis of human cancers. Different insertional mutagens have been successfully used to reveal new cancer genes. For example, retroviruses (RVs) are integrating viruses with the capacity to induce the deregulation of genes in the neighborhood of the insertion site. RVs have been employed for more than 30 years to identify cancer genes in the hematopoietic system and mammary gland. Similarly, another tool that has revolutionized cancer gene discovery is the cut-and-paste transposons. These DNA elements have been engineered to contain strong promoters and stop cassettes that may function to perturb gene expression upon integration proximal to genes. In addition, complex mouse models characterized by tissue-restricted activity of transposons have been developed to identify oncogenes and tumor suppressor genes that control the development of a wide range of solid tumor types, extending beyond those tissues accessible using RV-based approaches. Most recently, lentiviral vectors (LVs) have appeared on the scene for use in cancer gene screens. LVs are replication defective integrating vectors that have the advantage of being able to infect non-dividing cells, in a wide range of cell types and tissues. In this review, we describe the various insertional mutagens focusing on their advantages/limitations and we discuss the new and promising tools that will improve the insertional mutagenesis screens of the future. PMID:23928056

  20. Pathway-driven discovery of epilepsy genes

    PubMed Central

    Noebels, Jeffrey

    2016-01-01

    Epilepsy genes deliver critical insights into the molecular control of brain synchronization and are revolutionizing our understanding and treatment of the disease. The epilepsy-associated genome is rapidly expanding, and two powerful complementary approaches, isolation of de novo exome variants in patients and targeted mutagenesis in model systems, account for the steep increase. In sheer number, the tally of genes linked to seizures will likely match that of cancer and exceed it in biological diversity. The proteins act within most intracellular compartments and span the molecular determinants of firing and wiring in the developing brain. Every facet of neurotransmission, from dendritic spine to exocytotic machinery, is in play, and defects of synaptic inhibition are over-represented. The contributions of somatic mutations and noncoding microRNAs are also being explored. The functional spectrum of established epilepsy genes and the arrival of rapid, precise technologies for genome editing now provide a robust scaffold to prioritize hypothesis-driven discovery and further populate this genetic proto-map. Although each gene identified offers translational potential to stratify patient care, the complexity of individual variation and covert actions of genetic modifiers may confound single-gene solutions for the clinical disorder. In vivo genetic deconstruction of epileptic networks, ex vivo validation of variant profiles in patient-derived induced pluripotent stem cells, in silico variant modeling and modifier gene discovery, now in their earliest stages, will help clarify individual patterns. Because seizures stand at the crossroads of all neuronal synchronization disorders in the developing and aging brain, the neurobiological analysis of epilepsy-associated genes provides an extraordinary gateway to new insights into higher cortical function. PMID:25710836

  1. Pathway-driven discovery of epilepsy genes.

    PubMed

    Noebels, Jeffrey

    2015-03-01

    Epilepsy genes deliver critical insights into the molecular control of brain synchronization and are revolutionizing our understanding and treatment of the disease. The epilepsy-associated genome is rapidly expanding, and two powerful complementary approaches, isolation of de novo exome variants in patients and targeted mutagenesis in model systems, account for the steep increase. In sheer number, the tally of genes linked to seizures will likely match that of cancer and exceed it in biological diversity. The proteins act within most intracellular compartments and span the molecular determinants of firing and wiring in the developing brain. Every facet of neurotransmission, from dendritic spine to exocytotic machinery, is in play, and defects of synaptic inhibition are over-represented. The contributions of somatic mutations and noncoding microRNAs are also being explored. The functional spectrum of established epilepsy genes and the arrival of rapid, precise technologies for genome editing now provide a robust scaffold to prioritize hypothesis-driven discovery and further populate this genetic proto-map. Although each gene identified offers translational potential to stratify patient care, the complexity of individual variation and covert actions of genetic modifiers may confound single-gene solutions for the clinical disorder. In vivo genetic deconstruction of epileptic networks, ex vivo validation of variant profiles in patient-derived induced pluripotent stem cells, in silico variant modeling and modifier gene discovery, now in their earliest stages, will help clarify individual patterns. Because seizures stand at the crossroads of all neuronal synchronization disorders in the developing and aging brain, the neurobiological analysis of epilepsy-associated genes provides an extraordinary gateway to new insights into higher cortical function. PMID:25710836

  2. Pathway-driven discovery of epilepsy genes.

    PubMed

    Noebels, Jeffrey

    2015-03-01

    Epilepsy genes deliver critical insights into the molecular control of brain synchronization and are revolutionizing our understanding and treatment of the disease. The epilepsy-associated genome is rapidly expanding, and two powerful complementary approaches, isolation of de novo exome variants in patients and targeted mutagenesis in model systems, account for the steep increase. In sheer number, the tally of genes linked to seizures will likely match that of cancer and exceed it in biological diversity. The proteins act within most intracellular compartments and span the molecular determinants of firing and wiring in the developing brain. Every facet of neurotransmission, from dendritic spine to exocytotic machinery, is in play, and defects of synaptic inhibition are over-represented. The contributions of somatic mutations and noncoding microRNAs are also being explored. The functional spectrum of established epilepsy genes and the arrival of rapid, precise technologies for genome editing now provide a robust scaffold to prioritize hypothesis-driven discovery and further populate this genetic proto-map. Although each gene identified offers translational potential to stratify patient care, the complexity of individual variation and covert actions of genetic modifiers may confound single-gene solutions for the clinical disorder. In vivo genetic deconstruction of epileptic networks, ex vivo validation of variant profiles in patient-derived induced pluripotent stem cells, in silico variant modeling and modifier gene discovery, now in their earliest stages, will help clarify individual patterns. Because seizures stand at the crossroads of all neuronal synchronization disorders in the developing and aging brain, the neurobiological analysis of epilepsy-associated genes provides an extraordinary gateway to new insights into higher cortical function.

  3. Accelerating drug discovery via organs-on-chips

    PubMed Central

    Chan, Chung Yu; Huang, Po-Hsun; Guo, Feng; Ding, Xiaoyun; Kapur, Vivek; Mai, John D.

    2014-01-01

    Considerable advances have been made in the development of micro-physiological systems that seek to faithfully replicate the complexity and functionality of animal and human physiology in research laboratories. Sometimes referred to as “organs-on-chips”, these systems provide key insights into physiological or pathological processes associated with health maintenance and disease control, and serve as powerful platforms for new drug development and toxicity screening. In this Focus article, we review the state-of-the-art designs and examples for developing multiple “organs-on-chips”, and discuss the potential of this emerging technology to enhance our understanding of human physiology, and to transform and accelerate the drug discovery and pre-clinical testing process. This Focus article highlights some of the recent technological advances in this field, along with the challenges that must be addressed for these technologies to fully realize their potential. PMID:24193241

  4. Obesity and diabetes gene discovery approaches.

    PubMed

    Walder, K; Segal, D; Jowett, J; Blangero, J; Collier, G R

    2003-01-01

    New treatments are currently required for the common metabolic diseases obesity and type 2 diabetes. The identification of physiological and biochemical factors that underlie the metabolic disturbances observed in obesity and type 2 diabetes is a key step in developing better therapeutic outcomes. The discovery of new genes and pathways involved in the pathogenesis of these diseases is critical to this process, however identification of genes that contribute to the risk of developing these diseases represents a significant challenge as obesity and type 2 diabetes are complex diseases with many genetic and environmental causes. A number of diverse approaches have been used to discover and validate potential new targets for obesity and diabetes. To date, DNA-based approaches using candidate gene and genome-wide linkage analysis have had limited success in identifying genomic regions or genes involved in the development of these diseases. Recent advances in the ability to evaluate linkage analysis data from large family pedigrees using variance components based linkage analysis show great promise in robustly identifying genomic regions associated with the development of obesity and diabetes. RNA-based technologies such as cDNA microarrays have identified many genes differentially expressed in tissues of healthy and diseased subjects. Using a combined approach, we are endeavouring to focus attention on differentially expressed genes located in chromosomal regions previously linked with obesity and/or diabetes. Using this strategy, we have identified Beacon as a potential new target for obesity and diabetes.

  5. Biomarker Gene Signature Discovery Integrating Network Knowledge

    PubMed Central

    Cun, Yupeng; Fröhlich, Holger

    2012-01-01

    Discovery of prognostic and diagnostic biomarker gene signatures for diseases, such as cancer, is seen as a major step towards a better personalized medicine. During the last decade various methods, mainly coming from the machine learning or statistical domain, have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinical diagnosis is the typical low reproducibility of these signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. Here we review the current state of research in this field by giving an overview about so-far proposed approaches. PMID:24832044

  6. Accelerated discovery of materials for solar fuel cells at JCAP

    NASA Astrophysics Data System (ADS)

    Mitrovic, Slobodan; Cornell, Earl; Gregoire, John; Haber, Joel; Kan, Kevin; Lin, Sean; Liu, Xiaonao; Marcin, Martin; Soedarmadji, Edward; Suram, Santosh; Xiang, Chengxiang; Jin, Jian

    2013-03-01

    High-Throughput Experimentation group at the Joint Center for Artificial Photosynthesis has a formidable mission: provide accelerated discovery of new photon absorbers and heterogeneous (photo)catalysts for solar fuel cells at the rate far beyond anything attempted in material science to date. The HTE pipeline includes material synthesis, screening and characterization. Within the first year of operations, our fabrication capabilities have risen to 100,000 samples per day using combinatorial inkjet-printing. Such high rate of sample production is setting daunting requirements on screening methods. We are developing and testing methods for fast bandgap measurements, using colorimetry and uv-vis spectroscopy. Material thickness and roughness is determined by confocal chromatic spectroscopy. Catalytic activity is screen through a massively parallel bubble screen and a fast scanning droplet (photo)electrochemical cell. Concurrently, we are developing protocols for high-throughput determination of phase and structure (XRD), surface composition and chemistry (XPS), surface area measurement, etc. on the characterization side of the pipeline. This work was performed at Joint Center for Artificial Photosynthesis, a DOE Energy Innovation Hub, supported through the Office of Science of the U.S. Department of Energy under Award No. DE-SC0004993

  7. Antibiotic resistance gene discovery in food-producing animals.

    PubMed

    Allen, Heather K

    2014-06-01

    Numerous environmental reservoirs contribute to the widespread antibiotic resistance problem in human pathogens. One environmental reservoir of particular importance is the intestinal bacteria of food-producing animals. In this review I examine recent discoveries of antibiotic resistance genes in agricultural animals. Two types of antibiotic resistance gene discoveries will be discussed: the use of classic microbiological and molecular techniques, such as culturing and PCR, to identify known genes not previously reported in animals; and the application of high-throughput technologies, such as metagenomics, to identify novel genes and gene transfer mechanisms. These discoveries confirm that antibiotics should be limited to prudent uses.

  8. Standardized Plant Disease Evaluations will Enhance Resistance Gene Discovery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gene discovery and marker development using DNA based tools require plant populations with well-documented phenotypes. Related crops such as apples and pears may share a number of genes, for example resistance to common diseases, and data mining in one crop may reveal genes for the other. However, u...

  9. Antibiotic resistance gene discovery in food-producing animals

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Numerous environmental reservoirs contribute to the widespread antibiotic resistance problem in human pathogens. One environmental reservoir of particular importance is the intestinal bacteria of food-producing animals. In this review I examine recent discoveries of antibiotic resistance genes in ...

  10. Cloud computing approaches to accelerate drug discovery value chain.

    PubMed

    Garg, Vibhav; Arora, Suchir; Gupta, Chitra

    2011-12-01

    Continued advancements in the area of technology have helped high throughput screening (HTS) evolve from a linear to parallel approach by performing system level screening. Advanced experimental methods used for HTS at various steps of drug discovery (i.e. target identification, target validation, lead identification and lead validation) can generate data of the order of terabytes. As a consequence, there is pressing need to store, manage, mine and analyze this data to identify informational tags. This need is again posing challenges to computer scientists to offer the matching hardware and software infrastructure, while managing the varying degree of desired computational power. Therefore, the potential of "On-Demand Hardware" and "Software as a Service (SAAS)" delivery mechanisms cannot be denied. This on-demand computing, largely referred to as Cloud Computing, is now transforming the drug discovery research. Also, integration of Cloud computing with parallel computing is certainly expanding its footprint in the life sciences community. The speed, efficiency and cost effectiveness have made cloud computing a 'good to have tool' for researchers, providing them significant flexibility, allowing them to focus on the 'what' of science and not the 'how'. Once reached to its maturity, Discovery-Cloud would fit best to manage drug discovery and clinical development data, generated using advanced HTS techniques, hence supporting the vision of personalized medicine.

  11. Integrating virtual screening and combinatorial chemistry for accelerated drug discovery.

    PubMed

    López-Vallejo, Fabian; Caulfield, Thomas; Martínez-Mayorga, Karina; Giulianotti, Marc A; Nefzi, Adel; Houghten, Richard A; Medina-Franco, Jose L

    2011-07-01

    Virtual screening is increasingly being used in drug discovery programs with a growing number of successful applications. Experimental methodologies developed to speed up the drug discovery processes include high-throughput screening and combinatorial chemistry. The complementarities between computational and experimental screenings have been recognized and reviewed in the literature. Computational methods have also been used in the combinatorial chemistry field, in particular in library design. However, the integration of computational and combinatorial chemistry screenings has been attempted only recently. Combinatorial libraries (experimental or virtual) represent a notable source of chemically related compounds. Advances in combinatorial chemistry and deconvolution strategies, have enabled the rapid exploration of novel and dense regions in the chemical space. The present review is focused on the integration of virtual and experimental screening of combinatorial libraries. Applications of virtual screening to discover novel anticancer agents and our ongoing efforts towards the integration of virtual screening and combinatorial chemistry are also discussed.

  12. Discovery of Tumor Suppressor Gene Function.

    ERIC Educational Resources Information Center

    Oppenheimer, Steven B.

    1995-01-01

    This is an update of a 1991 review on tumor suppressor genes written at a time when understanding of how the genes work was limited. A recent major breakthrough in the understanding of the function of tumor suppressor genes is discussed. (LZ)

  13. Accelerating Scientific Discovery Through Computation and Visualization II

    PubMed Central

    Sims, James S.; George, William L.; Satterfield, Steven G.; Hung, Howard K.; Hagedorn, John G.; Ketcham, Peter M.; Griffin, Terence J.; Hagstrom, Stanley A.; Franiatte, Julien C.; Bryant, Garnett W.; Jaskólski, W.; Martys, Nicos S.; Bouldin, Charles E.; Simmons, Vernon; Nicolas, Oliver P.; Warren, James A.; am Ende, Barbara A.; Koontz, John E.; Filla, B. James; Pourprix, Vital G.; Copley, Stefanie R.; Bohn, Robert B.; Peskin, Adele P.; Parker, Yolanda M.; Devaney, Judith E.

    2002-01-01

    This is the second in a series of articles describing a wide variety of projects at NIST that synergistically combine physical science and information science. It describes, through examples, how the Scientific Applications and Visualization Group (SAVG) at NIST has utilized high performance parallel computing, visualization, and machine learning to accelerate research. The examples include scientific collaborations in the following areas: (1) High Precision Energies for few electron atomic systems, (2) Flows of suspensions, (3) X-ray absorption, (4) Molecular dynamics of fluids, (5) Nanostructures, (6) Dendritic growth in alloys, (7) Screen saver science, (8) genetic programming. PMID:27446728

  14. The development of algorithms for parallel knowledge discovery using graphics accelerators

    NASA Astrophysics Data System (ADS)

    Zieliński, Paweł; Mulawka, Jan

    2011-10-01

    The paper broaches topics of selected knowledge discovery algorithms. Different implementations have been verified on parallel platforms, including graphics accelerators using CUDA technology, multi-core microprocessors using OpenMP and many graphics accelerators. Results of investigations have been compared in terms of performance and scalability. Different types of data representation were also tested. The possibilities of both platforms, using the classification algorithms: the k-nearest neighbors, support vector machines and logistic regression are discussed.

  15. A Discovery Lab for Studying Gene Regulation.

    ERIC Educational Resources Information Center

    Moss, Robert

    1997-01-01

    Presents a laboratory in which students are provided with cultures of three bacterial strains. Using the results, students will determine which of the strains corresponds to a mutant lacking a particular functional gene. (DDR)

  16. SNP marker discovery in koala TLR genes.

    PubMed

    Cui, Jian; Frankham, Greta J; Johnson, Rebecca N; Polkinghorne, Adam; Timms, Peter; O'Meally, Denis; Cheng, Yuanyuan; Belov, Katherine

    2015-01-01

    Toll-like receptors (TLRs) play a crucial role in the early defence against invading pathogens, yet our understanding of TLRs in marsupial immunity is limited. Here, we describe the characterisation of nine TLRs from a koala immune tissue transcriptome and one TLR from a draft sequence of the koala genome and the subsequent development of an assay to study genetic diversity in these genes. We surveyed genetic diversity in 20 koalas from New South Wales, Australia and showed that one gene, TLR10 is monomorphic, while the other nine TLR genes have between two and 12 alleles. 40 SNPs (16 non-synonymous) were identified across the ten TLR genes. These markers provide a springboard to future studies on innate immunity in the koala, a species under threat from two major infectious diseases.

  17. SNP marker discovery in koala TLR genes.

    PubMed

    Cui, Jian; Frankham, Greta J; Johnson, Rebecca N; Polkinghorne, Adam; Timms, Peter; O'Meally, Denis; Cheng, Yuanyuan; Belov, Katherine

    2015-01-01

    Toll-like receptors (TLRs) play a crucial role in the early defence against invading pathogens, yet our understanding of TLRs in marsupial immunity is limited. Here, we describe the characterisation of nine TLRs from a koala immune tissue transcriptome and one TLR from a draft sequence of the koala genome and the subsequent development of an assay to study genetic diversity in these genes. We surveyed genetic diversity in 20 koalas from New South Wales, Australia and showed that one gene, TLR10 is monomorphic, while the other nine TLR genes have between two and 12 alleles. 40 SNPs (16 non-synonymous) were identified across the ten TLR genes. These markers provide a springboard to future studies on innate immunity in the koala, a species under threat from two major infectious diseases. PMID:25799012

  18. Biomedical discovery acceleration, with applications to craniofacial development.

    PubMed

    Leach, Sonia M; Tipney, Hannah; Feng, Weiguo; Baumgartner, William A; Kasliwal, Priyanka; Schuyler, Ronald P; Williams, Trevor; Spritz, Richard A; Hunter, Lawrence

    2009-03-01

    The profusion of high-throughput instruments and the explosion of new results in the scientific literature, particularly in molecular biomedicine, is both a blessing and a curse to the bench researcher. Even knowledgeable and experienced scientists can benefit from computational tools that help navigate this vast and rapidly evolving terrain. In this paper, we describe a novel computational approach to this challenge, a knowledge-based system that combines reading, reasoning, and reporting methods to facilitate analysis of experimental data. Reading methods extract information from external resources, either by parsing structured data or using biomedical language processing to extract information from unstructured data, and track knowledge provenance. Reasoning methods enrich the knowledge that results from reading by, for example, noting two genes that are annotated to the same ontology term or database entry. Reasoning is also used to combine all sources into a knowledge network that represents the integration of all sorts of relationships between a pair of genes, and to calculate a combined reliability score. Reporting methods combine the knowledge network with a congruent network constructed from experimental data and visualize the combined network in a tool that facilitates the knowledge-based analysis of that data. An implementation of this approach, called the Hanalyzer, is demonstrated on a large-scale gene expression array dataset relevant to craniofacial development. The use of the tool was critical in the creation of hypotheses regarding the roles of four genes never previously characterized as involved in craniofacial development; each of these hypotheses was validated by further experimental work.

  19. Biomedical Discovery Acceleration, with Applications to Craniofacial Development

    PubMed Central

    Feng, Weiguo; Baumgartner, William A.; Kasliwal, Priyanka; Schuyler, Ronald P.; Williams, Trevor; Spritz, Richard A.; Hunter, Lawrence

    2009-01-01

    The profusion of high-throughput instruments and the explosion of new results in the scientific literature, particularly in molecular biomedicine, is both a blessing and a curse to the bench researcher. Even knowledgeable and experienced scientists can benefit from computational tools that help navigate this vast and rapidly evolving terrain. In this paper, we describe a novel computational approach to this challenge, a knowledge-based system that combines reading, reasoning, and reporting methods to facilitate analysis of experimental data. Reading methods extract information from external resources, either by parsing structured data or using biomedical language processing to extract information from unstructured data, and track knowledge provenance. Reasoning methods enrich the knowledge that results from reading by, for example, noting two genes that are annotated to the same ontology term or database entry. Reasoning is also used to combine all sources into a knowledge network that represents the integration of all sorts of relationships between a pair of genes, and to calculate a combined reliability score. Reporting methods combine the knowledge network with a congruent network constructed from experimental data and visualize the combined network in a tool that facilitates the knowledge-based analysis of that data. An implementation of this approach, called the Hanalyzer, is demonstrated on a large-scale gene expression array dataset relevant to craniofacial development. The use of the tool was critical in the creation of hypotheses regarding the roles of four genes never previously characterized as involved in craniofacial development; each of these hypotheses was validated by further experimental work. PMID:19325874

  20. Schizophrenia risk genes: Implications for future drug development and discovery.

    PubMed

    O'Connell, Garret; Lawrie, Stephen M; McIntosh, Andrew M; Hall, Jeremy

    2011-06-15

    Present-day development of improved treatments for schizophrenia is hindered by uncertain models of disease, inter-individual response variability in clinical trials and a paucity of sensitive measures of treatment effects. Findings from genetic research emphasize the potential for schizophrenia risk genes to help develop focused treatments, discover new drug targets and provide markers of clinical subtypes. Advances in genetic technologies also provide novel modes of drug discovery in schizophrenia such as transcriptomics, epigenetics and transgenic animal models. In this review, we discuss proven and proposed ways risk genes can be used to enhance the development and discovery of treatments for schizophrenia and highlight key studies in these approaches. PMID:21093417

  1. Crizotinib: from discovery to accelerated development to front-line treatment.

    PubMed

    Blackhall, F; Cappuzzo, F

    2016-09-01

    Non-small-cell lung cancer (NSCLC) is associated with a poor prognosis and low survival rates, providing a strong rationale for the development of new treatment options. The discovery of ALK gene rearrangements in a subset of NSCLC specimens and the identification and development of the first-in-class ALK inhibitor crizotinib provided a personalised treatment option for patients with advanced ALK-positive NSCLC. Crizotinib demonstrated rapid and durable responses in advanced ALK-positive NSCLC patients in phase I and II studies, leading to accelerated FDA approval. Subsequent evaluation in phase III studies showed that crizotinib improved progression-free survival compared with platinum-based doublet chemotherapy in previously untreated patients and compared with pemetrexed or docetaxel in previously treated patients. Crizotinib was shown to have an acceptable safety profile and also to improve quality of life and symptom scores. Overall, crizotinib has been shown to provide a valuable first- and second-line treatment option and is now the first-line standard of care for patients with advanced ALK-positive NSCLC. PMID:27573754

  2. GWATCH: a web platform for automated gene association discovery analysis

    PubMed Central

    2014-01-01

    Background As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Findings Here we present a dynamic web-based platform – GWATCH – that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. Conclusions GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH. PMID:25374661

  3. Standardized plant disease evaluations will enhance resistance gene discovery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gene discovery and marker development using DNA-based tools require plant populations with well documented phenotypes. If dissimilar phenotype evaluation methods or data scoring techniques are employed with different crops, or at different labs for the same crops, then data mining for genetic marker...

  4. Modern plant metabolomics: Advanced natural product gene discoveries, improved technologies, and future prospects

    SciTech Connect

    Sumner, Lloyd W.; Lei, Zhentian; Nikolau, Basil J.; Saito, Kazuki

    2014-10-24

    Plant metabolomics has matured and modern plant metabolomics has accelerated gene discoveries and the elucidation of a variety of plant natural product biosynthetic pathways. This study highlights specific examples of the discovery and characterization of novel genes and enzymes associated with the biosynthesis of natural products such as flavonoids, glucosinolates, terpenoids, and alkaloids. Additional examples of the integration of metabolomics with genome-based functional characterizations of plant natural products that are important to modern pharmaceutical technology are also reviewed. This article also provides a substantial review of recent technical advances in mass spectrometry imaging, nuclear magnetic resonance imaging, integrated LC-MS-SPE-NMR for metabolite identifications, and x-ray crystallography of microgram quantities for structural determinations. The review closes with a discussion on the future prospects of metabolomics related to crop species and herbal medicine.

  5. Modern plant metabolomics: advanced natural product gene discoveries, improved technologies, and future prospects.

    PubMed

    Sumner, Lloyd W; Lei, Zhentian; Nikolau, Basil J; Saito, Kazuki

    2015-02-01

    Plant metabolomics has matured and modern plant metabolomics has accelerated gene discoveries and the elucidation of a variety of plant natural product biosynthetic pathways. This review covers the approximate period of 2000 to 2014, and highlights specific examples of the discovery and characterization of novel genes and enzymes associated with the biosynthesis of natural products such as flavonoids, glucosinolates, terpenoids, and alkaloids. Additional examples of the integration of metabolomics with genome-based functional characterizations of plant natural products that are important to modern pharmaceutical technology are also reviewed. This article also provides a substantial review of recent technical advances in mass spectrometry imaging, nuclear magnetic resonance imaging, integrated LC-MS-SPE-NMR for metabolite identifications, and X-ray crystallography of microgram quantities for structural determinations. The review closes with a discussion on the future prospects of metabolomics related to crop species and herbal medicine.

  6. Modern plant metabolomics: Advanced natural product gene discoveries, improved technologies, and future prospects

    DOE PAGES

    Sumner, Lloyd W.; Lei, Zhentian; Nikolau, Basil J.; Saito, Kazuki

    2014-10-24

    Plant metabolomics has matured and modern plant metabolomics has accelerated gene discoveries and the elucidation of a variety of plant natural product biosynthetic pathways. This study highlights specific examples of the discovery and characterization of novel genes and enzymes associated with the biosynthesis of natural products such as flavonoids, glucosinolates, terpenoids, and alkaloids. Additional examples of the integration of metabolomics with genome-based functional characterizations of plant natural products that are important to modern pharmaceutical technology are also reviewed. This article also provides a substantial review of recent technical advances in mass spectrometry imaging, nuclear magnetic resonance imaging, integrated LC-MS-SPE-NMR formore » metabolite identifications, and x-ray crystallography of microgram quantities for structural determinations. The review closes with a discussion on the future prospects of metabolomics related to crop species and herbal medicine.« less

  7. Open Access Target Validation Is a More Efficient Way to Accelerate Drug Discovery

    PubMed Central

    Lee, Wen Hwa

    2015-01-01

    There is a scarcity of novel treatments to address many unmet medical needs. Industry and academia are finally coming to terms with the fact that the prevalent models and incentives for innovation in early stage drug discovery are failing to promote progress quickly enough. Here we will examine how an open model of precompetitive public–private research partnership is enabling efficient derisking and acceleration in the early stages of drug discovery, whilst also widening the range of communities participating in the process, such as patient and disease foundations. PMID:26042736

  8. Novel venom gene discovery in the platypus

    PubMed Central

    2010-01-01

    Background To date, few peptides in the complex mixture of platypus venom have been identified and sequenced, in part due to the limited amounts of platypus venom available to study. We have constructed and sequenced a cDNA library from an active platypus venom gland to identify the remaining components. Results We identified 83 novel putative platypus venom genes from 13 toxin families, which are homologous to known toxins from a wide range of vertebrates (fish, reptiles, insectivores) and invertebrates (spiders, sea anemones, starfish). A number of these are expressed in tissues other than the venom gland, and at least three of these families (those with homology to toxins from distant invertebrates) may play non-toxin roles. Thus, further functional testing is required to confirm venom activity. However, the presence of similar putative toxins in such widely divergent species provides further evidence for the hypothesis that there are certain protein families that are selected preferentially during evolution to become venom peptides. We have also used homology with known proteins to speculate on the contributions of each venom component to the symptoms of platypus envenomation. Conclusions This study represents a step towards fully characterizing the first mammal venom transcriptome. We have found similarities between putative platypus toxins and those of a number of unrelated species, providing insight into the evolution of mammalian venom. PMID:20920228

  9. Discovery of pinoresinol reductase genes in sphingomonads.

    PubMed

    Fukuhara, Y; Kamimura, N; Nakajima, M; Hishiyama, S; Hara, H; Kasai, D; Tsuji, Y; Narita-Yamada, S; Nakamura, S; Katano, Y; Fujita, N; Katayama, Y; Fukuda, M; Kajita, S; Masai, E

    2013-01-10

    Bacterial genes for the degradation of major dilignols produced in lignifying xylem are expected to be useful tools for the structural modification of lignin in plants. For this purpose, we isolated pinZ involved in the conversion of pinoresinol from Sphingobium sp. strain SYK-6. pinZ showed 43-77% identity at amino acid level with bacterial NmrA-like proteins of unknown function, a subgroup of atypical short chain dehydrogenases/reductases, but revealed only 15-21% identity with plant pinoresinol/lariciresinol reductases. PinZ completely converted racemic pinoresinol to lariciresinol, showing a specific activity of 46±3 U/mg in the presence of NADPH at 30°C. In contrast, the activity for lariciresinol was negligible. This substrate preference is similar to a pinoresinol reductase, AtPrR1, of Arabidopsis thaliana; however, the specific activity of PinZ toward (±)-pinoresinol was significantly higher than that of AtPrR1. The role of pinZ and a pinZ ortholog of Novosphingobium aromaticivorans DSM 12444 were also characterized.

  10. Beegle: from literature mining to disease-gene discovery.

    PubMed

    ElShal, Sarah; Tranchevent, Léon-Charles; Sifrim, Alejandro; Ardeshirdavani, Amin; Davis, Jesse; Moreau, Yves

    2016-01-29

    Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/.

  11. Beegle: from literature mining to disease-gene discovery.

    PubMed

    ElShal, Sarah; Tranchevent, Léon-Charles; Sifrim, Alejandro; Ardeshirdavani, Amin; Davis, Jesse; Moreau, Yves

    2016-01-29

    Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/. PMID:26384564

  12. Accelerated Discovery in Photocatalysis using a Mechanism-Based Screening Method.

    PubMed

    Hopkinson, Matthew N; Gómez-Suárez, Adrián; Teders, Michael; Sahoo, Basudev; Glorius, Frank

    2016-03-18

    Herein, we report a conceptually novel mechanism-based screening approach to accelerate discovery in photocatalysis. In contrast to most screening methods, which consider reactions as discrete entities, this approach instead focuses on a single constituent mechanistic step of a catalytic reaction. Using luminescence spectroscopy to investigate the key quenching step in photocatalytic reactions, an initial screen of 100 compounds led to the discovery of two promising substrate classes. Moreover, a second, more focused screen provided mechanistic insights useful in developing proof-of-concept reactions. Overall, this fast and straightforward approach both facilitated the discovery and aided the development of new light-promoted reactions and suggests that mechanism-based screening strategies could become useful tools in the hunt for new reactivity.

  13. Accelerated Discovery in Photocatalysis using a Mechanism-Based Screening Method.

    PubMed

    Hopkinson, Matthew N; Gómez-Suárez, Adrián; Teders, Michael; Sahoo, Basudev; Glorius, Frank

    2016-03-18

    Herein, we report a conceptually novel mechanism-based screening approach to accelerate discovery in photocatalysis. In contrast to most screening methods, which consider reactions as discrete entities, this approach instead focuses on a single constituent mechanistic step of a catalytic reaction. Using luminescence spectroscopy to investigate the key quenching step in photocatalytic reactions, an initial screen of 100 compounds led to the discovery of two promising substrate classes. Moreover, a second, more focused screen provided mechanistic insights useful in developing proof-of-concept reactions. Overall, this fast and straightforward approach both facilitated the discovery and aided the development of new light-promoted reactions and suggests that mechanism-based screening strategies could become useful tools in the hunt for new reactivity. PMID:27000485

  14. INTEGRATE: gene fusion discovery using whole genome and transcriptome data

    PubMed Central

    Zhang, Jin; White, Nicole M.; Schmidt, Heather K.; Fulton, Robert S.; Tomlinson, Chad; Warren, Wesley C.; Wilson, Richard K.; Maher, Christopher A.

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use. PMID:26556708

  15. Plug-and-Play Benzylisoquinoline Alkaloid Biosynthetic Gene Discovery in Engineered Yeast.

    PubMed

    Morris, J S; Dastmalchi, M; Li, J; Chang, L; Chen, X; Hagel, J M; Facchini, P J

    2016-01-01

    Benzylisoquinoline alkaloid (BIA) metabolism has been the focus of a considerable research effort over the past half-century, primarily because of the pharmaceutical importance of several compounds produced by opium poppy (Papaver somniferum). Advancements in genomics technologies have substantially accelerated the rate of gene discovery over the past decade, such that most biosynthetic enzymes involved in the formation of the major alkaloids of opium poppy have now been isolated and partially characterized. Not unexpectedly, the availability of all perceived biosynthetic genes has facilitated the reconstitution of several BIA pathways in microbial hosts, including yeast (Saccharomyces cerevisiae). Product yields are currently insufficient to consider the commercial production of high-value BIAs, such as morphine. However, the rudimentary success demonstrated by the uncomplicated and routine assembly of a multitude of characterized BIA biosynthetic genes provides a valuable gene discovery tool for the rapid functional identification of the plethora of gene candidates available through increasingly accessible genomic, transcriptomic, and proteomic databases. BIA biosynthetic gene discovery represents a substantial research opportunity largely owing to the wealth of existing enzyme data mostly obtained from a single plant species. Functionally novel enzymes and variants with potential metabolic engineering applications can be considered the primary targets. Selection of candidates from sequence repositories is facilitated by the monophyletic relationship among biosynthetic genes belonging to a wide range of enzyme families, such as the numerous cytochromes P450 and AdoMet-dependent O- and N-methyltransferases that operate in BIA metabolism. We describe methods for the rapid functional screening of uncharacterized gene candidates encoding potential BIA biosynthetic enzymes using yeast strains engineered to perform selected metabolic conversions. As an initial

  16. Mitigating false-positive associations in rare disease gene discovery.

    PubMed

    Akle, Sebastian; Chun, Sung; Jordan, Daniel M; Cassa, Christopher A

    2015-10-01

    Clinical sequencing is expanding, but causal variants are still not identified in the majority of cases. These unsolved cases can aid in gene discovery when individuals with similar phenotypes are identified in systems such as the Matchmaker Exchange. We describe risks for gene discovery in this growing set of unsolved cases. In a set of rare disease cases with the same phenotype, it is not difficult to find two individuals with the same phenotype that carry variants in the same gene. We quantify the risk of false-positive association in a cohort of individuals with the same phenotype, using the prior probability of observing a variant in each gene from over 60,000 individuals (Exome Aggregation Consortium). Based on the number of individuals with a genic variant, cohort size, specific gene, and mode of inheritance, we calculate a P value that the match represents a true association. A match in two of 10 patients in MECP2 is statistically significant (P = 0.0014), whereas a match in TTN would not reach significance, as expected (P > 0.999). Finally, we analyze the probability of matching in clinical exome cases to estimate the number of cases needed to identify genes related to different disorders. We offer Rare Disease Match, an online tool to mitigate the uncertainty of false-positive associations. PMID:26378430

  17. Mitigating false-positive associations in rare disease gene discovery.

    PubMed

    Akle, Sebastian; Chun, Sung; Jordan, Daniel M; Cassa, Christopher A

    2015-10-01

    Clinical sequencing is expanding, but causal variants are still not identified in the majority of cases. These unsolved cases can aid in gene discovery when individuals with similar phenotypes are identified in systems such as the Matchmaker Exchange. We describe risks for gene discovery in this growing set of unsolved cases. In a set of rare disease cases with the same phenotype, it is not difficult to find two individuals with the same phenotype that carry variants in the same gene. We quantify the risk of false-positive association in a cohort of individuals with the same phenotype, using the prior probability of observing a variant in each gene from over 60,000 individuals (Exome Aggregation Consortium). Based on the number of individuals with a genic variant, cohort size, specific gene, and mode of inheritance, we calculate a P value that the match represents a true association. A match in two of 10 patients in MECP2 is statistically significant (P = 0.0014), whereas a match in TTN would not reach significance, as expected (P > 0.999). Finally, we analyze the probability of matching in clinical exome cases to estimate the number of cases needed to identify genes related to different disorders. We offer Rare Disease Match, an online tool to mitigate the uncertainty of false-positive associations.

  18. Discovery of a widely distributed toxin biosynthetic gene cluster

    PubMed Central

    Lee, Shaun W.; Mitchell, Douglas A.; Markley, Andrew L.; Hensler, Mary E.; Gonzalez, David; Wohlrab, Aaron; Dorrestein, Pieter C.; Nizet, Victor; Dixon, Jack E.

    2008-01-01

    Bacteriocins represent a large family of ribosomally produced peptide antibiotics. Here we describe the discovery of a widely conserved biosynthetic gene cluster for the synthesis of thiazole and oxazole heterocycles on ribosomally produced peptides. These clusters encode a toxin precursor and all necessary proteins for toxin maturation and export. Using the toxin precursor peptide and heterocycle-forming synthetase proteins from the human pathogen Streptococcus pyogenes, we demonstrate the in vitro reconstitution of streptolysin S activity. We provide evidence that the synthetase enzymes, as predicted from our bioinformatics analysis, introduce heterocycles onto precursor peptides, thereby providing molecular insight into the chemical structure of streptolysin S. Furthermore, our studies reveal that the synthetase exhibits relaxed substrate specificity and modifies toxin precursors from both related and distant species. Given our findings, it is likely that the discovery of similar peptidic toxins will rapidly expand to existing and emerging genomes. PMID:18375757

  19. OpenZika: An IBM World Community Grid Project to Accelerate Zika Virus Drug Discovery

    PubMed Central

    Perryman, Alexander L.; Horta Andrade, Carolina

    2016-01-01

    The Zika virus outbreak in the Americas has caused global concern. To help accelerate this fight against Zika, we launched the OpenZika project. OpenZika is an IBM World Community Grid Project that uses distributed computing on millions of computers and Android devices to run docking experiments, in order to dock tens of millions of drug-like compounds against crystal structures and homology models of Zika proteins (and other related flavivirus targets). This will enable the identification of new candidates that can then be tested in vitro, to advance the discovery and development of new antiviral drugs against the Zika virus. The docking data is being made openly accessible so that all members of the global research community can use it to further advance drug discovery studies against Zika and other related flaviviruses. PMID:27764115

  20. Microbial genome mining for accelerated natural products discovery: is a renaissance in the making?

    PubMed Central

    Bachmann, Brian O; Van Lanen, Steven G; Baltz, Richard H

    2014-01-01

    Microbial genome mining is a rapidly developing approach to discover new and novel secondary metabolites for drug discovery. Many advances have been made in the past decade to facilitate genome mining, and these are reviewed in this Special Issue of the Journal of Industrial Microbiology and Biotechnology. In this Introductory Review, we discuss the concept of genome mining and why it is important for the revitalization of natural product discovery; what microbes show the most promise for focused genome mining; how microbial genomes can be mined; how genome mining can be leveraged with other technologies; how progress on genome mining can be accelerated; and who should fund future progress in this promising field. We direct interested readers to more focused reviews on the individual topics in this Special Issue for more detailed summaries on the current state-of-the-art. PMID:24342967

  1. Gene discovery of modular diterpene metabolism in nonmodel systems.

    PubMed

    Zerbe, Philipp; Hamberger, Björn; Yuen, Macaire M S; Chiang, Angela; Sandhu, Harpreet K; Madilao, Lina L; Nguyen, Anh; Hamberger, Britta; Bach, Søren Spanner; Bohlmann, Jörg

    2013-06-01

    Plants produce over 10,000 different diterpenes of specialized (secondary) metabolism, and fewer diterpenes of general (primary) metabolism. Specialized diterpenes may have functions in ecological interactions of plants with other organisms and also benefit humanity as pharmaceuticals, fragrances, resins, and other industrial bioproducts. Examples of high-value diterpenes are taxol and forskolin pharmaceuticals or ambroxide fragrances. Yields and purity of diterpenes obtained from natural sources or by chemical synthesis are often insufficient for large-volume or high-end applications. Improvement of agricultural or biotechnological diterpene production requires knowledge of biosynthetic genes and enzymes. However, specialized diterpene pathways are extremely diverse across the plant kingdom, and most specialized diterpenes are taxonomically restricted to a few plant species, genera, or families. Consequently, there is no single reference system to guide gene discovery and rapid annotation of specialized diterpene pathways. Functional diversification of genes and plasticity of enzyme functions of these pathways further complicate correct annotation. To address this challenge, we used a set of 10 different plant species to develop a general strategy for diterpene gene discovery in nonmodel systems. The approach combines metabolite-guided transcriptome resources, custom diterpene synthase (diTPS) and cytochrome P450 reference gene databases, phylogenies, and, as shown for select diTPSs, single and coupled enzyme assays using microbial and plant expression systems. In the 10 species, we identified 46 new diTPS candidates and over 400 putatively terpenoid-related P450s in a resource of nearly 1 million predicted transcripts of diterpene-accumulating tissues. Phylogenetic patterns of lineage-specific blooms of genes guided functional characterization. PMID:23613273

  2. Gene discovery of modular diterpene metabolism in nonmodel systems.

    PubMed

    Zerbe, Philipp; Hamberger, Björn; Yuen, Macaire M S; Chiang, Angela; Sandhu, Harpreet K; Madilao, Lina L; Nguyen, Anh; Hamberger, Britta; Bach, Søren Spanner; Bohlmann, Jörg

    2013-06-01

    Plants produce over 10,000 different diterpenes of specialized (secondary) metabolism, and fewer diterpenes of general (primary) metabolism. Specialized diterpenes may have functions in ecological interactions of plants with other organisms and also benefit humanity as pharmaceuticals, fragrances, resins, and other industrial bioproducts. Examples of high-value diterpenes are taxol and forskolin pharmaceuticals or ambroxide fragrances. Yields and purity of diterpenes obtained from natural sources or by chemical synthesis are often insufficient for large-volume or high-end applications. Improvement of agricultural or biotechnological diterpene production requires knowledge of biosynthetic genes and enzymes. However, specialized diterpene pathways are extremely diverse across the plant kingdom, and most specialized diterpenes are taxonomically restricted to a few plant species, genera, or families. Consequently, there is no single reference system to guide gene discovery and rapid annotation of specialized diterpene pathways. Functional diversification of genes and plasticity of enzyme functions of these pathways further complicate correct annotation. To address this challenge, we used a set of 10 different plant species to develop a general strategy for diterpene gene discovery in nonmodel systems. The approach combines metabolite-guided transcriptome resources, custom diterpene synthase (diTPS) and cytochrome P450 reference gene databases, phylogenies, and, as shown for select diTPSs, single and coupled enzyme assays using microbial and plant expression systems. In the 10 species, we identified 46 new diTPS candidates and over 400 putatively terpenoid-related P450s in a resource of nearly 1 million predicted transcripts of diterpene-accumulating tissues. Phylogenetic patterns of lineage-specific blooms of genes guided functional characterization.

  3. Accelerated oral nanomedicine discovery from miniaturized screening to clinical production exemplified by paediatric HIV nanotherapies

    PubMed Central

    Giardiello, Marco; Liptrott, Neill J.; McDonald, Tom O.; Moss, Darren; Siccardi, Marco; Martin, Phil; Smith, Darren; Gurjar, Rohan; Rannard, Steve P.; Owen, Andrew

    2016-01-01

    Considerable scope exists to vary the physical and chemical properties of nanoparticles, with subsequent impact on biological interactions; however, no accelerated process to access large nanoparticle material space is currently available, hampering the development of new nanomedicines. In particular, no clinically available nanotherapies exist for HIV populations and conventional paediatric HIV medicines are poorly available; one current paediatric formulation utilizes high ethanol concentrations to solubilize lopinavir, a poorly soluble antiretroviral. Here we apply accelerated nanomedicine discovery to generate a potential aqueous paediatric HIV nanotherapy, with clinical translation and regulatory approval for human evaluation. Our rapid small-scale screening approach yields large libraries of solid drug nanoparticles (160 individual components) targeting oral dose. Screening uses 1 mg of drug compound per library member and iterative pharmacological and chemical evaluation establishes potential candidates for progression through to clinical manufacture. The wide applicability of our strategy has implications for multiple therapy development programmes. PMID:27767027

  4. Genome-enabled Discovery of Carbon Sequestration Genes

    SciTech Connect

    Tuskan, Gerald A; Tschaplinski, Timothy J; Kalluri, Udaya C; Yin, Tongming; Yang, Xiaohan; Zhang, Xinye; Engle, Nancy L; Ranjan, Priya; Basu, Manojit M; Gunter, Lee E; Jawdy, Sara; Martin, Madhavi Z; Campbell, Alina S; DiFazio, Stephen P; Davis, John M; Hinchee, Maud; Pinnacchio, Christa; Meilan, R; Busov, V.; Strauss, S

    2009-01-01

    The fate of carbon below ground is likely to be a major factor determining the success of carbon sequestration strategies involving plants. Despite their importance, molecular processes controlling belowground C allocation and partitioning are poorly understood. This project is leveraging the Populus trichocarpa genome sequence to discover genes important to C sequestration in plants and soils. The focus is on the identification of genes that provide key control points for the flow and chemical transformations of carbon in roots, concentrating on genes that control the synthesis of chemical forms of carbon that result in slower turnover rates of soil organic matter (i.e., increased recalcitrance). We propose to enhance carbon allocation and partitioning to roots by 1) modifying the auxin signaling pathway, and the invertase family, which controls sucrose metabolism, and by 2) increasing root proliferation through transgenesis with genes known to control fine root proliferation (e.g., ANT), 3) increasing the production of recalcitrant C metabolites by identifying genes controlling secondary C metabolism by a major mQTL-based gene discovery effort, and 4) increasing aboveground productivity by enhancing drought tolerance to achieve maximum C sequestration. This broad, integrated approach is aimed at ultimately enhancing root biomass as well as root detritus longevity, providing the best prospects for significant enhancement of belowground C sequestration.

  5. Neuropsychiatric genomics in precision medicine: diagnostics, gene discovery, and translation

    PubMed Central

    Need, Anna C.; Goldstein, David B.

    2016-01-01

    Only a few years after its development, next-generation sequencing is rapidly becoming an essential part of clinical care for patients with serious neurological conditions, especially in the diagnosis of early-onset and severe presentations. Beyond this diagnostic role, there has been an explosion in definitive gene discovery in a range of neuropsychiatric diseases. This is providing new pointers to underlying disease biology and is beginning to outline a new framework for genetic stratification of neuropsychiatric disease, with clear relevance to both individual treatment optimization and clinical trial design. Here, we outline these developments and chart the expected impact on the treatment of neurological, neurodevelopmental, and psychiatric disease. PMID:27757059

  6. Inflammatory bowel disease gene discovery. CRADA final report

    SciTech Connect

    1997-09-09

    The ultimate goal of this project is to identify the human gene(s) responsible for the disorder known as IBD. The work was planned in two phases. The desired products resulting from Phase 1 were BAC clone(s) containing the genetic marker(s) identified by gene/Networks, Inc. as potentially linked to IBD, plasmid subclones of those BAC(s), and new genetic markers developed from these plasmid subclones. The newly developed markers would be genotyped by gene/Networks, Inc. to ascertain evidence for linkage or non-linkage of IBD to this region. If non-linkage was indicated, the project would move to investigation of other candidate chromosomal regions. Where linkage was indicated, the project would move to Phase 2, in which a physical map of the candidate region(s) would be developed. The products of this phase would be contig(s) of BAC clones in the region exhibiting linkage to IBD, as well as plasmic subclones of the BACs and further genetic marker development. There would also be continued genotyping with new polymorphic markers during this phase. It was anticipated that clones identified and developed during these two phases would provide the physical resources for eventual disease gene discovery.

  7. Psychiatric gene discoveries shape evidence on ADHD's biology.

    PubMed

    Thapar, A; Martin, J; Mick, E; Arias Vásquez, A; Langley, K; Scherer, S W; Schachar, R; Crosbie, J; Williams, N; Franke, B; Elia, J; Glessner, J; Hakonarson, H; Owen, M J; Faraone, S V; O'Donovan, M C; Holmans, P

    2016-09-01

    A strong motivation for undertaking psychiatric gene discovery studies is to provide novel insights into unknown biology. Although attention-deficit hyperactivity disorder (ADHD) is highly heritable, and large, rare copy number variants (CNVs) contribute to risk, little is known about its pathogenesis and it remains commonly misunderstood. We assembled and pooled five ADHD and control CNV data sets from the United Kingdom, Ireland, United States of America, Northern Europe and Canada. Our aim was to test for enrichment of neurodevelopmental gene sets, implicated by recent exome-sequencing studies of (a) schizophrenia and (b) autism as a means of testing the hypothesis that common pathogenic mechanisms underlie ADHD and these other neurodevelopmental disorders. We also undertook hypothesis-free testing of all biological pathways. We observed significant enrichment of individual genes previously found to harbour schizophrenia de novo non-synonymous single-nucleotide variants (SNVs; P=5.4 × 10(-4)) and targets of the Fragile X mental retardation protein (P=0.0018). No enrichment was observed for activity-regulated cytoskeleton-associated protein (P=0.23) or N-methyl-D-aspartate receptor (P=0.74) post-synaptic signalling gene sets previously implicated in schizophrenia. Enrichment of ADHD CNV hits for genes impacted by autism de novo SNVs (P=0.019 for non-synonymous SNV genes) did not survive Bonferroni correction. Hypothesis-free testing yielded several highly significantly enriched biological pathways, including ion channel pathways. Enrichment findings were robust to multiple testing corrections and to sensitivity analyses that excluded the most significant sample. The findings reveal that CNVs in ADHD converge on biologically meaningful gene clusters, including ones now established as conferring risk of other neurodevelopmental disorders.

  8. Psychiatric gene discoveries shape evidence on ADHD's biology.

    PubMed

    Thapar, A; Martin, J; Mick, E; Arias Vásquez, A; Langley, K; Scherer, S W; Schachar, R; Crosbie, J; Williams, N; Franke, B; Elia, J; Glessner, J; Hakonarson, H; Owen, M J; Faraone, S V; O'Donovan, M C; Holmans, P

    2016-09-01

    A strong motivation for undertaking psychiatric gene discovery studies is to provide novel insights into unknown biology. Although attention-deficit hyperactivity disorder (ADHD) is highly heritable, and large, rare copy number variants (CNVs) contribute to risk, little is known about its pathogenesis and it remains commonly misunderstood. We assembled and pooled five ADHD and control CNV data sets from the United Kingdom, Ireland, United States of America, Northern Europe and Canada. Our aim was to test for enrichment of neurodevelopmental gene sets, implicated by recent exome-sequencing studies of (a) schizophrenia and (b) autism as a means of testing the hypothesis that common pathogenic mechanisms underlie ADHD and these other neurodevelopmental disorders. We also undertook hypothesis-free testing of all biological pathways. We observed significant enrichment of individual genes previously found to harbour schizophrenia de novo non-synonymous single-nucleotide variants (SNVs; P=5.4 × 10(-4)) and targets of the Fragile X mental retardation protein (P=0.0018). No enrichment was observed for activity-regulated cytoskeleton-associated protein (P=0.23) or N-methyl-D-aspartate receptor (P=0.74) post-synaptic signalling gene sets previously implicated in schizophrenia. Enrichment of ADHD CNV hits for genes impacted by autism de novo SNVs (P=0.019 for non-synonymous SNV genes) did not survive Bonferroni correction. Hypothesis-free testing yielded several highly significantly enriched biological pathways, including ion channel pathways. Enrichment findings were robust to multiple testing corrections and to sensitivity analyses that excluded the most significant sample. The findings reveal that CNVs in ADHD converge on biologically meaningful gene clusters, including ones now established as conferring risk of other neurodevelopmental disorders. PMID:26573769

  9. Psychiatric gene discoveries shape evidence on ADHD's biology

    PubMed Central

    Thapar, A; Martin, J; Mick, E; Arias Vásquez, A; Langley, K; Scherer, S W; Schachar, R; Crosbie, J; Williams, N; Franke, B; Elia, J; Glessner, J; Hakonarson, H; Owen, M J; Faraone, S V; O'Donovan, M C; Holmans, P

    2016-01-01

    A strong motivation for undertaking psychiatric gene discovery studies is to provide novel insights into unknown biology. Although attention-deficit hyperactivity disorder (ADHD) is highly heritable, and large, rare copy number variants (CNVs) contribute to risk, little is known about its pathogenesis and it remains commonly misunderstood. We assembled and pooled five ADHD and control CNV data sets from the United Kingdom, Ireland, United States of America, Northern Europe and Canada. Our aim was to test for enrichment of neurodevelopmental gene sets, implicated by recent exome-sequencing studies of (a) schizophrenia and (b) autism as a means of testing the hypothesis that common pathogenic mechanisms underlie ADHD and these other neurodevelopmental disorders. We also undertook hypothesis-free testing of all biological pathways. We observed significant enrichment of individual genes previously found to harbour schizophrenia de novo non-synonymous single-nucleotide variants (SNVs; P=5.4 × 10−4) and targets of the Fragile X mental retardation protein (P=0.0018). No enrichment was observed for activity-regulated cytoskeleton-associated protein (P=0.23) or N-methyl-D-aspartate receptor (P=0.74) post-synaptic signalling gene sets previously implicated in schizophrenia. Enrichment of ADHD CNV hits for genes impacted by autism de novo SNVs (P=0.019 for non-synonymous SNV genes) did not survive Bonferroni correction. Hypothesis-free testing yielded several highly significantly enriched biological pathways, including ion channel pathways. Enrichment findings were robust to multiple testing corrections and to sensitivity analyses that excluded the most significant sample. The findings reveal that CNVs in ADHD converge on biologically meaningful gene clusters, including ones now established as conferring risk of other neurodevelopmental disorders. PMID:26573769

  10. Genomics-Based Discovery of Plant Genes for Synthetic Biology of Terpenoid Fragrances: A Case Study in Sandalwood oil Biosynthesis.

    PubMed

    Celedon, J M; Bohlmann, J

    2016-01-01

    Terpenoid fragrances are powerful mediators of ecological interactions in nature and have a long history of traditional and modern industrial applications. Plants produce a great diversity of fragrant terpenoid metabolites, which make them a superb source of biosynthetic genes and enzymes. Advances in fragrance gene discovery have enabled new approaches in synthetic biology of high-value speciality molecules toward applications in the fragrance and flavor, food and beverage, cosmetics, and other industries. Rapid developments in transcriptome and genome sequencing of nonmodel plant species have accelerated the discovery of fragrance biosynthetic pathways. In parallel, advances in metabolic engineering of microbial and plant systems have established platforms for synthetic biology applications of some of the thousands of plant genes that underlie fragrance diversity. While many fragrance molecules (eg, simple monoterpenes) are abundant in readily renewable plant materials, some highly valuable fragrant terpenoids (eg, santalols, ambroxides) are rare in nature and interesting targets for synthetic biology. As a representative example for genomics/transcriptomics enabled gene and enzyme discovery, we describe a strategy used successfully for elucidation of a complete fragrance biosynthetic pathway in sandalwood (Santalum album) and its reconstruction in yeast (Saccharomyces cerevisiae). We address questions related to the discovery of specific genes within large gene families and recovery of rare gene transcripts that are selectively expressed in recalcitrant tissues. To substantiate the validity of the approaches, we describe the combination of methods used in the gene and enzyme discovery of a cytochrome P450 in the fragrant heartwood of tropical sandalwood, responsible for the fragrance defining, final step in the biosynthesis of (Z)-santalols. PMID:27480682

  11. Genomics-Based Discovery of Plant Genes for Synthetic Biology of Terpenoid Fragrances: A Case Study in Sandalwood oil Biosynthesis.

    PubMed

    Celedon, J M; Bohlmann, J

    2016-01-01

    Terpenoid fragrances are powerful mediators of ecological interactions in nature and have a long history of traditional and modern industrial applications. Plants produce a great diversity of fragrant terpenoid metabolites, which make them a superb source of biosynthetic genes and enzymes. Advances in fragrance gene discovery have enabled new approaches in synthetic biology of high-value speciality molecules toward applications in the fragrance and flavor, food and beverage, cosmetics, and other industries. Rapid developments in transcriptome and genome sequencing of nonmodel plant species have accelerated the discovery of fragrance biosynthetic pathways. In parallel, advances in metabolic engineering of microbial and plant systems have established platforms for synthetic biology applications of some of the thousands of plant genes that underlie fragrance diversity. While many fragrance molecules (eg, simple monoterpenes) are abundant in readily renewable plant materials, some highly valuable fragrant terpenoids (eg, santalols, ambroxides) are rare in nature and interesting targets for synthetic biology. As a representative example for genomics/transcriptomics enabled gene and enzyme discovery, we describe a strategy used successfully for elucidation of a complete fragrance biosynthetic pathway in sandalwood (Santalum album) and its reconstruction in yeast (Saccharomyces cerevisiae). We address questions related to the discovery of specific genes within large gene families and recovery of rare gene transcripts that are selectively expressed in recalcitrant tissues. To substantiate the validity of the approaches, we describe the combination of methods used in the gene and enzyme discovery of a cytochrome P450 in the fragrant heartwood of tropical sandalwood, responsible for the fragrance defining, final step in the biosynthesis of (Z)-santalols.

  12. Genome Enabled Discovery of Carbon Sequestration Genes in Poplar

    SciTech Connect

    Filichkin, Sergei; Etherington, Elizabeth; Ma, Caiping; Strauss, Steve

    2007-02-22

    The goals of the S.H. Strauss laboratory portion of 'Genome-enabled discovery of carbon sequestration genes in poplar' are (1) to explore the functions of candidate genes using Populus transformation by inserting genes provided by Oakridge National Laboratory (ORNL) and the University of Florida (UF) into poplar; (2) to expand the poplar transformation toolkit by developing transformation methods for important genotypes; and (3) to allow induced expression, and efficient gene suppression, in roots and other tissues. As part of the transformation improvement effort, OSU developed transformation protocols for Populus trichocarpa 'Nisqually-1' clone and an early flowering P. alba clone, 6K10. Complete descriptions of the transformation systems were published (Ma et. al. 2004, Meilan et. al 2004). Twenty-one 'Nisqually-1' and 622 6K10 transgenic plants were generated. To identify root predominant promoters, a set of three promoters were tested for their tissue-specific expression patterns in poplar and in Arabidopsis as a model system. A novel gene, ET304, was identified by analyzing a collection of poplar enhancer trap lines generated at OSU (Filichkin et. al 2006a, 2006b). Other promoters include the pGgMT1 root-predominant promoter from Casuarina glauca and the pAtPIN2 promoter from Arabidopsis root specific PIN2 gene. OSU tested two induction systems, alcohol- and estrogen-inducible, in multiple poplar transgenics. Ethanol proved to be the more efficient when tested in tissue culture and greenhouse conditions. Two estrogen-inducible systems were evaluated in transgenic Populus, neither of which functioned reliably in tissue culture conditions. GATEWAY-compatible plant binary vectors were designed to compare the silencing efficiency of homologous (direct) RNAi vs. heterologous (transitive) RNAi inverted repeats. A set of genes was targeted for post transcriptional silencing in the model Arabidopsis system; these include the floral meristem identity gene (APETALA1 or

  13. Second-generation sequencing for gene discovery in the Brassicaceae.

    PubMed

    Hayward, Alice; Vighnesh, Guru; Delay, Christina; Samian, Mohd Rafizan; Manoli, Sahana; Stiller, Jiri; McKenzie, Megan; Edwards, David; Batley, Jacqueline

    2012-08-01

    The Brassicaceae contains the most diverse collection of agriculturally important crop species of all plant families. Yet, this is one of the few families that do not form functional symbiotic associations with mycorrhizal fungi in the soil for improved nutrient acquisition. The genes involved in this symbiosis were more recently recruited by legumes for symbiotic association with nitrogen-fixing rhizobia bacteria. This study applied second-generation sequencing (SGS) and analysis tools to discover that two such genes, NSP1 (Nodulation Signalling Pathway 1) and NSP2, remain conserved in diverse members of the Brassicaceae despite the absence of these symbioses. We demonstrate the utility of SGS data for the discovery of putative gene homologs and their analysis in complex polyploid crop genomes with little prior sequence information. Furthermore, we show how this data can be applied to enhance downstream reverse genetics analyses. We hypothesize that Brassica NSP genes may function in the root in other plant-microbe interaction pathways that were recruited for mycorrhizal and rhizobial symbioses during evolution.

  14. Sugarcane Functional Genomics: Gene Discovery for Agronomic Trait Development

    PubMed Central

    Menossi, M.; Silva-Filho, M. C.; Vincentz, M.; Van-Sluys, M.-A.; Souza, G. M.

    2008-01-01

    Sugarcane is a highly productive crop used for centuries as the main source of sugar and recently to produce ethanol, a renewable bio-fuel energy source. There is increased interest in this crop due to the impending need to decrease fossil fuel usage. Sugarcane has a highly polyploid genome. Expressed sequence tag (EST) sequencing has significantly contributed to gene discovery and expression studies used to associate function with sugarcane genes. A significant amount of data exists on regulatory events controlling responses to herbivory, drought, and phosphate deficiency, which cause important constraints on yield and on endophytic bacteria, which are highly beneficial. The means to reduce drought, phosphate deficiency, and herbivory by the sugarcane borer have a negative impact on the environment. Improved tolerance for these constraints is being sought. Sugarcane's ability to accumulate sucrose up to 16% of its culm dry weight is a challenge for genetic manipulation. Genome-based technology such as cDNA microarray data indicates genes associated with sugar content that may be used to develop new varieties improved for sucrose content or for traits that restrict the expansion of the cultivated land. The genes can also be used as molecular markers of agronomic traits in traditional breeding programs. PMID:18273390

  15. Second-generation sequencing for gene discovery in the Brassicaceae.

    PubMed

    Hayward, Alice; Vighnesh, Guru; Delay, Christina; Samian, Mohd Rafizan; Manoli, Sahana; Stiller, Jiri; McKenzie, Megan; Edwards, David; Batley, Jacqueline

    2012-08-01

    The Brassicaceae contains the most diverse collection of agriculturally important crop species of all plant families. Yet, this is one of the few families that do not form functional symbiotic associations with mycorrhizal fungi in the soil for improved nutrient acquisition. The genes involved in this symbiosis were more recently recruited by legumes for symbiotic association with nitrogen-fixing rhizobia bacteria. This study applied second-generation sequencing (SGS) and analysis tools to discover that two such genes, NSP1 (Nodulation Signalling Pathway 1) and NSP2, remain conserved in diverse members of the Brassicaceae despite the absence of these symbioses. We demonstrate the utility of SGS data for the discovery of putative gene homologs and their analysis in complex polyploid crop genomes with little prior sequence information. Furthermore, we show how this data can be applied to enhance downstream reverse genetics analyses. We hypothesize that Brassica NSP genes may function in the root in other plant-microbe interaction pathways that were recruited for mycorrhizal and rhizobial symbioses during evolution. PMID:22765874

  16. The Matchmaker Exchange: a platform for rare disease gene discovery.

    PubMed

    Philippakis, Anthony A; Azzariti, Danielle R; Beltran, Sergi; Brookes, Anthony J; Brownstein, Catherine A; Brudno, Michael; Brunner, Han G; Buske, Orion J; Carey, Knox; Doll, Cassie; Dumitriu, Sergiu; Dyke, Stephanie O M; den Dunnen, Johan T; Firth, Helen V; Gibbs, Richard A; Girdea, Marta; Gonzalez, Michael; Haendel, Melissa A; Hamosh, Ada; Holm, Ingrid A; Huang, Lijia; Hurles, Matthew E; Hutton, Ben; Krier, Joel B; Misyura, Andriy; Mungall, Christopher J; Paschall, Justin; Paten, Benedict; Robinson, Peter N; Schiettecatte, François; Sobreira, Nara L; Swaminathan, Ganesh J; Taschner, Peter E; Terry, Sharon F; Washington, Nicole L; Züchner, Stephan; Boycott, Kym M; Rehm, Heidi L

    2015-10-01

    There are few better examples of the need for data sharing than in the rare disease community, where patients, physicians, and researchers must search for "the needle in a haystack" to uncover rare, novel causes of disease within the genome. Impeding the pace of discovery has been the existence of many small siloed datasets within individual research or clinical laboratory databases and/or disease-specific organizations, hoping for serendipitous occasions when two distant investigators happen to learn they have a rare phenotype in common and can "match" these cases to build evidence for causality. However, serendipity has never proven to be a reliable or scalable approach in science. As such, the Matchmaker Exchange (MME) was launched to provide a robust and systematic approach to rare disease gene discovery through the creation of a federated network connecting databases of genotypes and rare phenotypes using a common application programming interface (API). The core building blocks of the MME have been defined and assembled. Three MME services have now been connected through the API and are available for community use. Additional databases that support internal matching are anticipated to join the MME network as it continues to grow.

  17. The Matchmaker Exchange: a platform for rare disease gene discovery.

    PubMed

    Philippakis, Anthony A; Azzariti, Danielle R; Beltran, Sergi; Brookes, Anthony J; Brownstein, Catherine A; Brudno, Michael; Brunner, Han G; Buske, Orion J; Carey, Knox; Doll, Cassie; Dumitriu, Sergiu; Dyke, Stephanie O M; den Dunnen, Johan T; Firth, Helen V; Gibbs, Richard A; Girdea, Marta; Gonzalez, Michael; Haendel, Melissa A; Hamosh, Ada; Holm, Ingrid A; Huang, Lijia; Hurles, Matthew E; Hutton, Ben; Krier, Joel B; Misyura, Andriy; Mungall, Christopher J; Paschall, Justin; Paten, Benedict; Robinson, Peter N; Schiettecatte, François; Sobreira, Nara L; Swaminathan, Ganesh J; Taschner, Peter E; Terry, Sharon F; Washington, Nicole L; Züchner, Stephan; Boycott, Kym M; Rehm, Heidi L

    2015-10-01

    There are few better examples of the need for data sharing than in the rare disease community, where patients, physicians, and researchers must search for "the needle in a haystack" to uncover rare, novel causes of disease within the genome. Impeding the pace of discovery has been the existence of many small siloed datasets within individual research or clinical laboratory databases and/or disease-specific organizations, hoping for serendipitous occasions when two distant investigators happen to learn they have a rare phenotype in common and can "match" these cases to build evidence for causality. However, serendipity has never proven to be a reliable or scalable approach in science. As such, the Matchmaker Exchange (MME) was launched to provide a robust and systematic approach to rare disease gene discovery through the creation of a federated network connecting databases of genotypes and rare phenotypes using a common application programming interface (API). The core building blocks of the MME have been defined and assembled. Three MME services have now been connected through the API and are available for community use. Additional databases that support internal matching are anticipated to join the MME network as it continues to grow. PMID:26295439

  18. Accelerated Discovery of Thermoelectric Materials: Combinatorial Facility and High-Throughput Measurement of Thermoelectric Power Factor.

    PubMed

    García-Cañadas, Jorge; Adkins, Nicholas J E; McCain, Stephen; Hauptstein, Bastian; Brew, Ashley; Jarvis, David J; Min, Gao

    2016-06-13

    A series of processes have been developed to facilitate the rapid discovery of new promising thermoelectric alloys. A novel combinatorial facility where elements are wire-fed and laser-melted was designed and constructed. Different sample compositions can be achieved by feeding different element wires at specific rates. The composition of all the samples prepared was tested by energy dispersive X-ray spectroscopy (EDS). Then, their thermoelectric properties (power factor) at room temperature were screened in a specially designed new high-throughput setup. After the screening, the thermoelectric properties can be mapped with the possibility of identifying compositional trends. As a proof-of-concept, a promising thermoelectric ternary system, Al-Fe-Ti, has been identified, demonstrating the capability of this accelerated approach.

  19. Accelerated Discovery of Thermoelectric Materials: Combinatorial Facility and High-Throughput Measurement of Thermoelectric Power Factor.

    PubMed

    García-Cañadas, Jorge; Adkins, Nicholas J E; McCain, Stephen; Hauptstein, Bastian; Brew, Ashley; Jarvis, David J; Min, Gao

    2016-06-13

    A series of processes have been developed to facilitate the rapid discovery of new promising thermoelectric alloys. A novel combinatorial facility where elements are wire-fed and laser-melted was designed and constructed. Different sample compositions can be achieved by feeding different element wires at specific rates. The composition of all the samples prepared was tested by energy dispersive X-ray spectroscopy (EDS). Then, their thermoelectric properties (power factor) at room temperature were screened in a specially designed new high-throughput setup. After the screening, the thermoelectric properties can be mapped with the possibility of identifying compositional trends. As a proof-of-concept, a promising thermoelectric ternary system, Al-Fe-Ti, has been identified, demonstrating the capability of this accelerated approach. PMID:27186664

  20. Reformulation of DFT +U as a Pseudohybrid Hubbard Density Functional for Accelerated Materials Discovery

    NASA Astrophysics Data System (ADS)

    Agapito, Luis A.; Curtarolo, Stefano; Buongiorno Nardelli, Marco

    2015-01-01

    The accurate prediction of the electronic properties of materials at a low computational expense is a necessary condition for the development of effective high-throughput quantum-mechanics (HTQM) frameworks for accelerated materials discovery. HTQM infrastructures rely on the predictive capability of density functional theory (DFT), the method of choice for the first-principles study of materials properties. However, DFT suffers from approximations that result in a somewhat inaccurate description of the electronic band structure of semiconductors and insulators. In this article, we introduce ACBN0, a pseudohybrid Hubbard density functional that yields an improved prediction of the band structure of insulators such as transition-metal oxides, as shown for TiO2 , MnO, NiO, and ZnO, with only a negligible increase in computational cost.

  1. Big Data and Comparative Effectiveness Research in Radiation Oncology: Synergy and Accelerated Discovery

    PubMed Central

    Trifiletti, Daniel M.; Showalter, Timothy N.

    2015-01-01

    Several advances in large data set collection and processing have the potential to provide a wave of new insights and improvements in the use of radiation therapy for cancer treatment. The era of electronic health records, genomics, and improving information technology resources creates the opportunity to leverage these developments to create a learning healthcare system that can rapidly deliver informative clinical evidence. By merging concepts from comparative effectiveness research with the tools and analytic approaches of “big data,” it is hoped that this union will accelerate discovery, improve evidence for decision making, and increase the availability of highly relevant, personalized information. This combination offers the potential to provide data and analysis that can be leveraged for ultra-personalized medicine and high-quality, cutting-edge radiation therapy. PMID:26697409

  2. Advances in Climate Informatics: Accelerating Discovery in Climate Science with Machine Learning

    NASA Astrophysics Data System (ADS)

    Monteleoni, C.

    2015-12-01

    Despite the scientific consensus on climate change, drastic uncertainties remain. The climate system is characterized by complex phenomena that are imperfectly observed and even more imperfectly simulated. Climate data is Big Data, yet the magnitude of data and climate model output increasingly overwhelms the tools currently used to analyze them. Computational innovation is therefore needed. Machine learning is a cutting-edge research area at the intersection of computer science and statistics, focused on developing algorithms for big data analytics. Machine learning has revolutionized scientific discovery (e.g. Bioinformatics), and spawned new technologies (e.g. Web search). The impact of machine learning on climate science promises to be similarly profound. The goal of the novel interdisciplinary field of Climate Informatics is to accelerate discovery in climate science with machine learning, in order to shed light on urgent questions about climate change. In this talk, I will survey my research group's progress in the emerging field of climate informatics. Our work includes algorithms to improve the combined predictions of the IPCC multi-model ensemble, applications to seasonal and subseasonal prediction, and a data-driven technique to detect and define extreme events.

  3. Protein-Protein Interaction Inhibition (2P2I)-Oriented Chemical Library Accelerates Hit Discovery.

    PubMed

    Milhas, Sabine; Raux, Brigitt; Betzi, Stéphane; Derviaux, Carine; Roche, Philippe; Restouin, Audrey; Basse, Marie-Jeanne; Rebuffet, Etienne; Lugari, Adrien; Badol, Marion; Kashyap, Rudra; Lissitzky, Jean-Claude; Eydoux, Cécilia; Hamon, Véronique; Gourdel, Marie-Edith; Combes, Sébastien; Zimmermann, Pascale; Aurrand-Lions, Michel; Roux, Thomas; Rogers, Catherine; Müller, Susanne; Knapp, Stefan; Trinquet, Eric; Collette, Yves; Guillemot, Jean-Claude; Morelli, Xavier

    2016-08-19

    Protein-protein interactions (PPIs) represent an enormous source of opportunity for therapeutic intervention. We and others have recently pinpointed key rules that will help in identifying the next generation of innovative drugs to tackle this challenging class of targets within the next decade. We used these rules to design an oriented chemical library corresponding to a set of diverse "PPI-like" modulators with cores identified as privileged structures in therapeutics. In this work, we purchased the resulting 1664 structurally diverse compounds and evaluated them on a series of representative protein-protein interfaces with distinct "druggability" potential using homogeneous time-resolved fluorescence (HTRF) technology. For certain PPI classes, analysis of the hit rates revealed up to 100 enrichment factors compared with nonoriented chemical libraries. This observation correlates with the predicted "druggability" of the targets. A specific focus on selectivity profiles, the three-dimensional (3D) molecular modes of action resolved by X-ray crystallography, and the biological activities of identified hits targeting the well-defined "druggable" bromodomains of the bromo and extraterminal (BET) family are presented as a proof-of-concept. Overall, our present study illustrates the potency of machine learning-based oriented chemical libraries to accelerate the identification of hits targeting PPIs. A generalization of this method to a larger set of compounds will accelerate the discovery of original and potent probes for this challenging class of targets. PMID:27219844

  4. Turning publicly available gene expression data into discoveries using gene set context analysis.

    PubMed

    Ji, Zhicheng; Vokes, Steven A; Dang, Chi V; Ji, Hongkai

    2016-01-01

    Gene Set Context Analysis (GSCA) is an open source software package to help researchers use massive amounts of publicly available gene expression data (PED) to make discoveries. Users can interactively visualize and explore gene and gene set activities in 25,000+ consistently normalized human and mouse gene expression samples representing diverse biological contexts (e.g. different cells, tissues and disease types, etc.). By providing one or multiple genes or gene sets as input and specifying a gene set activity pattern of interest, users can query the expression compendium to systematically identify biological contexts associated with the specified gene set activity pattern. In this way, researchers with new gene sets from their own experiments may discover previously unknown contexts of gene set functions and hence increase the value of their experiments. GSCA has a graphical user interface (GUI). The GUI makes the analysis convenient and customizable. Analysis results can be conveniently exported as publication quality figures and tables. GSCA is available at https://github.com/zji90/GSCA. This software significantly lowers the bar for biomedical investigators to use PED in their daily research for generating and screening hypotheses, which was previously difficult because of the complexity, heterogeneity and size of the data.

  5. Computational Materials Science and Chemistry: Accelerating Discovery and Innovation through Simulation-Based Engineering and Science

    SciTech Connect

    Crabtree, George; Glotzer, Sharon; McCurdy, Bill; Roberto, Jim

    2010-07-26

    This report is based on a SC Workshop on Computational Materials Science and Chemistry for Innovation on July 26-27, 2010, to assess the potential of state-of-the-art computer simulations to accelerate understanding and discovery in materials science and chemistry, with a focus on potential impacts in energy technologies and innovation. The urgent demand for new energy technologies has greatly exceeded the capabilities of today's materials and chemical processes. To convert sunlight to fuel, efficiently store energy, or enable a new generation of energy production and utilization technologies requires the development of new materials and processes of unprecedented functionality and performance. New materials and processes are critical pacing elements for progress in advanced energy systems and virtually all industrial technologies. Over the past two decades, the United States has developed and deployed the world's most powerful collection of tools for the synthesis, processing, characterization, and simulation and modeling of materials and chemical systems at the nanoscale, dimensions of a few atoms to a few hundred atoms across. These tools, which include world-leading x-ray and neutron sources, nanoscale science facilities, and high-performance computers, provide an unprecedented view of the atomic-scale structure and dynamics of materials and the molecular-scale basis of chemical processes. For the first time in history, we are able to synthesize, characterize, and model materials and chemical behavior at the length scale where this behavior is controlled. This ability is transformational for the discovery process and, as a result, confers a significant competitive advantage. Perhaps the most spectacular increase in capability has been demonstrated in high performance computing. Over the past decade, computational power has increased by a factor of a million due to advances in hardware and software. This rate of improvement, which shows no sign of abating, has

  6. Technology development for gene discovery and full-length sequencing

    SciTech Connect

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  7. The discovery of the microphthalmia locus and its gene, Mitf

    PubMed Central

    Arnheiter, Heinz

    2010-01-01

    Summary The history of the discovery of the microphthalmia locus and its gene, now called Mitf, is a testament to the triumph of serendipity. Although the first microphthalmia mutation was discovered among the descendants of a mouse that was irradiated for the purpose of mutagenesis, the mutation most likely was not radiation-induced but occurred spontaneously in one of the parents of a later breeding. Although Mitf might eventually have been identified by other molecular genetic techniques, it was first cloned from a chance transgene insertion at the microphthalmia locus. And although Mitf was found to encode a member of a well-known transcription factor family, its analysis might still be in its infancy had Mitf not turned out to be of crucial importance for the physiology and pathology of many distinct organs, including eye, ear, immune system, bone, and skin, and in particular for melanoma. In fact, near seven decades of Mitf research have led to many insights about development, function, degeneration, and malignancies of a number of specific cell types, and it is hoped that these insights will one day lead to therapies benefitting those afflicted with diseases originating in these cell types. PMID:20807369

  8. The discovery of the microphthalmia locus and its gene, Mitf.

    PubMed

    Arnheiter, Heinz

    2010-12-01

    The history of the discovery of the microphthalmia locus and its gene, now called Mitf, is a testament to the triumph of serendipity. Although the first microphthalmia mutation was discovered among the descendants of a mouse that was irradiated for the purpose of mutagenesis, the mutation most likely was not radiation induced but occurred spontaneously in one of the parents of a later breeding. Although Mitf might eventually have been identified by other molecular genetic techniques, it was first cloned from a chance transgene insertion at the microphthalmia locus. And although Mitf was found to encode a member of a well-known transcription factor family, its analysis might still be in its infancy had Mitf not turned out to be of crucial importance for the physiology and pathology of many distinct organs, including eye, ear, immune system, bone, and skin, and in particular for melanoma. In fact, near seven decades of Mitf research have led to many insights about development, function, degeneration, and malignancies of a number of specific cell types, and it is hoped that these insights will one day lead to therapies benefitting those afflicted with diseases originating in these cell types.

  9. Risk genes for schizophrenia: translational opportunities for drug discovery.

    PubMed

    Winchester, Catherine L; Pratt, Judith A; Morris, Brian J

    2014-07-01

    Despite intensive research over many years, the treatment of schizophrenia remains a major health issue. Current and emerging treatments for schizophrenia are based upon the classical dopamine and glutamate hypotheses of disease. Existing first and second generation antipsychotic drugs based upon the dopamine hypothesis are limited by their inability to treat all symptom domains and their undesirable side effect profiles. Third generation drugs based upon the glutamate hypothesis of disease are currently under evaluation but are more likely to be used as add on treatments. Hence there is a large unmet clinical need. A major challenge in neuropsychiatric disease research is the relatively limited knowledge of disease mechanisms. However, as our understanding of the genetic causes of the disease evolves, novel strategies for the development of improved therapeutic agents will become apparent. In this review we consider the current status of knowledge of the genetic basis of schizophrenia, including methods for identifying genetic variants associated with the disorder and how they impact on gene function. Although the genetic architecture of schizophrenia is complex, some targets amenable to pharmacological intervention can be discerned. We conclude that many challenges lie ahead but the stratification of patients according to biobehavioural constructs that cross existing disease classifications but with common genetic and neurobiological bases, offer opportunities for new approaches to effective drug discovery.

  10. Chromosome substitution strains: gene discovery functional analysis and systems studies

    PubMed Central

    Nadeau, Joseph H.; Forejt, Jiri; Takada, Toyoyuki; Shiroishi, Toshihiko

    2014-01-01

    Laboratory mice are valuable in biomedical research in part because of the extraordinary diversity of genetic resources that are available for studies of complex genetic traits and as models for human biology and disease. Chromosome substitution strains (CSSs) are important in this resource portfolio because of their demonstrated use for gene discovery, genetic and epigenetic studies, functional characterizations, and systems analysis. CSSs are made by replacing a single chromosome in a host strain with the corresponding chromosome from a donor strain. A complete CSS panel involves a total of 22 engineered inbred strains, one for each of the 19 autosomes, one each for the X and Y chromosomes, and one for mitochondria. A genome survey simply involves comparing each phenotype for each of the CSSs with the phenotypes of the host strain. The CSS panels that are available for laboratory mice have been used to dissect a remarkable variety of phenotypes and to characterize an impressive array of disease models. These surveys have revealed considerable phenotypic diversity even among closely related progenitor strains, evidence for strong epistasis and for heritable epigenetic changes. Perhaps most importantly, and presumably because of their unique genetic constitution, CSSs, and congenic strains derived from them, the genetic variants underlying quantitative trait loci (QTLs) are readily identified and functionally characterized. Together these studies show that CSSs are important resource for laboratory mice. PMID:22961226

  11. Natural Product Libraries to Accelerate the High Throughput Discovery of Therapeutic Leads±

    PubMed Central

    Johnson, Tyler A.; Sohn, Johann; Inman, Wayne D.; Estee, Samarkand A.; Loveridge, Steven T.; Vervoort, Helene C.; Tenney, Karen; Liu, Junke; Ang, Kenny Kean-Hooi; Ratnam, Joseline; Bray, Walter M.; Gassner, Nadine C.; Shen, Young Y.; Lokey, R. Scott; McKerrow, James H.; Boundy-Mills, Kyria; Nukanto, Arif; Kanti, Atit; Julistiono, Heddy; Kardono, Leonardus B. S.; Bjeldanes, Leonard F.; Crews, Phillip

    2011-01-01

    A high throughput (HT) paradigm generating LC-MS-UV-ELSD based natural product libraries to discover compounds with new bioactivities and or molecular structures is presented. To validate this methodology an extract of the Indo Pacific marine sponge Cacospongia mycofijiensis was evaluated using assays involving cytoskeletal profiling, tumor cell lines, and parasites. Twelve known compounds were identified including the latrunculins (1–4, 10), fijianolides (5, 8–9), mycothiazole (11), the aignopsanes (6–7) and sacrotride A (13). Compounds 1–4, 5, 8–11 exhibited bioactivity not previously reported against the parasite T. brucei, while 11 showed selectivity for lymphoma (U937) tumor cell lines. Four new compounds were also discovered including: aignopsanoic acid B (13), apo latrunculin T (14), 20-methoxy-fijianolide A (15) and aignopsane ketal (16). Compounds 13 and 16 represent important derivatives of the aignopsane class, 14 exhibited inhibition of T. brucei without disrupting microfilament assembly and 15 demonstrated modest microtubule stabilizing effects. The use of removable well plate libraries to avoid false positives from extracts enriched with only 1–2 major metabolites is also discussed. Overall, these results highlight the advantages of applying modern methods in natural products-based research to accelerate the HT discovery of therapeutic leads and or new molecular structures using LC-MS-UV-ELSD based libraries. PMID:22129061

  12. The NKI-Rockland Sample: A Model for Accelerating the Pace of Discovery Science in Psychiatry

    PubMed Central

    Nooner, Kate Brody; Colcombe, Stanley J.; Tobe, Russell H.; Mennes, Maarten; Benedict, Melissa M.; Moreno, Alexis L.; Panek, Laura J.; Brown, Shaquanna; Zavitz, Stephen T.; Li, Qingyang; Sikka, Sharad; Gutman, David; Bangaru, Saroja; Schlachter, Rochelle Tziona; Kamiel, Stephanie M.; Anwar, Ayesha R.; Hinz, Caitlin M.; Kaplan, Michelle S.; Rachlin, Anna B.; Adelsberg, Samantha; Cheung, Brian; Khanuja, Ranjit; Yan, Chaogan; Craddock, Cameron C.; Calhoun, Vincent; Courtney, William; King, Margaret; Wood, Dylan; Cox, Christine L.; Kelly, A. M. Clare; Di Martino, Adriana; Petkova, Eva; Reiss, Philip T.; Duan, Nancy; Thomsen, Dawn; Biswal, Bharat; Coffey, Barbara; Hoptman, Matthew J.; Javitt, Daniel C.; Pomara, Nunzio; Sidtis, John J.; Koplewicz, Harold S.; Castellanos, Francisco Xavier; Leventhal, Bennett L.; Milham, Michael P.

    2012-01-01

    The National Institute of Mental Health strategic plan for advancing psychiatric neuroscience calls for an acceleration of discovery and the delineation of developmental trajectories for risk and resilience across the lifespan. To attain these objectives, sufficiently powered datasets with broad and deep phenotypic characterization, state-of-the-art neuroimaging, and genetic samples must be generated and made openly available to the scientific community. The enhanced Nathan Kline Institute-Rockland Sample (NKI-RS) is a response to this need. NKI-RS is an ongoing, institutionally centered endeavor aimed at creating a large-scale (N > 1000), deeply phenotyped, community-ascertained, lifespan sample (ages 6–85 years old) with advanced neuroimaging and genetics. These data will be publically shared, openly, and prospectively (i.e., on a weekly basis). Herein, we describe the conceptual basis of the NKI-RS, including study design, sampling considerations, and steps to synchronize phenotypic and neuroimaging assessment. Additionally, we describe our process for sharing the data with the scientific community while protecting participant confidentiality, maintaining an adequate database, and certifying data integrity. The pilot phase of the NKI-RS, including challenges in recruiting, characterizing, imaging, and sharing data, is discussed while also explaining how this experience informed the final design of the enhanced NKI-RS. It is our hope that familiarity with the conceptual underpinnings of the enhanced NKI-RS will facilitate harmonization with future data collection efforts aimed at advancing psychiatric neuroscience and nosology. PMID:23087608

  13. Live Cell in Vitro and in Vivo Imaging Applications: Accelerating Drug Discovery

    PubMed Central

    Isherwood, Beverley; Timpson, Paul; McGhee, Ewan J; Anderson, Kurt I; Canel, Marta; Serrels, Alan; Brunton, Valerie G; Carragher, Neil O

    2011-01-01

    Dynamic regulation of specific molecular processes and cellular phenotypes in live cell systems reveal unique insights into cell fate and drug pharmacology that are not gained from traditional fixed endpoint assays. Recent advances in microscopic imaging platform technology combined with the development of novel optical biosensors and sophisticated image analysis solutions have increased the scope of live cell imaging applications in drug discovery. We highlight recent literature examples where live cell imaging has uncovered novel insight into biological mechanism or drug mode-of-action. We survey distinct types of optical biosensors and associated analytical methods for monitoring molecular dynamics, in vitro and in vivo. We describe the recent expansion of live cell imaging into automated target validation and drug screening activities through the development of dedicated brightfield and fluorescence kinetic imaging platforms. We provide specific examples of how temporal profiling of phenotypic response signatures using such kinetic imaging platforms can increase the value of in vitro high-content screening. Finally, we offer a prospective view of how further application and development of live cell imaging technology and reagents can accelerate preclinical lead optimization cycles and enhance the in vitro to in vivo translation of drug candidates. PMID:24310493

  14. Discovery of novel genes and gene isoforms by integrating transcriptomic and proteomic profiling from mouse liver.

    PubMed

    Wu, Peng; Zhang, Hongyu; Lin, Weiran; Hao, Yunwei; Ren, Liangliang; Zhang, Chengpu; Li, Ning; Wei, Handong; Jiang, Ying; He, Fuchu

    2014-05-01

    Comprehensively identifying gene expression in both transcriptomic and proteomic levels of one tissue is a prerequisite for a deeper understanding of its biological functions. Alternative splicing and RNA editing, two main forms of transcriptional processing, play important roles in transcriptome and proteome diversity and result in multiple isoforms for one gene, which are hard to identify by mass spectrometry (MS)-based proteomics approach due to the relative lack of isoform information in standard protein databases. In our study, we employed MS and RNA-Seq in parallel into mouse liver tissue and captured a considerable catalogue of both transcripts and proteins that, respectively, covered 60 and 34% of protein-coding genes in Ensembl. We then developed a bioinformatics workflow for building a customized protein database that for the first time included new splicing-derived peptides and RNA-editing-caused peptide variants, allowing us to more completely identify protein isoforms. Using this experimentally determined database, we totally identified 150 peptides not present in standard biological databases at false discovery rate of <1%, corresponding to 72 novel splicing isoforms, 43 new genetic regions, and 15 RNA-editing sites. Of these, 11 randomly selected novel events passed experimental verification by PCR and Sanger sequencing. New discoveries of gene products with high confidence in two omics levels demonstrated the robustness and effectiveness of our approach and its potential application into improve genome annotation. All the MS data have been deposited to the iProx ( http://ww.iprox.org ) with the identifier IPX00003601.

  15. Accelerating target discovery using pre-competitive open science—patients need faster innovation more than anyone else

    PubMed Central

    Low, Eric; Bountra, Chas; Lee, Wen Hwa

    2016-01-01

    We are experiencing a new era enabled by unencumbered access to high quality data through the emergence of open science initiatives in the historically challenging area of early stage drug discovery. At the same time, many patient-centric organisations are taking matters into their own hands by participating in, enabling and funding research. Here we present the rationale behind the innovative partnership between the Structural Genomics Consortium (SGC)—an open, pre-competitive pre-clinical research consortium and the research-focused patient organisation Myeloma UK to create a new, comprehensive platform to accelerate the discovery and development of new treatments for multiple myeloma. PMID:27594912

  16. Accelerating target discovery using pre-competitive open science-patients need faster innovation more than anyone else.

    PubMed

    Low, Eric; Bountra, Chas; Lee, Wen Hwa

    2016-01-01

    We are experiencing a new era enabled by unencumbered access to high quality data through the emergence of open science initiatives in the historically challenging area of early stage drug discovery. At the same time, many patient-centric organisations are taking matters into their own hands by participating in, enabling and funding research. Here we present the rationale behind the innovative partnership between the Structural Genomics Consortium (SGC)-an open, pre-competitive pre-clinical research consortium and the research-focused patient organisation Myeloma UK to create a new, comprehensive platform to accelerate the discovery and development of new treatments for multiple myeloma. PMID:27594912

  17. Empowering Accelerated Personal, Professional and Scholarly Discovery among Information Seekers: An Educational Vision

    ERIC Educational Resources Information Center

    Harmon, Glynn

    2013-01-01

    The term discovery applies herein to the successful outcome of inquiry in which a significant personal, professional or scholarly breakthrough or insight occurs, and which is individually or socially acknowledged as a key contribution to knowledge. Since discoveries culminate at fixed points in time, discoveries can serve as an outcome metric for…

  18. Gene Prioritization for Imaging Genetics Studies Using Gene Ontology and a Stratified False Discovery Rate Approach.

    PubMed

    Patel, Sejal; Park, Min Tae M; Chakravarty, M Mallar; Knight, Jo

    2016-01-01

    Imaging genetics is an emerging field in which the association between genes and neuroimaging-based quantitative phenotypes are used to explore the functional role of genes in neuroanatomy and neurophysiology in the context of healthy function and neuropsychiatric disorders. The main obstacle for researchers in the field is the high dimensionality of the data in both the imaging phenotypes and the genetic variants commonly typed. In this article, we develop a novel method that utilizes Gene Ontology, an online database, to select and prioritize certain genes, employing a stratified false discovery rate (sFDR) approach to investigate their associations with imaging phenotypes. sFDR has the potential to increase power in genome wide association studies (GWAS), and is quickly gaining traction as a method for multiple testing correction. Our novel approach addresses both the pressing need in genetic research to move beyond candidate gene studies, while not being overburdened with a loss of power due to multiple testing. As an example of our methodology, we perform a GWAS of hippocampal volume using both the Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA2) and the Alzheimer's Disease Neuroimaging Initiative datasets. The analysis of ENIGMA2 data yielded a set of SNPs with sFDR values between 10 and 20%. Our approach demonstrates a potential method to prioritize genes based on biological systems impaired in a disease.

  19. Gene Prioritization for Imaging Genetics Studies Using Gene Ontology and a Stratified False Discovery Rate Approach

    PubMed Central

    Patel, Sejal; Park, Min Tae M.; Chakravarty, M. Mallar; Knight, Jo

    2016-01-01

    Imaging genetics is an emerging field in which the association between genes and neuroimaging-based quantitative phenotypes are used to explore the functional role of genes in neuroanatomy and neurophysiology in the context of healthy function and neuropsychiatric disorders. The main obstacle for researchers in the field is the high dimensionality of the data in both the imaging phenotypes and the genetic variants commonly typed. In this article, we develop a novel method that utilizes Gene Ontology, an online database, to select and prioritize certain genes, employing a stratified false discovery rate (sFDR) approach to investigate their associations with imaging phenotypes. sFDR has the potential to increase power in genome wide association studies (GWAS), and is quickly gaining traction as a method for multiple testing correction. Our novel approach addresses both the pressing need in genetic research to move beyond candidate gene studies, while not being overburdened with a loss of power due to multiple testing. As an example of our methodology, we perform a GWAS of hippocampal volume using both the Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA2) and the Alzheimer's Disease Neuroimaging Initiative datasets. The analysis of ENIGMA2 data yielded a set of SNPs with sFDR values between 10 and 20%. Our approach demonstrates a potential method to prioritize genes based on biological systems impaired in a disease. PMID:27092072

  20. cudaMap: a GPU accelerated program for gene expression connectivity mapping

    PubMed Central

    2013-01-01

    Background Modern cancer research often involves large datasets and the use of sophisticated statistical techniques. Together these add a heavy computational load to the analysis, which is often coupled with issues surrounding data accessibility. Connectivity mapping is an advanced bioinformatic and computational technique dedicated to therapeutics discovery and drug re-purposing around differential gene expression analysis. On a normal desktop PC, it is common for the connectivity mapping task with a single gene signature to take > 2h to complete using sscMap, a popular Java application that runs on standard CPUs (Central Processing Units). Here, we describe new software, cudaMap, which has been implemented using CUDA C/C++ to harness the computational power of NVIDIA GPUs (Graphics Processing Units) to greatly reduce processing times for connectivity mapping. Results cudaMap can identify candidate therapeutics from the same signature in just over thirty seconds when using an NVIDIA Tesla C2050 GPU. Results from the analysis of multiple gene signatures, which would previously have taken several days, can now be obtained in as little as 10 minutes, greatly facilitating candidate therapeutics discovery with high throughput. We are able to demonstrate dramatic speed differentials between GPU assisted performance and CPU executions as the computational load increases for high accuracy evaluation of statistical significance. Conclusion Emerging ‘omics’ technologies are constantly increasing the volume of data and information to be processed in all areas of biomedical research. Embracing the multicore functionality of GPUs represents a major avenue of local accelerated computing. cudaMap will make a strong contribution in the discovery of candidate therapeutics by enabling speedy execution of heavy duty connectivity mapping tasks, which are increasingly required in modern cancer research. cudaMap is open source and can be freely downloaded from http

  1. Mining gene-centric relationships from literature: the roles of gene mutation and gene expression in supporting drug discovery.

    PubMed

    Tari, Luis; Patel, Jagruti; Küntzer, Jan; Li, Ying; Peng, Zhengwei; Wang, Yuan; Aguiar, Laura; Cai, James

    2014-01-01

    Identifying drug target candidates is an important task for early development throughout the drug discovery process. This process is supported by the development of new high-throughput technologies that enable better understanding of disease mechanism. It becomes critical to facilitate effective analysis of the large amount of biological data. However, with much of the biological knowledge represented in the literature in the form of natural text, analysis and interpretation of high-throughput data has not reached its potential effectiveness. In this paper, we describe our solution in employing text mining as a technique in finding scientific information for target and biomarker discovery from the biomedical literature. Our approach utilises natural language processing techniques to capture linguistic patterns for the extraction of biological knowledge from text. Additionally, we discuss how the extracted knowledge is used for the analysis of biological data such as next-generation sequencing and gene expression data.

  2. Prioritization of neurodevelopmental disease genes by discovery of new mutations.

    PubMed

    Hoischen, Alexander; Krumm, Niklas; Eichler, Evan E

    2014-06-01

    Advances in genome sequencing technologies have begun to revolutionize neurogenetics, allowing the full spectrum of genetic variation to be better understood in relation to disease. Exome sequencing of hundreds to thousands of samples from patients with autism spectrum disorder, intellectual disability, epilepsy and schizophrenia provides strong evidence of the importance of de novo and gene-disruptive events. There are now several hundred new candidate genes and targeted resequencing technologies that allow screening of dozens of genes in tens of thousands of individuals with high specificity and sensitivity. The decision of which genes to pursue depends on many factors, including recurrence, previous evidence of overlap with pathogenic copy number variants, the position of the mutation in the protein, the mutational burden among healthy individuals and membership of the candidate gene in disease-implicated protein networks. We discuss these emerging criteria for gene prioritization and the potential impact on the field of neuroscience. PMID:24866042

  3. Prioritization of neurodevelopmental disease genes by discovery of new mutations

    PubMed Central

    Hoischen, Alexander; Krumm, Niklas; Eichler, Evan E.

    2014-01-01

    Advances in genome sequencing technologies have begun to revolutionize neurogenetics allowing the full spectrum of genetic variation to be better understood in relationship to disease. Exome sequencing of hundreds to thousands of samples from patients with autism spectrum disorder, intellectual disability, epilepsy, and schizophrenia provide strong evidence of the importance of de novo and gene-disruptive events. There are now several hundred new candidate genes and targeted resequencing technologies that allow screening of dozens of genes in tens of thousands of individuals with high specificity and sensitivity. The decision of which genes to pursue depends on numerous factors including recurrence, prior evidence of overlap with pathogenic copy number variants, the position of the mutation within the protein, the mutational burden among healthy individuals, and membership of the candidate gene within disease-implicated protein networks. We discuss these emerging criteria for gene prioritization and the potential impact on the field of neuroscience. PMID:24866042

  4. New approaches to gene discovery with animal models of obesity and diabetes.

    PubMed

    Collier, Greg; Walder, Ken; De Silva, Andrea; Tenne-Brown, Janette; Sanigorski, Andrew; Segal, David; Kantham, Lakshmi; Augert, Guy

    2002-06-01

    DNA-based approaches to the discovery of genes contributing to the development of type 2 diabetes have not been very successful despite substantial investments of time and money. The multiple gene-gene and gene-environment interactions that influence the development of type 2 diabetes mean that DNA approaches are not the ideal tool for defining the etiology of this complex disease. Gene expression-based technologies may prove to be a more rewarding strategy to identify diabetes candidate genes. There are a number of RNA-based technologies available to identify genes that are differentially expressed in various tissues in type 2 diabetes. These include differential display polymerase chain reaction (ddPCR), suppression subtractive hybridization (SSH), and cDNA microarrays. The power of new technologies to detect differential gene expression is ideally suited to studies utilizing appropriate animal models of human disease. We have shown that the gene expression approach, in combination with an excellent animal model such as the Israeli sand rat (Psammomys obesus), can provide novel genes and pathways that may be important in the disease process and provide novel therapeutic approaches. This paper will describe a new gene discovery, beacon, a novel gene linked with energy intake. As the functional characterization of novel genes discovered in our laboratory using this approach continues, it is anticipated that we will soon be able to compile a definitive list of genes that are important in the development of obesity and type 2 diabetes.

  5. Recurrent and pathological gene fusions in breast cancer: current advances in genomic discovery and clinical implications.

    PubMed

    Veeraraghavan, Jamunarani; Ma, Jiacheng; Hu, Yiheng; Wang, Xiao-Song

    2016-07-01

    Gene fusions have long been considered principally as the oncogenic events of hematologic malignancies, but have recently gained wide attention in solid tumors due to several milestone discoveries and the advancement of deep sequencing technologies. With the progress in deep sequencing studies of breast cancer transcriptomes and genomes, the discovery of recurrent and pathological gene fusions in breast cancer is on the focus. Recently, driven by new deep sequencing studies, several recurrent or pathological gene fusions have been identified in breast cancer, including ESR1-CCDC170, SEC16A-NOTCH1, SEC22B-NOTCH2, and ESR1-YAP1 etc. More important, most of these gene fusions are preferentially identified in the more aggressive breast cancers, such as luminal B, basal-like, or endocrine-resistant breast cancer, suggesting recurrent gene fusions as additional key driver events in these tumors other than the known drivers such as the estrogen receptor. In this paper, we have comprehensively summarized the newly identified recurrent or pathological gene fusion events in breast cancer, reviewed the contributions of new genomic and deep sequencing technologies to new fusion discovery and the integrative bioinformatics tools to analyze these data, highlighted the biological relevance and clinical implications of these fusion discoveries, and discussed future directions of gene fusion research in breast cancer. PMID:27372070

  6. Computational discovery of gene modules and regulatory networks.

    PubMed

    Bar-Joseph, Ziv; Gerber, Georg K; Lee, Tong Ihn; Rinaldi, Nicola J; Yoo, Jane Y; Robert, François; Gordon, D Benjamin; Fraenkel, Ernest; Jaakkola, Tommi S; Young, Richard A; Gifford, David K

    2003-11-01

    We describe an algorithm for discovering regulatory networks of gene modules, GRAM (Genetic Regulatory Modules), that combines information from genome-wide location and expression data sets. A gene module is defined as a set of coexpressed genes to which the same set of transcription factors binds. Unlike previous approaches that relied primarily on functional information from expression data, the GRAM algorithm explicitly links genes to the factors that regulate them by incorporating DNA binding data, which provide direct physical evidence of regulatory interactions. We use the GRAM algorithm to describe a genome-wide regulatory network in Saccharomyces cerevisiae using binding information for 106 transcription factors profiled in rich medium conditions data from over 500 expression experiments. We also present a genome-wide location analysis data set for regulators in yeast cells treated with rapamycin, and use the GRAM algorithm to provide biological insights into this regulatory network

  7. Discovery of Cationic Polymers for Non-viral Gene Delivery using Combinatorial Approaches

    PubMed Central

    Barua, Sutapa; Ramos, James; Potta, Thrimoorthy; Taylor, David; Huang, Huang-Chiao; Montanez, Gabriela; Rege, Kaushal

    2015-01-01

    Gene therapy is an attractive treatment option for diseases of genetic origin, including several cancers and cardiovascular diseases. While viruses are effective vectors for delivering exogenous genes to cells, concerns related to insertional mutagenesis, immunogenicity, lack of tropism, decay and high production costs necessitate the discovery of non-viral methods. Significant efforts have been focused on cationic polymers as non-viral alternatives for gene delivery. Recent studies have employed combinatorial syntheses and parallel screening methods for enhancing the efficacy of gene delivery, biocompatibility of the delivery vehicle, and overcoming cellular level barriers as they relate to polymer-mediated transgene uptake, transport, transcription, and expression. This review summarizes and discusses recent advances in combinatorial syntheses and parallel screening of cationic polymer libraries for the discovery of efficient and safe gene delivery systems. PMID:21843141

  8. Power of deep, all-exon resequencing for discovery of human trait genes

    PubMed Central

    Kryukov, Gregory V.; Shpunt, Alexander; Stamatoyannopoulos, John A.; Sunyaev, Shamil R.

    2009-01-01

    The ability to sequence cost-effectively all of the coding regions of a given individual genome is rapidly approaching, with the potential for whole-genome resequencing not far behind. Initiatives are currently underway to phenotype hundreds of thousands of individuals for major human traits. Here, we determine the power for de novo discovery of genes related to human traits by resequencing all human exons in a clinical population. We analyze the potential of the gene discovery strategy that combines multiple rare variants from the same gene and treats genes, rather than individual alleles, as the units for the association test. By using computer simulations based on deep resequencing data for the European population, we show that genes meaningfully affecting a human trait can be identified in an unbiased fashion, although large sample sizes would be required to achieve substantial power. PMID:19202052

  9. GENOME-ENABLED DISCOVERY OF CARBON SEQUESTRATION GENES IN POPLAR

    SciTech Connect

    DAVIS J M

    2007-10-11

    Plants utilize carbon by partitioning the reduced carbon obtained through photosynthesis into different compartments and into different chemistries within a cell and subsequently allocating such carbon to sink tissues throughout the plant. Since the phytohormones auxin and cytokinin are known to influence sink strength in tissues such as roots (Skoog & Miller 1957, Nordstrom et al. 2004), we hypothesized that altering the expression of genes that regulate auxin-mediated (e.g., AUX/IAA or ARF transcription factors) or cytokinin-mediated (e.g., RR transcription factors) control of root growth and development would impact carbon allocation and partitioning belowground (Fig. 1 - Renewal Proposal). Specifically, the ARF, AUX/IAA and RR transcription factor gene families mediate the effects of the growth regulators auxin and cytokinin on cell expansion, cell division and differentiation into root primordia. Invertases (IVR), whose transcript abundance is enhanced by both auxin and cytokinin, are critical components of carbon movement and therefore of carbon allocation. Thus, we initiated comparative genomic studies to identify the AUX/IAA, ARF, RR and IVR gene families in the Populus genome that could impact carbon allocation and partitioning. Bioinformatics searches using Arabidopsis gene sequences as queries identified regions with high degrees of sequence similarities in the Populus genome. These Populus sequences formed the basis of our transgenic experiments. Transgenic modification of gene expression involving members of these gene families was hypothesized to have profound effects on carbon allocation and partitioning.

  10. Candidate gene discovery and prioritization in rare diseases.

    PubMed

    Jegga, Anil G

    2014-01-01

    A rare or orphan disorder is any disease that affects a small percentage of the population. Most genes and pathways underlying these disorders remain unknown. High-throughput techniques are frequently applied to detect disease candidate genes. The speed and affordability of sequencing following recent technological advances while advantageous are accompanied by the problem of data deluge. Furthermore, experimental validation of disease candidate genes is both time-consuming and expensive. Therefore, several computational approaches have been developed to identify the most promising candidates for follow-up studies. Based on the guilt by association principle, most of these approaches use prior knowledge about a disease of interest to discover and rank novel candidate genes. In this chapter, a brief overview of some of the in silico strategies for candidate gene prioritization is provided. To demonstrate their utility in rare disease research, a Web-based computational suite of tools that use integrated heterogeneous data sources for ranking disease candidate genes is used to demonstrate how to run typical queries using this system.

  11. [CSNP discovery by two-dimensional gene scanning (TDGS)].

    PubMed

    Suh, Y

    2001-04-21

    Challenges in the post-genomic era are to use genetic information in correlating individual gene variations (SNPs; single nucleotide polymorphisms, the most common form of genetic variation) with medically important parameters, such as disease susceptibility, individual responses to drugs and prognostic variables. What is missing is a high-throughput technology to identify all possible SNPs in essentially all human genes in population-based studies with high accuracy and speed in a cost-effective manner. Most tests advocated for their high throughput at low cost are actually SNP screening tests. That is, they screen samples for the presence of one or multiple previously identified SNPs. Such screening methods are only useful when all relevant SNPs in the genes of interest are known, which is presently not the case. Moreover, the usefulness of any one SNP varies enormously from population to population. Indeed, for the Korean population, where no information on possible sequence variation is available, a mutational scanning method, capable to detect all possible gene variations will be most useful. Two-Dimensional Gene Scanning (TDGS) is a high-throughput platform that enables to detect all possible SNPs in an entire gene in one gel under one set of conditions, with high sensitivity and specificity. TDGS is based on automated two-dimensional (2-D) DNA electrophoresis according to size and base pair sequence to detect DNA fragments containing all possible variations. Because the 2-D format permits the analysis of as many as 40 fragments of 250 bp on average in parallel, TDGS allows extensive multiplex PCR (megaplex PCR), i.e. up to 26 fragments in one single reaction, resulting in a significant cost reduction. TDGS tests are designed using a computer program to optimally position PCR primers around the relevant target sequences (exons). A simple automated 2-D instrument allows scanning all exons of a large gene in 8 different samples well within 3 hours. Using three

  12. Epi4K: Gene discovery in 4,000 genomes

    PubMed Central

    2012-01-01

    A major challenge in epilepsy research is to unravel the complex genetic mechanisms underlying both common and rare forms of epilepsy, as well as the genetic determinants of response to treatment. To accelerate progress in this area, the National Institute of Neurological Disorders and Stroke (NINDS) recently offered funding for the creation of a “Center without Walls” to focus on the genetics of human epilepsy. This article describes Epi4K, the collaborative study supported through this grant mechanism and having the aim of analyzing the genomes of a minimum 4,000 subjects with highly selected and well-characterized epilepsy. PMID:22642626

  13. Cryptococcus neoformans virulence gene discovery through insertional mutagenesis.

    PubMed

    Idnurm, Alexander; Reedy, Jennifer L; Nussbaum, Jesse C; Heitman, Joseph

    2004-04-01

    Insertional mutagenesis was applied to Cryptococcus neoformans to identify genes associated with virulence attributes. Using biolistic transformation, we generated 4,300 nourseothricin (NAT)-resistant strains, of which 590 exhibited stable resistance. We focused on mutants with defects in established virulence factors and identified two with reduced growth at 37 degrees C, four with reduced production of the antioxidant pigment melanin, and two with an increased sensitivity to nitric oxide (NO). The NAT insertion and mutant phenotypes were genetically linked in five of eight mutants, and the DNA flanking the insertions was characterized. For the strains with altered growth at 37 degrees C and altered melanin production, mutations were in previously uncharacterized genes, while the two NO-sensitive strains bore insertions in the flavohemoglobin gene FHB1, whose product counters NO stress. Because of the frequent instability of nourseothricin resistance associated with biolistic transformation, Agrobacterium-mediated transformation was tested. This transkingdom DNA delivery approach produced 100% stable nourseothricin-resistant transformants, and three melanin-defective strains were identified from 576 transformants, of which 2 were linked to NAT in segregation analysis. One of these mutants contained a T-DNA insertion in the promoter of the LAC1 (laccase) gene, which encodes a key enzyme required for melanin production, while the second contained an insertion in the promoter of the CLC1 gene, encoding a voltage-gated chloride channel. Clc1 and its homologs are required for ion homeostasis, and in their absence Cu+ transport into the secretory pathway is compromised, depriving laccase and other Cu(+)-dependent proteins of their essential cofactor. The NAT resistance cassette was optimized for cryptococcal codon usage and GC content and was then used to disrupt a mitogen-activated protein kinase gene, a predicted gene, and two putative chloride channel genes to

  14. Java Based Tool To Explore The Discovery Of Dark Energy And The Accelerated Expansion Of The Universe

    NASA Astrophysics Data System (ADS)

    Mijic, Milan; Lim, R.; Hu, Z.; Park, D.; Wells, D.; Wong, F.; Perrault, S.; Shvarts, E.; Levitin, S.; Rios, M.; Kang, E. Y. E.; Longson, T.

    2008-05-01

    The discovery of the accelerated expansion of the universe through observations of High-Redshift supernovae and its implication for the existence of Dark Energy as the dominant component of our universe, surely counts as one of the most important moments in the entire history of physics and astronomy. This discovery has great appeal to the general public, both because of the heroic lore to observe distant supernovae and because of the strange relativistic properties of the Dark Energy. To bring this development to the non-professionals, the Cal State L.A. Science Visualization project developed an easy to use Java based tool, which may be used in college, pre-college or public science education. The tool utilizes multimedia presentations, such as graphs or images, to simulate the search for and observations of high-redshift supernovae, and interactively leads to the discovery of the created universe fluid content. Model universes are selected in a semi-random manner, which displays range of interesting possibilities for the effective equation of state, the shape of the Hubble diagram, or the nature of the expansion. The Java-based tool is deployed through Java webstart for both high-end and low-end terminal users across platforms.

  15. Discovery and classification of homeobox genes in animal genomes.

    PubMed

    Marlétaz, Ferdinand; Paps, Jordi; Maeso, Ignacio; Holland, Peter W H

    2014-01-01

    The diversification of homeobox genes is of great interest to evolutionary and developmental biology. To generate a catalogue of all homeobox genes within species of interest, it is necessary to sequence complete genomes. It is now possible for small research projects and individual laboratories to determine near-complete genome sequences of animal species. We provide bioinformatic methods for assembling draft genome sequences from any animal species, including read filtering and error correction, plus methods for extracting and classifying all homeobox sequences. PMID:25151154

  16. Literature mining for the discovery of hidden connections between drugs, genes and diseases.

    PubMed

    Frijters, Raoul; van Vugt, Marianne; Smeets, Ruben; van Schaik, René; de Vlieg, Jacob; Alkema, Wynand

    2010-09-23

    The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs.

  17. Antibody discovery ex vivo accelerated by the LacO/LacI regulatory network.

    PubMed

    Yabuki, Munehisa; Cummings, W Jason; Leppard, John B; Immormino, Robert M; Wood, Christi L; Allison, Daniel S; Gray, Patrick W; Tjoelker, Larry W; Maizels, Nancy

    2012-01-01

    Monoclonal antibodies (mAbs) can be potent and highly specific therapeutics, diagnostics and research reagents. Nonetheless, mAb discovery using current in vivo or in vitro approaches can be costly and time-consuming, with no guarantee of success. We have established a platform for rapid discovery and optimization of mAbs ex vivo. This DTLacO platform derives from a chicken B cell line that has been engineered to enable rapid selection and seamless maturation of high affinity mAbs. We have validated the DTLacO platform by generation of high affinity and specific mAbs to five cell surface targets, the receptor tyrosine kinases VEGFR2 and TIE2, the glycoprotein TROP2, the small TNF receptor family member FN14, and the G protein-coupled receptor FZD10. mAb discovery is rapid and humanization is straightforward, establishing the utility of the DTLacO platform for identification of mAbs for therapeutic and other applications. PMID:22558313

  18. Genomic discovery of potent chromatin insulators for human gene therapy.

    PubMed

    Liu, Mingdong; Maurano, Matthew T; Wang, Hao; Qi, Heyuan; Song, Chao-Zhong; Navas, Patrick A; Emery, David W; Stamatoyannopoulos, John A; Stamatoyannopoulos, George

    2015-02-01

    Insertional mutagenesis and genotoxicity, which usually manifest as hematopoietic malignancy, represent major barriers to realizing the promise of gene therapy. Although insulator sequences that block transcriptional enhancers could mitigate or eliminate these risks, so far no human insulators with high functional potency have been identified. Here we describe a genomic approach for the identification of compact sequence elements that function as insulators. These elements are highly occupied by the insulator protein CTCF, are DNase I hypersensitive and represent only a small minority of the CTCF recognition sequences in the human genome. We show that the elements identified acted as potent enhancer blockers and substantially decreased the risk of tumor formation in a cancer-prone animal model. The elements are small, can be efficiently accommodated by viral vectors and have no detrimental effects on viral titers. The insulators we describe here are expected to increase the safety of gene therapy for genetic diseases.

  19. Gene discovery for facioscapulohumeral muscular dystrophy by machine learning techniques.

    PubMed

    González-Navarro, Félix F; Belanche-Muñoz, Lluís A; Gámez-Moreno, María G; Flores-Ríos, Brenda L; Ibarra-Esquer, Jorge E; López-Morteo, Gabriel A

    2016-04-28

    Facioscapulohumeral muscular dystrophy (FSHD) is a neuromuscular disorder that shows a preference for the facial, shoulder and upper arm muscles. FSHD affects about one in 20-400,000 people, and no effective therapeutic strategies are known to halt disease progression or reverse muscle weakness or atrophy. Many genes may be incorrectly regulated in affected muscle tissue, but the mechanisms responsible for the progressive muscle weakness remain largely unknown. Although machine learning (ML) has made significant inroads in biomedical disciplines such as cancer research, no reports have yet addressed FSHD analysis using ML techniques. This study explores a specific FSHD data set from a ML perspective. We report results showing a very promising small group of genes that clearly separates FSHD samples from healthy samples. In addition to numerical prediction figures, we show data visualizations and biological evidence illustrating the potential usefulness of these results. PMID:26960968

  20. Whole-genome resequencing: changing the paradigms of SNP detection, molecular mapping and gene discovery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The next generation sequencing (NGS) technologies have opened a wealth of opportunities for plant breeding and genomics research, and changed the paradigms of marker detection, genotyping, and gene discovery. Abundant genomic resources have been generated using a whole genome resequencing (WGR) str...

  1. CHEMICAL GENETICS: LIGAND-BASED DISCOVERY OF GENE FUNCTION

    PubMed Central

    Stockwell, Brent R.

    2011-01-01

    Chemical genetics is the study of gene-product function in a cellular or organismal context using exogenous ligands. In this approach, small molecules that bind directly to proteins are used to alter protein function, enabling a kinetic analysis of the in vivo consequences of these changes. Recent advances have strongly enhanced the power of exogenous ligands such that they can resemble genetic mutations in terms of their general applicability and target specificity. The growing sophistication of this approach raises the possibility of its application to any biological process. PMID:11253651

  2. Network-based gene prediction for Plasmodium falciparum malaria towards genetics-based drug discovery

    PubMed Central

    2015-01-01

    Background Malaria is the most deadly parasitic infectious disease. Existing drug treatments have limited efficacy in malaria elimination, and the complex pathogenesis of the disease is not fully understood. Detecting novel malaria-associated genes not only contributes in revealing the disease pathogenesis, but also facilitates discovering new targets for anti-malaria drugs. Methods In this study, we developed a network-based approach to predict malaria-associated genes. We constructed a cross-species network to integrate human-human, parasite-parasite and human-parasite protein interactions. Then we extended the random walk algorithm on this network, and used known malaria genes as the seeds to find novel candidate genes for malaria. Results We validated our algorithms using 77 known malaria genes: 14 human genes and 63 parasite genes were ranked averagely within top 2% and top 4%, respectively among human and parasite genomes. We also evaluated our method for predicting novel malaria genes using a set of 27 genes with literature supporting evidence. Our approach ranked 12 genes within top 1% and 24 genes within top 5%. In addition, we demonstrated that top-ranked candied genes were enriched for drug targets, and identified commonalities underlying top-ranked malaria genes through pathway analysis. In summary, the candidate malaria-associated genes predicted by our data-driven approach have the potential to guide genetics-based anti-malaria drug discovery. PMID:26099491

  3. Molecular dynamics-based virtual screening: accelerating the drug discovery process by high-performance computing.

    PubMed

    Ge, Hu; Wang, Yu; Li, Chanjuan; Chen, Nanhao; Xie, Yufang; Xu, Mengyan; He, Yingyan; Gu, Xinchun; Wu, Ruibo; Gu, Qiong; Zeng, Liang; Xu, Jun

    2013-10-28

    High-performance computing (HPC) has become a state strategic technology in a number of countries. One hypothesis is that HPC can accelerate biopharmaceutical innovation. Our experimental data demonstrate that HPC can significantly accelerate biopharmaceutical innovation by employing molecular dynamics-based virtual screening (MDVS). Without using HPC, MDVS for a 10K compound library with tens of nanoseconds of MD simulations requires years of computer time. In contrast, a state of the art HPC can be 600 times faster than an eight-core PC server is in screening a typical drug target (which contains about 40K atoms). Also, careful design of the GPU/CPU architecture can reduce the HPC costs. However, the communication cost of parallel computing is a bottleneck that acts as the main limit of further virtual screening improvements for drug innovations.

  4. Cohesin gene mutations in tumorigenesis: from discovery to clinical significance

    PubMed Central

    Solomon, David A.; Kim, Jung-Sik; Waldman, Todd

    2014-01-01

    Cohesin is a multi-protein complex composed of four core subunits (SMC1A, SMC3, RAD21, and either STAG1 or STAG2) that is responsible for the cohesion of sister chromatids following DNA replication until its cleavage during mitosis thereby enabling faithful segregation of sister chromatids into two daughter cells. Recent cancer genomics analyses have discovered a high frequency of somatic mutations in the genes encoding the core cohesin subunits as well as cohesin regulatory factors (e.g. NIPBL, PDS5B, ESPL1) in a select subset of human tumors including glioblastoma, Ewing sarcoma, urothelial carcinoma, acute myeloid leukemia, and acute megakaryoblastic leukemia. Herein we review these studies including discussion of the functional significance of cohesin inactivation in tumorigenesis and potential therapeutic mechanisms to selectively target cancers harboring cohesin mutations. [BMB Reports 2014; 47(6): 299-310] PMID:24856830

  5. Metagenomics and novel gene discovery: promise and potential for novel therapeutics.

    PubMed

    Culligan, Eamonn P; Sleator, Roy D; Marchesi, Julian R; Hill, Colin

    2014-04-01

    Metagenomics provides a means of assessing the total genetic pool of all the microbes in a particular environment, in a culture-independent manner. It has revealed unprecedented diversity in microbial community composition, which is further reflected in the encoded functional diversity of the genomes, a large proportion of which consists of novel genes. Herein, we review both sequence-based and functional metagenomic methods to uncover novel genes and outline some of the associated problems of each type of approach, as well as potential solutions. Furthermore, we discuss the potential for metagenomic biotherapeutic discovery, with a particular focus on the human gut microbiome and finally, we outline how the discovery of novel genes may be used to create bioengineered probiotics.

  6. Pine Gene Discovery Project - Final Report - 08/31/1997 - 02/28/2001

    SciTech Connect

    Whetten, R. W.; Sederoff, R. R.; Kinlaw, C.; Retzel, E.

    2001-04-30

    Integration of pines into the large scope of plant biology research depends on study of pines in parallel with study of annual plants, and on availability of research materials from pine to plant biologists interested in comparing pine with annual plant systems. The objectives of the Pine Gene Discovery Project were to obtain 10,000 partial DNA sequences of genes expressed in loblolly pine, to determine which of those pine genes were similar to known genes from other organisms, and to make the DNA sequences and isolated pine genes available to plant researchers to stimulate integration of pines into the wider scope of plant biology research. Those objectives have been completed, and the results are available to the public. Requests for pine genes have been received from a number of laboratories that would otherwise not have included pine in their research, indicating that progress is being made toward the goal of integrating pine research into the larger molecular biology research community.

  7. The Alveolate Perkinsus marinus: Biological Insights from EST Gene Discovery

    PubMed Central

    2010-01-01

    Background Perkinsus marinus, a protozoan parasite of the eastern oyster Crassostrea virginica, has devastated natural and farmed oyster populations along the Atlantic and Gulf coasts of the United States. It is classified as a member of the Perkinsozoa, a recently established phylum considered close to the ancestor of ciliates, dinoflagellates, and apicomplexans, and a key taxon for understanding unique adaptations (e.g. parasitism) within the Alveolata. Despite intense parasite pressure, no disease-resistant oysters have been identified and no effective therapies have been developed to date. Results To gain insight into the biological basis of the parasite's virulence and pathogenesis mechanisms, and to identify genes encoding potential targets for intervention, we generated >31,000 5' expressed sequence tags (ESTs) derived from four trophozoite libraries generated from two P. marinus strains. Trimming and clustering of the sequence tags yielded 7,863 unique sequences, some of which carry a spliced leader. Similarity searches revealed that 55% of these had hits in protein sequence databases, of which 1,729 had their best hit with proteins from the chromalveolates (E-value ≤ 1e-5). Some sequences are similar to those proven to be targets for effective intervention in other protozoan parasites, and include not only proteases, antioxidant enzymes, and heat shock proteins, but also those associated with relict plastids, such as acetyl-CoA carboxylase and methyl erythrithol phosphate pathway components, and those involved in glycan assembly, protein folding/secretion, and parasite-host interactions. Conclusions Our transcriptome analysis of P. marinus, the first for any member of the Perkinsozoa, contributes new insight into its biology and taxonomic position. It provides a very informative, albeit preliminary, glimpse into the expression of genes encoding functionally relevant proteins as potential targets for chemotherapy, and evidence for the presence of a relict

  8. Prediction and accelerated laboratory discovery of previously unknown 18-electron ABX compounds

    NASA Astrophysics Data System (ADS)

    Gautier, Romain; Zhang, Xiuwen; Hu, Linhua; Yu, Liping; Lin, Yuyuan; Sunde, Tor O. L.; Chon, Danbee; Poeppelmeier, Kenneth R.; Zunger, Alex

    2015-04-01

    Chemists and material scientists have often focused on the properties of previously reported compounds, but neglect numerous unreported but chemically plausible compounds that could have interesting properties. For example, the 18-valence electron ABX family of compounds features examples of topological insulators, thermoelectrics and piezoelectrics, but only 83 out of 483 of these possible compounds have been made. Using first-principles thermodynamics we examined the theoretical stability of the 400 unreported members and predict that 54 should be stable. Of those previously unreported ‘missing’ materials now predicted to be stable, 15 were grown in this study; X-ray studies agreed with the predicted crystal structure in all 15 cases. Among the predicted and characterized properties of the missing compounds are potential transparent conductors, thermoelectric materials and topological semimetals. This integrated process—prediction of functionality in unreported compounds followed by laboratory synthesis and characterization—could be a route to the systematic discovery of hitherto missing, realizable functional materials.

  9. Rising to the challenge: accelerated pace of discovery transforms marine virology.

    PubMed

    Brum, Jennifer R; Sullivan, Matthew B

    2015-03-01

    Marine viruses have important roles in microbial mortality, gene transfer, metabolic reprogramming and biogeochemical cycling. In this Review, we discuss recent technological advances in marine virology including the use of near-quantitative, reproducible metagenomics for large-scale investigation of viral communities and the emergence of gene-based viral ecology. We also describe the reprogramming of microbially driven processes by viral metabolic genes, the identification of novel viruses using cultivation-dependent and cultivation-independent tools, and the potential for modelling studies to provide a framework for studying virus-host interactions. These transformative advances have set a rapid pace in exploring and predicting how marine viruses manipulate and respond to their environment.

  10. Prospective calculation of identification power for individual genes in analyses controlling the false discovery rate.

    PubMed

    Crager, Michael R

    2012-12-01

    Recent work on prospective power and sample size calculations for analyses of high-dimension gene expression data that control the false discovery rate (FDR) focuses on the average power over all the truly nonnull hypotheses, or equivalently, the expected proportion of nonnull hypotheses rejected. Using another characterization of power, we adapt Efron's ([2007] Ann Stat 35:1351-1377) empirical Bayes approach to post hoc power calculation to develop a method for prospective calculation of the "identification power" for individual genes. This is the probability that a gene with a given true degree of association with clinical outcome or state will be included in a set within which the FDR is controlled at a specified level. An example calculation using proportional hazards regression highlights the effects of large numbers of genes with little or no association on the identification power for individual genes with substantial association.

  11. From mouse to humans: discovery of the CACNG2 pain susceptibility gene.

    PubMed

    Nissenbaum, J

    2012-10-01

    Chronic pain is a major healthcare problem affecting the daily lives of millions with enormous financial costs. The notorious variability and lack of efficient pain relief pharmaceuticals provide both genetic and therapeutic challenge. There are several genetic approaches that aim to uncover the molecular nature of pain phenotypes into their genetic components. Gene mapping using model organisms for various pain phenotypes has led to the identification of novel genes affecting susceptibility and response to pain stimuli. Translational studies have succeeded to tie those genes to human pain syndromes, thus suggesting new targets for drug discovery. In this short review, a perspective on pain genetics and the trajectory from pain phenotype to pain gene involving fine-mapping strategies, bioinformatic analysis and microarray profiling alongside human association analysis will be introduced. This integrated approach has led to identification of CACNG2 as a novel neuropathic pain gene affecting pain susceptibility both in mice and humans. It also serves as a prototype for efficient and economic discovery of pain genes. Comparisons to other methods as well as future directions of pain genetics will be discussed as well. PMID:22775325

  12. From mouse to humans: discovery of the CACNG2 pain susceptibility gene.

    PubMed

    Nissenbaum, J

    2012-10-01

    Chronic pain is a major healthcare problem affecting the daily lives of millions with enormous financial costs. The notorious variability and lack of efficient pain relief pharmaceuticals provide both genetic and therapeutic challenge. There are several genetic approaches that aim to uncover the molecular nature of pain phenotypes into their genetic components. Gene mapping using model organisms for various pain phenotypes has led to the identification of novel genes affecting susceptibility and response to pain stimuli. Translational studies have succeeded to tie those genes to human pain syndromes, thus suggesting new targets for drug discovery. In this short review, a perspective on pain genetics and the trajectory from pain phenotype to pain gene involving fine-mapping strategies, bioinformatic analysis and microarray profiling alongside human association analysis will be introduced. This integrated approach has led to identification of CACNG2 as a novel neuropathic pain gene affecting pain susceptibility both in mice and humans. It also serves as a prototype for efficient and economic discovery of pain genes. Comparisons to other methods as well as future directions of pain genetics will be discussed as well.

  13. Global Landscape of a Co-Expressed Gene Network in Barley and its Application to Gene Discovery in Triticeae Crops

    PubMed Central

    Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo

    2011-01-01

    Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235

  14. Accelerated evolution of morph-biased genes in pea aphids.

    PubMed

    Purandare, Swapna R; Bickel, Ryan D; Jaquiery, Julie; Rispe, Claude; Brisson, Jennifer A

    2014-08-01

    Phenotypic plasticity, the production of alternative phenotypes (or morphs) from the same genotype due to environmental factors, results in some genes being expressed in a morph-biased manner. Theoretically, these morph-biased genes experience relaxed selection, the consequence of which is the buildup of slightly deleterious mutations at these genes. Over time, this is expected to result in increased protein divergence at these genes between species and a signature of relaxed purifying selection within species. Here we test these theoretical expectations using morph-biased genes in the pea aphid, a species that produces multiple morphs via polyphenism. We find that morph-biased genes exhibit faster rates of evolution (in terms of dN/dS) relative to unbiased genes and that divergence generally increases with increasing morph bias. Further, genes with expression biased toward rarer morphs (sexual females and males) show faster rates of evolution than genes expressed in the more common morph (asexual females), demonstrating that the amount of time a gene spends being expressed in a morph is associated with its rate of evolution. And finally, we show that genes expressed in the rarer morphs experience decreased purifying selection relative to unbiased genes, suggesting that it is a relaxation of purifying selection that contributes to their faster rates of evolution. Our results provide an important empirical look at the impact of phenotypic plasticity on gene evolution.

  15. Ten years of gene discovery for meiotic event control in rice.

    PubMed

    Luo, Qiong; Li, Yafei; Shen, Yi; Cheng, Zhukuan

    2014-03-20

    Meiosis is the crucial process by which sexually propagating eukaryotes give rise to haploid gametes from diploid cells. Several key processes, like homologous chromosomes pairing, synapsis, recombination, and segregation, sequentially take place in meiosis. Although these widely conserved events are under both genetic and epigenetic control, the accurate details of molecular mechanisms are continuing to investigate. Rice is a good model organism for exploring the molecular mechanisms of meiosis in higher plants. So far, 28 rice meiotic genes have been characterized. In this review, we give an overview of the discovery of rice meiotic genes in the last ten years, with a particular focus on their functions in meiosis. PMID:24656233

  16. Leaf Ests from Stevia rebaudiana: a resource for gene discovery in diterpene synthesis.

    PubMed

    Brandle, J E; Richman, A; Swanson, A K; Chapman, B P

    2002-11-01

    Expressed sequence tags (ESTs) are providing a new approach to gene discovery in plant secondary metabolism. Stevia rebaudiana Bert. leaves produce high concentrations of diterpene steviol glycosides and should be a rich source of transcripts involved in diterpene synthesis. In order to create a resource for gene discovery and increase our understanding of steviol glycoside biosynthesis, we sequenced 5,548 ESTs from a S. rebaudiana leaf cDNA library. The EST collection was fully annotated based on database search results. ESTs involved in diterpene synthesis were identified using published sequences as electronic probes, by keyword searches of search results, and by differential representation. A significant portion of the ESTs were specific for standard leaf metabolic pathways; energy and primary metabolism represented 17.6% and 13.1% of total transcripts respectively. Diterpene metabolism in S. rebaudiana represented 1.1% of total transcripts. This study identified candidate genes for 70% of the known steps in the steviol glycoside pathway. One candidate, kaurene oxidase, was the 8th most abundant EST in the collection. Identification of many candidate genes specific to the I -deoxyxylulose 5-phosphate pathway suggests that the primary source of isopentenyl diphosphate, a precursor of geranylgeranyl diphosphate, is via the non-mevalonic acid pathway. The use of ESTs has greatly facilitated the identification of candidate genes and increased our understanding of diterpene metabolism.

  17. Discovery of New Candidate Genes Related to Brain Development Using Protein Interaction Information

    PubMed Central

    Chen, Lei; Chu, Chen; Kong, Xiangyin; Huang, Tao; Cai, Yu-Dong

    2015-01-01

    Human brain development is a dramatic process composed of a series of complex and fine-tuned spatiotemporal gene expressions. A good comprehension of this process can assist us in developing the potential of our brain. However, we have only limited knowledge about the genes and gene functions that are involved in this biological process. Therefore, a substantial demand remains to discover new brain development-related genes and identify their biological functions. In this study, we aimed to discover new brain-development related genes by building a computational method. We referred to a series of computational methods used to discover new disease-related genes and developed a similar method. In this method, the shortest path algorithm was executed on a weighted graph that was constructed using protein-protein interactions. New candidate genes fell on at least one of the shortest paths connecting two known genes that are related to brain development. A randomization test was then adopted to filter positive discoveries. Of the final identified genes, several have been reported to be associated with brain development, indicating the effectiveness of the method, whereas several of the others may have potential roles in brain development. PMID:25635857

  18. Proceedings of the 2013 CINP summit: innovative partnerships to accelerate CNS drug discovery for improved patient care.

    PubMed

    Phillips, Anthony George; Hongaard-Andersen, Peter; Moscicki, Richard A; Sahakian, Barbara; Quirion, Rémi; Krishnan, K Ranga Rama; Race, Tim

    2014-12-25

    Central nervous system (CNS) diseases and, in particular, mental health disorders, are becoming recognized as the health challenge of the 21(st) century. Currently, at least 10% of the global population is affected by a mental health disorder, a figure that is set to increase year on year. Meanwhile, the rate of development of new CNS drugs has not increased for many years, despite unprecedented levels of investment. In response to this state of affairs, the Collegium Internationale Neuro-Psychopharmacologicum (CINP) convened a summit to discuss ways to reverse this disturbing trend through new partnerships to accelerate CNS drug discovery. The objectives of the Summit were to explore the issues affecting the value chain (i.e. the chain of activities or stakeholders that a company engages in/with to deliver a product to market) in brain research, thereby gaining insights from key stakeholders and developing actions to address unmet needs; to identify achievable objectives to address the issues; to develop action plans to bring about measurable improvements across the value chain and accelerate CNS drug discovery; and finally, to communicate recommendations to governments, the research and development community, and other relevant stakeholders. Summit outputs include the following action plans, aligned to the pressure points within the brain research-drug development value chain: Code of conduct dealing with conflict of interest issues, Prevention, early diagnosis, and treatment, Linking science and regulation, Patient involvement in trial design, definition of endpoints, etc., Novel trial design, Reproduction and confirmation of data, Update of intellectual property (IP) laws to facilitate repurposing and combination therapy (low priority), Large-scale, global patient registries, Editorials on nomenclature, biomarkers, and diagnostic tools, and Public awareness, with brain disease advocates to attend G8 meetings and World Economic Forum (WEF) Annual meetings in

  19. Proceedings of the 2013 CINP Summit: Innovative Partnerships to Accelerate CNS Drug Discovery for Improved Patient Care

    PubMed Central

    Hongaard-Andersen, Peter; Moscicki, Richard A.; Sahakian, Barbara; Quirion, Rémi; Krishnan, K. Ranga Rama; Race, Tim

    2015-01-01

    Central nervous system (CNS) diseases and, in particular, mental health disorders, are becoming recognized as the health challenge of the 21st century. Currently, at least 10% of the global population is affected by a mental health disorder, a figure that is set to increase year on year. Meanwhile, the rate of development of new CNS drugs has not increased for many years, despite unprecedented levels of investment. In response to this state of affairs, the Collegium Internationale Neuro-Psychopharmacologicum (CINP) convened a summit to discuss ways to reverse this disturbing trend through new partnerships to accelerate CNS drug discovery. The objectives of the Summit were to explore the issues affecting the value chain (i.e. the chain of activities or stakeholders that a company engages in/with to deliver a product to market) in brain research, thereby gaining insights from key stakeholders and developing actions to address unmet needs; to identify achievable objectives to address the issues; to develop action plans to bring about measurable improvements across the value chain and accelerate CNS drug discovery; and finally, to communicate recommendations to governments, the research and development community, and other relevant stakeholders. Summit outputs include the following action plans, aligned to the pressure points within the brain research-drug development value chain: Code of conduct dealing with conflict of interest issues,Prevention, early diagnosis, and treatment,Linking science and regulation,Patient involvement in trial design, definition of endpoints, etc.,Novel trial design,Reproduction and confirmation of data,Update of intellectual property (IP) laws to facilitate repurposing and combination therapy (low priority),Large-scale, global patient registries,Editorials on nomenclature, biomarkers, and diagnostic tools, andPublic awareness, with brain disease advocates to attend G8 meetings and World Economic Forum (WEF) Annual meetings in Davos

  20. Proceedings of the 2013 CINP summit: innovative partnerships to accelerate CNS drug discovery for improved patient care.

    PubMed

    Phillips, Anthony George; Hongaard-Andersen, Peter; Moscicki, Richard A; Sahakian, Barbara; Quirion, Rémi; Krishnan, K Ranga Rama; Race, Tim

    2015-02-01

    Central nervous system (CNS) diseases and, in particular, mental health disorders, are becoming recognized as the health challenge of the 21(st) century. Currently, at least 10% of the global population is affected by a mental health disorder, a figure that is set to increase year on year. Meanwhile, the rate of development of new CNS drugs has not increased for many years, despite unprecedented levels of investment. In response to this state of affairs, the Collegium Internationale Neuro-Psychopharmacologicum (CINP) convened a summit to discuss ways to reverse this disturbing trend through new partnerships to accelerate CNS drug discovery. The objectives of the Summit were to explore the issues affecting the value chain (i.e. the chain of activities or stakeholders that a company engages in/with to deliver a product to market) in brain research, thereby gaining insights from key stakeholders and developing actions to address unmet needs; to identify achievable objectives to address the issues; to develop action plans to bring about measurable improvements across the value chain and accelerate CNS drug discovery; and finally, to communicate recommendations to governments, the research and development community, and other relevant stakeholders. Summit outputs include the following action plans, aligned to the pressure points within the brain research-drug development value chain: Code of conduct dealing with conflict of interest issues, Prevention, early diagnosis, and treatment, Linking science and regulation, Patient involvement in trial design, definition of endpoints, etc., Novel trial design, Reproduction and confirmation of data, Update of intellectual property (IP) laws to facilitate repurposing and combination therapy (low priority), Large-scale, global patient registries, Editorials on nomenclature, biomarkers, and diagnostic tools, and Public awareness, with brain disease advocates to attend G8 meetings and World Economic Forum (WEF) Annual meetings in

  1. In silico prioritization based on coexpression can aid epileptic encephalopathy gene discovery

    PubMed Central

    Oliver, Karen L.; Lukic, Vesna; Freytag, Saskia; Scheffer, Ingrid E.; Berkovic, Samuel F.

    2016-01-01

    Objective: To evaluate the performance of an in silico prioritization approach that was applied to 179 epileptic encephalopathy candidate genes in 2013 and to expand the application of this approach to the whole genome based on expression data from the Allen Human Brain Atlas. Methods: PubMed searches determined which of the 179 epileptic encephalopathy candidate genes had been validated. For validated genes, it was noted whether they were 1 of the 19 of 179 candidates prioritized in 2013. The in silico prioritization approach was applied genome-wide; all genes were ranked according to their coexpression strength with a reference set (i.e., 51 established epileptic encephalopathy genes) in both adult and developing human brain expression data sets. Candidate genes ranked in the top 10% for both data sets were cross-referenced with genes previously implicated in the epileptic encephalopathies due to a de novo variant. Results: Five of 6 validated epileptic encephalopathy candidate genes were among the 19 prioritized in 2013 (odds ratio = 54, 95% confidence interval [7,∞], p = 4.5 × 10−5, Fisher exact test); one gene was false negative. A total of 297 genes ranked in the top 10% for both the adult and developing brain data sets based on coexpression with the reference set. Of these, 9 had been previously implicated in the epileptic encephalopathies (FBXO41, PLXNA1, ACOT4, PAK6, GABBR2, YWHAG, NBEA, KNDC1, and SELRC1). Conclusions: We conclude that brain gene coexpression data can be used to assist epileptic encephalopathy gene discovery and propose 9 genes as strong epileptic encephalopathy candidates worthy of further investigation. PMID:27066588

  2. Gene expression monitoring for gene discovery in models of peripheral and central nervous system differentiation, regeneration, and trauma.

    PubMed

    Farlow, D N; Vansant, G; Cameron, A A; Chang, J; Khoh-Reiter, S; Pham, N L; Wu, W; Sagara, Y; Nicholls, J G; Carlo, D J; Ill, C R

    2000-10-20

    Gene expression monitoring using gene expression microarrays represents an extremely powerful technology for gene discovery in a variety of systems. We describe the results of seven experiments using Incyte GEM technology to compile a proprietary portfolio of data concerning differential gene expression in six different models of neuronal differentiation and regeneration, and recovery from injury or disease. Our first two experiments cataloged genes significantly up- or down-regulated during two phases of the retinoic acid-induced differentiation of the embryonal carcinoma line Ntera-2. To identify genes involved in neuronal regeneration we performed three GEM experiments, which included changes in gene expression in rat dorsal root ganglia during the healing of experimentally injured sciatic nerve, in regenerating neonatal opossum spinal cord, and during lipopolysaccharide stimulation of primary cultures of rat Schwann cells. Finally we have monitored genes involved in the recovery phase of the inflammatory disease of the rat spinal cord, experimental allergic encephalomyelitis, as well as those responsible for protection from oxidative stress in a glutamate-resistant rat hippocampal cell line. Analysis of the results of the approximately 70,000 data points collected is presented. PMID:11074584

  3. Discovery of the faithfulness gene: a model of transmission and transformation of scientific information.

    PubMed

    Green, Eva G T; Clémence, Alain

    2008-09-01

    The purpose of this paper is to study the diffusion and transformation of scientific information in everyday discussions. Based on rumour models and social representations theory, the impact of interpersonal communication and pre-existing beliefs on transmission of the content of a scientific discovery was analysed. In three experiments, a communication chain was simulated to investigate how laypeople make sense of a genetic discovery first published in a scientific outlet, then reported in a mainstream newspaper and finally discussed in groups. Study 1 (N=40) demonstrated a transformation of information when the scientific discovery moved along the communication chain. During successive narratives, scientific expert terminology disappeared while scientific information associated with lay terminology persisted. Moreover, the idea of a discovery of a faithfulness gene emerged. Study 2 (N=70) revealed that transmission of the scientific message varied as a function of attitudes towards genetic explanations of behaviour (pro-genetics vs. anti-genetics). Pro-genetics employed more scientific terminology than anti-genetics. Study 3 (N=75) showed that endorsement of genetic explanations was related to descriptive accounts of the scientific information, whereas rejection of genetic explanations was related to evaluative accounts of the information.

  4. Weighted gene co-expression based biomarker discovery for psoriasis detection.

    PubMed

    Sundarrajan, Sudharsana; Arumugam, Mohanapriya

    2016-11-15

    Psoriasis is a chronic inflammatory disease of the skin with an unknown aetiology. The disease manifests itself as red and silvery scaly plaques distributed over the scalp, lower back and extensor aspects of the limbs. After receiving scant consideration for quite a few years, psoriasis has now become a prominent focus for new drug development. A group of closely connected and differentially co-expressed genes may act in a network and may serve as molecular signatures for an underlying phenotype. A weighted gene coexpression network analysis (WGCNA), a system biology approach has been utilized for identification of new molecular targets for psoriasis. Gene coexpression relationships were investigated in 58 psoriatic lesional samples resulting in five gene modules, clustered based on the gene coexpression patterns. The coexpression pattern was validated using three psoriatic datasets. 10 highly connected and informative genes from each module was selected and termed as psoriasis specific hub signatures. A random forest based binary classifier built using the expression profiles of signature genes robustly distinguished psoriatic samples from the normal samples in the validation set with an accuracy of 0.95 to 1. These signature genes may serve as potential candidates for biomarker discovery leading to new therapeutic targets. WGCNA, the network based approach has provided an alternative path to mine out key controllers and drivers of psoriasis. The study principle from the current work can be extended to other pathological conditions.

  5. Discovery of nucleotide polymorphisms in the Musa gene pool by Ecotilling.

    PubMed

    Till, Bradley J; Jankowicz-Cieslak, Joanna; Sági, László; Huynh, Owen A; Utsushi, Hiroe; Swennen, Rony; Terauchi, Ryohei; Mba, Chikelu

    2010-11-01

    Musa (banana and plantain) is an important genus for the global export market and in local markets where it provides staple food for approximately 400 million people. Hybridization and polyploidization of several (sub)species, combined with vegetative propagation and human selection have produced a complex genetic history. We describe the application of the Ecotilling method for the discovery and characterization of nucleotide polymorphisms in diploid and polyploid accessions of Musa. We discovered over 800 novel alleles in 80 accessions. Sequencing and band evaluation shows Ecotilling to be a robust and accurate platform for the discovery of polymorphisms in homologous and homeologous gene targets. In the process of validating the method, we identified two single nucleotide polymorphisms that may be deleterious for the function of a gene putatively important for phototropism. Evaluation of heterozygous polymorphism and haplotype blocks revealed a high level of nucleotide diversity in Musa accessions. We further applied a strategy for the simultaneous discovery of heterozygous and homozygous polymorphisms in diploid accessions to rapidly evaluate nucleotide diversity in accessions of the same genome type. This strategy can be used to develop hypotheses for inheritance patterns of nucleotide polymorphisms within and between genome types. We conclude that Ecotilling is suitable for diversity studies in Musa, that it can be considered for functional genomics studies and as tool in selecting germplasm for traditional and mutation breeding approaches.

  6. Discovery of nucleotide polymorphisms in the Musa gene pool by Ecotilling.

    PubMed

    Till, Bradley J; Jankowicz-Cieslak, Joanna; Sági, László; Huynh, Owen A; Utsushi, Hiroe; Swennen, Rony; Terauchi, Ryohei; Mba, Chikelu

    2010-11-01

    Musa (banana and plantain) is an important genus for the global export market and in local markets where it provides staple food for approximately 400 million people. Hybridization and polyploidization of several (sub)species, combined with vegetative propagation and human selection have produced a complex genetic history. We describe the application of the Ecotilling method for the discovery and characterization of nucleotide polymorphisms in diploid and polyploid accessions of Musa. We discovered over 800 novel alleles in 80 accessions. Sequencing and band evaluation shows Ecotilling to be a robust and accurate platform for the discovery of polymorphisms in homologous and homeologous gene targets. In the process of validating the method, we identified two single nucleotide polymorphisms that may be deleterious for the function of a gene putatively important for phototropism. Evaluation of heterozygous polymorphism and haplotype blocks revealed a high level of nucleotide diversity in Musa accessions. We further applied a strategy for the simultaneous discovery of heterozygous and homozygous polymorphisms in diploid accessions to rapidly evaluate nucleotide diversity in accessions of the same genome type. This strategy can be used to develop hypotheses for inheritance patterns of nucleotide polymorphisms within and between genome types. We conclude that Ecotilling is suitable for diversity studies in Musa, that it can be considered for functional genomics studies and as tool in selecting germplasm for traditional and mutation breeding approaches. PMID:20589365

  7. An endogenous accelerator for viral gene expression confers a fitness advantage.

    PubMed

    Teng, Melissa W; Bolovan-Fritts, Cynthia; Dar, Roy D; Womack, Andrew; Simpson, Michael L; Shenk, Thomas; Weinberger, Leor S

    2012-12-21

    Many signaling circuits face a fundamental tradeoff between accelerating their response speed while maintaining final levels below a cytotoxic threshold. Here, we describe a transcriptional circuitry that dynamically converts signaling inputs into faster rates without amplifying final equilibrium levels. Using time-lapse microscopy, we find that transcriptional activators accelerate human cytomegalovirus (CMV) gene expression in single cells without amplifying steady-state expression levels, and this acceleration generates a significant replication advantage. We map the accelerator to a highly self-cooperative transcriptional negative-feedback loop (Hill coefficient ∼7) generated by homomultimerization of the virus's essential transactivator protein IE2 at nuclear PML bodies. Eliminating the IE2-accelerator circuit reduces transcriptional strength through mislocalization of incoming viral genomes away from PML bodies and carries a heavy fitness cost. In general, accelerators may provide a mechanism for signal-transduction circuits to respond quickly to external signals without increasing steady-state levels of potentially cytotoxic molecules.

  8. In silico analysis of gene expression profiles in the olfactory mucosae of aging senescence-accelerated mice.

    PubMed

    Getchell, Thomas V; Peng, Xuejun; Green, C Paul; Stromberg, Arnold J; Chen, Kuey-Chu; Mattson, Mark P; Getchell, Marilyn L

    2004-08-01

    We utilized high-density Affymetrix oligonucleotide arrays to investigate gene expression in the olfactory mucosae of near age-matched aging senescence-accelerated mice (SAM). The senescence-prone (SAMP) strain has a significantly shorter lifespan than does the senescence-resistant (SAMR) strain. To analyze our data, we applied biostatistical methods that included a correlation analysis to evaluate sources of methodologic and biological variability; a two-sided t-test to identify a subpopulation of Present genes with a biologically relevant P-value <0.05; and a false discovery rate (FDR) analysis adjusted to a stringent 5% level that yielded 127 genes with a P-value of <0.001 that were differentially regulated in near age-matched SAMPs (SAMP-Os; 13.75 months) compared to SAMRs (SAMR-Os, 12.5 months). Volcano plots related the variability in the mean hybridization signals as determined by the two-sided t-test to fold changes in gene expression. The genes were categorized into the six functional groups used previously in gene profiling experiments to identify candidate genes that may be relevant for senescence at the genomic and cellular levels in the aging mouse brain (Lee et al. [2000] Nat Genet 25:294-297) and in the olfactory mucosa (Getchell et al. [2003] Ageing Res Rev 2:211-243), which serves several functions that include chemosensory detection, immune barrier function, xenobiotic metabolism, and neurogenesis. Because SAMR-Os and SAMP-Os have substantially different median lifespans, we related the rate constant alpha in the Gompertz equation on aging to intrinsic as opposed to environmental mechanisms of senescence based on our analysis of genes modulated during aging in the olfactory mucosa. PMID:15248299

  9. Gene expression, single nucleotide variant and fusion transcript discovery in archival material from breast tumors.

    PubMed

    Norton, Nadine; Sun, Zhifu; Asmann, Yan W; Serie, Daniel J; Necela, Brian M; Bhagwate, Aditya; Jen, Jin; Eckloff, Bruce W; Kalari, Krishna R; Thompson, Kevin J; Carr, Jennifer M; Kachergus, Jennifer M; Geiger, Xochiquetzal J; Perez, Edith A; Thompson, E Aubrey

    2013-01-01

    Advantages of RNA-Seq over array based platforms are quantitative gene expression and discovery of expressed single nucleotide variants (eSNVs) and fusion transcripts from a single platform, but the sensitivity for each of these characteristics is unknown. We measured gene expression in a set of manually degraded RNAs, nine pairs of matched fresh-frozen, and FFPE RNA isolated from breast tumor with the hybridization based, NanoString nCounter (226 gene panel) and with whole transcriptome RNA-Seq using RiboZeroGold ScriptSeq V2 library preparation kits. We performed correlation analyses of gene expression between samples and across platforms. We then specifically assessed whole transcriptome expression of lincRNA and discovery of eSNVs and fusion transcripts in the FFPE RNA-Seq data. For gene expression in the manually degraded samples, we observed Pearson correlations of >0.94 and >0.80 with NanoString and ScriptSeq protocols, respectively. Gene expression data for matched fresh-frozen and FFPE samples yielded mean Pearson correlations of 0.874 and 0.783 for NanoString (226 genes) and ScriptSeq whole transcriptome protocols respectively, p<2x10(-16). Specifically for lincRNAs, we observed superb Pearson correlation (0.988) between matched fresh-frozen and FFPE pairs. FFPE samples across NanoString and RNA-Seq platforms gave a mean Pearson correlation of 0.838. In FFPE libraries, we detected 53.4% of high confidence SNVs and 24% of high confidence fusion transcripts. Sensitivity of fusion transcript detection was not overcome by an increase in depth of sequencing up to 3-fold (increase from ~56 to ~159 million reads). Both NanoString and ScriptSeq RNA-Seq technologies yield reliable gene expression data for degraded and FFPE material. The high degree of correlation between NanoString and RNA-Seq platforms suggests discovery based whole transcriptome studies from FFPE material will produce reliable expression data. The RiboZeroGold ScriptSeq protocol performed

  10. Discovery of five conserved beta -defensin gene clusters using a computational search strategy.

    PubMed

    Schutte, Brian C; Mitros, Joseph P; Bartlett, Jennifer A; Walters, Jesse D; Jia, Hong Peng; Welsh, Michael J; Casavant, Thomas L; McCray, Paul B

    2002-02-19

    The innate immune system includes antimicrobial peptides that protect multicellular organisms from a diverse spectrum of microorganisms. beta-Defensins comprise one important family of mammalian antimicrobial peptides. The annotation of the human genome fails to reveal the expected diversity, and a recent query of the draft sequence with the blast search engine found only one new beta-defensin gene (DEFB3). To define better the beta-defensin gene family, we adopted a genomics approach that uses hmmer, a computational search tool based on hidden Markov models, in combination with blast. This strategy identified 28 new human and 43 new mouse beta-defensin genes in five syntenic chromosomal regions. Within each syntenic cluster, the gene sequences and organization were similar, suggesting each cluster pair arose from a common ancestor and was retained because of conserved functions. Preliminary analysis indicates that at least 26 of the predicted genes are transcribed. These results demonstrate the value of a genomewide search strategy to identify genes with conserved structural motifs. Discovery of these genes represents a new starting point for exploring the role of beta-defensins in innate immunity.

  11. Gene Discovery of Modular Diterpene Metabolism in Nonmodel Systems1[W][OA

    PubMed Central

    Zerbe, Philipp; Hamberger, Björn; Yuen, Macaire M.S.; Chiang, Angela; Sandhu, Harpreet K.; Madilao, Lina L.; Nguyen, Anh; Hamberger, Britta; Bach, Søren Spanner; Bohlmann, Jörg

    2013-01-01

    Plants produce over 10,000 different diterpenes of specialized (secondary) metabolism, and fewer diterpenes of general (primary) metabolism. Specialized diterpenes may have functions in ecological interactions of plants with other organisms and also benefit humanity as pharmaceuticals, fragrances, resins, and other industrial bioproducts. Examples of high-value diterpenes are taxol and forskolin pharmaceuticals or ambroxide fragrances. Yields and purity of diterpenes obtained from natural sources or by chemical synthesis are often insufficient for large-volume or high-end applications. Improvement of agricultural or biotechnological diterpene production requires knowledge of biosynthetic genes and enzymes. However, specialized diterpene pathways are extremely diverse across the plant kingdom, and most specialized diterpenes are taxonomically restricted to a few plant species, genera, or families. Consequently, there is no single reference system to guide gene discovery and rapid annotation of specialized diterpene pathways. Functional diversification of genes and plasticity of enzyme functions of these pathways further complicate correct annotation. To address this challenge, we used a set of 10 different plant species to develop a general strategy for diterpene gene discovery in nonmodel systems. The approach combines metabolite-guided transcriptome resources, custom diterpene synthase (diTPS) and cytochrome P450 reference gene databases, phylogenies, and, as shown for select diTPSs, single and coupled enzyme assays using microbial and plant expression systems. In the 10 species, we identified 46 new diTPS candidates and over 400 putatively terpenoid-related P450s in a resource of nearly 1 million predicted transcripts of diterpene-accumulating tissues. Phylogenetic patterns of lineage-specific blooms of genes guided functional characterization. PMID:23613273

  12. Gene Discovery in the Apicomplexa as Revealed by EST Sequencing and Assembly of a Comparative Gene Database

    PubMed Central

    Li, Li; Brunk, Brian P.; Kissinger, Jessica C.; Pape, Deana; Tang, Keliang; Cole, Robert H.; Martin, John; Wylie, Todd; Dante, Mike; Fogarty, Steven J.; Howe, Daniel K.; Liberator, Paul; Diaz, Carmen; Anderson, Jennifer; White, Michael; Jerome, Maria E.; Johnson, Emily A.; Radke, Jay A.; Stoeckert, Christian J.; Waterston, Robert H.; Clifton, Sandra W.; Roos, David S.; Sibley, L. David

    2003-01-01

    Large-scale EST sequencing projects for several important parasites within the phylum Apicomplexa were undertaken for the purpose of gene discovery. Included were several parasites of medical importance (Plasmodium falciparum, Toxoplasma gondii) and others of veterinary importance (Eimeria tenella, Sarcocystis neurona, and Neospora caninum). A total of 55,192 ESTs, deposited into dbEST/GenBank, were included in the analyses. The resulting sequences have been clustered into nonredundant gene assemblies and deposited into a relational database that supports a variety of sequence and text searches. This database has been used to compare the gene assemblies using BLAST similarity comparisons to the public protein databases to identify putative genes. Of these new entries, ∼15%–20% represent putative homologs with a conservative cutoff of p < 10−9, thus identifying many conserved genes that are likely to share common functions with other well-studied organisms. Gene assemblies were also used to identify strain polymorphisms, examine stage-specific expression, and identify gene families. An interesting class of genes that are confined to members of this phylum and not shared by plants, animals, or fungi, was identified. These genes likely mediate the novel biological features of members of the Apicomplexa and hence offer great potential for biological investigation and as possible therapeutic targets. [The sequence data from this study have been submitted to dbEST division of GenBank under accession nos.: Toxoplasma gondii: –, –, –, –, – , –, –, –, –. Plasmodium falciparum: –, –, –, –. Sarcocystis neurona: , , , , , , , , , , , , , –, –, –, –, –. Eimeria tenella: –, –, –, –, –, –, –, –, – , –, –, –, –, –, –, –, –, –, –, –. Neospora caninum: –, –, , – , –, –.] PMID:12618375

  13. Inherited retinal diseases in dogs: advances in gene/mutation discovery

    PubMed Central

    Miyadera, Keiko

    2015-01-01

    1. Inherited retinal diseases (RDs) are vision-threatening conditions affecting humans as well as many domestic animals. Through many years of clinical studies of the domestic dog population, a wide array of RDs has been phenotypically characterized. Extensive effort to map the causative gene and to identify the underlying mutation followed. Through candidate gene, linkage analysis, genome-wide association studies, and more recently, by means of next-generation sequencing, as many as 31 mutations in 24 genes have been identified as the underlying cause for canine RDs. Most of these genes have been associated with human RDs providing opportunities to study their roles in the disease pathogenesis and in normal visual function. The canine model has also contributed in developing new treatments such as gene therapy which has been clinically applied to human patients. Meanwhile, with increasing knowledge of the molecular architecture of RDs in different subpopulations of dogs, the conventional understanding of RDs as a simple monogenic disease is beginning to change. Emerging evidence of modifiers that alters the disease outcome is complicating the interpretation of DNA tests. In this review, advances in the gene/mutation discovery approaches and the emerging genetic complexity of canine RDs are discussed. PMID:26120276

  14. Discovery of a novel imprinted gene by transcriptional analysis of parthenogenetic embryonic stem cells

    PubMed Central

    Sritanaudomchai, Hathaitip; Ma, Hong; Clepper, Lisa; Gokhale, Sumita; Bogan, Randy; Hennebold, Jon; Wolf, Don; Mitalipov, Shoukhrat

    2010-01-01

    BACKGROUND Parthenogenetic embryonic stem cells (PESCs) may have future utilities in cell replacement therapies since they are closely related to the female from which the activated oocyte was obtained. Furthermore, the avoidance of parthenogenetic development in mammals provides the most compelling rationale for the evolution of genomic imprinting, and the biological process of parthenogenesis raises complex issues regarding differential gene expression. METHODS AND RESULTS We describe here homozygous rhesus monkey PESCs derived from a spontaneously duplicated, haploid oocyte genome. Since the effect of homozygosity on PESCs pluripotency and differentiation potential is unknown, we assessed the similarities and differences in pluripotency markers and developmental potential by in vitro and in vivo differentiation of homozygous and heterozygous PESCs. To understand the differences in gene expression regulation between parthenogenetic and biparental embryonic stem cells (ESCs), we conducted microarray analysis of genome-wide mRNA profiles of primate PESCs and ESCs derived from fertilized embryos using the Affymetrix Rhesus Macaque Genome array. Several known paternally imprinted genes were in the highly down-regulated group in PESCs compared with ESCs. Furthermore, allele-specific expression analysis of other genes whose expression is also down-regulated in PESCs, led to the identification of one novel imprinted gene, inositol polyphosphate-5-phosphatase F (INPP5F), which was exclusively expressed from a paternal allele. CONCLUSION Our findings suggest that PESCs could be used as a model for studying genomic imprinting, and in the discovery of novel imprinted genes. PMID:20522441

  15. Repurposed transcriptomic data facilitate discovery of innate immunity toll-like receptor (TLR) Genes across Lophotrochozoa.

    PubMed

    Halanych, Kenneth M; Kocot, Kevin M

    2014-10-01

    The growing volume of genomic data from across life represents opportunities for deriving valuable biological information from data that were initially collected for another purpose. Here, we use transcriptomes collected for phylogenomic studies to search for toll-like receptor (TLR) genes in poorly sampled lophotrochozoan clades (Annelida, Mollusca, Brachiopoda, Phoronida, and Entoprocta) and one ecdysozoan clade (Priapulida). TLR genes are involved in innate immunity across animals by recognizing potential microbial infection. They have an extracellular leucine-rich repeat (LRR) domain connected to a transmembrane domain and an intracellular toll/interleukin-1 receptor (TIR) domain. Consequently, these genes are important in initiating a signaling pathway to trigger defense. We found at least one TLR ortholog in all but two taxa examined, suggesting that a broad array of lophotrochozoans may have innate immune systems similar to those observed in vertebrates and arthropods. Comparison to the SMART database confirmed the presence of both the LRR and the TIR protein motifs characteristic of TLR genes. Because we looked at only one transcriptome per species, discovery of TLR genes was limited for most taxa. However, several TRL-like genes that vary in the number and placement of LRR domains were found in phoronids. Additionally, several contigs contained LRR domains but lacked TIR domains, suggesting they were not TLRs. Many of these LRR-containing contigs had other domains (e.g., immunoglobin) and are likely involved in innate immunity.

  16. High-throughput platform for the discovery of elicitors of silent bacterial gene clusters.

    PubMed

    Seyedsayamdost, Mohammad R

    2014-05-20

    Over the past decade, bacterial genome sequences have revealed an immense reservoir of biosynthetic gene clusters, sets of contiguous genes that have the potential to produce drugs or drug-like molecules. However, the majority of these gene clusters appear to be inactive for unknown reasons prompting terms such as "cryptic" or "silent" to describe them. Because natural products have been a major source of therapeutic molecules, methods that rationally activate these silent clusters would have a profound impact on drug discovery. Herein, a new strategy is outlined for awakening silent gene clusters using small molecule elicitors. In this method, a genetic reporter construct affords a facile read-out for activation of the silent cluster of interest, while high-throughput screening of small molecule libraries provides potential inducers. This approach was applied to two cryptic gene clusters in the pathogenic model Burkholderia thailandensis. The results not only demonstrate a prominent activation of these two clusters, but also reveal that the majority of elicitors are themselves antibiotics, most in common clinical use. Antibiotics, which kill B. thailandensis at high concentrations, act as inducers of secondary metabolism at low concentrations. One of these antibiotics, trimethoprim, served as a global activator of secondary metabolism by inducing at least five biosynthetic pathways. Further application of this strategy promises to uncover the regulatory networks that activate silent gene clusters while at the same time providing access to the vast array of cryptic molecules found in bacteria.

  17. Abiotic stress tolerance: from gene discovery in model organisms to crop improvement.

    PubMed

    Bressan, Ray; Bohnert, Hans; Zhu, Jian-Kang

    2009-01-01

    Productive and sustainable agriculture necessitates growing plants in sub-optimal environments with less input of precious resources such as fresh water. For a better understanding and rapid improvement of abiotic stress tolerance, it is important to link physiological and biochemical work to molecular studies in genetically tractable model organisms. With the use of several technologies for the discovery of stress tolerance genes and their appropriate alleles, transgenic approaches to improving stress tolerance in crops remarkably parallels breeding principles with a greatly expanded germplasm base and will succeed eventually.

  18. Gene Overexpression Resources in Cereals for Functional Genomics and Discovery of Useful Genes

    PubMed Central

    Abe, Kiyomi; Ichikawa, Hiroaki

    2016-01-01

    Identification and elucidation of functions of plant genes is valuable for both basic and applied research. In addition to natural variation in model plants, numerous loss-of-function resources have been produced by mutagenesis with chemicals, irradiation, or insertions of transposable elements or T-DNA. However, we may be unable to observe loss-of-function phenotypes for genes with functionally redundant homologs and for those essential for growth and development. To offset such disadvantages, gain-of-function transgenic resources have been exploited. Activation-tagged lines have been generated using obligatory overexpression of endogenous genes by random insertion of an enhancer. Recent progress in DNA sequencing technology and bioinformatics has enabled the preparation of genomewide collections of full-length cDNAs (fl-cDNAs) in some model species. Using the fl-cDNA clones, a novel gain-of-function strategy, Fl-cDNA OvereXpressor gene (FOX)-hunting system, has been developed. A mutant phenotype in a FOX line can be directly attributed to the overexpressed fl-cDNA. Investigating a large population of FOX lines could reveal important genes conferring favorable phenotypes for crop breeding. Alternatively, a unique loss-of-function approach Chimeric REpressor gene Silencing Technology (CRES-T) has been developed. In CRES-T, overexpression of a chimeric repressor, composed of the coding sequence of a transcription factor (TF) and short peptide designated as the repression domain, could interfere with the action of endogenous TF in plants. Although plant TFs usually consist of gene families, CRES-T is effective, in principle, even for the TFs with functional redundancy. In this review, we focus on the current status of the gene-overexpression strategies and resources for identifying and elucidating novel functions of cereal genes. We discuss the potential of these research tools for identifying useful genes and phenotypes for application in crop breeding. PMID

  19. Genomics-driven discovery of the pneumocandin biosynthetic gene cluster in the fungus Glarea lozoyensis

    PubMed Central

    2013-01-01

    Background The antifungal therapy caspofungin is a semi-synthetic derivative of pneumocandin B0, a lipohexapeptide produced by the fungus Glarea lozoyensis, and was the first member of the echinocandin class approved for human therapy. The nonribosomal peptide synthetase (NRPS)-polyketide synthases (PKS) gene cluster responsible for pneumocandin biosynthesis from G. lozoyensis has not been elucidated to date. In this study, we report the elucidation of the pneumocandin biosynthetic gene cluster by whole genome sequencing of the G. lozoyensis wild-type strain ATCC 20868. Results The pneumocandin biosynthetic gene cluster contains a NRPS (GLNRPS4) and a PKS (GLPKS4) arranged in tandem, two cytochrome P450 monooxygenases, seven other modifying enzymes, and genes for L-homotyrosine biosynthesis, a component of the peptide core. Thus, the pneumocandin biosynthetic gene cluster is significantly more autonomous and organized than that of the recently characterized echinocandin B gene cluster. Disruption mutants of GLNRPS4 and GLPKS4 no longer produced the pneumocandins (A0 and B0), and the Δglnrps4 and Δglpks4 mutants lost antifungal activity against the human pathogenic fungus Candida albicans. In addition to pneumocandins, the G. lozoyensis genome encodes a rich repertoire of natural product-encoding genes including 24 PKSs, six NRPSs, five PKS-NRPS hybrids, two dimethylallyl tryptophan synthases, and 14 terpene synthases. Conclusions Characterization of the gene cluster provides a blueprint for engineering new pneumocandin derivatives with improved pharmacological properties. Whole genome estimation of the secondary metabolite-encoding genes from G. lozoyensis provides yet another example of the huge potential for drug discovery from natural products from the fungal kingdom. PMID:23688303

  20. Evaluation of Gene Association Methods for Coexpression Network Construction and Biological Knowledge Discovery

    PubMed Central

    Kumari, Sapna; Nie, Jeff; Chen, Huann-Sheng; Ma, Hao; Stewart, Ron; Li, Xiang; Lu, Meng-Zhu; Taylor, William M.; Wei, Hairong

    2012-01-01

    Background Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. Methods and Results In this study, we compared eight gene association methods – Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson – and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. Conclusions We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction. PMID:23226279

  1. Analysis of Brassica rapa ESTs: gene discovery and expression patterns of AP2/ERF family genes.

    PubMed

    Zhuang, Jing; Xiong, Ai-Sheng; Peng, Ri-He; Gao, Feng; Zhu, Bo; Zhang, Jian; Fu, Xiao-Yan; Jin, Xiao-Feng; Chen, Jian-Min; Zhang, Zhen; Qiao, Yu-Shan; Yao, Quan-Hong

    2010-06-01

    Chinese cabbage (Brassica rapa subsp. pekinensis) is among the most important vegetables and is widely cultivated in world. Genes in the AP2/ERF family encode transcriptional regulators that serve a variety of functions in the plants. Expressed sequence tags (ESTs) are created by partially sequencing randomly isolated gene transcripts and have proved valuable in molecular biology. Starting from the database with 142 947 ESTs of B. rapa, 62 putative AP2/ERF family genes were identified by in silico cloning using the conserved AP2/ERF domain amino acid sequence of Arabidopsis thaliana as a probe. Based on the number of AP2/ERF domains and functions of the genes, the AP2/ERF transcription factors from B. rapa were classified into four subfamilies (DREB, ERF, AP2 and RAV). Using large-scale available EST information as a source of expression data for digital expression profiling, differentially detected genes were identified among diverse plant tissues. Roots contained the largest number of transcripts of the AP2/ERF family genes, followed by leaves and seeds. Only a few of the 62 AP2/ERF family genes were detected in all tissues: most were detected only in some tissues but not in others. The maximum detected was that of BraERF-B2-5, and it was recorded from seed tissue.

  2. MoD Tools: regulatory motif discovery in nucleotide sequences from co-regulated or homologous genes.

    PubMed

    Pavesi, Giulio; Mereghetti, Paolo; Zambelli, Federico; Stefani, Marco; Mauri, Giancarlo; Pesole, Graziano

    2006-07-01

    Understanding the complex mechanisms regulating gene expression at the transcriptional and post-transcriptional levels is one of the greatest challenges of the post-genomic era. The MoD (MOtif Discovery) Tools web server comprises a set of tools for the discovery of novel conserved sequence and structure motifs in nucleotide sequences, motifs that in turn are good candidates for regulatory activity. The server includes the following programs: Weeder, for the discovery of conserved transcription factor binding sites (TFBSs) in nucleotide sequences from co-regulated genes; WeederH, for the discovery of conserved TFBSs and distal regulatory modules in sequences from homologous genes; RNAProfile, for the discovery of conserved secondary structure motifs in unaligned RNA sequences whose secondary structure is not known. In this way, a given gene can be compared with other co-regulated genes or with its homologs, or its mRNA can be analyzed for conserved motifs regulating its post-transcriptional fate. The web server thus provides researchers with different strategies and methods to investigate the regulation of gene expression, at both the transcriptional and post-transcriptional levels. Available at http://www.pesolelab.it/modtools/ and http://www.beacon.unimi.it/modtools/.

  3. A Genomics Based Discovery of Secondary Metabolite Biosynthetic Gene Clusters in Aspergillus ustus

    PubMed Central

    Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong

    2015-01-01

    Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic. PMID:25706180

  4. Discovery of midgut genes for the RNA interference control of corn rootworm.

    PubMed

    Hu, Xu; Richtman, Nina M; Zhao, Jian-Zhou; Duncan, Keith E; Niu, Xiping; Procyk, Lisa A; Oneal, Meghan A; Kernodle, Bliss M; Steimel, Joseph P; Crane, Virginia C; Sandahl, Gary; Ritland, Julie L; Howard, Richard J; Presnail, James K; Lu, Albert L; Wu, Gusui

    2016-01-01

    RNA interference (RNAi) is a promising new technology for corn rootworm control. This paper presents the discovery of new gene targets - dvssj1 and dvssj2, in western corn rootworm (WCR). Dvssj1 and dvssj2 are orthologs of the Drosophila genes snakeskin (ssk) and mesh, respectively. These genes encode membrane proteins associated with smooth septate junctions (SSJ) which are required for intestinal barrier function. Based on bioinformatics analysis, dvssj1 appears to be an arthropod-specific gene. Diet based insect feeding assays using double-stranded RNA (dsRNA) targeting dvssj1 and dvssj2 demonstrate targeted mRNA suppression, larval growth inhibition, and mortality. In RNAi treated WCR, injury to the midgut was manifested by "blebbing" of the midgut epithelium into the gut lumen. Ultrastructural examination of midgut epithelial cells revealed apoptosis and regenerative activities. Transgenic plants expressing dsRNA targeting dvssj1 show insecticidal activity and significant plant protection from WCR damage. The data indicate that dvssj1 and dvssj2 are effective gene targets for the control of WCR using RNAi technology, by apparent suppression of production of their respective smooth septate junction membrane proteins located within the intestinal lining, leading to growth inhibition and mortality. PMID:27464714

  5. Gene discovery using next-generation pyrosequencing to develop ESTs for Phalaenopsis orchids

    PubMed Central

    2011-01-01

    Background Orchids are one of the most diversified angiosperms, but few genomic resources are available for these non-model plants. In addition to the ecological significance, Phalaenopsis has been considered as an economically important floriculture industry worldwide. We aimed to use massively parallel 454 pyrosequencing for a global characterization of the Phalaenopsis transcriptome. Results To maximize sequence diversity, we pooled RNA from 10 samples of different tissues, various developmental stages, and biotic- or abiotic-stressed plants. We obtained 206,960 expressed sequence tags (ESTs) with an average read length of 228 bp. These reads were assembled into 8,233 contigs and 34,630 singletons. The unigenes were searched against the NCBI non-redundant (NR) protein database. Based on sequence similarity with known proteins, these analyses identified 22,234 different genes (E-value cutoff, e-7). Assembled sequences were annotated with Gene Ontology, Gene Family and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Among these annotations, over 780 unigenes encoding putative transcription factors were identified. Conclusion Pyrosequencing was effective in identifying a large set of unigenes from Phalaenopsis. The informative EST dataset we developed constitutes a much-needed resource for discovery of genes involved in various biological processes in Phalaenopsis and other orchid species. These transcribed sequences will narrow the gap between study of model organisms with many genomic resources and species that are important for ecological and evolutionary studies. PMID:21749684

  6. Discovery of midgut genes for the RNA interference control of corn rootworm

    PubMed Central

    Hu, Xu; Richtman, Nina M.; Zhao, Jian-Zhou; Duncan, Keith E.; Niu, Xiping; Procyk, Lisa A.; Oneal, Meghan A.; Kernodle, Bliss M.; Steimel, Joseph P.; Crane, Virginia C.; Sandahl, Gary; Ritland, Julie L.; Howard, Richard J.; Presnail, James K.; Lu, Albert L.; Wu, Gusui

    2016-01-01

    RNA interference (RNAi) is a promising new technology for corn rootworm control. This paper presents the discovery of new gene targets - dvssj1 and dvssj2, in western corn rootworm (WCR). Dvssj1 and dvssj2 are orthologs of the Drosophila genes snakeskin (ssk) and mesh, respectively. These genes encode membrane proteins associated with smooth septate junctions (SSJ) which are required for intestinal barrier function. Based on bioinformatics analysis, dvssj1 appears to be an arthropod-specific gene. Diet based insect feeding assays using double-stranded RNA (dsRNA) targeting dvssj1 and dvssj2 demonstrate targeted mRNA suppression, larval growth inhibition, and mortality. In RNAi treated WCR, injury to the midgut was manifested by “blebbing” of the midgut epithelium into the gut lumen. Ultrastructural examination of midgut epithelial cells revealed apoptosis and regenerative activities. Transgenic plants expressing dsRNA targeting dvssj1 show insecticidal activity and significant plant protection from WCR damage. The data indicate that dvssj1 and dvssj2 are effective gene targets for the control of WCR using RNAi technology, by apparent suppression of production of their respective smooth septate junction membrane proteins located within the intestinal lining, leading to growth inhibition and mortality. PMID:27464714

  7. Next-generation diagnostics and disease-gene discovery with the Exomiser.

    PubMed

    Smedley, Damian; Jacobsen, Julius O B; Jäger, Marten; Köhler, Sebastian; Holtgrewe, Manuel; Schubach, Max; Siragusa, Enrico; Zemojtel, Tomasz; Buske, Orion J; Washington, Nicole L; Bone, William P; Haendel, Melissa A; Robinson, Peter N

    2015-12-01

    Exomiser is an application that prioritizes genes and variants in next-generation sequencing (NGS) projects for novel disease-gene discovery or differential diagnostics of Mendelian disease. Exomiser comprises a suite of algorithms for prioritizing exome sequences using random-walk analysis of protein interaction networks, clinical relevance and cross-species phenotype comparisons, as well as a wide range of other computational filters for variant frequency, predicted pathogenicity and pedigree analysis. In this protocol, we provide a detailed explanation of how to install Exomiser and use it to prioritize exome sequences in a number of scenarios. Exomiser requires ∼3 GB of RAM and roughly 15-90 s of computing time on a standard desktop computer to analyze a variant call format (VCF) file. Exomiser is freely available for academic use from http://www.sanger.ac.uk/science/tools/exomiser.

  8. Discovery of nucleotide polymorphisms in the Musa gene pool by Ecotilling

    PubMed Central

    Jankowicz-Cieslak, Joanna; Sági, László; Huynh, Owen A.; Utsushi, Hiroe; Swennen, Rony; Terauchi, Ryohei; Mba, Chikelu

    2010-01-01

    Musa (banana and plantain) is an important genus for the global export market and in local markets where it provides staple food for approximately 400 million people. Hybridization and polyploidization of several (sub)species, combined with vegetative propagation and human selection have produced a complex genetic history. We describe the application of the Ecotilling method for the discovery and characterization of nucleotide polymorphisms in diploid and polyploid accessions of Musa. We discovered over 800 novel alleles in 80 accessions. Sequencing and band evaluation shows Ecotilling to be a robust and accurate platform for the discovery of polymorphisms in homologous and homeologous gene targets. In the process of validating the method, we identified two single nucleotide polymorphisms that may be deleterious for the function of a gene putatively important for phototropism. Evaluation of heterozygous polymorphism and haplotype blocks revealed a high level of nucleotide diversity in Musa accessions. We further applied a strategy for the simultaneous discovery of heterozygous and homozygous polymorphisms in diploid accessions to rapidly evaluate nucleotide diversity in accessions of the same genome type. This strategy can be used to develop hypotheses for inheritance patterns of nucleotide polymorphisms within and between genome types. We conclude that Ecotilling is suitable for diversity studies in Musa, that it can be considered for functional genomics studies and as tool in selecting germplasm for traditional and mutation breeding approaches. Electronic supplementary material The online version of this article (doi:10.1007/s00122-010-1395-5) contains supplementary material, which is available to authorized users. PMID:20589365

  9. Long Serial Analysis of Gene Expression for Gene Discovery and Transcriptome Profiling in the Widespread Marine Coccolithophore Emiliania huxleyi†

    PubMed Central

    Dyhrman, Sonya T.; Haley, Sheean T.; Birkeland, Shanda R.; Wurch, Louie L.; Cipriano, Michael J.; McArthur, Andrew G.

    2006-01-01

    The abundant and widespread coccolithophore Emiliania huxleyi plays an important role in mediating CO2 exchange between the ocean and the atmosphere through its impact on marine photosynthesis and calcification. Here, we use long serial analysis of gene expression (SAGE) to identify E. huxleyi genes responsive to nitrogen (N) or phosphorus (P) starvation. Long SAGE is an elegant approach for examining quantitative and comprehensive gene expression patterns without a priori knowledge of gene sequences via the detection of 21-bp nucleotide sequence tags. E. huxleyi appears to have a robust transcriptional-level response to macronutrient deficiency, with 42 tags uniquely present or up-regulated twofold or greater in the N-starved library and 128 tags uniquely present or up-regulated twofold or greater in the P-starved library. The expression patterns of several tags were validated with reverse transcriptase PCR. Roughly 48% of these differentially expressed tags could be mapped to publicly available genomic or expressed sequence tag (EST) sequence data. For example, in the P-starved library a number of the tags mapped to genes with a role in P scavenging, including a putative phosphate-repressible permease and a putative polyphosphate synthetase. In short, the long SAGE analyses have (i) identified many new differentially regulated gene sequences, (ii) assigned regulation data to EST sequences with no database homology and unknown function, and (iii) highlighted previously uncharacterized aspects of E. huxleyi N and P physiology. To this end, our long SAGE libraries provide a new public resource for gene discovery and transcriptional analysis in this biogeochemically important marine organism. PMID:16391051

  10. Long serial analysis of gene expression for gene discovery and transcriptome profiling in the widespread marine coccolithophore Emiliania huxleyi.

    PubMed

    Dyhrman, Sonya T; Haley, Sheean T; Birkeland, Shanda R; Wurch, Louie L; Cipriano, Michael J; McArthur, Andrew G

    2006-01-01

    The abundant and widespread coccolithophore Emiliania huxleyi plays an important role in mediating CO2 exchange between the ocean and the atmosphere through its impact on marine photosynthesis and calcification. Here, we use long serial analysis of gene expression (SAGE) to identify E. huxleyi genes responsive to nitrogen (N) or phosphorus (P) starvation. Long SAGE is an elegant approach for examining quantitative and comprehensive gene expression patterns without a priori knowledge of gene sequences via the detection of 21-bp nucleotide sequence tags. E. huxleyi appears to have a robust transcriptional-level response to macronutrient deficiency, with 42 tags uniquely present or up-regulated twofold or greater in the N-starved library and 128 tags uniquely present or up-regulated twofold or greater in the P-starved library. The expression patterns of several tags were validated with reverse transcriptase PCR. Roughly 48% of these differentially expressed tags could be mapped to publicly available genomic or expressed sequence tag (EST) sequence data. For example, in the P-starved library a number of the tags mapped to genes with a role in P scavenging, including a putative phosphate-repressible permease and a putative polyphosphate synthetase. In short, the long SAGE analyses have (i) identified many new differentially regulated gene sequences, (ii) assigned regulation data to EST sequences with no database homology and unknown function, and (iii) highlighted previously uncharacterized aspects of E. huxleyi N and P physiology. To this end, our long SAGE libraries provide a new public resource for gene discovery and transcriptional analysis in this biogeochemically important marine organism.

  11. Controlling false discoveries in multidimensional directional decisions, with applications to gene expression data on ordered categories.

    PubMed

    Guo, Wenge; Sarkar, Sanat K; Peddada, Shyamal D

    2010-06-01

    Microarray gene expression studies over ordered categories are routinely conducted to gain insights into biological functions of genes and the underlying biological processes. Some common experiments are time-course/dose-response experiments where a tissue or cell line is exposed to different doses and/or durations of time to a chemical. A goal of such studies is to identify gene expression patterns/profiles over the ordered categories. This problem can be formulated as a multiple testing problem where for each gene the null hypothesis of no difference between the successive mean gene expressions is tested and further directional decisions are made if it is rejected. Much of the existing multiple testing procedures are devised for controlling the usual false discovery rate (FDR) rather than the mixed directional FDR (mdFDR), the expected proportion of Type I and directional errors among all rejections. Benjamini and Yekutieli (2005, Journal of the American Statistical Association 100, 71-93) proved that an augmentation of the usual Benjamini-Hochberg (BH) procedure can control the mdFDR while testing simple null hypotheses against two-sided alternatives in terms of one-dimensional parameters. In this article, we consider the problem of controlling the mdFDR involving multidimensional parameters. To deal with this problem, we develop a procedure extending that of Benjamini and Yekutieli based on the Bonferroni test for each gene. A proof is given for its mdFDR control when the underlying test statistics are independent across the genes. The results of a simulation study evaluating its performance under independence as well as under dependence of the underlying test statistics across the genes relative to other relevant procedures are reported. Finally, the proposed methodology is applied to a time-course microarray data obtained by Lobenhofer et al. (2002, Molecular Endocrinology 16, 1215-1229). We identified several important cell-cycle genes, such as DNA

  12. Electrical stimulation accelerates and enhances expression of regeneration-associated genes in regenerating rat femoral motoneurons.

    PubMed

    Al-Majed, Abdulhakeem A; Tam, Siu Lin; Gordon, Tessa

    2004-06-01

    1. In this study we investigated whether electrical stimulation accelerates the upregulation of Talpha1-tubulin and GAP-43 (regeneration-associated genes; RAGs) and the downregulation of the medium-molecular-weight neurofilament (NFM), in concert with stimulation-induced acceleration of BDNF and trkB gene expression and axonal regeneration. 2. Two weeks prior to unilateral femoral nerve transection and suture, fluorogold (Fluorochrome Inc., Denver) or fluororuby (Dextran tetramethylrhodamine, Mol. Probes, D-1817, Eugene, OR) was injected into quadriceps muscles of the left and right hindlimbs to label the femoral motoneuron pools as previously described. Over a period of 7 days, fresh spinal cords were processed for semiquantitation of mRNA by using in situ hybridization. 3. There was an increase in Talpha1-tubulin and GAP-43 mRNA and a decline in the NFM mRNA at 7 days after nerve suture and sham stimulation but not in intact nerves. In contrast, 1-h stimulation of sutured but not intact nerves dramatically accelerated the changes in gene expression: mRNA levels of Talpha1-tubulin and GAP-43 were significantly elevated above control levels by 2 days while NFM mRNA was significantly reduced by 2 days in the sutured nerves. Thereby, the neurofilament/tubulin expression ratio was reduced at 2 days after suture and stimulation, possibly allowing more tubulin to be transported faster into the growing axons to accelerate the elongation rate following stimulation. Importantly, the changes in RAGs and NFM gene expression were delayed relative to the accelerated upregulation of BDNF and trkB mRNA by electrical stimulation. 4. The temporal sequence of upregulation of BDNF and trkB, altered gene expression of RAGs and NFM, and accelerated axonal outgrowth from the proximal nerve stump are consistent with a key role of BDNF and trkB in mediating the altered expression of RAGs and, in turn, the promotion of axonal outgrowth after electrical stimulation.

  13. Gene discovery by chemical mutagenesis and whole-genome sequencing in Dictyostelium.

    PubMed

    Li, Cheng-Lin Frank; Santhanam, Balaji; Webb, Amanda Nicole; Zupan, Blaž; Shaulsky, Gad

    2016-09-01

    Whole-genome sequencing is a useful approach for identification of chemical-induced lesions, but previous applications involved tedious genetic mapping to pinpoint the causative mutations. We propose that saturation mutagenesis under low mutagenic loads, followed by whole-genome sequencing, should allow direct implication of genes by identifying multiple independent alleles of each relevant gene. We tested the hypothesis by performing three genetic screens with chemical mutagenesis in the social soil amoeba Dictyostelium discoideum Through genome sequencing, we successfully identified mutant genes with multiple alleles in near-saturation screens, including resistance to intense illumination and strong suppressors of defects in an allorecognition pathway. We tested the causality of the mutations by comparison to published data and by direct complementation tests, finding both dominant and recessive causative mutations. Therefore, our strategy provides a cost- and time-efficient approach to gene discovery by integrating chemical mutagenesis and whole-genome sequencing. The method should be applicable to many microbial systems, and it is expected to revolutionize the field of functional genomics in Dictyostelium by greatly expanding the mutation spectrum relative to other common mutagenesis methods. PMID:27307293

  14. Exploiting Pre-rRNA Processing in Diamond Blackfan Anemia Gene Discovery and Diagnosis

    PubMed Central

    Farrar, Jason E.; Quarello, Paola; Fisher, Ross; O’Brien, Kelly A.; Aspesi, Anna; Parrella, Sara; Henson, Adrianna L.; Seidel, Nancy E.; Atsidaftos, Eva; Prakash, Supraja; Bari, Shahla; Garelli, Emanuela; Arceci, Robert J.; Dianzani, Irma; Ramenghi, Ugo; Vlachos, Adrianna; Lipton, Jeffrey M.; Bodine, David M.; Ellis, Steven R.

    2014-01-01

    Diamond Blackfan anemia (DBA), a syndrome primarily characterized by anemia and physical abnormalities, is one among a group of related inherited bone marrow failure syndromes (IBMFS) which share overlapping clinical features. Heterozygous mutations or single-copy deletions have been identified in 12 ribosomal protein genes in approximately 60% of DBA cases, with the genetic etiology unexplained in most remaining patients. Unlike many IBMFS, for which functional screening assays complement clinical and genetic findings, suspected DBA in the absence of typical alterations of the known genes must frequently be diagnosed after exclusion of other IBMFS. We report here a novel deletion in a child that presented such a diagnostic challenge and prompted development of a novel functional assay that can assist in the diagnosis of a significant fraction of patients with DBA. The ribosomal proteins affected in DBA are required for pre-rRNA processing, a process which can be interrogated to monitor steps in the maturation of 40S and 60S ribosomal subunits. In contrast to prior methods used to assess pre-rRNA processing, the assay reported here, based on capillary electrophoresis measurement of the maturation of rRNA in pre-60S ribosomal subunits, would be readily amenable to use in diagnostic laboratories. In addition to utility as a diagnostic tool, we applied this technique to gene discovery in DBA, resulting in the identification of RPL31 as a novel DBA gene. PMID:25042156

  15. Discovery of genes related to formothion resistance in oriental fruit fly (Bactrocera dorsalis) by a constrained functional genomics analysis.

    PubMed

    Kuo, T C-Y; Hu, C-C; Chien, T-Y; Chen, M J M; Feng, H-T; Chen, L-F O; Chen, C-Y; Hsu, J-C

    2015-06-01

    Artificial selection can provide insights into how insecticide resistance mechanisms evolve in populations. The underlying basis of such phenomena can involve complex interactions of multiple genes, and the resolution of this complexity first necessitates confirmation that specific genes are involved in resistance mechanisms. Here, we used a novel approach invoking a constrained RNA sequencing analysis to refine the discovery of specific genes involved in insecticide resistance. Specifically, for gene discovery, an additional constraint was added to the traditional comparisons of susceptible vs. resistant flies by the incorporation of a line in which insecticide susceptibility was 'recovered' within a resistant line by the removal of insecticide stress. In our analysis, the criterion for the classification of any gene as related to insecticide resistance was based on evidence for differential expression in the resistant line as compared with both the susceptible and recovered lines. The incorporation of this additional constraint reduced the number of differentially expressed genes putatively involved in resistance to 464, compared with more than 1000 that had been identified previously using this same species. In addition, our analysis identified several key genes involved in metabolic detoxification processes that showed up-regulated expression. Furthermore, the involvement of acetylcholinesterase, a known target for modification in insecticide resistance, was associated with three key nonsynonymous amino acid substitutions within our data. In conclusion, the incorporation of an additional constraint using a 'recovered' line for gene discovery provides a higher degree of confidence in genes identified to be involved in insecticide resistance phenomena.

  16. SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate.

    PubMed

    Roffler, Gretchen H; Amish, Stephen J; Smith, Seth; Cosart, Ted; Kardos, Marty; Schwartz, Michael K; Luikart, Gordon

    2016-09-01

    Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding and nearby 5' and 3' untranslated regions of chosen candidate genes. Targeted sequences were taken from bighorn sheep (Ovis canadensis) exon capture data and directly from the domestic sheep genome (Ovis aries v. 3; oviAri3). The bighorn sheep sequences used in the Dall's sheep (Ovis dalli dalli) exon capture aligned to 2350 genes on the oviAri3 genome with an average of 2 exons each. We developed a microfluidic qPCR-based SNP chip to genotype 476 Dall's sheep from locations across their range and test for patterns of selection. Using multiple corroborating approaches (lositan and bayescan), we detected 28 SNP loci potentially under selection. We additionally identified candidate loci significantly associated with latitude, longitude, precipitation and temperature, suggesting local environmental adaptation. The three methods demonstrated consistent support for natural selection on nine genes with immune and disease-regulating functions (e.g. Ovar-DRA, APC, BATF2, MAGEB18), cell regulation signalling pathways (e.g. KRIT1, PI3K, ORRC3), and respiratory health (CYSLTR1). Characterizing adaptive allele distributions from novel genetic techniques will facilitate investigation of the influence of environmental variation on local adaptation of a northern alpine ungulate throughout its range. This research demonstrated the utility of exon capture for gene-targeted SNP discovery and subsequent SNP chip genotyping using low-quality samples in a nonmodel species. PMID:27327375

  17. SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate.

    PubMed

    Roffler, Gretchen H; Amish, Stephen J; Smith, Seth; Cosart, Ted; Kardos, Marty; Schwartz, Michael K; Luikart, Gordon

    2016-09-01

    Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding and nearby 5' and 3' untranslated regions of chosen candidate genes. Targeted sequences were taken from bighorn sheep (Ovis canadensis) exon capture data and directly from the domestic sheep genome (Ovis aries v. 3; oviAri3). The bighorn sheep sequences used in the Dall's sheep (Ovis dalli dalli) exon capture aligned to 2350 genes on the oviAri3 genome with an average of 2 exons each. We developed a microfluidic qPCR-based SNP chip to genotype 476 Dall's sheep from locations across their range and test for patterns of selection. Using multiple corroborating approaches (lositan and bayescan), we detected 28 SNP loci potentially under selection. We additionally identified candidate loci significantly associated with latitude, longitude, precipitation and temperature, suggesting local environmental adaptation. The three methods demonstrated consistent support for natural selection on nine genes with immune and disease-regulating functions (e.g. Ovar-DRA, APC, BATF2, MAGEB18), cell regulation signalling pathways (e.g. KRIT1, PI3K, ORRC3), and respiratory health (CYSLTR1). Characterizing adaptive allele distributions from novel genetic techniques will facilitate investigation of the influence of environmental variation on local adaptation of a northern alpine ungulate throughout its range. This research demonstrated the utility of exon capture for gene-targeted SNP discovery and subsequent SNP chip genotyping using low-quality samples in a nonmodel species.

  18. Leveraging gene-environment interactions and endotypes for asthma gene discovery.

    PubMed

    Bønnelykke, Klaus; Ober, Carole

    2016-03-01

    Asthma is a heterogeneous clinical syndrome that includes subtypes of disease with different underlying causes and disease mechanisms. Asthma is caused by a complex interaction between genes and environmental exposures; early-life exposures in particular play an important role. Asthma is also heritable, and a number of susceptibility variants have been discovered in genome-wide association studies, although the known risk alleles explain only a small proportion of the heritability. In this review, we present evidence supporting the hypothesis that focusing on more specific asthma phenotypes, such as childhood asthma with severe exacerbations, and on relevant exposures that are involved in gene-environment interactions (GEIs), such as rhinovirus infections, will improve detection of asthma genes and our understanding of the underlying mechanisms. We will discuss the challenges of considering GEIs and the advantages of studying responses to asthma-associated exposures in clinical birth cohorts, as well as in cell models of GEIs, to dissect the context-specific nature of genotypic risks, to prioritize variants in genome-wide association studies, and to identify pathways involved in pathogenesis in subgroups of patients. We propose that such approaches, in spite of their many challenges, present great opportunities for better understanding of asthma pathogenesis and heterogeneity and, ultimately, for improving prevention and treatment of disease. PMID:26947980

  19. Leveraging gene-environment interactions and endotypes for asthma gene discovery.

    PubMed

    Bønnelykke, Klaus; Ober, Carole

    2016-03-01

    Asthma is a heterogeneous clinical syndrome that includes subtypes of disease with different underlying causes and disease mechanisms. Asthma is caused by a complex interaction between genes and environmental exposures; early-life exposures in particular play an important role. Asthma is also heritable, and a number of susceptibility variants have been discovered in genome-wide association studies, although the known risk alleles explain only a small proportion of the heritability. In this review, we present evidence supporting the hypothesis that focusing on more specific asthma phenotypes, such as childhood asthma with severe exacerbations, and on relevant exposures that are involved in gene-environment interactions (GEIs), such as rhinovirus infections, will improve detection of asthma genes and our understanding of the underlying mechanisms. We will discuss the challenges of considering GEIs and the advantages of studying responses to asthma-associated exposures in clinical birth cohorts, as well as in cell models of GEIs, to dissect the context-specific nature of genotypic risks, to prioritize variants in genome-wide association studies, and to identify pathways involved in pathogenesis in subgroups of patients. We propose that such approaches, in spite of their many challenges, present great opportunities for better understanding of asthma pathogenesis and heterogeneity and, ultimately, for improving prevention and treatment of disease.

  20. 141 Gene Discovery and Data Sharing in Genome Wide Association Analyses: lessons form AIDS genetic restriction genes

    PubMed Central

    O'Brien, Stephen J; Svitin, Anton; Malov, Sergey; Cherkazov, Nikolay; Dobrynin, Pavel; Geerts, Paul; Troyer, Jennifer; Hendrickson-Lambert, Sher; Sezgin, Efe; Hutcheson, Holli

    2014-01-01

    As genome wide association studies plus whole genome sequence analyses for complex human disease determinants are expanding, it seems useful to develop strategies to facilitate large data sharing, rapid replication and validation of provocative statistical associations that straddle the threshold for genome wide significance. At this conference, we shall announce GWATCH, (Genome Wide Association Tracks Chromosome Highway) a web based data release platform that can freely display and inspect unabridged genome tracked association data without compromising privacy or Informed Consent constrictions, allowing for rapid discovery and replication opportunities. We illustrate the utility with HIV-AIDS resistance genes screened in combined large multicenter cohort studies GWAS (MACS, HGDS, MHGS, ALLIVE, LSOCA HOMER) developed and studied over the last decades.

  1. Adaptive radiation of venomous marine snail lineages and the accelerated evolution of venom peptide genes.

    PubMed

    Olivera, Baldomero M; Watkins, Maren; Bandyopadhyay, Pradip; Imperial, Julita S; de la Cotera, Edgar P Heimer; Aguilar, Manuel B; Vera, Estuardo López; Concepcion, Gisela P; Lluisma, Arturo

    2012-09-01

    An impressive biodiversity (>10,000 species) of marine snails (suborder Toxoglossa or superfamily Conoidea) have complex venoms, each containing approximately 100 biologically active, disulfide-rich peptides. In the genus Conus, the most intensively investigated toxoglossan lineage (∼500 species), a small set of venom gene superfamilies undergo rapid sequence hyperdiversification within their mature toxin regions. Each major lineage of Toxoglossa has its own distinct set of venom gene superfamilies. Two recently identified venom gene superfamilies are expressed in the large Turridae clade, but not in Conus. Thus, as major venomous molluscan clades expand, a small set of lineage-specific venom gene superfamilies undergo accelerated evolution. The juxtaposition of extremely conserved signal sequences with hypervariable mature peptide regions is unprecedented and raises the possibility that in these gene superfamilies, the signal sequences are conserved as a result of an essential role they play in enabling rapid sequence evolution of the region of the gene that encodes the active toxin.

  2. Characterization of ESTs from black locust for gene discovery and marker development.

    PubMed

    Wang, J X; Lu, C; Yuan, C Q; Cui, B B; Qiu, Q D; Sun, P; Hu, R Y; Wu, D C; Sun, Y H; Li, Y

    2015-01-01

    Black locust (Robinia pseudoacacia L.) is an ecologically and economically important species. However, it has relatively underdeveloped genomic resources, and this limits gene discovery and marker-assisted selective breeding. In the present study, we obtained large-scale transcriptome data using a next-generation sequencing platform to compensate for the lack of black locust genomic information. Increasing the amount of transcriptome data for black locust will provide a valuable resource for multi-gene phylogenetic analyses and will facilitate research on the mechanisms whereby conserved genes and functions are maintained in the face of species divergence. We sequenced the black locust transcriptome from a cDNA library of multiple tissues and individuals on an Illumina platform, and this produced 108,229,352 clean sequence reads. The high-quality overlapping expressed sequence tags (ESTs) were assembled into 36,533 unigenes, and 4781 simple sequence repeats were characterized. A large collection of high-quality ESTs was obtained, de novo assembled, and characterized. Our results markedly expand the previous transcript catalogues of black locust and can gradually be applied to black locust breeding programs. Furthermore, our data will facilitate future research on the comparative genomics of black locust and related species. PMID:26505419

  3. Sleeping Beauty transposon insertional mutagenesis based mouse models for cancer gene discovery

    PubMed Central

    Moriarity, Branden S; Largaespada, David A

    2016-01-01

    Large-scale genomic efforts to study human cancer, such as the cancer gene atlas (TCGA), have identified numerous cancer drivers in a wide variety of tumor types. However, there are limitations to this approach, the mutations and expression or copy number changes that are identified are not always clearly functionally relevant, and only annotated genes and genetic elements are thoroughly queried. The use of complimentary, nonbiased, functional approaches to identify drivers of cancer development and progression is ideal to maximize the rate at which cancer discoveries are achieved. One such approach that has been successful is the use of the Sleeping Beauty (SB) transposon-based mutagenesis system in mice. This system uses a conditionally expressed transposase and mutagenic transposon allele to target mutagenesis to somatic cells of a given tissue in mice to cause random mutations leading to tumor development. Analysis of tumors for transposon common insertion sites (CIS) identifies candidate cancer genes specific to that tumor type. While similar screens have been performed in mice with the PiggyBac (PB) transposon and viral approaches, we limit extensive discussion to SB. Here we discuss the basic structure of these screens, screens that have been performed, methods used to identify CIS. PMID:26051241

  4. De novo transcriptome sequencing and discovery of genes related to copper tolerance in Paeonia ostii.

    PubMed

    Wang, Yanjie; Dong, Chunlan; Xue, Zeyun; Jin, Qijiang; Xu, Yingchun

    2016-01-15

    Paeonia ostii, an important ornamental and medicinal plant, grows normally on copper (Cu) mines with widespread Cu contamination of soils, and it has the ability to lower Cu contents in the Cu-contaminated soils. However, very little molecular information concerned with Cu resistance of P. ostii is available. In this study, high-throughput de novo transcriptome sequencing was carried out for P. ostii with and without Cu treatment using Illumina HiSeq 2000 platform. A total of 77,704 All-unigenes were obtained with a mean length of 710 bp. Of these unigenes, 47,461 were annotated with public databases based on sequence similarities. Comparative transcript profiling allowed the discovery of 4324 differentially expressed genes (DEGs), with 2207 up-regulated and 2117 down-regulated unigenes in Cu-treated library as compared to the control counterpart. Based on these DEGs, Gene Ontology (GO) enrichment analysis indicated Cu stress-relevant terms, such as 'membrane' and 'antioxidant activity'. Meanwhile, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis uncovered some important pathways, including 'biosynthesis of secondary metabolites' and 'metabolic pathways'. In addition, expression patterns of 12 selected DEGs derived from quantitative real-time polymerase chain reaction (qRT-PCR) were consistent with their transcript abundance changes obtained by transcriptomic analyses, suggesting that all the 12 genes were authentically involved in Cu tolerance in P. ostii. This is the first report to identify genes related to Cu stress responses in P. ostii, which could offer valuable information on the molecular mechanisms of Cu resistance, and provide a basis for further genomics research on this and related ornamental species for phytoremediation.

  5. Discovery of MicroRNA169 Gene Copies in Genomes of Flowering Plants through Positional Information

    PubMed Central

    Calviño, Martín; Messing, Joachim

    2013-01-01

    Expansion and contraction of microRNA (miRNA) families can be studied in sequenced plant genomes through sequence alignments. Here, we focused on miR169 in sorghum because of its implications in drought tolerance and stem-sugar content. We were able to discover many miR169 copies that have escaped standard genome annotation methods. A new miR169 cluster was found on sorghum chromosome 1. This cluster is composed of the previously annotated sbi-MIR169o together with two newly found MIR169 copies, named sbi-MIR169t and sbi-MIR169u. We also found that a miR169 cluster on sorghum chr7 consisting of sbi-MIR169l, sbi-MIR169m, and sbi-MIR169n is contained within a chromosomal inversion of at least 500 kb that occurred in sorghum relative to Brachypodium, rice, foxtail millet, and maize. Surprisingly, synteny of chromosomal segments containing MIR169 copies with linked bHLH and CONSTANS-LIKE genes extended from Brachypodium to dictotyledonous species such as grapevine, soybean, and cassava, indicating a strong conservation of linkages of certain flowering and/or plant height genes and microRNAs, which may explain linkage drag of drought and flowering traits and would have consequences for breeding new varieties. Furthermore, alignment of rice and sorghum orthologous regions revealed the presence of two additional miR169 gene copies (miR169r and miR169s) on sorghum chr7 that formed an antisense miRNA gene pair. Both copies are expressed and target different set of genes. Synteny-based analysis of microRNAs among different plant species should lead to the discovery of new microRNAs in general and contribute to our understanding of their evolution. PMID:23348041

  6. Gene Discovery for Synthetic Biology: Exploring the Novel Natural Product Biosynthetic Capacity of Eukaryotic Microalgae.

    PubMed

    O'Neill, E C; Saalbach, G; Field, R A

    2016-01-01

    Eukaryotic microalgae are an incredibly diverse group of organisms whose sole unifying feature is their ability to photosynthesize. They are known for producing a range of potent toxins, which can build up during harmful algal blooms causing damage to ecosystems and fisheries. Genome sequencing is lagging behind in these organisms because of their genetic complexity, but transcriptome sequencing is beginning to make up for this deficit. As more sequence data becomes available, it is apparent that eukaryotic microalgae possess a range of complex natural product biosynthesis capabilities. Some of the genes concerned are responsible for the biosynthesis of known toxins, but there are many more for which we do not know the products. Bioinformatic and analytical techniques have been developed for natural product discovery in bacteria and these approaches can be used to extract information about the products synthesized by algae. Recent analyses suggest that eukaryotic microalgae produce many complex natural products that remain to be discovered. PMID:27480684

  7. A Sorghum Mutant Resource as an Efficient Platform for Gene Discovery in Grasses[OPEN

    PubMed Central

    Burke, John; Chen, Junping; Wang, Bo; Hayes, Chad; Emendack, Yves

    2016-01-01

    Sorghum (Sorghum bicolor) is a versatile C4 crop and a model for research in family Poaceae. High-quality genome sequence is available for the elite inbred line BTx623, but functional validation of genes remains challenging due to the limited genomic and germplasm resources available for comprehensive analysis of induced mutations. In this study, we generated 6400 pedigreed M4 mutant pools from EMS-mutagenized BTx623 seeds through single-seed descent. Whole-genome sequencing of 256 phenotyped mutant lines revealed >1.8 million canonical EMS-induced mutations, affecting >95% of genes in the sorghum genome. The vast majority (97.5%) of the induced mutations were distinct from natural variations. To demonstrate the utility of the sequenced sorghum mutant resource, we performed reverse genetics to identify eight genes potentially affecting drought tolerance, three of which had allelic mutations and two of which exhibited exact cosegregation with the phenotype of interest. Our results establish that a large-scale resource of sequenced pedigreed mutants provides an efficient platform for functional validation of genes in sorghum, thereby accelerating sorghum breeding. Moreover, findings made in sorghum could be readily translated to other members of the Poaceae via integrated genomics approaches. PMID:27354556

  8. Grouped False-Discovery Rate for Removing the Gene-Set-Level Bias of RNA-seq.

    PubMed

    Yang, Tae Young; Jeong, Seongmun

    2013-01-01

    In recent years, RNA-seq has become a very competitive alternative to microarrays. In RNA-seq experiments, the expected read count for a gene is proportional to its expression level multiplied by its transcript length. Even when two genes are expressed at the same level, differences in length will yield differing numbers of total reads. The characteristics of these RNA-seq experiments create a gene-level bias such that the proportion of significantly differentially expressed genes increases with the transcript length, whereas such bias is not present in microarray data. Gene-set analysis seeks to identify the gene sets that are enriched in the list of the identified significant genes. In the gene-set analysis of RNA-seq, the gene-level bias subsequently yields the gene-set-level bias that a gene set with genes of long length will be more likely to show up as enriched than will a gene set with genes of shorter length. Because gene expression is not related to its transcript length, any gene set containing long genes is not of biologically greater interest than gene sets with shorter genes. Accordingly the gene-set-level bias should be removed to accurately calculate the statistical significance of each gene-set enrichment in the RNA-seq. We present a new gene set analysis method of RNA-seq, called FDRseq, which can accurately calculate the statistical significance of a gene-set enrichment score by the grouped false-discovery rate. Numerical examples indicated that FDRseq is appropriate for controlling the transcript length bias in the gene-set analysis of RNA-seq data. To implement FDRseq, we developed the R program, which can be downloaded at no cost from http://home.mju.ac.kr/home/index.action?siteId=tyang.

  9. Seed traits and genes important for translational biology--highlights from recent discoveries.

    PubMed

    Martínez-Andújar, Cristina; Martin, Ruth C; Nonogaki, Hiroyuki

    2012-01-01

    Seeds provide food, feed, fiber and fuel. They are also an important delivery system of genetic information, which is essential for the survival of wild species in ecosystems and the production of agricultural crops. In this review, seed traits and genes that are potentially important for agricultural applications are discussed. Over the long period of crop domestication, seed traits have been modified through intentional or unintentional selections. While most selections have led to seed traits favorable for agricultural consumption, such as larger seeds with higher nutritional value than the wild type, other manipulations in modern breeding sometimes led to negative traits, such as vivipary, precocious germination on the maternal plant or reduced seed vigor, as a side effect during the improvement of other characteristics. Greater effort is needed to overcome these problems that have emerged as a consequence of crop improvement. Seed biology researchers have characterized the function of many genes in the last decade, including those associated with seed domestication, which may be useful in addressing critical issues in modern agriculture, such as the prevention of vivipary and seed shattering or the enhancement of yields. Recent discoveries in seed biology research are highlighted in this review, with an emphasis on their potential for translational biology.

  10. Adeno-Associated Virus at 50: A Golden Anniversary of Discovery, Research, and Gene Therapy Success—A Personal Perspective

    PubMed Central

    Hastie, Eric

    2015-01-01

    Abstract Fifty years after the discovery of adeno-associated virus (AAV) and more than 30 years after the first gene transfer experiment was conducted, dozens of gene therapy clinical trials are in progress, one vector is approved for use in Europe, and breakthroughs in virus modification and disease modeling are paving the way for a revolution in the treatment of rare diseases, cancer, as well as HIV. This review will provide a historical perspective on the progression of AAV for gene therapy from discovery to the clinic, focusing on contributions from the Samulski lab regarding basic science and cloning of AAV, optimized large-scale production of vectors, preclinical large animal studies and safety data, vector modifications for improved efficacy, and successful clinical applications. PMID:25807962

  11. Adeno-associated virus at 50: a golden anniversary of discovery, research, and gene therapy success--a personal perspective.

    PubMed

    Hastie, Eric; Samulski, R Jude

    2015-05-01

    Fifty years after the discovery of adeno-associated virus (AAV) and more than 30 years after the first gene transfer experiment was conducted, dozens of gene therapy clinical trials are in progress, one vector is approved for use in Europe, and breakthroughs in virus modification and disease modeling are paving the way for a revolution in the treatment of rare diseases, cancer, as well as HIV. This review will provide a historical perspective on the progression of AAV for gene therapy from discovery to the clinic, focusing on contributions from the Samulski lab regarding basic science and cloning of AAV, optimized large-scale production of vectors, preclinical large animal studies and safety data, vector modifications for improved efficacy, and successful clinical applications.

  12. Gene invasion in distant eukaryotic lineages: discovery of mutually exclusive genetic elements reveals marine biodiversity.

    PubMed

    Monier, Adam; Sudek, Sebastian; Fast, Naomi M; Worden, Alexandra Z

    2013-09-01

    Inteins are rare, translated genetic parasites mainly found in bacteria and archaea, while spliceosomal introns are distinctly eukaryotic features abundant in most nuclear genomes. Using targeted metagenomics, we discovered an intein in an Atlantic population of the photosynthetic eukaryote, Bathycoccus, harbored by the essential spliceosomal protein PRP8 (processing factor 8 protein). Although previously thought exclusive to fungi, we also identified PRP8 inteins in parasitic (Capsaspora) and predatory (Salpingoeca) protists. Most new PRP8 inteins were at novel insertion sites that, surprisingly, were not in the most conserved regions of the gene. Evolutionarily, Dikarya fungal inteins at PRP8 insertion site a appeared more related to the Bathycoccus intein at a unique insertion site, than to other fungal and opisthokont inteins. Strikingly, independent analyses of Pacific and Atlantic samples revealed an intron at the same codon as the Bathycoccus PRP8 intein. The two elements are mutually exclusive and neither was found in cultured Bathycoccus or other picoprasinophyte genomes. Thus, wild Bathycoccus contain one of few non-fungal eukaryotic inteins known and a rare polymorphic intron. Our data indicate at least two Bathycoccus ecotypes exist, associated respectively with oceanic or mesotrophic environments. We hypothesize that intein propagation is facilitated by marine viruses; and, while intron gain is still poorly understood, presence of a spliceosomal intron where a locus lacks an intein raises the possibility of new, intein-primed mechanisms for intron gain. The discovery of nucleus-encoded inteins and associated sequence polymorphisms in uncultivated marine eukaryotes highlights their diversity and reveals potential sexual boundaries between populations indistinguishable by common marker genes.

  13. Using Phenomic Analysis of Photosynthetic Function for Abiotic Stress Response Gene Discovery

    PubMed Central

    Rungrat, Tepsuda; Awlia, Mariam; Brown, Tim; Cheng, Riyan; Sirault, Xavier; Fajkus, Jiri; Trtilek, Martin; Furbank, Bob; Badger, Murray; Tester, Mark; Pogson, Barry J; Borevitz, Justin O; Wilson, Pip

    2016-01-01

    Monitoring the photosynthetic performance of plants is a major key to understanding how plants adapt to their growth conditions. Stress tolerance traits have a high genetic complexity as plants are constantly, and unavoidably, exposed to numerous stress factors, which limits their growth rates in the natural environment. Arabidopsis thaliana, with its broad genetic diversity and wide climatic range, has been shown to successfully adapt to stressful conditions to ensure the completion of its life cycle. As a result, A. thaliana has become a robust and renowned plant model system for studying natural variation and conducting gene discovery studies. Genome wide association studies (GWAS) in restructured populations combining natural and recombinant lines is a particularly effective way to identify the genetic basis of complex traits. As most abiotic stresses affect photosynthetic activity, chlorophyll fluorescence measurements are a potential phenotyping technique for monitoring plant performance under stress conditions. This review focuses on the use of chlorophyll fluorescence as a tool to study genetic variation underlying the stress tolerance responses to abiotic stress in A. thaliana. PMID:27695390

  14. Using Phenomic Analysis of Photosynthetic Function for Abiotic Stress Response Gene Discovery

    PubMed Central

    Rungrat, Tepsuda; Awlia, Mariam; Brown, Tim; Cheng, Riyan; Sirault, Xavier; Fajkus, Jiri; Trtilek, Martin; Furbank, Bob; Badger, Murray; Tester, Mark; Pogson, Barry J; Borevitz, Justin O; Wilson, Pip

    2016-01-01

    Monitoring the photosynthetic performance of plants is a major key to understanding how plants adapt to their growth conditions. Stress tolerance traits have a high genetic complexity as plants are constantly, and unavoidably, exposed to numerous stress factors, which limits their growth rates in the natural environment. Arabidopsis thaliana, with its broad genetic diversity and wide climatic range, has been shown to successfully adapt to stressful conditions to ensure the completion of its life cycle. As a result, A. thaliana has become a robust and renowned plant model system for studying natural variation and conducting gene discovery studies. Genome wide association studies (GWAS) in restructured populations combining natural and recombinant lines is a particularly effective way to identify the genetic basis of complex traits. As most abiotic stresses affect photosynthetic activity, chlorophyll fluorescence measurements are a potential phenotyping technique for monitoring plant performance under stress conditions. This review focuses on the use of chlorophyll fluorescence as a tool to study genetic variation underlying the stress tolerance responses to abiotic stress in A. thaliana.

  15. Paradigm of Tunable Clustering Using Binarization of Consensus Partition Matrices (Bi-CoPaM) for Gene Discovery

    PubMed Central

    Abu-Jamous, Basel; Fa, Rui; Roberts, David J.; Nandi, Asoke K.

    2013-01-01

    Clustering analysis has a growing role in the study of co-expressed genes for gene discovery. Conventional binary and fuzzy clustering do not embrace the biological reality that some genes may be irrelevant for a problem and not be assigned to a cluster, while other genes may participate in several biological functions and should simultaneously belong to multiple clusters. Also, these algorithms cannot generate tight clusters that focus on their cores or wide clusters that overlap and contain all possibly relevant genes. In this paper, a new clustering paradigm is proposed. In this paradigm, all three eventualities of a gene being exclusively assigned to a single cluster, being assigned to multiple clusters, and being not assigned to any cluster are possible. These possibilities are realised through the primary novelty of the introduction of tunable binarization techniques. Results from multiple clustering experiments are aggregated to generate one fuzzy consensus partition matrix (CoPaM), which is then binarized to obtain the final binary partitions. This is referred to as Binarization of Consensus Partition Matrices (Bi-CoPaM). The method has been tested with a set of synthetic datasets and a set of five real yeast cell-cycle datasets. The results demonstrate its validity in generating relevant tight, wide, and complementary clusters that can meet requirements of different gene discovery studies. PMID:23409186

  16. A combination of gene expression ranking and co-expression network analysis increases discovery rate in large-scale mutant screens for novel Arabidopsis thaliana abiotic stress genes.

    PubMed

    Ransbotyn, Vanessa; Yeger-Lotem, Esti; Basha, Omer; Acuna, Tania; Verduyn, Christoph; Gordon, Michal; Chalifa-Caspi, Vered; Hannah, Matthew A; Barak, Simon

    2015-05-01

    As challenges to food security increase, the demand for lead genes for improving crop production is growing. However, genetic screens of plant mutants typically yield very low frequencies of desired phenotypes. Here, we present a powerful computational approach for selecting candidate genes for screening insertion mutants. We combined ranking of Arabidopsis thaliana regulatory genes according to their expression in response to multiple abiotic stresses (Multiple Stress [MST] score), with stress-responsive RNA co-expression network analysis to select candidate multiple stress regulatory (MSTR) genes. Screening of 62 T-DNA insertion mutants defective in candidate MSTR genes, for abiotic stress germination phenotypes yielded a remarkable hit rate of up to 62%; this gene discovery rate is 48-fold greater than that of other large-scale insertional mutant screens. Moreover, the MST score of these genes could be used to prioritize them for screening. To evaluate the contribution of the co-expression analysis, we screened 64 additional mutant lines of MST-scored genes that did not appear in the RNA co-expression network. The screening of these MST-scored genes yielded a gene discovery rate of 36%, which is much higher than that of classic mutant screens but not as high as when picking candidate genes from the co-expression network. The MSTR co-expression network that we created, AraSTressRegNet is publicly available at http://netbio.bgu.ac.il/arnet. This systems biology-based screening approach combining gene ranking and network analysis could be generally applicable to enhancing identification of genes regulating additional processes in plants and other organisms provided that suitable transcriptome data are available. PMID:25370817

  17. A combination of gene expression ranking and co-expression network analysis increases discovery rate in large-scale mutant screens for novel Arabidopsis thaliana abiotic stress genes.

    PubMed

    Ransbotyn, Vanessa; Yeger-Lotem, Esti; Basha, Omer; Acuna, Tania; Verduyn, Christoph; Gordon, Michal; Chalifa-Caspi, Vered; Hannah, Matthew A; Barak, Simon

    2015-05-01

    As challenges to food security increase, the demand for lead genes for improving crop production is growing. However, genetic screens of plant mutants typically yield very low frequencies of desired phenotypes. Here, we present a powerful computational approach for selecting candidate genes for screening insertion mutants. We combined ranking of Arabidopsis thaliana regulatory genes according to their expression in response to multiple abiotic stresses (Multiple Stress [MST] score), with stress-responsive RNA co-expression network analysis to select candidate multiple stress regulatory (MSTR) genes. Screening of 62 T-DNA insertion mutants defective in candidate MSTR genes, for abiotic stress germination phenotypes yielded a remarkable hit rate of up to 62%; this gene discovery rate is 48-fold greater than that of other large-scale insertional mutant screens. Moreover, the MST score of these genes could be used to prioritize them for screening. To evaluate the contribution of the co-expression analysis, we screened 64 additional mutant lines of MST-scored genes that did not appear in the RNA co-expression network. The screening of these MST-scored genes yielded a gene discovery rate of 36%, which is much higher than that of classic mutant screens but not as high as when picking candidate genes from the co-expression network. The MSTR co-expression network that we created, AraSTressRegNet is publicly available at http://netbio.bgu.ac.il/arnet. This systems biology-based screening approach combining gene ranking and network analysis could be generally applicable to enhancing identification of genes regulating additional processes in plants and other organisms provided that suitable transcriptome data are available.

  18. An Evaluation of Active Learning Causal Discovery Methods for Reverse-Engineering Local Causal Pathways of Gene Regulation.

    PubMed

    Ma, Sisi; Kemmeren, Patrick; Aliferis, Constantin F; Statnikov, Alexander

    2016-01-01

    Reverse-engineering of causal pathways that implicate diseases and vital cellular functions is a fundamental problem in biomedicine. Discovery of the local causal pathway of a target variable (that consists of its direct causes and direct effects) is essential for effective intervention and can facilitate accurate diagnosis and prognosis. Recent research has provided several active learning methods that can leverage passively observed high-throughput data to draft causal pathways and then refine the inferred relations with a limited number of experiments. The current study provides a comprehensive evaluation of the performance of active learning methods for local causal pathway discovery in real biological data. Specifically, 54 active learning methods/variants from 3 families of algorithms were applied for local causal pathways reconstruction of gene regulation for 5 transcription factors in S. cerevisiae. Four aspects of the methods' performance were assessed, including adjacency discovery quality, edge orientation accuracy, complete pathway discovery quality, and experimental cost. The results of this study show that some methods provide significant performance benefits over others and therefore should be routinely used for local causal pathway discovery tasks. This study also demonstrates the feasibility of local causal pathway reconstruction in real biological systems with significant quality and low experimental cost.

  19. An Evaluation of Active Learning Causal Discovery Methods for Reverse-Engineering Local Causal Pathways of Gene Regulation

    PubMed Central

    Ma, Sisi; Kemmeren, Patrick; Aliferis, Constantin F.; Statnikov, Alexander

    2016-01-01

    Reverse-engineering of causal pathways that implicate diseases and vital cellular functions is a fundamental problem in biomedicine. Discovery of the local causal pathway of a target variable (that consists of its direct causes and direct effects) is essential for effective intervention and can facilitate accurate diagnosis and prognosis. Recent research has provided several active learning methods that can leverage passively observed high-throughput data to draft causal pathways and then refine the inferred relations with a limited number of experiments. The current study provides a comprehensive evaluation of the performance of active learning methods for local causal pathway discovery in real biological data. Specifically, 54 active learning methods/variants from 3 families of algorithms were applied for local causal pathways reconstruction of gene regulation for 5 transcription factors in S. cerevisiae. Four aspects of the methods’ performance were assessed, including adjacency discovery quality, edge orientation accuracy, complete pathway discovery quality, and experimental cost. The results of this study show that some methods provide significant performance benefits over others and therefore should be routinely used for local causal pathway discovery tasks. This study also demonstrates the feasibility of local causal pathway reconstruction in real biological systems with significant quality and low experimental cost. PMID:26939894

  20. Functional Linkage between Genes That Regulate Osmotic Stress Responses and Multidrug Resistance Transporters: Challenges and Opportunities for Antibiotic Discovery

    PubMed Central

    2014-01-01

    All cells need to protect themselves against the osmotic challenges of their environment by maintaining low permeability to ions across their cell membranes. This is a basic principle of cellular function, which is reflected in the interactions among ion transport and drug efflux genes that have arisen during cellular evolution. Thus, upon exposure to pore-forming antibiotics such as amphotericin B (AmB) or daptomycin (Dap), sensitive cells overexpress common resistance genes to protect themselves from added osmotic challenges. These genes share pathway interactions with the various types of multidrug resistance (MDR) transporter genes, which both preserve the native lipid membrane composition and at the same time eliminate disruptive hydrophobic molecules that partition excessively within the lipid bilayer. An increased understanding of the relationships between the genes (and their products) that regulate osmotic stress responses and MDR transporters will help to identify novel strategies and targets to overcome the current stalemate in drug discovery. PMID:24295980

  1. Topoisomerase inhibition accelerates gene expression after adeno-associated virus-mediated gene transfer to the mammalian heart.

    PubMed

    Prasad, Konkal-Matt R; Xu, Yaqin; Yang, Zequan; Toufektsian, Marie-Claire; Berr, Stuart S; French, Brent A

    2007-04-01

    Utility of adeno-associated virus 2 (AAV2) vectors for cardiac gene therapy is limited by the prolonged lag phase before maximal gene expression. Topoisomerase inhibition can induce AAV2-mediated gene expression in vivo, but with variable success in different tissues. In this study, we demonstrate that topoisomerase inhibition can accelerate AAV2-mediated gene expression in the mouse heart. We used an AAV2 vector expressing firefly luciferase and monitored expression kinetics using non-invasive bioluminescence imaging. In the group receiving vector alone, cardiac luciferase activity was evident from week 2 onward and increased progressively to reach a steady plateau by 9 weeks postinjection. In the group receiving vector and camptothecine (CPT), luciferase expression was evident from days 2 to 4 onward and increased rapidly to reach a steady plateau by 3-4 weeks postinjection, nearly three times faster than in the absence of CPT (P<0.05). Southern blot analysis of AAV2 genomes in cardiac tissue showed rapid conversion of the AAV2 genome from its single-stranded to double-stranded form in CPT-treated mice. Non-invasive determinations of luciferase expression correlated well with in vitro luciferase assays. Direct injection of the AAV2 vector and long-term luciferase gene expression had no detectable effects on normal cardiac function as assessed by magnetic resonance imaging.

  2. De novo Assembly and Characterization of the Transcriptome of Broomcorn Millet (Panicum miliaceum L.) for Gene Discovery and Marker Development.

    PubMed

    Yue, Hong; Wang, Le; Liu, Hui; Yue, Wenjie; Du, Xianghong; Song, Weining; Nie, Xiaojun

    2016-01-01

    Broomcorn millet (Panicum miliaceum L.) is one of the world's oldest cultivated cereals, which is well-adapted to extreme environments such as drought, heat, and salinity with an efficient C4 carbon fixation. Discovery and identification of genes involved in these processes will provide valuable information to improve the crop for meeting the challenge of global climate change. However, the lack of genetic resources and genomic information make gene discovery and molecular mechanism studies very difficult. Here, we sequenced and assembled the transcriptome of broomcorn millet using Illumina sequencing technology. After sequencing, a total of 45,406,730 and 51,160,820 clean paired-end reads were obtained for two genotypes Yumi No. 2 and Yumi No. 3. These reads were mixed and then assembled into 113,643 unigenes, with the length ranging from 351 to 15,691 bp, of which 62,543 contings could be assigned to 315 gene ontology (GO) categories. Cluster of orthologous groups and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses assigned could map 15,514 unigenes into 202 KEGG pathways and 51,020 unigenes to 25 COG categories, respectively. Furthermore, 35,216 simple sequence repeats (SSRs) were identified in 27,055 unigene sequences, of which trinucleotides were the most abundant repeat unit, accounting for 66.72% of SSRs. In addition, 292 differentially expressed genes were identified between the two genotypes, which were significantly enriched in 88 GO terms and 12 KEGG pathways. Finally, the expression patterns of four selected transcripts were validated through quantitative reverse transcription polymerase chain reaction analysis. Our study for the first time sequenced and assembled the transcriptome of broomcorn millet, which not only provided a rich sequence resource for gene discovery and marker development in this important crop, but will also facilitate the further investigation of the molecular mechanism of its favored agronomic traits and beyond. PMID

  3. De novo Assembly and Characterization of the Transcriptome of Broomcorn Millet (Panicum miliaceum L.) for Gene Discovery and Marker Development

    PubMed Central

    Yue, Hong; Wang, Le; Liu, Hui; Yue, Wenjie; Du, Xianghong; Song, Weining; Nie, Xiaojun

    2016-01-01

    Broomcorn millet (Panicum miliaceum L.) is one of the world’s oldest cultivated cereals, which is well-adapted to extreme environments such as drought, heat, and salinity with an efficient C4 carbon fixation. Discovery and identification of genes involved in these processes will provide valuable information to improve the crop for meeting the challenge of global climate change. However, the lack of genetic resources and genomic information make gene discovery and molecular mechanism studies very difficult. Here, we sequenced and assembled the transcriptome of broomcorn millet using Illumina sequencing technology. After sequencing, a total of 45,406,730 and 51,160,820 clean paired-end reads were obtained for two genotypes Yumi No. 2 and Yumi No. 3. These reads were mixed and then assembled into 113,643 unigenes, with the length ranging from 351 to 15,691 bp, of which 62,543 contings could be assigned to 315 gene ontology (GO) categories. Cluster of orthologous groups and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses assigned could map 15,514 unigenes into 202 KEGG pathways and 51,020 unigenes to 25 COG categories, respectively. Furthermore, 35,216 simple sequence repeats (SSRs) were identified in 27,055 unigene sequences, of which trinucleotides were the most abundant repeat unit, accounting for 66.72% of SSRs. In addition, 292 differentially expressed genes were identified between the two genotypes, which were significantly enriched in 88 GO terms and 12 KEGG pathways. Finally, the expression patterns of four selected transcripts were validated through quantitative reverse transcription polymerase chain reaction analysis. Our study for the first time sequenced and assembled the transcriptome of broomcorn millet, which not only provided a rich sequence resource for gene discovery and marker development in this important crop, but will also facilitate the further investigation of the molecular mechanism of its favored agronomic traits and beyond. PMID

  4. An integrated approach to gene discovery and marker development in Atlantic cod (Gadus morhua).

    PubMed

    Bowman, Sharen; Hubert, Sophie; Higgins, Brent; Stone, Cynthia; Kimball, Jennifer; Borza, Tudor; Bussey, Jillian Tarrant; Simpson, Gary; Kozera, Catherine; Curtis, Bruce A; Hall, Jennifer R; Hori, Tiago S; Feng, Charles Y; Rise, Marlies; Booman, Marije; Gamperl, A Kurt; Trippel, Edward; Symonds, Jane; Johnson, Stewart C; Rise, Matthew L

    2011-04-01

    Atlantic cod is a species that has been overexploited by the capture fishery. Programs to domesticate this species are underway in several countries, including Canada, to provide an alternative route for production. Selective breeding programs have been successfully applied in the domestication of other species, with genomics-based approaches used to augment conventional methods of animal production in recent years. Genomics tools, such as gene sequences and sets of variable markers, also have the potential to enhance and accelerate selective breeding programs in aquaculture, and to provide better monitoring tools to ensure that wild cod populations are well managed. We describe the generation of significant genomics resources for Atlantic cod through an integrated genomics/selective breeding approach. These include 158,877 expressed sequence tags (ESTs), a set of annotated putative transcripts and several thousand single nucleotide polymorphism markers that were developed from, and have been shown to be highly variable in, fish enrolled in two selective breeding programs. Our EST collection was generated from various tissues and life cycle stages. In some cases, tissues from which libraries were generated were isolated from fish exposed to stressors, including elevated temperature, or antigen stimulation (bacterial and viral) to enrich for transcripts that are involved in these response pathways. The genomics resources described here support the developing aquaculture industry, enabling the application of molecular markers within selective breeding programs. Marker sets should also find widespread application in fisheries management.

  5. De Novo Transcriptomic Analysis of Peripheral Blood Lymphocytes from the Chinese Goose: Gene Discovery and Immune System Pathway Description

    PubMed Central

    Tariq, Mansoor; Chen, Rong; Yuan, Hongyu; Liu, Yanjie; Wu, Yanan; Wang, Junya; Xia, Chun

    2015-01-01

    Background The Chinese goose is one of the most economically important poultry birds and is a natural reservoir for many avian viruses. However, the nature and regulation of the innate and adaptive immune systems of this waterfowl species are not completely understood due to limited information on the goose genome. Recently, transcriptome sequencing technology was applied in the genomic studies focused on novel gene discovery. Thus, this study described the transcriptome of the goose peripheral blood lymphocytes to identify immunity relevant genes. Principal Findings De novo transcriptome assembly of the goose peripheral blood lymphocytes was sequenced by Illumina-Solexa technology. In total, 211,198 unigenes were assembled from the 69.36 million cleaned reads. The average length, N50 size and the maximum length of the assembled unigenes were 687 bp, 1,298 bp and 18,992 bp, respectively. A total of 36,854 unigenes showed similarity by BLAST search against the NCBI non-redundant (Nr) protein database. For functional classification, 163,161 unigenes were comprised of three Gene Ontology (Go) categories and 67 subcategories. A total of 15,334 unigenes were annotated into 25 eukaryotic orthologous groups (KOGs) categories. Kyoto Encyclopedia of Genes and Genomes (KEGG) database annotated 39,585 unigenes into six biological functional groups and 308 pathways. Among the 2,757 unigenes that participated in the 15 immune system KEGG pathways, 125 of the most important immune relevant genes were summarized and analyzed by STRING analysis to identify gene interactions and relationships. Moreover, 10 genes were confirmed by PCR and analyzed. Of these 125 unigenes, 109 unigenes, approximately 87%, were not previously identified in the goose. Conclusion This de novo transcriptome analysis could provide important Chinese goose sequence information and highlights the value of new gene discovery, pathways investigation and immune system gene identification, and comparison with

  6. Genome-Scale Discovery of Cell Wall Biosynthesis Genes in Populus (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    SciTech Connect

    Muchero, Wellington

    2012-03-22

    Wellington Muchero from Oak Ridge National Laboratory gives a talk titled "Discovery of Cell Wall Biosynthesis Genes in Populus" at the JGI 7th Annual Users Meeting: Genomics of Energy & Environment Meeting on March 22, 2012 in Walnut Creek, California.

  7. Genome-Scale Discovery of Cell Wall Biosynthesis Genes in Populus (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    ScienceCinema

    Muchero, Wellington [Oak Ridge National Laboratory

    2016-07-12

    Wellington Muchero from Oak Ridge National Laboratory gives a talk titled "Discovery of Cell Wall Biosynthesis Genes in Populus" at the JGI 7th Annual Users Meeting: Genomics of Energy & Environment Meeting on March 22, 2012 in Walnut Creek, California.

  8. IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites

    PubMed Central

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T. B. K.; Cimermančič, Peter; Fischbach, Michael A.; Ivanova, Natalia N.; Markowitz, Victor M.

    2015-01-01

    ABSTRACT In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of “big” genomic data for discovering small molecules. IMG-ABC relies on IMG’s comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC’s focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. PMID:26173699

  9. Metabologenomics: Correlation of Microbial Gene Clusters with Metabolites Drives Discovery of a Nonribosomal Peptide with an Unusual Amino Acid Monomer

    PubMed Central

    2016-01-01

    For more than half a century the pharmaceutical industry has sifted through natural products produced by microbes, uncovering new scaffolds and fashioning them into a broad range of vital drugs. We sought a strategy to reinvigorate the discovery of natural products with distinctive structures using bacterial genome sequencing combined with metabolomics. By correlating genetic content from 178 actinomycete genomes with mass spectrometry-enabled analyses of their exported metabolomes, we paired new secondary metabolites with their biosynthetic gene clusters. We report the use of this new approach to isolate and characterize tambromycin, a new chlorinated natural product, composed of several nonstandard amino acid monomeric units, including a unique pyrrolidine-containing amino acid we name tambroline. Tambromycin shows antiproliferative activity against cancerous human B- and T-cell lines. The discovery of tambromycin via large-scale correlation of gene clusters with metabolites (a.k.a. metabologenomics) illuminates a path for structure-based discovery of natural products at a sharply increased rate. PMID:27163034

  10. Molecular networking and pattern-based genome mining improves discovery of biosynthetic gene clusters and their products from Salinispora species.

    PubMed

    Duncan, Katherine R; Crüsemann, Max; Lechner, Anna; Sarkar, Anindita; Li, Jie; Ziemert, Nadine; Wang, Mingxun; Bandeira, Nuno; Moore, Bradley S; Dorrestein, Pieter C; Jensen, Paul R

    2015-04-23

    Genome sequencing has revealed that bacteria contain many more biosynthetic gene clusters than predicted based on the number of secondary metabolites discovered to date. While this biosynthetic reservoir has fostered interest in new tools for natural product discovery, there remains a gap between gene cluster detection and compound discovery. Here we apply molecular networking and the new concept of pattern-based genome mining to 35 Salinispora strains, including 30 for which draft genome sequences were either available or obtained for this study. The results provide a method to simultaneously compare large numbers of complex microbial extracts, which facilitated the identification of media components, known compounds and their derivatives, and new compounds that could be prioritized for structure elucidation. These efforts revealed considerable metabolite diversity and led to several molecular family-gene cluster pairings, of which the quinomycin-type depsipeptide retimycin A was characterized and linked to gene cluster NRPS40 using pattern-based bioinformatic approaches.

  11. The High Time Resolution Universe Pulsar Survey - XII. Galactic plane acceleration search and the discovery of 60 pulsars

    NASA Astrophysics Data System (ADS)

    Ng, C.; Champion, D. J.; Bailes, M.; Barr, E. D.; Bates, S. D.; Bhat, N. D. R.; Burgay, M.; Burke-Spolaor, S.; Flynn, C. M. L.; Jameson, A.; Johnston, S.; Keith, M. J.; Kramer, M.; Levin, L.; Petroff, E.; Possenti, A.; Stappers, B. W.; van Straten, W.; Tiburzi, C.; Eatough, R. P.; Lyne, A. G.

    2015-07-01

    We present initial results from the low-latitude Galactic plane region of the High Time Resolution Universe pulsar survey conducted at the Parkes 64-m radio telescope. We discuss the computational challenges arising from the processing of the terabyte-sized survey data. Two new radio interference mitigation techniques are introduced, as well as a partially coherent segmented acceleration search algorithm which aims to increase our chances of discovering highly relativistic short-orbit binary systems, covering a parameter space including potential pulsar-black hole binaries. We show that under a constant acceleration approximation, a ratio of data length over orbital period of ≈0.1 results in the highest effectiveness for this search algorithm. From the 50 per cent of data processed thus far, we have redetected 435 previously known pulsars and discovered a further 60 pulsars, two of which are fast-spinning pulsars with periods less than 30 ms. PSR J1101-6424 is a millisecond pulsar whose heavy white dwarf (WD) companion and short spin period of 5.1 ms indicate a rare example of full-recycling via Case A Roche lobe overflow. PSR J1757-27 appears to be an isolated recycled pulsar with a relatively long spin period of 17 ms. In addition, PSR J1244-6359 is a mildly recycled binary system with a heavy WD companion, PSR J1755-25 has a significant orbital eccentricity of 0.09 and PSR J1759-24 is likely to be a long-orbit eclipsing binary with orbital period of the order of tens of years. Comparison of our newly discovered pulsar sample to the known population suggests that they belong to an older population. Furthermore, we demonstrate that our current pulsar detection yield is as expected from population synthesis.

  12. IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

    SciTech Connect

    Chen, I-Min; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Huang, Jinghua; Reddy, T. B.K.; Cimermancic, Peter; Fischbach, Michael; Ivanova, Natalia; Markowitz, Victor; Kyrpides, Nikos; Pati, Amrita

    2014-10-28

    In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC (https://img.jgi.doe.gov/abc/) -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorway to a new era in the discovery of novel molecules.

  13. A novel peptide-modified and gene-activated biomimetic bone matrix accelerating bone regeneration.

    PubMed

    Pan, Haitao; Zheng, Qixin; Yang, Shuhua; Guo, Xiaodong; Wu, Bin; Zou, Zhenwei; Duan, Zhixia

    2014-08-01

    The osteogenic differentiation of bone marrow stromal cells (BMSCs) can be regulated by systemic or local growth factor, especially by transforming growth factor beta 1 (TGF-β1). However, how to maintain the bioactivity of exogenous TGF-β1 is a great challenge due to its short half-life time. The most promising solution is to transfer TGF-β1 gene into seed cells through transgenic technology and then transgenic cells to continuously secret endogenous TGF-β1 protein via gene expression. In this study, a novel non-viral vector (K)16GRGDSPC was chemically linked to bioactive bone matrices PLGA-[ASP-PEG]n using cross-linker to construct a novel non-viral gene transfer system. TGF-β1 gene was incubated with this system and subsequently rabbit-derived BMSCs were co-cultured with this gene-activated PLGA-[ASP-PEG]n, while co-cultured with PLGA-[ASP-PEG]n modified with (K)16GRGDSPC only and original PLGA-[ASP-PEG]n as control. Thus we fabricated three kinds of composites: Group A (BMSCs-TGF-β1DNA-(K)16GRGDSPC-PLGA-[ASP-PEG]n composite); Group B (BMSCs-(K)16GRGDSPC-PLGA-[ASP-PEG]n composite); and Group C (BMSCs-PLGA-[ASP-PEG]n composite). TGF-β1 and other osteogenic phenotype markers of alkaline phosphatase, osteocalcin, osteopontin and type I collagen in Group A were all significantly higher than the other two groups ex vivo. In vivo, 15-mm long segmental rabbit bone defects were created and randomly implanted the aforementioned composites separately, and then fixed with plate-screws. The results demonstrated that the implants in Group A significantly accelerated bone regeneration compared with the other implants based on X-rays, histological and biomechanical examinations. Therefore, we conclude this novel peptide-modified and gene-activated biomimetic bone matrix of TGF-β1DNA-(K)16GRGDSPC-PLGA-[ASP-PEG]n is a very promising scaffold biomaterial for accelerating bone regeneration. PMID:24115366

  14. Discovery of germline-related genes in Cephalochordate amphioxus: A genome wide survey using genome annotation and transcriptome data.

    PubMed

    Yue, Jia-Xing; Li, Kun-Lung; Yu, Jr-Kai

    2015-12-01

    The generation of germline cells is a critical process in the reproduction of multicellular organisms. Studies in animal models have identified a common repertoire of genes that play essential roles in primordial germ cell (PGC) formation. However, comparative studies also indicate that the timing and regulation of this core genetic program vary considerably in different animals, raising the intriguing questions regarding the evolution of PGC developmental mechanisms in metazoans. Cephalochordates (commonly called amphioxus or lancelets) represent one of the invertebrate chordate groups and can provide important information about the evolution of developmental mechanisms in the chordate lineage. In this study, we used genome and transcriptome data to identify germline-related genes in two distantly related cephalochordate species, Branchiostoma floridae and Asymmetron lucayanum. Branchiostoma and Asymmetron diverged more than 120 MYA, and the most conspicuous difference between them is their gonadal morphology. We used important germline developmental genes in several model animals to search the amphioxus genome and transcriptome dataset for conserved homologs. We also annotated the assembled transcriptome data using Gene Ontology (GO) terms to facilitate the discovery of putative genes associated with germ cell development and reproductive functions in amphioxus. We further confirmed the expression of 14 genes in developing oocytes or mature eggs using whole mount in situ hybridization, suggesting their potential functions in amphioxus germ cell development. The results of this global survey provide a useful resource for testing potential functions of candidate germline-related genes in cephalochordates and for investigating differences in gonad developmental mechanisms between Branchiostoma and Asymmetron species.

  15. Plastid–Nuclear Interaction and Accelerated Coevolution in Plastid Ribosomal Genes in Geraniaceae

    PubMed Central

    Weng, Mao-Lun; Ruhlman, Tracey A.; Jansen, Robert K.

    2016-01-01

    Plastids and mitochondria have many protein complexes that include subunits encoded by organelle and nuclear genomes. In animal cells, compensatory evolution between mitochondrial and nuclear-encoded subunits was identified and the high mitochondrial mutation rates were hypothesized to drive compensatory evolution in nuclear genomes. In plant cells, compensatory evolution between plastid and nucleus has rarely been investigated in a phylogenetic framework. To investigate plastid–nuclear coevolution, we focused on plastid ribosomal protein genes that are encoded by plastid and nuclear genomes from 27 Geraniales species. Substitution rates were compared for five sets of genes representing plastid- and nuclear-encoded ribosomal subunit proteins targeted to the cytosol or the plastid as well as nonribosomal protein controls. We found that nonsynonymous substitution rates (dN) and the ratios of nonsynonymous to synonymous substitution rates (ω) were accelerated in both plastid- (CpRP) and nuclear-encoded subunits (NuCpRP) of the plastid ribosome relative to control sequences. Our analyses revealed strong signals of cytonuclear coevolution between plastid- and nuclear-encoded subunits, in which nonsynonymous substitutions in CpRP and NuCpRP tend to occur along the same branches in the Geraniaceae phylogeny. This coevolution pattern cannot be explained by physical interaction between amino acid residues. The forces driving accelerated coevolution varied with cellular compartment of the sequence. Increased ω in CpRP was mainly due to intensified positive selection whereas increased ω in NuCpRP was caused by relaxed purifying selection. In addition, the many indels identified in plastid rRNA genes in Geraniaceae may have contributed to changes in plastid subunits. PMID:27190001

  16. Plastid-Nuclear Interaction and Accelerated Coevolution in Plastid Ribosomal Genes in Geraniaceae.

    PubMed

    Weng, Mao-Lun; Ruhlman, Tracey A; Jansen, Robert K

    2016-01-01

    Plastids and mitochondria have many protein complexes that include subunits encoded by organelle and nuclear genomes. In animal cells, compensatory evolution between mitochondrial and nuclear-encoded subunits was identified and the high mitochondrial mutation rates were hypothesized to drive compensatory evolution in nuclear genomes. In plant cells, compensatory evolution between plastid and nucleus has rarely been investigated in a phylogenetic framework. To investigate plastid-nuclear coevolution, we focused on plastid ribosomal protein genes that are encoded by plastid and nuclear genomes from 27 Geraniales species. Substitution rates were compared for five sets of genes representing plastid- and nuclear-encoded ribosomal subunit proteins targeted to the cytosol or the plastid as well as nonribosomal protein controls. We found that nonsynonymous substitution rates (dN) and the ratios of nonsynonymous to synonymous substitution rates (ω) were accelerated in both plastid- (CpRP) and nuclear-encoded subunits (NuCpRP) of the plastid ribosome relative to control sequences. Our analyses revealed strong signals of cytonuclear coevolution between plastid- and nuclear-encoded subunits, in which nonsynonymous substitutions in CpRP and NuCpRP tend to occur along the same branches in the Geraniaceae phylogeny. This coevolution pattern cannot be explained by physical interaction between amino acid residues. The forces driving accelerated coevolution varied with cellular compartment of the sequence. Increased ω in CpRP was mainly due to intensified positive selection whereas increased ω in NuCpRP was caused by relaxed purifying selection. In addition, the many indels identified in plastid rRNA genes in Geraniaceae may have contributed to changes in plastid subunits. PMID:27190001

  17. DISCOVERY OF CANDIDATE H{sub 2}O DISK MASERS IN ACTIVE GALACTIC NUCLEI AND ESTIMATIONS OF CENTRIPETAL ACCELERATIONS

    SciTech Connect

    Greenhill, Lincoln J.; Moran, James M.; Tilak, Avanti; Kondratko, Paul T.

    2009-12-10

    Based on spectroscopic signatures, about one-third of known H{sub 2}O maser sources in active galactic nuclei (AGNs) are believed to arise in highly inclined accretion disks around central engines. These 'disk maser candidates' are of interest primarily because angular structure and rotation curves can be resolved with interferometers, enabling dynamical study. We identify five new disk maser candidates in studies with the Green Bank Telescope, bringing the total number published to 30. We discovered two (NGC 1320, NGC 17) in a survey of 40 inclined active galaxies (v {sub sys} < 20, 000 km s{sup -1}). The remaining three disk maser candidates were identified in monitoring of known sources: NGC 449, NGC 2979, and NGC 3735. We also confirm a previously marginal case in UGC 4203. For the disk maser candidates reported here, inferred rotation speeds are 130-500 km s{sup -1}. Monitoring of three more rapidly rotating candidate disks (CG 211, NGC 6264, VV 340A) has enabled measurement of likely orbital centripetal acceleration, and estimation of central masses ((2-7) x10{sup 7} M {sub sun}) and mean disk radii (0.2-0.4 pc). Accelerations may ultimately permit estimation of distances when combined with interferometer data. This is notable because the three AGNs are relatively distant (10,000 km s{sup -1}

  18. Accelerated molecular dynamics simulations of the octopamine receptor using GPUs: discovery of an alternate agonist-binding position.

    PubMed

    Kastner, Kevin W; Izaguirre, Jesús A

    2016-10-01

    Octopamine receptors (OARs) perform key biological functions in invertebrates, making this class of G-protein coupled receptors (GPCRs) worth considering for insecticide development. However, no crystal structures and very little research exists for OARs. Furthermore, GPCRs are large proteins, are suspended in a lipid bilayer, and are activated on the millisecond timescale, all of which make conventional molecular dynamics (MD) simulations infeasible, even if run on large supercomputers. However, accelerated Molecular Dynamics (aMD) simulations can reduce this timescale to even hundreds of nanoseconds, while running the simulations on graphics processing units (GPUs) would enable even small clusters of GPUs to have processing power equivalent to hundreds of CPUs. Our results show that aMD simulations run on GPUs can successfully obtain the active and inactive state conformations of a GPCR on this reduced timescale. Furthermore, we discovered a potential alternate active-state agonist-binding position in the octopamine receptor which has yet to be observed and may be a novel GPCR agonist-binding position. These results demonstrate that a complex biological system with an activation process on the millisecond timescale can be successfully simulated on the nanosecond timescale using a simple computing system consisting of a small number of GPUs. Proteins 2016; 84:1480-1489. © 2016 Wiley Periodicals, Inc. PMID:27318014

  19. Accelerated molecular dynamics simulations of the octopamine receptor using GPUs: discovery of an alternate agonist-binding position.

    PubMed

    Kastner, Kevin W; Izaguirre, Jesús A

    2016-10-01

    Octopamine receptors (OARs) perform key biological functions in invertebrates, making this class of G-protein coupled receptors (GPCRs) worth considering for insecticide development. However, no crystal structures and very little research exists for OARs. Furthermore, GPCRs are large proteins, are suspended in a lipid bilayer, and are activated on the millisecond timescale, all of which make conventional molecular dynamics (MD) simulations infeasible, even if run on large supercomputers. However, accelerated Molecular Dynamics (aMD) simulations can reduce this timescale to even hundreds of nanoseconds, while running the simulations on graphics processing units (GPUs) would enable even small clusters of GPUs to have processing power equivalent to hundreds of CPUs. Our results show that aMD simulations run on GPUs can successfully obtain the active and inactive state conformations of a GPCR on this reduced timescale. Furthermore, we discovered a potential alternate active-state agonist-binding position in the octopamine receptor which has yet to be observed and may be a novel GPCR agonist-binding position. These results demonstrate that a complex biological system with an activation process on the millisecond timescale can be successfully simulated on the nanosecond timescale using a simple computing system consisting of a small number of GPUs. Proteins 2016; 84:1480-1489. © 2016 Wiley Periodicals, Inc.

  20. Declining expression of a single epithelial cell-autonomous gene accelerates age-related thymic involution

    PubMed Central

    Sun, Liguang; Guo, Jianfei; Brown, Robert; Amagai, Takashi; Zhao, Yong; Su, Dong-Ming

    2010-01-01

    SUMMARY Age-related thymic involution may be triggered by gene expression changes in lymphohematopoietic and/or non-hematopoietic thymic epithelial cells (TECs). The role of epithelial cell-autonomous gene FoxN1 may be involved in the process, but it is still a puzzle due to shortage of evidence from gradual loss-of-function and exogenous gain-of-function studies. Using our recently generated loxP-floxed-FoxN1(fx) mouse carrying the ubiquitous CreERT (uCreERT) transgene with a low dose of spontaneous activation, which causes gradual FoxN1 deletion with age, we found that the uCreERT-fx/fx mice showed an accelerated age-related thymic involution due to progressive loss of FoxN1+ TECs. The thymic aging phenotypes were clearly observable as early as at 3–6 months of age, resembling the naturally aged (18–22-month-old) murine thymus. By intrathymically supplying aged wild-type mice with exogenous FoxN1-cDNA, thymic involution and defective peripheral CD4+ T-cell function could be partially rescued. The results support the notion that decline of a single epithelial cell-autonomous gene FoxN1 levels with age causes primary deterioration in TECs followed by impairment of the total postnatal thymic microenvironment, and potentially triggers age-related thymic involution in mice. PMID:20156205

  1. Ataxin1L is a regulator of HSC function highlighting the utility of cross-tissue comparisons for gene discovery.

    PubMed

    Kahle, Juliette J; Souroullas, George P; Yu, Peng; Zohren, Fabian; Lee, Yoontae; Shaw, Chad A; Zoghbi, Huda Y; Goodell, Margaret A

    2013-03-01

    Hematopoietic stem cells (HSCs) are rare quiescent cells that continuously replenish the cellular components of the peripheral blood. Observing that the ataxia-associated gene Ataxin-1-like (Atxn1L) was highly expressed in HSCs, we examined its role in HSC function through in vitro and in vivo assays. Mice lacking Atxn1L had greater numbers of HSCs that regenerated the blood more quickly than their wild-type counterparts. Molecular analyses indicated Atxn1L null HSCs had gene expression changes that regulate a program consistent with their higher level of proliferation, suggesting that Atxn1L is a novel regulator of HSC quiescence. To determine if additional brain-associated genes were candidates for hematologic regulation, we examined genes encoding proteins from autism- and ataxia-associated protein-protein interaction networks for their representation in hematopoietic cell populations. The interactomes were found to be highly enriched for proteins encoded by genes specifically expressed in HSCs relative to their differentiated progeny. Our data suggest a heretofore unappreciated similarity between regulatory modules in the brain and HSCs, offering a new strategy for novel gene discovery in both systems.

  2. SSHscreen and SSHdb, generic software for microarray based gene discovery: application to the stress response in cowpea

    PubMed Central

    2010-01-01

    Background Suppression subtractive hybridization is a popular technique for gene discovery from non-model organisms without an annotated genome sequence, such as cowpea (Vigna unguiculata (L.) Walp). We aimed to use this method to enrich for genes expressed during drought stress in a drought tolerant cowpea line. However, current methods were inefficient in screening libraries and management of the sequence data, and thus there was a need to develop software tools to facilitate the process. Results Forward and reverse cDNA libraries enriched for cowpea drought response genes were screened on microarrays, and the R software package SSHscreen 2.0.1 was developed (i) to normalize the data effectively using spike-in control spot normalization, and (ii) to select clones for sequencing based on the calculation of enrichment ratios with associated statistics. Enrichment ratio 3 values for each clone showed that 62% of the forward library and 34% of the reverse library clones were significantly differentially expressed by drought stress (adjusted p value < 0.05). Enrichment ratio 2 calculations showed that > 88% of the clones in both libraries were derived from rare transcripts in the original tester samples, thus supporting the notion that suppression subtractive hybridization enriches for rare transcripts. A set of 118 clones were chosen for sequencing, and drought-induced cowpea genes were identified, the most interesting encoding a late embryogenesis abundant Lea5 protein, a glutathione S-transferase, a thaumatin, a universal stress protein, and a wound induced protein. A lipid transfer protein and several components of photosynthesis were down-regulated by the drought stress. Reverse transcriptase quantitative PCR confirmed the enrichment ratio values for the selected cowpea genes. SSHdb, a web-accessible database, was developed to manage the clone sequences and combine the SSHscreen data with sequence annotations derived from BLAST and Blast2GO. The self

  3. Discovery of Possible Gene Relationships through the Application of Self-Organizing Maps to DNA Microarray Databases

    PubMed Central

    Chavez-Alvarez, Rocio; Chavoya, Arturo; Mendez-Vazquez, Andres

    2014-01-01

    DNA microarrays and cell cycle synchronization experiments have made possible the study of the mechanisms of cell cycle regulation of Saccharomyces cerevisiae by simultaneously monitoring the expression levels of thousands of genes at specific time points. On the other hand, pattern recognition techniques can contribute to the analysis of such massive measurements, providing a model of gene expression level evolution through the cell cycle process. In this paper, we propose the use of one of such techniques –an unsupervised artificial neural network called a Self-Organizing Map (SOM)–which has been successfully applied to processes involving very noisy signals, classifying and organizing them, and assisting in the discovery of behavior patterns without requiring prior knowledge about the process under analysis. As a test bed for the use of SOMs in finding possible relationships among genes and their possible contribution in some biological processes, we selected 282 S. cerevisiae genes that have been shown through biological experiments to have an activity during the cell cycle. The expression level of these genes was analyzed in five of the most cited time series DNA microarray databases used in the study of the cell cycle of this organism. With the use of SOM, it was possible to find clusters of genes with similar behavior in the five databases along two cell cycles. This result suggested that some of these genes might be biologically related or might have a regulatory relationship, as was corroborated by comparing some of the clusters obtained with SOMs against a previously reported regulatory network that was generated using biological knowledge, such as protein-protein interactions, gene expression levels, metabolism dynamics, promoter binding, and modification, regulation and transport of proteins. The methodology described in this paper could be applied to the study of gene relationships of other biological processes in different organisms. PMID:24699245

  4. Discovery of agents that eradicate leukemia stem cells using an in silico screen of public gene expression data

    PubMed Central

    Hassane, Duane C.; Guzman, Monica L.; Corbett, Cheryl; Li, Xiaojie; Abboud, Ramzi; Young, Fay; Liesveld, Jane L.; Carroll, Martin

    2008-01-01

    Increasing evidence indicates that malignant stem cells are important for the pathogenesis of acute myelogenous leukemia (AML) and represent a reservoir of cells that drive the development of AML and relapse. Therefore, new treatment regimens are necessary to prevent relapse and improve therapeutic outcomes. Previous studies have shown that the sesquiterpene lactone, parthenolide (PTL), ablates bulk, progenitor, and stem AML cells while causing no appreciable toxicity to normal hematopoietic cells. Thus, PTL must evoke cellular responses capable of mediating AML selective cell death. Given recent advances in chemical genomics such as gene expression-based high-throughput screening (GE-HTS) and the Connectivity Map, we hypothesized that the gene expression signature resulting from treatment of primary AML with PTL could be used to search for similar signatures in publicly available gene expression profiles deposited into the Gene Expression Omnibus (GEO). We therefore devised a broad in silico screen of the GEO database using the PTL gene expression signature as a template and discovered 2 new agents, celastrol and 4-hydroxy-2-nonenal, that effectively eradicate AML at the bulk, progenitor, and stem cell level. These findings suggest the use of multicenter collections of high-throughput data to facilitate discovery of leukemia drugs and drug targets. PMID:18305216

  5. Genome-wide SNP discovery and linkage analysis in barley based on genes responsive to abiotic stress.

    PubMed

    Rostoks, Nils; Mudie, Sharon; Cardle, Linda; Russell, Joanne; Ramsay, Luke; Booth, Allan; Svensson, Jan T; Wanamaker, Steve I; Walia, Harkamal; Rodriguez, Edmundo M; Hedley, Peter E; Liu, Hui; Morris, Jenny; Close, Timothy J; Marshall, David F; Waugh, Robbie

    2005-12-01

    More than 2,000 genome-wide barley single nucleotide polymorphisms (SNPs) were developed by resequencing unigene fragments from eight diverse accessions. The average genome-wide SNP frequency observed in 877 unigenes was 1 SNP per 200 bp. However, SNP frequency was highly variable with the least number of SNP and SNP haplotypes observed within European cultivated germplasm reflecting effects of breeding history on genetic diversity. More than 300 SNP loci were mapped genetically in three experimental mapping populations which allowed the construction of an integrated SNP map incorporating a large number of RFLP, AFLP and SSR markers (1,237 loci in total). The genes used for SNP discovery were selected based on their transcriptional response to a variety of abiotic stresses. A set of known barley abiotic stress QTL was positioned on the linkage map, while the available sequence and gene expression information facilitated the identification of genes potentially associated with these traits. Comparison of the sequenced SNP loci to the rice genome sequence identified several regions of highly conserved gene order providing a framework for marker saturation in barley genomic regions of interest. The integration of genome-wide SNP and expression data with available genetic and phenotypic information will facilitate the identification of gene function in barley and other non-model organisms. PMID:16244872

  6. Metabolic profiling of yeast culture using gas chromatography coupled with orthogonal acceleration accurate mass time-of-flight mass spectrometry: application to biomarker discovery.

    PubMed

    Kondo, Elsuida; Marriott, Philip J; Parker, Rhiannon M; Kouremenos, Konstantinos A; Morrison, Paul; Adams, Mike

    2014-01-01

    Yeast and yeast cultures are frequently used as additives in diets of dairy cows. Beneficial effects from the inclusion of yeast culture in diets for dairy mammals have been reported, and the aim of this study was to develop a comprehensive analytical method for the accurate mass identification of the 'global' metabolites in order to differentiate a variety of yeasts at varying growth stages (Diamond V XP, Yea-Sacc and Levucell). Microwave-assisted derivatization for metabolic profiling is demonstrated through the analysis of differing yeast samples developed for cattle feed, which include a wide range of metabolites of interest covering a large range of compound classes. Accurate identification of the components was undertaken using GC-oa-ToFMS (gas chromatography-orthogonal acceleration-time-of-flight mass spectrometry), followed by principal component analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA) for data reduction and biomarker discovery. Semi-quantification (fold changes in relative peak areas) was reported for metabolites identified as possible discriminative biomarkers (p-value <0.05, fold change >2), including D-ribose (four fold decrease), myo-inositol (five fold increase), L-phenylalanine (three fold increase), glucopyranoside (two fold increase), fructose (three fold increase) and threitol (three fold increase) respectively. PMID:24356230

  7. A Hybrid Computational Method for the Discovery of Novel Reproduction-Related Genes

    PubMed Central

    Chen, Lei; Chu, Chen; Kong, Xiangyin; Huang, Guohua; Huang, Tao; Cai, Yu-Dong

    2015-01-01

    Uncovering the molecular mechanisms underlying reproduction is of great importance to infertility treatment and to the generation of healthy offspring. In this study, we discovered novel reproduction-related genes with a hybrid computational method, integrating three different types of method, which offered new clues for further reproduction research. This method was first executed on a weighted graph, constructed based on known protein-protein interactions, to search the shortest paths connecting any two known reproduction-related genes. Genes occurring in these paths were deemed to have a special relationship with reproduction. These newly discovered genes were filtered with a randomization test. Then, the remaining genes were further selected according to their associations with known reproduction-related genes measured by protein-protein interaction score and alignment score obtained by BLAST. The in-depth analysis of the high confidence novel reproduction genes revealed hidden mechanisms of reproduction and provided guidelines for further experimental validations. PMID:25768094

  8. A hybrid computational method for the discovery of novel reproduction-related genes.

    PubMed

    Chen, Lei; Chu, Chen; Kong, Xiangyin; Huang, Guohua; Huang, Tao; Cai, Yu-Dong

    2015-01-01

    Uncovering the molecular mechanisms underlying reproduction is of great importance to infertility treatment and to the generation of healthy offspring. In this study, we discovered novel reproduction-related genes with a hybrid computational method, integrating three different types of method, which offered new clues for further reproduction research. This method was first executed on a weighted graph, constructed based on known protein-protein interactions, to search the shortest paths connecting any two known reproduction-related genes. Genes occurring in these paths were deemed to have a special relationship with reproduction. These newly discovered genes were filtered with a randomization test. Then, the remaining genes were further selected according to their associations with known reproduction-related genes measured by protein-protein interaction score and alignment score obtained by BLAST. The in-depth analysis of the high confidence novel reproduction genes revealed hidden mechanisms of reproduction and provided guidelines for further experimental validations.

  9. Co-clustering phenome–genome for phenotype classification and disease gene discovery

    PubMed Central

    Hwang, TaeHyun; Atluri, Gowtham; Xie, MaoQiang; Dey, Sanjoy; Hong, Changjin; Kumar, Vipin; Kuang, Rui

    2012-01-01

    Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped >2000 phenotype–gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this article, a regularized non-negative matrix tri-factorization (R-NMTF) algorithm is introduced to co-cluster phenotypes and genes, and simultaneously detect associations between the detected phenotype clusters and gene clusters. The R-NMTF algorithm factorizes the phenotype–gene association matrix under the prior knowledge from phenotype similarity network and protein–protein interaction network, supervised by the label information from known disease classes and biological pathways. In the experiments on disease phenotype–gene associations in OMIM and KEGG disease pathways, R-NMTF significantly improved the classification of disease phenotypes and disease pathway genes compared with support vector machines and Label Propagation in cross-validation on the annotated phenotypes and genes. The newly predicted phenotypes in each disease class are highly consistent with human phenotype ontology annotations. The roles of the new member genes in the disease pathways are examined and validated in the protein–protein interaction subnetworks. Extensive literature review also confirmed many new members of the disease classes and pathways as well as the predicted associations between disease phenotype classes and pathways. PMID:22735708

  10. An Endogenous Accelerator for Viral Gene Expression Confers a Fitness Advantage

    SciTech Connect

    Wong, Melissa; Bolovan-Fritts, Cynthia; Dar, Roy D.; Womack, Andrew; Simpson, Michael L; Shenk, Thomas; Weinberger, Leor S.

    2012-01-01

    Signal transduction circuits have long been known to differentiate between signals by amplifying inputs to different levels. Here, we describe a novel transcriptional circuitry that dynamically converts greater input levels into faster rates, without increasing the final equilibrium level (i.e. a rate amplifier). We utilize time-lapse microscopy to study human herpesvirus (cytomegalovirus) infection of live cells in real time. Strikingly, our results show that transcriptional activators accelerate viral gene expression in single cells without amplifying the steady-state levels of gene products in these cells. Experiment and modeling show that rate amplification operates by dynamically manipulating the traditional gain-bandwidth feedback relationship from electrical circuit theory to convert greater input levels into faster rates, and is driven by highly self-cooperative transcriptional feedback encoded by the virus s essential transactivator, IE2. This transcriptional rate-amplifier provides a significant fitness advantage for the virus and for minimal synthetic circuits. In general, rate-amplifiers may provide a mechanism for signal-transduction circuits to respond quickly to external signals without increasing steady-state levels of potentially cytotoxic molecules.

  11. Discovery of Core Biotic Stress Responsive Genes in Arabidopsis by Weighted Gene Co-Expression Network Analysis

    PubMed Central

    Amrine, Katherine C. H.; Blanco-Ulate, Barbara; Cantu, Dario

    2015-01-01

    Intricate signal networks and transcriptional regulators translate the recognition of pathogens into defense responses. In this study, we carried out a gene co-expression analysis of all currently publicly available microarray data, which were generated in experiments that studied the interaction of the model plant Arabidopsis thaliana with microbial pathogens. This work was conducted to identify (i) modules of functionally related co-expressed genes that are differentially expressed in response to multiple biotic stresses, and (ii) hub genes that may function as core regulators of disease responses. Using Weighted Gene Co-expression Network Analysis (WGCNA) we constructed an undirected network leveraging a rich curated expression dataset comprising 272 microarrays that involved microbial infections of Arabidopsis plants with a wide array of fungal and bacterial pathogens with biotrophic, hemibiotrophic, and necrotrophic lifestyles. WGCNA produced a network with scale-free and small-world properties composed of 205 distinct clusters of co-expressed genes. Modules of functionally related co-expressed genes that are differentially regulated in response to multiple pathogens were identified by integrating differential gene expression testing with functional enrichment analyses of gene ontology terms, known disease associated genes, transcriptional regulators, and cis-regulatory elements. The significance of functional enrichments was validated by comparisons with randomly generated networks. Network topology was then analyzed to identify intra- and inter-modular gene hubs. Based on high connectivity, and centrality in meta-modules that are clearly enriched in defense responses, we propose a list of 66 target genes for reverse genetic experiments to further dissect the Arabidopsis immune system. Our results show that statistical-based data trimming prior to network analysis allows the integration of expression datasets generated by different groups, under different

  12. Discovery of sequence motifs related to coexpression of genes using evolutionary computation

    PubMed Central

    Fogel, Gary B.; Weekes, Dana G.; Varga, Gabor; Dow, Ernst R.; Harlow, Harry B.; Onyia, Jude E.; Su, Chen

    2004-01-01

    Transcription factors are key regulatory elements that control gene expression. Recognition of transcription factor binding site (TFBS) motifs in the upstream region of coexpressed genes is therefore critical towards a true understanding of the regulations of gene expression. The task of discovering eukaryotic TFBSs remains a challenging problem. Here, we demonstrate that evolutionary computation can be used to search for TFBSs in upstream regions of genes known to be coexpressed. Evolutionary computation was used to search for TFBSs of genes regulated by octamer-binding factor and nuclear factor kappa B. The discovered binding sites included experimentally determined known binding motifs as well as lists of putative, previously unknown TFBSs. We believe that this method to search nucleotide sequence information efficiently for similar motifs will be useful for discovering TFBSs that affect gene regulation. PMID:15266008

  13. Discovery of functional genes for systemic acquired resistance in Arabidopsis thaliana through integrated data mining.

    PubMed

    Pan, Youlian; Pylatuik, Jeffrey D; Ouyang, Junjun; Famili, A Fazel; Fobert, Pierre R

    2004-12-01

    Various data mining techniques combined with sequence motif information in the promoter region of genes were applied to discover functional genes that are involved in the defense mechanism of systemic acquired resistance (SAR) in Arabidopsis thaliana. A series of K-Means clustering with difference-in-shape as distance measure was initially applied. A stability measure was used to validate this clustering process. A decision tree algorithm with the discover-and-mask technique was used to identify a group of most informative genes. Appearance and abundance of various transcription factor binding sites in the promoter region of the genes were studied. Through the combination of these techniques, we were able to identify 24 candidate genes involved in the SAR defense mechanism. The candidate genes fell into 2 highly resolved categories, each category showing significantly unique profiles of regulatory elements in their promoter regions. This study demonstrates the strength of such integration methods and suggests a broader application of this approach.

  14. Discovery of conserved motifs in promoters of orthologous genes in prokaryotes.

    PubMed

    Janky, Rekin's; van Helden, Jacques

    2007-01-01

    We present a method to predict cis-acting elements for a given gene by detecting over-represented motifs in promoters of a set of ortholo gous genes in prokaryotes (single-gene, multiple-genomes approach). The method has been used successfully to detect regulatory elements at various taxonomical levels in prokaryotes. A web interface is available at the Regulatory Sequence Analysis Tools site (http://rsat.scmbb.ulb.ac.be/rsat/).

  15. Discovery of germline-related genes in Cephalochordate amphioxus: A genome wide survey using genome annotation and transcriptome data.

    PubMed

    Yue, Jia-Xing; Li, Kun-Lung; Yu, Jr-Kai

    2015-12-01

    The generation of germline cells is a critical process in the reproduction of multicellular organisms. Studies in animal models have identified a common repertoire of genes that play essential roles in primordial germ cell (PGC) formation. However, comparative studies also indicate that the timing and regulation of this core genetic program vary considerably in different animals, raising the intriguing questions regarding the evolution of PGC developmental mechanisms in metazoans. Cephalochordates (commonly called amphioxus or lancelets) represent one of the invertebrate chordate groups and can provide important information about the evolution of developmental mechanisms in the chordate lineage. In this study, we used genome and transcriptome data to identify germline-related genes in two distantly related cephalochordate species, Branchiostoma floridae and Asymmetron lucayanum. Branchiostoma and Asymmetron diverged more than 120 MYA, and the most conspicuous difference between them is their gonadal morphology. We used important germline developmental genes in several model animals to search the amphioxus genome and transcriptome dataset for conserved homologs. We also annotated the assembled transcriptome data using Gene Ontology (GO) terms to facilitate the discovery of putative genes associated with germ cell development and reproductive functions in amphioxus. We further confirmed the expression of 14 genes in developing oocytes or mature eggs using whole mount in situ hybridization, suggesting their potential functions in amphioxus germ cell development. The results of this global survey provide a useful resource for testing potential functions of candidate germline-related genes in cephalochordates and for investigating differences in gonad developmental mechanisms between Branchiostoma and Asymmetron species. PMID:25847029

  16. MUFFINN: cancer gene discovery via network analysis of somatic mutation data.

    PubMed

    Cho, Ara; Shim, Jung Eun; Kim, Eiru; Supek, Fran; Lehner, Ben; Lee, Insuk

    2016-01-01

    A major challenge for distinguishing cancer-causing driver mutations from inconsequential passenger mutations is the long-tail of infrequently mutated genes in cancer genomes. Here, we present and evaluate a method for prioritizing cancer genes accounting not only for mutations in individual genes but also in their neighbors in functional networks, MUFFINN (MUtations For Functional Impact on Network Neighbors). This pathway-centric method shows high sensitivity compared with gene-centric analyses of mutation data. Notably, only a marginal decrease in performance is observed when using 10 % of TCGA patient samples, suggesting the method may potentiate cancer genome projects with small patient populations. PMID:27333808

  17. IMG-ABC. A knowledge base to fuel discovery of biosynthetic gene clusters and novel secondary metabolites

    DOE PAGES

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T. B. K.; Cimermančič, Peter; Fischbach, Michael A.; et al

    2015-07-14

    In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of “big” genomic data for discovering small molecules. IMG-ABC relies on IMG’s comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve asmore » the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC’s focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in lphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG’s extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG

  18. IMG-ABC. A knowledge base to fuel discovery of biosynthetic gene clusters and novel secondary metabolites

    SciTech Connect

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T. B. K.; Cimermančič, Peter; Fischbach, Michael A.; Ivanova, Natalia N.; Markowitz, Victor M.; Kyrpides, Nikos C.; Pati, Amrita

    2015-07-14

    In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of “big” genomic data for discovering small molecules. IMG-ABC relies on IMG’s comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC’s focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in lphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG’s extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG

  19. Discovery and Replication of Gene Influences on Brain Structure Using LASSO Regression

    PubMed Central

    Kohannim, Omid; Hibar, Derrek P.; Stein, Jason L.; Jahanshad, Neda; Hua, Xue; Rajagopalan, Priya; Toga, Arthur W.; Jack, Clifford R.; Weiner, Michael W.; de Zubicaray, Greig I.; McMahon, Katie L.; Hansell, Narelle K.; Martin, Nicholas G.; Wright, Margaret J.; Thompson, Paul M.

    2012-01-01

    We implemented least absolute shrinkage and selection operator (LASSO) regression to evaluate gene effects in genome-wide association studies (GWAS) of brain images, using an MRI-derived temporal lobe volume measure from 729 subjects scanned as part of the Alzheimer’s Disease Neuroimaging Initiative (ADNI). Sparse groups of SNPs in individual genes were selected by LASSO, which identifies efficient sets of variants influencing the data. These SNPs were considered jointly when assessing their association with neuroimaging measures. We discovered 22 genes that passed genome-wide significance for influencing temporal lobe volume. This was a substantially greater number of significant genes compared to those found with standard, univariate GWAS. These top genes are all expressed in the brain and include genes previously related to brain function or neuropsychiatric disorders such as MACROD2, SORCS2, GRIN2B, MAGI2, NPAS3, CLSTN2, GABRG3, NRXN3, PRKAG2, GAS7, RBFOX1, ADARB2, CHD4, and CDH13. The top genes we identified with this method also displayed significant and widespread post hoc effects on voxelwise, tensor-based morphometry (TBM) maps of the temporal lobes. The most significantly associated gene was an autism susceptibility gene known as MACROD2. We were able to successfully replicate the effect of the MACROD2 gene in an independent cohort of 564 young, Australian healthy adult twins and siblings scanned with MRI (mean age: 23.8 ± 2.2 SD years). Our approach powerfully complements univariate techniques in detecting influences of genes on the living brain. PMID:22888310

  20. SNP discovery and marker development for disease resistance candidate genes in common carp (Cyprinus carpio)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Single nucleotide polymorphisms (SNPs) in immune response genes have been reported as markers of susceptibility to infectious diseases in human and livestock. A disease caused by cyprinid herpes virus 3 (CyHV-3) is highly contagious and virulent in common carp. With the aim to investigate the gene...

  1. From metagenomic gene discovery to enzymatic breakdown of crosslinks in agricultural fibers for functional products

    Technology Transfer Automated Retrieval System (TEKTRAN)

    From the rumen microflora, more than twenty novel genes involved in the hydrolysis of glucuronoarabinoxylans have been discovered and isolated. The specific genes functioning in the breakdown of crosslinkages have been cloned and expressed in E. coli, and the active enzymes purified and extensively ...

  2. Discovery of putative capsaicin biosynthetic genes by RNA-Seq and digital gene expression analysis of pepper

    PubMed Central

    Zhang, Zi-Xin; Zhao, Shu-Niu; Liu, Gao-Feng; Huang, Zu-Mei; Cao, Zhen-Mu; Cheng, Shan-Han; Lin, Shi-Sen

    2016-01-01

    The Indian pepper ‘Guijiangwang’ (Capsicum frutescens L.), one of the world’s hottest chili peppers, is rich in capsaicinoids. The accumulation of the alkaloid capsaicin and its analogs in the epidermal cells of the placenta contribute to the pungency of Capsicum fruits. To identify putative genes involved in capsaicin biosynthesis, RNA-Seq was used to analyze the pepper’s expression profiles over five developmental stages. Five cDNA libraries were constructed from the total RNA of placental tissue and sequenced using an Illumina HiSeq 2000. More than 19 million clean reads were obtained from each library, and greater than 50% of the reads were assignable to reference genes. Digital gene expression (DGE) profile analysis using Solexa sequencing was performed at five fruit developmental stages and resulted in the identification of 135 genes of known function; their expression patterns were compared to the capsaicin accumulation pattern. Ten genes of known function were identified as most likely to be involved in regulating capsaicin synthesis. Additionally, 20 new candidate genes were identified related to capsaicin synthesis. We use a combination of RNA-Seq and DGE analyses to contribute to the understanding of the biosynthetic regulatory mechanism(s) of secondary metabolites in a nonmodel plant and to identify candidate enzyme-encoding genes. PMID:27756914

  3. Novel Gene Discovery in the Human Malaria Parasite using Nucleosome Positioning Data

    PubMed Central

    Pokhriyal, N.; Ponts, N.; Harris, E. Y.; Le Roch, K. G.; Lonardi, S.

    2013-01-01

    Recent genome-wide studies on nucleosome positioning in model organisms have shown strong evidence that nucleosome landscapes in the proximity of protein-coding genes exhibit regular characteristic patterns. Here, we propose a computational framework to discover novel genes in the human malaria parasite genome P. falciparum using nucleosome positioning inferred from MAINE-seq data. We rely on a classifier trained on the nucleosome landscape profiles of experimentally verified genes, and then used to discover new genes (without considering the primary DNA sequence). Cross-validation experiments show that our classifier is very accurate. About two thirds of the locations reported by the classifier match experimentally determined expressed sequence tags in GenBank, for which no gene has been annotated in the human malaria parasite. PMID:25076982

  4. Discovery of clubroot-resistant genes in Brassica napus by transcriptome sequencing.

    PubMed

    Chen, S W; Liu, T; Gao, Y; Zhang, C; Peng, S D; Bai, M B; Li, S J; Xu, L; Zhou, X Y; Lin, L B

    2016-01-01

    Clubroot significantly affects plants of the Brassicaceae family and is one of the main diseases causing serious losses in B. napus yield. Few studies have investigated the clubroot-resistance mechanism in B. napus. Identification of clubroot-resistant genes may be used in clubroot-resistant breeding, as well as to elucidate the molecular mechanism behind B. napus clubroot-resistance. We used three B. napus transcriptome samples to construct a transcriptome sequencing library by using Illumina HiSeq™ 2000 sequencing and bioinformatic analysis. In total, 171 million high-quality reads were obtained, containing 96,149 unigenes of N50-value. We aligned the obtained unigenes with the Nr, Swiss-Prot, clusters of orthologous groups, and gene ontology databases and annotated their functions. In the Kyoto encyclopedia of genes and genomes database, 25,033 unigenes (26.04%) were assigned to 124 pathways. Many genes, including broad-spectrum disease-resistance genes, specific clubroot-resistant genes, and genes related to indole-3-acetic acid (IAA) signal transduction, cytokinin synthesis, and myrosinase synthesis in the Huashuang 3 variety of B. napus were found to be related to clubroot-resistance. The effective clubroot-resistance observed in this variety may be due to the induced increased expression of these disease-resistant genes and strong inhibition of the IAA signal transduction, cytokinin synthesis, and myrosinase synthesis. The homology observed between unigenes 0048482, 0061770 and the Crr1 gene shared 94% nucleotide similarity. Furthermore, unigene 0061770 could have originated from an inversion of the Crr1 5'-end sequence.

  5. Discovery of clubroot-resistant genes in Brassica napus by transcriptome sequencing.

    PubMed

    Chen, S W; Liu, T; Gao, Y; Zhang, C; Peng, S D; Bai, M B; Li, S J; Xu, L; Zhou, X Y; Lin, L B

    2016-01-01

    Clubroot significantly affects plants of the Brassicaceae family and is one of the main diseases causing serious losses in B. napus yield. Few studies have investigated the clubroot-resistance mechanism in B. napus. Identification of clubroot-resistant genes may be used in clubroot-resistant breeding, as well as to elucidate the molecular mechanism behind B. napus clubroot-resistance. We used three B. napus transcriptome samples to construct a transcriptome sequencing library by using Illumina HiSeq™ 2000 sequencing and bioinformatic analysis. In total, 171 million high-quality reads were obtained, containing 96,149 unigenes of N50-value. We aligned the obtained unigenes with the Nr, Swiss-Prot, clusters of orthologous groups, and gene ontology databases and annotated their functions. In the Kyoto encyclopedia of genes and genomes database, 25,033 unigenes (26.04%) were assigned to 124 pathways. Many genes, including broad-spectrum disease-resistance genes, specific clubroot-resistant genes, and genes related to indole-3-acetic acid (IAA) signal transduction, cytokinin synthesis, and myrosinase synthesis in the Huashuang 3 variety of B. napus were found to be related to clubroot-resistance. The effective clubroot-resistance observed in this variety may be due to the induced increased expression of these disease-resistant genes and strong inhibition of the IAA signal transduction, cytokinin synthesis, and myrosinase synthesis. The homology observed between unigenes 0048482, 0061770 and the Crr1 gene shared 94% nucleotide similarity. Furthermore, unigene 0061770 could have originated from an inversion of the Crr1 5'-end sequence. PMID:27525940

  6. Discovery and validation of gene classifiers for endocrine-disrupting chemicals in zebrafish (Danio rerio)

    EPA Science Inventory

    Development and application of transcriptomics-based gene classifiers for ecotoxicological applications lag far behind those of human biomedical science. Many such classifiers discovered thus far lack vigorous statistical and experimental validations, with their stability and rel...

  7. Biochemical genomics for gene discovery in benzylisoquinoline alkaloid biosynthesis in opium poppy and related species.

    PubMed

    Dang, Thu Thuy T; Onoyovwi, Akpevwe; Farrow, Scott C; Facchini, Peter J

    2012-01-01

    Benzylisoquinoline alkaloids (BIAs) are a large, diverse group of ∼2500 specialized plant metabolites. Many BIAs display potent pharmacological activities, including the narcotic analgesics codeine and morphine, the vasodilator papaverine, the cough suppressant and potential anticancer drug noscapine, the antimicrobial agents sanguinarine and berberine, and the muscle relaxant (+)-tubocurarine. Opium poppy remains the sole commercial source for codeine, morphine, and a variety of semisynthetic drugs, including oxycodone and buprenorphine, derived primarily from the biosynthetic pathway intermediate thebaine. Recent advances in transcriptomics, proteomics, and metabolomics have created unprecedented opportunities for isolating and characterizing novel BIA biosynthetic genes. Here, we describe the application of next-generation sequencing and cDNA microarrays for selecting gene candidates based on comparative transcriptome analysis. We outline the basic mass spectrometric techniques to perform deep proteome and targeted metabolite analyses on BIA-producing plant tissues and provide methodologies for functionally characterizing biosynthetic gene candidates through in vitro enzyme assays and transient gene silencing in planta. PMID:22999177

  8. G-SESAME: web tools for GO-term-based gene similarity analysis and knowledge discovery

    PubMed Central

    Du, Zhidian; Li, Lin; Chen, Chin-Fu; Yu, Philip S.; Wang, James Z.

    2009-01-01

    We have developed a set of online tools for measuring the semantic similarities of Gene Ontology (GO) terms and the functional similarities of gene products, and for further discovering biomedical knowledge from the GO database. The tools have been used for about 6.9 million times by 417 institutions from 43 countries since October 2006. The online tools are available at: http://bioinformatics.clemson.edu/G-SESAME. PMID:19491312

  9. G-SESAME: web tools for GO-term-based gene similarity analysis and knowledge discovery.

    PubMed

    Du, Zhidian; Li, Lin; Chen, Chin-Fu; Yu, Philip S; Wang, James Z

    2009-07-01

    We have developed a set of online tools for measuring the semantic similarities of Gene Ontology (GO) terms and the functional similarities of gene products, and for further discovering biomedical knowledge from the GO database. The tools have been used for about 6.9 million times by 417 institutions from 43 countries since October 2006. The online tools are available at: http://bioinformatics.clemson.edu/G-SESAME.

  10. Novel cell lines promote the discovery of genes involved in early heart development.

    PubMed

    Brunskill, E W; Witte, D P; Yutzey, K E; Potter, S S

    2001-07-15

    Clonal cell lines representing early cardiomyocytes would provide valuable reagents for the dissection of the genetic program of early cardiogenesis. Here we describe the establishment and characterization of cell lines from the hearts of transgenic mice and embryos with SV40 large T antigen expressed in the heart-forming region. Ultrastructure analysis by transmission electron microscopy showed the primitive, precontractile nature of the resulting cells, with the absence of myofilaments, Z lines, and intercalated disks. Immunohistochemistry, RT-PCR, Northern blots, and oligonucleotide microarrays were used to determine the expression levels of thousands of genes in the 1H and ECL-2 cell lines. The resulting gene-expression profiles showed the transcription of early cardiomyocyte genes such as Nkx2.5, GATA4, Tbx5, dHAND, cardiac troponin C, and SM22-alpha. Furthermore, many genes not previously implicated in early cardiac development were expressed. Two of these genes, Hic-5, a possible negative regulator of muscle differentiation, and the transcription enhancing factor TEF-5 were selected and shown by in situ hybridizations to be expressed in the early developing heart. The results show that the 1H and ECL-2 cell lines can be used to discover novel genes expressed in the early cardiomyocyte. PMID:11437454

  11. Hypothalamic leptin gene therapy reduces body weight without accelerating age-related bone loss.

    PubMed

    Turner, Russell T; Dube, Michael; Branscum, Adam J; Wong, Carmen P; Olson, Dawn A; Zhong, Xiaoying; Kweh, Mercedes F; Larkin, Iske V; Wronski, Thomas J; Rosen, Clifford J; Kalra, Satya P; Iwaniec, Urszula T

    2015-12-01

    Excessive weight gain in adults is associated with a variety of negative health outcomes. Unfortunately, dieting, exercise, and pharmacological interventions have had limited long-term success in weight control and can result in detrimental side effects, including accelerating age-related cancellous bone loss. We investigated the efficacy of using hypothalamic leptin gene therapy as an alternative method for reducing weight in skeletally-mature (9 months old) female rats and determined the impact of leptin-induced weight loss on bone mass, density, and microarchitecture, and serum biomarkers of bone turnover (CTx and osteocalcin). Rats were implanted with cannulae in the 3rd ventricle of the hypothalamus and injected with either recombinant adeno-associated virus encoding the gene for rat leptin (rAAV-Leptin, n=7) or a control vector encoding green fluorescent protein (rAAV-GFP, n=10) and sacrificed 18 weeks later. A baseline control group (n=7) was sacrificed at vector administration. rAAV-Leptin-treated rats lost weight (-4±2%) while rAAV-GFP-treated rats gained weight (14±2%) during the study. At study termination, rAAV-Leptin-treated rats weighed 17% less than rAAV-GFP-treated rats and had lower abdominal white adipose tissue weight (-80%), serum leptin (-77%), and serum IGF1 (-34%). Cancellous bone volume fraction in distal femur metaphysis and epiphysis, and in lumbar vertebra tended to be lower (P<0.1) in rAAV-GFP-treated rats (13.5 months old) compared to baseline control rats (9 months old). Significant differences in cancellous bone or biomarkers of bone turnover were not detected between rAAV-Leptin and rAAV-GFP rats. In summary, rAAV-Leptin-treated rats maintained a lower body weight compared to baseline and rAAV-GFP-treated rats with minimal effects on bone mass, density, microarchitecture, or biochemical markers of bone turnover.

  12. Discovery and mapping of ten novel G protein-coupled receptor genes.

    PubMed

    Lee, D K; Nguyen, T; Lynch, K R; Cheng, R; Vanti, W B; Arkhitko, O; Lewis, T; Evans, J F; George, S R; O'Dowd, B F

    2001-09-01

    We report the identification, cloning and tissue distributions of ten novel human genes encoding G protein-coupled receptors (GPCRs) GPR78, GPR80, GPR81, GPR82, GPR93, GPR94, GPR95, GPR101, GPR102, GPR103 and a pseudogene, psi GPR79. Each novel orphan GPCR (oGPCR) gene was discovered using customized searches of the GenBank high-throughput genomic sequences database with previously known GPCR-encoding sequences. The expressed genes can now be used in assays to determine endogenous and pharmacological ligands. GPR78 shared highest identity with the oGPCR gene GPR26 (56% identity in the transmembrane (TM) regions). psi GPR79 shared highest sequence identity with the P2Y(2) gene and contained a frame-shift truncating the encoded receptor in TM5, demonstrating a pseudogene. GPR80 shared highest identity with the P2Y(1) gene (45% in the TM regions), while GPR81, GPR82 and GPR93 shared TM identities with the oGPCR genes HM74 (70%), GPR17 (30%) and P2Y(5) (40%), respectively. Two other novel GPCR genes, GPR94 and GPR95, encoded a subfamily with the genes encoding the UDP-glucose and P2Y(12) receptors (sharing >50% identities in the TM regions). GPR101 demonstrated only distant identities with other GPCR genes and GPR102 shared identities with GPR57, GPR58 and PNR (35-42% in the TM regions). GPR103 shared identities with the neuropeptide FF 2, neuropeptide Y2 and galanin GalR1 receptors (34-38% in the TM regions). Northern analyses revealed GPR78 mRNA expression in the pituitary and placenta and GPR81 expression in the pituitary. A search of the GenBank databases with the GPR82 sequence retrieved an identical sequence in an expressed sequence tag (EST) partially encoding GPR82 from human colonic tissue. The GPR93 sequence retrieved an identical, human EST sequence from human primary tonsil B-cells and an EST partially encoding mouse GPR93 from small intestinal tissue. GPR94 was expressed in the frontal cortex, caudate putamen and thalamus of brain while GPR95 was expressed

  13. DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes.

    PubMed

    Piñero, Janet; Queralt-Rosinach, Núria; Bravo, Àlex; Deu-Pons, Jordi; Bauer-Mehren, Anna; Baron, Martin; Sanz, Ferran; Furlong, Laura I

    2015-01-01

    DisGeNET is a comprehensive discovery platform designed to address a variety of questions concerning the genetic underpinning of human diseases. DisGeNET contains over 380,000 associations between >16,000 genes and 13,000 diseases, which makes it one of the largest repositories currently available of its kind. DisGeNET integrates expert-curated databases with text-mined data, covers information on Mendelian and complex diseases, and includes data from animal disease models. It features a score based on the supporting evidence to prioritize gene-disease associations. It is an open access resource available through a web interface, a Cytoscape plugin and as a Semantic Web resource. The web interface supports user-friendly data exploration and navigation. DisGeNET data can also be analysed via the DisGeNET Cytoscape plugin, and enriched with the annotations of other plugins of this popular network analysis software suite. Finally, the information contained in DisGeNET can be expanded and complemented using Semantic Web technologies and linked to a variety of resources already present in the Linked Data cloud. Hence, DisGeNET offers one of the most comprehensive collections of human gene-disease associations and a valuable set of tools for investigating the molecular mechanisms underlying diseases of genetic origin, designed to fulfill the needs of different user profiles, including bioinformaticians, biologists and health-care practitioners. Database URL: http://www.disgenet.org/ PMID:25877637

  14. Use of model organism and disease databases to support matchmaking for human disease gene discovery.

    PubMed

    Mungall, Christopher J; Washington, Nicole L; Nguyen-Xuan, Jeremy; Condit, Christopher; Smedley, Damian; Köhler, Sebastian; Groza, Tudor; Shefchek, Kent; Hochheiser, Harry; Robinson, Peter N; Lewis, Suzanna E; Haendel, Melissa A

    2015-10-01

    The Matchmaker Exchange application programming interface (API) allows searching a patient's genotypic or phenotypic profiles across clinical sites, for the purposes of cohort discovery and variant disease causal validation. This API can be used not only to search for matching patients, but also to match against public disease and model organism data. This public disease data enable matching known diseases and variant-phenotype associations using phenotype semantic similarity algorithms developed by the Monarch Initiative. The model data can provide additional evidence to aid diagnosis, suggest relevant models for disease mechanism and treatment exploration, and identify collaborators across the translational divide. The Monarch Initiative provides an implementation of this API for searching multiple integrated sources of data that contextualize the knowledge about any given patient or patient family into the greater biomedical knowledge landscape. While this corpus of data can aid diagnosis, it is also the beginning of research to improve understanding of rare human diseases.

  15. Natural and man-made V-gene repertoires for antibody discovery

    PubMed Central

    Finlay, William J. J.; Almagro, Juan C.

    2012-01-01

    Antibodies are the fastest-growing segment of the biologics market. The success of antibody-based drugs resides in their exquisite specificity, high potency, stability, solubility, safety, and relatively inexpensive manufacturing process in comparison with other biologics. We outline here the structural studies and fundamental principles that define how antibodies interact with diverse targets. We also describe the antibody repertoires and affinity maturation mechanisms of humans, mice, and chickens, plus the use of novel single-domain antibodies in camelids and sharks. These species all utilize diverse evolutionary solutions to generate specific and high affinity antibodies and illustrate the plasticity of natural antibody repertoires. In addition, we discuss the multiple variations of man-made antibody repertoires designed and validated in the last two decades, which have served as tools to explore how the size, diversity, and composition of a repertoire impact the antibody discovery process. PMID:23162556

  16. Discovery of functional non-coding conserved regions in the α-synuclein gene locus

    PubMed Central

    Sterling, Lori; Walter, Michael; Ting, Dennis; Schüle, Birgitt

    2014-01-01

    Several single nucleotide polymorphisms (SNPs) and the Rep-1 microsatellite marker of the α-synuclein ( SNCA) gene have consistently been shown to be associated with Parkinson’s disease, but the functional relevance is unclear. Based on these findings we hypothesized that conserved cis-regulatory elements in the SNCA genomic region regulate expression of SNCA, and that SNPs in these regions could be functionally modulating the expression of SNCA, thus contributing to neuronal demise and predisposing to Parkinson’s disease. In a pair-wise comparison of a 206kb genomic region encompassing the SNCA gene, we revealed 34 evolutionary conserved DNA sequences between human and mouse. All elements were cloned into reporter vectors and assessed for expression modulation in dual luciferase reporter assays.  We found that 12 out of 34 elements exhibited either an enhancement or reduction of the expression of the reporter gene. Three elements upstream of the SNCA gene displayed an approximately 1.5 fold (p<0.009) increase in expression. Of the intronic regions, three showed a 1.5 fold increase and two others indicated a 2 and 2.5 fold increase in expression (p<0.002). Three elements downstream of the SNCA gene showed 1.5 fold and 2.5 fold increase (p<0.0009). One element downstream of SNCA had a reduced expression of the reporter gene of 0.35 fold (p<0.0009) of normal activity. Our results demonstrate that the SNCA gene contains cis-regulatory regions that might regulate the transcription and expression of SNCA. Further studies in disease-relevant tissue types will be important to understand the functional impact of regulatory regions and specific Parkinson’s disease-associated SNPs and its function in the disease process. PMID:25566351

  17. Discovery of functional non-coding conserved regions in the α-synuclein gene locus.

    PubMed

    Sterling, Lori; Walter, Michael; Ting, Dennis; Schüle, Birgitt

    2014-01-01

    Several single nucleotide polymorphisms (SNPs) and the Rep-1 microsatellite marker of the α-synuclein ( SNCA) gene have consistently been shown to be associated with Parkinson's disease, but the functional relevance is unclear. Based on these findings we hypothesized that conserved cis-regulatory elements in the SNCA genomic region regulate expression of SNCA, and that SNPs in these regions could be functionally modulating the expression of SNCA, thus contributing to neuronal demise and predisposing to Parkinson's disease. In a pair-wise comparison of a 206kb genomic region encompassing the SNCA gene, we revealed 34 evolutionary conserved DNA sequences between human and mouse. All elements were cloned into reporter vectors and assessed for expression modulation in dual luciferase reporter assays.  We found that 12 out of 34 elements exhibited either an enhancement or reduction of the expression of the reporter gene. Three elements upstream of the SNCA gene displayed an approximately 1.5 fold (p<0.009) increase in expression. Of the intronic regions, three showed a 1.5 fold increase and two others indicated a 2 and 2.5 fold increase in expression (p<0.002). Three elements downstream of the SNCA gene showed 1.5 fold and 2.5 fold increase (p<0.0009). One element downstream of SNCA had a reduced expression of the reporter gene of 0.35 fold (p<0.0009) of normal activity. Our results demonstrate that the SNCA gene contains cis-regulatory regions that might regulate the transcription and expression of SNCA. Further studies in disease-relevant tissue types will be important to understand the functional impact of regulatory regions and specific Parkinson's disease-associated SNPs and its function in the disease process.

  18. Drosophila and Caenorhabditis elegans as Discovery Platforms for Genes Involved in Human Alcohol Use Disorder

    PubMed Central

    Grotewiel, Mike; Bettinger, Jill C.

    2015-01-01

    Background Despite the profound clinical significance and strong heritability of alcohol use disorder (AUD), we do not yet have a comprehensive understanding of the naturally occurring genetic variance within the human genome that drives its development. This lack of understanding is likely to be due in part to the large phenotypic and genetic heterogeneities that underlie human AUD. As a complement to genetic studies in humans, many laboratories are using the invertebrate model organisms (iMOs) Drosophila melanogaster (fruit fly) and Caenorhabditis elegans (nematode worm) to identify genetic mechanisms that influence the effects of alcohol (ethanol) on behavior. While these extremely powerful models have identified many genes that influence the behavioral responses to alcohol, in most cases it has remained unclear whether results from behavioral–genetic studies in iMOs are directly applicable to understanding the genetic basis of human AUD. Methods In this review, we critically evaluate the utility of the fly and worm models for identifying genes that influence AUD in humans. Results Based on results published through early 2015, studies in flies and worms have identified 91 and 50 genes, respectively, that influence 1 or more aspects of behavioral responses to alcohol. Collectively, these fly and worm genes correspond to 293 orthologous genes in humans. Intriguingly, 51 of these 293 human genes have been implicated in AUD by at least 1 study in human populations. Conclusions Our analyses strongly suggest that the Drosophila and C. elegans models have considerable utility for identifying orthologs of genes that influence human AUD. PMID:26173477

  19. De Novo Assembly of the Common Bean Transcriptome Using Short Reads for the Discovery of Drought-Responsive Genes

    PubMed Central

    Wu, Jing; Wang, Lanfen; Li, Long; Wang, Shumin

    2014-01-01

    The common bean (Phaseolus vulgaris L.) is one of the most important food legumes, far ahead of other legumes. The average grain yield of the common bean worldwide is much lower than its potential yields, primarily due to drought in the field. However, the gene network that mediates plant responses to drought stress remains largely unknown in this species. The major goals of our study are to identify a large scale of genes involved in drought stress using RNA-seq. First, we assembled 270 million high-quality trimmed reads into a non-redundant set of 62,828 unigenes, representing approximately 49 Mb of unique transcriptome sequences. Of these unigenes, 26,501 (42.2%) common bean unigenes had significant similarity with unigenes/predicted proteins from other legumes or sequenced plants. All unigenes were functionally annotated within the GO, COG and KEGG pathways. The strategy for de novo assembly of transcriptome data generated here will be useful in other legume plant transcriptome studies. Second, we identified 10,482 SSRs and 4,099 SNPs in transcripts. The large number of genetic markers provides a resource for gene discovery and development of functional molecular markers. Finally, we found differential expression genes (DEGs) between terminal drought and optimal irrigation treatments and between the two different genotypes Long 22-0579 (drought tolerant) and Naihua (drought sensitive). DEGs were confirmed by quantitative real-time PCR assays, which indicated that these genes are functionally associated with the drought-stress response. These resources will be helpful for basic and applied research for genome analysis and crop drought resistance improvement in the common bean. PMID:25275443

  20. De Novo Assembly of Auricularia polytricha Transcriptome Using Illumina Sequencing for Gene Discovery and SSR Marker Identification

    PubMed Central

    Zhou, Yan; Chen, Lianfu; Fan, Xiuzhi; Bian, Yinbing

    2014-01-01

    Auricularia polytricha (Mont.) Sacc., a type of edible black-brown mushroom with a gelatinous and modality-specific fruiting body, is in high demand in Asia due to its nutritional and medicinal properties. Illumina Solexa sequenceing technology was used to generate very large transcript sequences from the mycelium and the mature fruiting body of A. polytricha for gene discovery and molecular marker development. De novo assembly generated 36,483 ESTs with an N50 length of 636 bp. A total of 28,108 ESTs demonstrated significant hits with known proteins in the nr database, and 94.03% of the annotated ESTs showed the greatest similarity to A. delicata, a related species of A. polytricha. Functional categorization of the Gene Ontology (GO), Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) metabolic pathways revealed the conservation of genes involved in various biological processes in A. polytricha. Gene expression profile analysis indicated that a total of 2,057 ESTs were differentially expressed, including 1,020 ESTs that were up-regulated in the mycelium and 1,037 up-regulated in the fruiting body. Functional enrichment showed that the ESTs associated with biosynthesis, metabolism and assembly of proteins were more active in fruiting body development. The expression patterns of homologous transcription factors indicated that the molecular mechanisms of fruiting body formation and development were not exactly the same as for other agarics. Interestingly, an EST encoding tyrosinase was significantly up-regulated in the fruiting body, indicating that melanins accumulated during the processes of the formation of the black-brown color of the fruiting body in A. polytricha development. In addition, a total of 1,715 potential SSRs were detected in this transcriptome. The transcriptome analysis of A. polytricha provides valuable sequence resources and numerous molecular markers to facilitate further functional genomics studies and

  1. Essential Gene Discovery in the Basidiomycete Cryptococcus neoformans for Antifungal Drug Target Prioritization

    PubMed Central

    Ianiri, Giuseppe

    2015-01-01

    ABSTRACT Fungal diseases represent a major burden to health care globally. As with other pathogenic microbes, there is a limited number of agents suitable for use in treating fungal diseases, and resistance to these agents can develop rapidly. Cryptococcus neoformans is a basidiomycete fungus that causes cryptococcosis worldwide in both immunocompromised and healthy individuals. As a basidiomycete, it diverged from other common pathogenic or model ascomycete fungi more than 500 million years ago. Here, we report C. neoformans genes that are essential for viability as identified through forward and reverse genetic approaches, using an engineered diploid strain and genetic segregation after meiosis. The forward genetic approach generated random insertional mutants in the diploid strain, the induction of meiosis and sporulation, and selection for haploid cells with counterselection of the insertion event. More than 2,500 mutants were analyzed, and transfer DNA (T-DNA) insertions in several genes required for viability were identified. The genes include those encoding the thioredoxin reductase (Trr1), a ribosome assembly factor (Rsa4), an mRNA-capping component (Cet1), and others. For targeted gene replacement, the C. neoformans homologs of 35 genes required for viability in ascomycete fungi were disrupted, meiosis and sporulation were induced, and haploid progeny were evaluated for their ability to grow on selective media. Twenty-one (60%) were found to be required for viability in C. neoformans. These genes are involved in mitochondrial translation, ergosterol biosynthesis, and RNA-related functions. The heterozygous diploid mutants were evaluated for haploinsufficiency on a number of perturbing agents and drugs, revealing phenotypes due to the loss of one copy of an essential gene in C. neoformans. This study expands the knowledge of the essential genes in fungi using a basidiomycete as a model organism. Genes that have no mammalian homologs and are essential

  2. Discovery of Phytophthora infestans Genes Expressed in Planta through Mining of cDNA Libraries

    PubMed Central

    Chaves, Diego; Pinzón, Andrés; Grajales, Alejandro; Rojas, Alejandro; Mutis, Gabriel; Cárdenas, Martha; Burbano, Daniel; Jiménez, Pedro; Bernal, Adriana; Restrepo, Silvia

    2010-01-01

    Background Phytophthora infestans (Mont.) de Bary causes late blight of potato and tomato, and has a broad host range within the Solanaceae family. Most studies of the Phytophthora – Solanum pathosystem have focused on gene expression in the host and have not analyzed pathogen gene expression in planta. Methodology/Principal Findings We describe in detail an in silico approach to mine ESTs from inoculated host plants deposited in a database in order to identify particular pathogen sequences associated with disease. We identified candidate effector genes through mining of 22,795 ESTs corresponding to P. infestans cDNA libraries in compatible and incompatible interactions with hosts from the Solanaceae family. Conclusions/Significance We annotated genes of P. infestans expressed in planta associated with late blight using different approaches and assigned putative functions to 373 out of the 501 sequences found in the P. infestans genome draft, including putative secreted proteins, domains associated with pathogenicity and poorly characterized proteins ideal for further experimental studies. Our study provides a methodology for analyzing cDNA libraries and provides an understanding of the plant – oomycete pathosystems that is independent of the host, condition, or type of sample by identifying genes of the pathogen expressed in planta. PMID:20352100

  3. Phylogenomic Analysis of Natural Products Biosynthetic Gene Clusters Allows Discovery of Arseno-Organic Metabolites in Model Streptomycetes.

    PubMed

    Cruz-Morales, Pablo; Kopp, Johannes Florian; Martínez-Guerrero, Christian; Yáñez-Guerra, Luis Alfonso; Selem-Mojica, Nelly; Ramos-Aboites, Hilda; Feldmann, Jörg; Barona-Gómez, Francisco

    2016-07-02

    Natural products from microbes have provided humans with beneficial antibiotics for millennia. However, a decline in the pace of antibiotic discovery exerts pressure on human health as antibiotic resistance spreads, a challenge that may better faced by unveiling chemical diversity produced by microbes. Current microbial genome mining approaches have revitalized research into antibiotics, but the empirical nature of these methods limits the chemical space that is explored.Here, we address the problem of finding novel pathways by incorporating evolutionary principles into genome mining. We recapitulated the evolutionary history of twenty-three enzyme families previously uninvestigated in the context of natural product biosynthesis in Actinobacteria, the most proficient producers of natural products. Our genome evolutionary analyses where based on the assumption that expanded-repurposed enzyme families-from central metabolism, occur frequently and thus have the potential to catalyze new conversions in the context of natural products biosynthesis. Our analyses led to the discovery of biosynthetic gene clusters coding for hidden chemical diversity, as validated by comparing our predictions with those from state-of-the-art genome mining tools; as well as experimentally demonstrating the existence of a biosynthetic pathway for arseno-organic metabolites in Streptomyces coelicolor and Streptomyces lividans, Using a gene knockout and metabolite profile combined strategy.As our approach does not rely solely on sequence similarity searches of previously identified biosynthetic enzymes, these results establish the basis for the development of an evolutionary-driven genome mining tool termed EvoMining that complements current platforms. We anticipate that by doing so real 'chemical dark matter' will be unveiled.

  4. Phylogenomic Analysis of Natural Products Biosynthetic Gene Clusters Allows Discovery of Arseno-Organic Metabolites in Model Streptomycetes

    PubMed Central

    Cruz-Morales, Pablo; Kopp, Johannes Florian; Martínez-Guerrero, Christian; Yáñez-Guerra, Luis Alfonso; Selem-Mojica, Nelly; Ramos-Aboites, Hilda; Feldmann, Jörg; Barona-Gómez, Francisco

    2016-01-01

    Natural products from microbes have provided humans with beneficial antibiotics for millennia. However, a decline in the pace of antibiotic discovery exerts pressure on human health as antibiotic resistance spreads, a challenge that may better faced by unveiling chemical diversity produced by microbes. Current microbial genome mining approaches have revitalized research into antibiotics, but the empirical nature of these methods limits the chemical space that is explored. Here, we address the problem of finding novel pathways by incorporating evolutionary principles into genome mining. We recapitulated the evolutionary history of twenty-three enzyme families previously uninvestigated in the context of natural product biosynthesis in Actinobacteria, the most proficient producers of natural products. Our genome evolutionary analyses where based on the assumption that expanded—repurposed enzyme families—from central metabolism, occur frequently and thus have the potential to catalyze new conversions in the context of natural products biosynthesis. Our analyses led to the discovery of biosynthetic gene clusters coding for hidden chemical diversity, as validated by comparing our predictions with those from state-of-the-art genome mining tools; as well as experimentally demonstrating the existence of a biosynthetic pathway for arseno-organic metabolites in Streptomyces coelicolor and Streptomyces lividans, Using a gene knockout and metabolite profile combined strategy. As our approach does not rely solely on sequence similarity searches of previously identified biosynthetic enzymes, these results establish the basis for the development of an evolutionary-driven genome mining tool termed EvoMining that complements current platforms. We anticipate that by doing so real ‘chemical dark matter’ will be unveiled. PMID:27289100

  5. Phylogenomic Analysis of Natural Products Biosynthetic Gene Clusters Allows Discovery of Arseno-Organic Metabolites in Model Streptomycetes.

    PubMed

    Cruz-Morales, Pablo; Kopp, Johannes Florian; Martínez-Guerrero, Christian; Yáñez-Guerra, Luis Alfonso; Selem-Mojica, Nelly; Ramos-Aboites, Hilda; Feldmann, Jörg; Barona-Gómez, Francisco

    2016-01-01

    Natural products from microbes have provided humans with beneficial antibiotics for millennia. However, a decline in the pace of antibiotic discovery exerts pressure on human health as antibiotic resistance spreads, a challenge that may better faced by unveiling chemical diversity produced by microbes. Current microbial genome mining approaches have revitalized research into antibiotics, but the empirical nature of these methods limits the chemical space that is explored.Here, we address the problem of finding novel pathways by incorporating evolutionary principles into genome mining. We recapitulated the evolutionary history of twenty-three enzyme families previously uninvestigated in the context of natural product biosynthesis in Actinobacteria, the most proficient producers of natural products. Our genome evolutionary analyses where based on the assumption that expanded-repurposed enzyme families-from central metabolism, occur frequently and thus have the potential to catalyze new conversions in the context of natural products biosynthesis. Our analyses led to the discovery of biosynthetic gene clusters coding for hidden chemical diversity, as validated by comparing our predictions with those from state-of-the-art genome mining tools; as well as experimentally demonstrating the existence of a biosynthetic pathway for arseno-organic metabolites in Streptomyces coelicolor and Streptomyces lividans, Using a gene knockout and metabolite profile combined strategy.As our approach does not rely solely on sequence similarity searches of previously identified biosynthetic enzymes, these results establish the basis for the development of an evolutionary-driven genome mining tool termed EvoMining that complements current platforms. We anticipate that by doing so real 'chemical dark matter' will be unveiled. PMID:27289100

  6. Discovery of molecular mechanisms of traditional Chinese medicinal formula Si-Wu-Tang using gene expression microarray and connectivity map.

    PubMed

    Wen, Zhining; Wang, Zhijun; Wang, Steven; Ravula, Ranadheer; Yang, Lun; Xu, Jun; Wang, Charles; Zuo, Zhong; Chow, Moses S S; Shi, Leming; Huang, Ying

    2011-03-28

    To pursue a systematic approach to discovery of mechanisms of action of traditional Chinese medicine (TCM), we used microarrays, bioinformatics and the "Connectivity Map" (CMAP) to examine TCM-induced changes in gene expression. We demonstrated that this approach can be used to elucidate new molecular targets using a model TCM herbal formula Si-Wu-Tang (SWT) which is widely used for women's health. The human breast cancer MCF-7 cells treated with 0.1 µM estradiol or 2.56 mg/ml of SWT showed dramatic gene expression changes, while no significant change was detected for ferulic acid, a known bioactive compound of SWT. Pathway analysis using differentially expressed genes related to the treatment effect identified that expression of genes in the nuclear factor erythroid 2-related factor 2 (Nrf2) cytoprotective pathway was most significantly affected by SWT, but not by estradiol or ferulic acid. The Nrf2-regulated genes HMOX1, GCLC, GCLM, SLC7A11 and NQO1 were upregulated by SWT in a dose-dependent manner, which was validated by real-time RT-PCR. Consistently, treatment with SWT and its four herbal ingredients resulted in an increased antioxidant response element (ARE)-luciferase reporter activity in MCF-7 and HEK293 cells. Furthermore, the gene expression profile of differentially expressed genes related to SWT treatment was used to compare with those of 1,309 compounds in the CMAP database. The CMAP profiles of estradiol-treated MCF-7 cells showed an excellent match with SWT treatment, consistent with SWT's widely claimed use for women's diseases and indicating a phytoestrogenic effect. The CMAP profiles of chemopreventive agents withaferin A and resveratrol also showed high similarity to the profiles of SWT. This study identified SWT as an Nrf2 activator and phytoestrogen, suggesting its use as a nontoxic chemopreventive agent, and demonstrated the feasibility of combining microarray gene expression profiling with CMAP mining to discover mechanisms of actions and

  7. Discovery and identification of candidate genes from the chitinase gene family for Verticillium dahliae resistance in cotton

    PubMed Central

    Xu, Jun; Xu, Xiaoyang; Tian, Liangliang; Wang, Guilin; Zhang, Xueying; Wang, Xinyu; Guo, Wangzhen

    2016-01-01

    Verticillium dahliae, a destructive and soil-borne fungal pathogen, causes massive losses in cotton yields. However, the resistance mechanism to V. dahilae in cotton is still poorly understood. Accumulating evidence indicates that chitinases are crucial hydrolytic enzymes, which attack fungal pathogens by catalyzing the fungal cell wall degradation. As a large gene family, to date, the chitinase genes (Chis) have not been systematically analyzed and effectively utilized in cotton. Here, we identified 47, 49, 92, and 116 Chis from four sequenced cotton species, diploid Gossypium raimondii (D5), G. arboreum (A2), tetraploid G. hirsutum acc. TM-1 (AD1), and G. barbadense acc. 3–79 (AD2), respectively. The orthologous genes were not one-to-one correspondence in the diploid and tetraploid cotton species, implying changes in the number of Chis in different cotton species during the evolution of Gossypium. Phylogenetic classification indicated that these Chis could be classified into six groups, with distinguishable structural characteristics. The expression patterns of Chis indicated their various expressions in different organs and tissues, and in the V. dahliae response. Silencing of Chi23, Chi32, or Chi47 in cotton significantly impaired the resistance to V. dahliae, suggesting these genes might act as positive regulators in disease resistance to V. dahliae. PMID:27354165

  8. Discovery and identification of candidate genes from the chitinase gene family for Verticillium dahliae resistance in cotton.

    PubMed

    Xu, Jun; Xu, Xiaoyang; Tian, Liangliang; Wang, Guilin; Zhang, Xueying; Wang, Xinyu; Guo, Wangzhen

    2016-06-29

    Verticillium dahliae, a destructive and soil-borne fungal pathogen, causes massive losses in cotton yields. However, the resistance mechanism to V. dahilae in cotton is still poorly understood. Accumulating evidence indicates that chitinases are crucial hydrolytic enzymes, which attack fungal pathogens by catalyzing the fungal cell wall degradation. As a large gene family, to date, the chitinase genes (Chis) have not been systematically analyzed and effectively utilized in cotton. Here, we identified 47, 49, 92, and 116 Chis from four sequenced cotton species, diploid Gossypium raimondii (D5), G. arboreum (A2), tetraploid G. hirsutum acc. TM-1 (AD1), and G. barbadense acc. 3-79 (AD2), respectively. The orthologous genes were not one-to-one correspondence in the diploid and tetraploid cotton species, implying changes in the number of Chis in different cotton species during the evolution of Gossypium. Phylogenetic classification indicated that these Chis could be classified into six groups, with distinguishable structural characteristics. The expression patterns of Chis indicated their various expressions in different organs and tissues, and in the V. dahliae response. Silencing of Chi23, Chi32, or Chi47 in cotton significantly impaired the resistance to V. dahliae, suggesting these genes might act as positive regulators in disease resistance to V. dahliae.

  9. Prior knowledge driven Granger causality analysis on gene regulatory network discovery

    DOE PAGES

    Yao, Shun; Yoo, Shinjae; Yu, Dantong

    2015-08-28

    Our study focuses on discovering gene regulatory networks from time series gene expression data using the Granger causality (GC) model. However, the number of available time points (T) usually is much smaller than the number of target genes (n) in biological datasets. The widely applied pairwise GC model (PGC) and other regularization strategies can lead to a significant number of false identifications when n>>T. In this study, we proposed a new method, viz., CGC-2SPR (CGC using two-step prior Ridge regularization) to resolve the problem by incorporating prior biological knowledge about a target gene data set. In our simulation experiments, themore » propose new methodology CGC-2SPR showed significant performance improvement in terms of accuracy over other widely used GC modeling (PGC, Ridge and Lasso) and MI-based (MRNET and ARACNE) methods. In addition, we applied CGC-2SPR to a real biological dataset, i.e., the yeast metabolic cycle, and discovered more true positive edges with CGC-2SPR than with the other existing methods. In our research, we noticed a “ 1+1>2” effect when we combined prior knowledge and gene expression data to discover regulatory networks. Based on causality networks, we made a functional prediction that the Abm1 gene (its functions previously were unknown) might be related to the yeast’s responses to different levels of glucose. In conclusion, our research improves causality modeling by combining heterogeneous knowledge, which is well aligned with the future direction in system biology. Furthermore, we proposed a method of Monte Carlo significance estimation (MCSE) to calculate the edge significances which provide statistical meanings to the discovered causality networks. All of our data and source codes will be available under the link https://bitbucket.org/dtyu/granger-causality/wiki/Home.« less

  10. Prior knowledge driven Granger causality analysis on gene regulatory network discovery

    SciTech Connect

    Yao, Shun; Yoo, Shinjae; Yu, Dantong

    2015-08-28

    Our study focuses on discovering gene regulatory networks from time series gene expression data using the Granger causality (GC) model. However, the number of available time points (T) usually is much smaller than the number of target genes (n) in biological datasets. The widely applied pairwise GC model (PGC) and other regularization strategies can lead to a significant number of false identifications when n>>T. In this study, we proposed a new method, viz., CGC-2SPR (CGC using two-step prior Ridge regularization) to resolve the problem by incorporating prior biological knowledge about a target gene data set. In our simulation experiments, the propose new methodology CGC-2SPR showed significant performance improvement in terms of accuracy over other widely used GC modeling (PGC, Ridge and Lasso) and MI-based (MRNET and ARACNE) methods. In addition, we applied CGC-2SPR to a real biological dataset, i.e., the yeast metabolic cycle, and discovered more true positive edges with CGC-2SPR than with the other existing methods. In our research, we noticed a “ 1+1>2” effect when we combined prior knowledge and gene expression data to discover regulatory networks. Based on causality networks, we made a functional prediction that the Abm1 gene (its functions previously were unknown) might be related to the yeast’s responses to different levels of glucose. In conclusion, our research improves causality modeling by combining heterogeneous knowledge, which is well aligned with the future direction in system biology. Furthermore, we proposed a method of Monte Carlo significance estimation (MCSE) to calculate the edge significances which provide statistical meanings to the discovered causality networks. All of our data and source codes will be available under the link https://bitbucket.org/dtyu/granger-causality/wiki/Home.

  11. RNA-Seq Based De Novo Transcriptome Assembly and Gene Discovery of Cistanche deserticola Fleshy Stem

    PubMed Central

    Yao, Fuwen; Li, Cuiping; Tang, Qingli; Sun, Min; Sun, Gaoyuan; Hu, Songnian; Yu, Jun; Song, Shuhui

    2015-01-01

    Backgrounds Cistanche deserticola is a completely non-photosynthetic parasitic plant with great medicinal value and mainly distributed in desert of Northwest China. Its dried fleshy stem is a crucial tonic in traditional Chinese medicine with roles of mainly improving male sexual function and strengthening immunity, but few mechanistic studies have been conducted partly due to the lack of genomic and transcriptomic resources. Results In this study, we performed deep transcriptome sequencing in fleshy stem of C. deserticola, and about 80 million reads were generated using Illumina pair-end sequencing on HiSeq2000 platform. Using trinity assembler, we obtained 95,787 transcript sequences with transcript lengths ranging from 200bp to 15,698bp, having an average length of 950 bases and the N50 length of 1,519 bases. 63,957 transcripts were identified actively expressed with FPKM ≥ 0.5, in which 30,098 transcripts were annotated with gene descriptions or gene ontology terms by sequence similarity analyses against several public databases (Uniprot, NR and Nt at NCBI, and KEGG). Furthermore, we identified key enzyme genes involved in biosynthesis of lignin and phenylethanoid glycosides (PhGs) which are known to be the primary active ingredients. Four phenylalanine ammonia-lyase (PAL) genes, the first key enzyme in lignin and PhG biosynthesis, were identified based on sequences comparison and phylogenetic analysis. Two biosynthesis pathways of PhGs were also proposed for the first time. Conclusions In all, we completed a global analysis of the C. deserticola fleshy stem transcriptome using RNA-seq technology. A collection of enzyme genes related to biosynthesis of lignin and phenylethanoid glysides were identified from the assembled and annotated transcripts, and the gene family of PAL was also predicted. The sequence data from this study will provide a valuable resource for conducting future phenylethanoid glysides biosynthesis researches and functional genomic studies

  12. Discovery and Expansion of Gene Modules by Seeking Isolated Groups in a Random Graph Process

    PubMed Central

    Brumm, Jochen; Conibear, Elizabeth; Wasserman, Wyeth W.; Bryan, Jennifer

    2008-01-01

    Background A central problem in systems biology research is the identification and extension of biological modules–groups of genes or proteins participating in a common cellular process or physical complex. As a result, there is a persistent need for practical, principled methods to infer the modular organization of genes from genome-scale data. Results We introduce a novel approach for the identification of modules based on the persistence of isolated gene groups within an evolving graph process. First, the underlying genomic data is summarized in the form of ranked gene–gene relationships, thereby accommodating studies that quantify the relevant biological relationship directly or indirectly. Then, the observed gene–gene relationship ranks are viewed as the outcome of a random graph process and candidate modules are given by the identifiable subgraphs that arise during this process. An isolation index is computed for each module, which quantifies the statistical significance of its survival time. Conclusions The Miso (module isolation) method predicts gene modules from genomic data and the associated isolation index provides a module-specific measure of confidence. Improving on existing alternative, such as graph clustering and the global pruning of dendrograms, this index offers two intuitively appealing features: (1) the score is module-specific; and (2) different choices of threshold correlate logically with the resulting performance, i.e. a stringent cutoff yields high quality predictions, but low sensitivity. Through the analysis of yeast phenotype data, the Miso method is shown to outperform existing alternatives, in terms of the specificity and sensitivity of its predictions. PMID:18843375

  13. Plant gravitropic signal transduction: A network analysis leads to gene discovery

    NASA Astrophysics Data System (ADS)

    Wyatt, Sarah

    Gravity plays a fundamental role in plant growth and development. Although a significant body of research has helped define the events of gravity perception, the role of the plant growth regulator auxin, and the mechanisms resulting in the gravity response, the events of signal transduction, those that link the biophysical action of perception to a biochemical signal that results in auxin redistribution, those that regulate the gravitropic effects on plant growth, remain, for the most part, a “black box.” Using a cold affect, dubbed the gravity persistent signal (GPS) response, we developed a mutant screen to specifically identify components of the signal transduction pathway. Cloning of the GPS genes have identified new proteins involved in gravitropic signaling. We have further exploited the GPS response using a multi-faceted approach including gene expression microarrays, proteomics analysis, and bioinformatics analysis and continued mutant analysis to identified additional genes, physiological and biochemical processes. Gene expression data provided the foundation of a regulatory network for gravitropic signaling. Based on these gene expression data and related data sets/information from the literature/repositories, we constructed a gravitropic signaling network for Arabidopsis inflorescence stems. To generate the network, both a dynamic Bayesian network approach and a time-lagged correlation coefficient approach were used. The dynamic Bayesian network added existing information of protein-protein interaction while the time-lagged correlation coefficient allowed incorporation of temporal regulation and thus could incorporate the time-course metric from the data set. Thus the methods complemented each other and provided us with a more comprehensive evaluation of connections. Each method generated a list of possible interactions associated with a statistical significance value. The two networks were then overlaid to generate a more rigorous, intersected

  14. Discovery of candidate genes for muscle traits based on GWAS supported by eQTL-analysis.

    PubMed

    Ponsuksili, Siriluck; Murani, Eduard; Trakooljul, Nares; Schwerin, Manfred; Wimmers, Klaus

    2014-01-01

    Biochemical and biophysical processes that take place in muscle under relaxed and stressed conditions depend on the abundance and activity of gene products of metabolic and structural pathways. In livestock at post-mortem, these muscle properties determine aspects of meat quality and are measurable. The conversion of muscle to meat mimics pathological processes associated with muscle ischemia, injury or damage in humans and it is an economic factor in pork production. Linkage, association, and expression analyses independently contributed to the identification of trait-associated molecular pathways and genes. We aim at providing multiple evidences for the role of specific genes in meat quality by integrating a genome-wide association study (GWAS) for meat quality traits and the detection of eQTL based on trait-correlated expressed genes and trait-associated markers. The GWAS revealed 51 and 200 SNPs significantly associated with meat quality in a crossbred Pietrain×(German Landrace×Large White) (Pi×(GL×LW)) and a purebred German Landrace (GL) population, respectively. Most significant SNPs in Pi×(GL×LW) were located on chromosomes (SSC) 4 and 6. The data of 47,836 eQTLs at a significance level of p<10(-5) were used to scale down the number candidate genes located in these regions. These SNPs on SSC4 showed association with expression levels of ZNF704, IMPA1, and OXSR1; SSC6 SNPs were associated with expression of SIGLEC10 and PIH1D1. Most significant SNPs in GL were located on SSC6 and associated with expression levels of PIH1D1, SIGLEC10, TBCB, LOC100518735, KIF1B, LOC100514845, and two unknown genes. The abundance of transcripts of these genes in muscle, in turn, is significantly correlated with meat quality traits. We identified several genes with evidence for their candidacy for meat quality arising from the integrative approach of a genome-wide association study and eQTL analysis. PMID:24643240

  15. Integrating microRNA target predictions for the discovery of gene regulatory networks: a semi-supervised ensemble learning approach

    PubMed Central

    2014-01-01

    Background MicroRNAs (miRNAs) are small non-coding RNAs which play a key role in the post-transcriptional regulation of many genes. Elucidating miRNA-regulated gene networks is crucial for the understanding of mechanisms and functions of miRNAs in many biological processes, such as cell proliferation, development, differentiation and cell homeostasis, as well as in many types of human tumors. To this aim, we have recently presented the biclustering method HOCCLUS2, for the discovery of miRNA regulatory networks. Experiments on predicted interactions revealed that the statistical and biological consistency of the obtained networks is negatively affected by the poor reliability of the output of miRNA target prediction algorithms. Recently, some learning approaches have been proposed to learn to combine the outputs of distinct prediction algorithms and improve their accuracy. However, the application of classical supervised learning algorithms presents two challenges: i) the presence of only positive examples in datasets of experimentally verified interactions and ii) unbalanced number of labeled and unlabeled examples. Results We present a learning algorithm that learns to combine the score returned by several prediction algorithms, by exploiting information conveyed by (only positively labeled/) validated and unlabeled examples of interactions. To face the two related challenges, we resort to a semi-supervised ensemble learning setting. Results obtained using miRTarBase as the set of labeled (positive) interactions and mirDIP as the set of unlabeled interactions show a significant improvement, over competitive approaches, in the quality of the predictions. This solution also improves the effectiveness of HOCCLUS2 in discovering biologically realistic miRNA:mRNA regulatory networks from large-scale prediction data. Using the miR-17-92 gene cluster family as a reference system and comparing results with previous experiments, we find a large increase in the number of

  16. Molecular Mechanisms and Evolutionary Processes Contributing to Accelerated Divergence of Gene Expression on the Drosophila X Chromosome.

    PubMed

    Coolon, Joseph D; Stevenson, Kraig R; McManus, C Joel; Yang, Bing; Graveley, Brenton R; Wittkopp, Patricia J

    2015-10-01

    In species with a heterogametic sex, population genetics theory predicts that DNA sequences on the X chromosome can evolve faster than comparable sequences on autosomes. Both neutral and nonneutral evolutionary processes can generate this pattern. Complex traits like gene expression are not predicted to have accelerated evolution by these theories, yet a "faster-X" pattern of gene expression divergence has recently been reported for both Drosophila and mammals. Here, we test the hypothesis that accelerated adaptive evolution of cis-regulatory sequences on the X chromosome is responsible for this pattern by comparing the relative contributions of cis- and trans-regulatory changes to patterns of faster-X expression divergence observed between strains and species of Drosophila with a range of divergence times. We find support for this hypothesis, especially among male-biased genes, when comparing different species. However, we also find evidence that trans-regulatory differences contribute to a faster-X pattern of expression divergence both within and between species. This contribution is surprising because trans-acting regulators of X-linked genes are generally assumed to be randomly distributed throughout the genome. We found, however, that X-linked transcription factors appear to preferentially regulate expression of X-linked genes, providing a potential mechanistic explanation for this result. The contribution of trans-regulatory variation to faster-X expression divergence was larger within than between species, suggesting that it is more likely to result from neutral processes than positive selection. These data show how accelerated evolution of both coding and noncoding sequences on the X chromosome can lead to accelerated expression divergence on the X chromosome relative to autosomes.

  17. Molecular Mechanisms and Evolutionary Processes Contributing to Accelerated Divergence of Gene Expression on the Drosophila X Chromosome

    PubMed Central

    Coolon, Joseph D.; Stevenson, Kraig R.; McManus, C. Joel; Yang, Bing; Graveley, Brenton R.; Wittkopp, Patricia J.

    2015-01-01

    In species with a heterogametic sex, population genetics theory predicts that DNA sequences on the X chromosome can evolve faster than comparable sequences on autosomes. Both neutral and nonneutral evolutionary processes can generate this pattern. Complex traits like gene expression are not predicted to have accelerated evolution by these theories, yet a “faster-X” pattern of gene expression divergence has recently been reported for both Drosophila and mammals. Here, we test the hypothesis that accelerated adaptive evolution of cis-regulatory sequences on the X chromosome is responsible for this pattern by comparing the relative contributions of cis- and trans-regulatory changes to patterns of faster-X expression divergence observed between strains and species of Drosophila with a range of divergence times. We find support for this hypothesis, especially among male-biased genes, when comparing different species. However, we also find evidence that trans-regulatory differences contribute to a faster-X pattern of expression divergence both within and between species. This contribution is surprising because trans-acting regulators of X-linked genes are generally assumed to be randomly distributed throughout the genome. We found, however, that X-linked transcription factors appear to preferentially regulate expression of X-linked genes, providing a potential mechanistic explanation for this result. The contribution of trans-regulatory variation to faster-X expression divergence was larger within than between species, suggesting that it is more likely to result from neutral processes than positive selection. These data show how accelerated evolution of both coding and noncoding sequences on the X chromosome can lead to accelerated expression divergence on the X chromosome relative to autosomes. PMID:26041937

  18. Large-Scale Discovery of Disease-Disease and Disease-Gene Associations

    PubMed Central

    Gligorijevic, Djordje; Stojanovic, Jelena; Djuric, Nemanja; Radosavljevic, Vladan; Grbovic, Mihajlo; Kulathinal, Rob J.; Obradovic, Zoran

    2016-01-01

    Data-driven phenotype analyses on Electronic Health Record (EHR) data have recently drawn benefits across many areas of clinical practice, uncovering new links in the medical sciences that can potentially affect the well-being of millions of patients. In this paper, EHR data is used to discover novel relationships between diseases by studying their comorbidities (co-occurrences in patients). A novel embedding model is designed to extract knowledge from disease comorbidities by learning from a large-scale EHR database comprising more than 35 million inpatient cases spanning nearly a decade, revealing significant improvements on disease phenotyping over current computational approaches. In addition, the use of the proposed methodology is extended to discover novel disease-gene associations by including valuable domain knowledge from genome-wide association studies. To evaluate our approach, its effectiveness is compared against a held-out set where, again, it revealed very compelling results. For selected diseases, we further identify candidate gene lists for which disease-gene associations were not studied previously. Thus, our approach provides biomedical researchers with new tools to filter genes of interest, thus, reducing costly lab studies. PMID:27578529

  19. Biomarker discovery and gene expression responses in Lycopersicon esculentum root exposed to lead.

    PubMed

    Hou, Jing; Bai, Lili; Xie, Yujia; Liu, Xinhui; Cui, Baoshan

    2015-12-15

    Gene expression analysis has shown particular promise for the identification of molecular biomarkers that can be used for further evaluation of potential toxicity of chemicals present in agricultural soil. In the study, we focused on the development of molecular markers to detect Pb toxicity in agricultural soil. Using the results obtained from microarray analysis, twelve Pb-responsive genes were selected and tested in different Pb concentrations to examine their concentration-response characteristics using real-time quantitative polymerase chain reaction (RT-qPCR). All the Pb treatments set in our study could generally induce the differential expression of the 12 genes, while the lowest observable adverse effect concentration (LOAEC) of Pb for seed germination, root elongation, biomass and structural modification derived from 1,297, 177, 177, and 1,297 mg Pb/kg soil, respectively, suggesting that the transcriptional approach was more sensitive than the traditional end points of death, growth, and morphology for the evaluation of Pb toxicity. The relative expression of glycoalkaloid metabolism 1 (P=-0.790), ethylene-responsive transcription factor ERF017 (P=-0.686) and CASP-like protein 4C2 (P=-0.652) demonstrates a dose-dependent response with Pb content in roots, implying that the three genes can be used as sensitive bioindicators of Pb stress in Lycopersicon esculentum.

  20. Human transporter database: comprehensive knowledge and discovery tools in the human transporter genes.

    PubMed

    Ye, Adam Y; Liu, Qing-Rong; Li, Chuan-Yun; Zhao, Min; Qu, Hong

    2014-01-01

    Transporters are essential in homeostatic exchange of endogenous and exogenous substances at the systematic, organic, cellular, and subcellular levels. Gene mutations of transporters are often related to pharmacogenetics traits. Recent developments in high throughput technologies on genomics, transcriptomics and proteomics allow in depth studies of transporter genes in normal cellular processes and diverse disease conditions. The flood of high throughput data have resulted in urgent need for an updated knowledgebase with curated, organized, and annotated human transporters in an easily accessible way. Using a pipeline with the combination of automated keywords query, sequence similarity search and manual curation on transporters, we collected 1,555 human non-redundant transporter genes to develop the Human Transporter Database (HTD) (http://htd.cbi.pku.edu.cn). Based on the extensive annotations, global properties of the transporter genes were illustrated, such as expression patterns and polymorphisms in relationships with their ligands. We noted that the human transporters were enriched in many fundamental biological processes such as oxidative phosphorylation and cardiac muscle contraction, and significantly associated with Mendelian and complex diseases such as epilepsy and sudden infant death syndrome. Overall, HTD provides a well-organized interface to facilitate research communities to search detailed molecular and genetic information of transporters for development of personalized medicine.

  1. Human Transporter Database: Comprehensive Knowledge and Discovery Tools in the Human Transporter Genes

    PubMed Central

    Ye, Adam Y.; Liu, Qing-Rong; Li, Chuan-Yun; Zhao, Min; Qu, Hong

    2014-01-01

    Transporters are essential in homeostatic exchange of endogenous and exogenous substances at the systematic, organic, cellular, and subcellular levels. Gene mutations of transporters are often related to pharmacogenetics traits. Recent developments in high throughput technologies on genomics, transcriptomics and proteomics allow in depth studies of transporter genes in normal cellular processes and diverse disease conditions. The flood of high throughput data have resulted in urgent need for an updated knowledgebase with curated, organized, and annotated human transporters in an easily accessible way. Using a pipeline with the combination of automated keywords query, sequence similarity search and manual curation on transporters, we collected 1,555 human non-redundant transporter genes to develop the Human Transporter Database (HTD) (http://htd.cbi.pku.edu.cn). Based on the extensive annotations, global properties of the transporter genes were illustrated, such as expression patterns and polymorphisms in relationships with their ligands. We noted that the human transporters were enriched in many fundamental biological processes such as oxidative phosphorylation and cardiac muscle contraction, and significantly associated with Mendelian and complex diseases such as epilepsy and sudden infant death syndrome. Overall, HTD provides a well-organized interface to facilitate research communities to search detailed molecular and genetic information of transporters for development of personalized medicine. PMID:24558441

  2. Transcriptome Analysis of Catharanthus roseus for Gene Discovery and Expression Profiling

    PubMed Central

    Sharma, Raghvendra; Sinha, Alok K.; Jain, Mukesh

    2014-01-01

    The medicinal plant, Catharanthus roseus, accumulates wide range of terpenoid indole alkaloids, which are well documented therapeutic agents. In this study, deep transcriptome sequencing of C. roseus was carried out to identify the pathways and enzymes (genes) involved in biosynthesis of these compounds. About 343 million reads were generated from different tissues (leaf, flower and root) of C. roseus using Illumina platform. Optimization of de novo assembly involving a two-step process resulted in a total of 59,220 unique transcripts with an average length of 1284 bp. Comprehensive functional annotation and gene ontology (GO) analysis revealed the representation of many genes involved in different biological processes and molecular functions. In total, 65% of C. roseus transcripts showed homology with sequences available in various public repositories, while remaining 35% unigenes may be considered as C. roseus specific. In silico analysis revealed presence of 11,620 genic simple sequence repeats (excluding mono-nucleotide repeats) and 1820 transcription factor encoding genes in C. roseus transcriptome. Expression analysis showed roots and leaves to be actively participating in bisindole alkaloid production with clear indication that enzymes involved in pathway of vindoline and vinblastine biosynthesis are restricted to aerial tissues. Such large-scale transcriptome study provides a rich source for understanding plant-specialized metabolism, and is expected to promote research towards production of plant-derived pharmaceuticals. PMID:25072156

  3. Discovery of Chemosensory Genes in the Oriental Fruit Fly, Bactrocera dorsalis.

    PubMed

    Wu, Zhongzhen; Zhang, He; Wang, Zhengbing; Bin, Shuying; He, Hualiang; Lin, Jintian

    2015-01-01

    The oriental fruit fly, Bactrocera dorsalis, is a devastating fruit fly pest in tropical and sub-tropical countries. Like other insects, this fly uses its chemosensory system to efficiently interact with its environment. However, our understanding of the molecular components comprising B. dorsalis chemosensory system is limited. Using next generation sequencing technologies, we sequenced the transcriptome of four B. dorsalis developmental stages: egg, larva, pupa and adult chemosensory tissues. A total of 31 candidate odorant binding proteins (OBPs), 4 candidate chemosensory proteins (CSPs), 23 candidate odorant receptors (ORs), 11 candidate ionotropic receptors (IRs), 6 candidate gustatory receptors (GRs) and 3 candidate sensory neuron membrane proteins (SNMPs) were identified. The tissue distributions of the OBP and CSP transcripts were determined by RT-PCR and a subset of nine genes were further characterized. The predicted proteins from these genes shared high sequence similarity to Drosophila melanogaster pheromone binding protein related proteins (PBPRPs). Interestingly, one OBP (BdorOBP19c) was exclusively expressed in the sex pheromone glands of mature females. RT-PCR was also used to compare the expression of the candidate genes in the antennae of male and female B. dorsalis adults. These antennae-enriched OBPs, CSPs, ORs, IRs and SNMPs could play a role in the detection of pheromones and general odorants and thus could be useful target genes for the integrated pest management of B. dorsalis and other agricultural pests. PMID:26070069

  4. The discovery of integrated gene networks for autism and related disorders.

    PubMed

    Hormozdiari, Fereydoun; Penn, Osnat; Borenstein, Elhanan; Eichler, Evan E

    2015-01-01

    Despite considerable genetic heterogeneity underlying neurodevelopmental diseases, there is compelling evidence that many disease genes will map to a much smaller number of biological subnetworks. We developed a computational method, termed MAGI (merging affected genes into integrated networks), that simultaneously integrates protein-protein interactions and RNA-seq expression profiles during brain development to discover "modules" enriched for de novo mutations in probands. We applied this method to recent exome sequencing of 1116 patients with autism and intellectual disability, discovering two distinct modules that differ in their properties and associated phenotypes. The first module consists of 80 genes associated with Wnt, Notch, SWI/SNF, and NCOR complexes and shows the highest expression early during embryonic development (8-16 post-conception weeks [pcw]). The second module consists of 24 genes associated with synaptic function, including long-term potentiation and calcium signaling with higher levels of postnatal expression. Patients with de novo mutations in these modules are more significantly intellectually impaired and carry more severe missense mutations when compared to probands with de novo mutations outside of these modules. We used our approach to define subsets of the network associated with higher functioning autism as well as greater severity with respect to IQ. Finally, we applied MAGI independently to epilepsy and schizophrenia exome sequencing cohorts and found significant overlap as well as expansion of these modules, suggesting a core set of integrated neurodevelopmental networks common to seemingly diverse human diseases. PMID:25378250

  5. Using Osteoclast Differentiation as a Model for Gene Discovery in an Undergraduate Cell Biology Laboratory

    ERIC Educational Resources Information Center

    Birnbaum, Mark J.; Picco, Jenna; Clements, Meghan; Witwicka, Hanna; Yang, Meiheng; Hoey, Margaret T.; Odgren, Paul R.

    2010-01-01

    A key goal of molecular/cell biology/biotechnology is to identify essential genes in virtually every physiological process to uncover basic mechanisms of cell function and to establish potential targets of drug therapy combating human disease. This article describes a semester-long, project-oriented molecular/cellular/biotechnology laboratory…

  6. Large-Scale Discovery of Disease-Disease and Disease-Gene Associations.

    PubMed

    Gligorijevic, Djordje; Stojanovic, Jelena; Djuric, Nemanja; Radosavljevic, Vladan; Grbovic, Mihajlo; Kulathinal, Rob J; Obradovic, Zoran

    2016-01-01

    Data-driven phenotype analyses on Electronic Health Record (EHR) data have recently drawn benefits across many areas of clinical practice, uncovering new links in the medical sciences that can potentially affect the well-being of millions of patients. In this paper, EHR data is used to discover novel relationships between diseases by studying their comorbidities (co-occurrences in patients). A novel embedding model is designed to extract knowledge from disease comorbidities by learning from a large-scale EHR database comprising more than 35 million inpatient cases spanning nearly a decade, revealing significant improvements on disease phenotyping over current computational approaches. In addition, the use of the proposed methodology is extended to discover novel disease-gene associations by including valuable domain knowledge from genome-wide association studies. To evaluate our approach, its effectiveness is compared against a held-out set where, again, it revealed very compelling results. For selected diseases, we further identify candidate gene lists for which disease-gene associations were not studied previously. Thus, our approach provides biomedical researchers with new tools to filter genes of interest, thus, reducing costly lab studies. PMID:27578529

  7. Large-Scale Discovery of Disease-Disease and Disease-Gene Associations.

    PubMed

    Gligorijevic, Djordje; Stojanovic, Jelena; Djuric, Nemanja; Radosavljevic, Vladan; Grbovic, Mihajlo; Kulathinal, Rob J; Obradovic, Zoran

    2016-01-01

    Data-driven phenotype analyses on Electronic Health Record (EHR) data have recently drawn benefits across many areas of clinical practice, uncovering new links in the medical sciences that can potentially affect the well-being of millions of patients. In this paper, EHR data is used to discover novel relationships between diseases by studying their comorbidities (co-occurrences in patients). A novel embedding model is designed to extract knowledge from disease comorbidities by learning from a large-scale EHR database comprising more than 35 million inpatient cases spanning nearly a decade, revealing significant improvements on disease phenotyping over current computational approaches. In addition, the use of the proposed methodology is extended to discover novel disease-gene associations by including valuable domain knowledge from genome-wide association studies. To evaluate our approach, its effectiveness is compared against a held-out set where, again, it revealed very compelling results. For selected diseases, we further identify candidate gene lists for which disease-gene associations were not studied previously. Thus, our approach provides biomedical researchers with new tools to filter genes of interest, thus, reducing costly lab studies.

  8. Discovery of Chemosensory Genes in the Oriental Fruit Fly, Bactrocera dorsalis

    PubMed Central

    Wu, Zhongzhen; Zhang, He; Wang, Zhengbing; Bin, Shuying; He, Hualiang; Lin, Jintian

    2015-01-01

    The oriental fruit fly, Bactrocera dorsalis, is a devastating fruit fly pest in tropical and sub-tropical countries. Like other insects, this fly uses its chemosensory system to efficiently interact with its environment. However, our understanding of the molecular components comprising B. dorsalis chemosensory system is limited. Using next generation sequencing technologies, we sequenced the transcriptome of four B. dorsalis developmental stages: egg, larva, pupa and adult chemosensory tissues. A total of 31 candidate odorant binding proteins (OBPs), 4 candidate chemosensory proteins (CSPs), 23 candidate odorant receptors (ORs), 11 candidate ionotropic receptors (IRs), 6 candidate gustatory receptors (GRs) and 3 candidate sensory neuron membrane proteins (SNMPs) were identified. The tissue distributions of the OBP and CSP transcripts were determined by RT-PCR and a subset of nine genes were further characterized. The predicted proteins from these genes shared high sequence similarity to Drosophila melanogaster pheromone binding protein related proteins (PBPRPs). Interestingly, one OBP (BdorOBP19c) was exclusively expressed in the sex pheromone glands of mature females. RT-PCR was also used to compare the expression of the candidate genes in the antennae of male and female B. dorsalis adults. These antennae-enriched OBPs, CSPs, ORs, IRs and SNMPs could play a role in the detection of pheromones and general odorants and thus could be useful target genes for the integrated pest management of B. dorsalis and other agricultural pests. PMID:26070069

  9. Transcriptome Analysis and Discovery of Genes Relevant to Development in Bradysia odoriphaga at Three Developmental Stages.

    PubMed

    Gao, Huanhuan; Zhai, Yifan; Wang, Wenbo; Chen, Hao; Zhou, Xianhong; Zhuang, Qianying; Yu, Yi; Li, Rumei

    2016-01-01

    Bradysia odoriphaga (Diptera: Sciaridae) is the most important pest of Chinese chive (Allium tuberosum) in Asia; however, the molecular genetics are poorly understood. To explore the molecular biological mechanism of development, Illumina sequencing and de novo assembly were performed in the third-instar, fourth-instar, and pupal B. odoriphaga. The study resulted in 16.2 Gb of clean data and 47,578 unigenes (≥125 bp) contained in 7,632,430 contigs, 46.21% of which were annotated from non-redundant protein (NR), Gene Ontology (GO), Clusters of Orthologous Groups (COG), Eukaryotic Orthologous Groups (KOG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. It was found that 19.67% of unigenes matched the homologous species mainly, including Aedes aegypti, Culex quinquefasciatus, Ceratitis capitata, and Anopheles gambiae. According to differentially expressed gene (DEG) analysis, 143, 490, and 309 DEGs were annotated as involved in the developmental process in the GO database respectively, in the comparisons of third-instar and fourth-instar larvae, third-instar larvae and pupae, and fourth-instar larvae and pupae. Twenty-five genes were closely related to these processes, including developmental process, reproduction process, and reproductive organs development and programmed cell death (PCD). The information of unigenes assembled in B. odoriphaga through transcriptome and DEG analyses could provide a detailed genetic basis and regulated information for elaborating the developmental mechanism from the larval, pre-pupal to pupal stages of B. odoriphaga. PMID:26891450

  10. Transcriptome Analysis and Discovery of Genes Relevant to Development in Bradysia odoriphaga at Three Developmental Stages

    PubMed Central

    Wang, Wenbo; Chen, Hao; Zhou, Xianhong; Zhuang, Qianying; Yu, Yi; Li, Rumei

    2016-01-01

    Bradysia odoriphaga (Diptera: Sciaridae) is the most important pest of Chinese chive (Allium tuberosum) in Asia; however, the molecular genetics are poorly understood. To explore the molecular biological mechanism of development, Illumina sequencing and de novo assembly were performed in the third-instar, fourth-instar, and pupal B. odoriphaga. The study resulted in 16.2 Gb of clean data and 47,578 unigenes (≥125bp) contained in 7,632,430contigs, 46.21% of which were annotated from non-redundant protein (NR), Gene Ontology (GO), Clusters of Orthologous Groups (COG), Eukaryotic Orthologous Groups (KOG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. It was found that 19.67% of unigenes matched the homologous species mainly, including Aedes aegypti, Culex quinquefasciatus, Ceratitis capitata, and Anopheles gambiae. According to differentially expressed gene (DEG) analysis, 143, 490, and 309 DEGs were annotated as involved in the developmental process in the GO database respectively, in the comparisons of third-instar and fourth-instar larvae, third-instar larvae and pupae, and fourth-instar larvae and pupae. Twenty-five genes were closely related to these processes, including developmental process, reproduction process, and reproductive organs development and programmed cell death (PCD). The information of unigenes assembled in B. odoriphaga through transcriptome and DEG analyses could provide a detailed genetic basis and regulated information for elaborating the developmental mechanism from the larval, pre-pupal to pupal stages of B. odoriphaga. PMID:26891450

  11. Characterization of Greater Middle Eastern genetic variation for enhanced disease gene discovery.

    PubMed

    Scott, Eric M; Halees, Anason; Itan, Yuval; Spencer, Emily G; He, Yupeng; Azab, Mostafa Abdellateef; Gabriel, Stacey B; Belkadi, Aziz; Boisson, Bertrand; Abel, Laurent; Clark, Andrew G; Alkuraya, Fowzan S; Casanova, Jean-Laurent; Gleeson, Joseph G

    2016-09-01

    The Greater Middle East (GME) has been a central hub of human migration and population admixture. The tradition of consanguinity, variably practiced in the Persian Gulf region, North Africa, and Central Asia, has resulted in an elevated burden of recessive disease. Here we generated a whole-exome GME variome from 1,111 unrelated subjects. We detected substantial diversity and admixture in continental and subregional populations, corresponding to several ancient founder populations with little evidence of bottlenecks. Measured consanguinity rates were an order of magnitude above those in other sampled populations, and the GME population exhibited an increased burden of runs of homozygosity (ROHs) but showed no evidence for reduced burden of deleterious variation due to classically theorized 'genetic purging'. Applying this database to unsolved recessive conditions in the GME population reduced the number of potential disease-causing variants by four- to sevenfold. These results show variegated genetic architecture in GME populations and support future human genetic discoveries in Mendelian and population genetics. PMID:27428751

  12. Leveraging a Sturge-Weber Gene Discovery: An Agenda for Future Research.

    PubMed

    Comi, Anne M; Sahin, Mustafa; Hammill, Adrienne; Kaplan, Emma H; Juhász, Csaba; North, Paula; Ball, Karen L; Levin, Alex V; Cohen, Bernard; Morris, Jill; Lo, Warren; Roach, E Steve

    2016-05-01

    Sturge-Weber syndrome (SWS) is a vascular neurocutaneous disorder that results from a somatic mosaic mutation in GNAQ, which is also responsible for isolated port-wine birthmarks. Infants with SWS are born with a cutaneous capillary malformation (port-wine birthmark) of the forehead or upper eyelid which can signal an increased risk of brain and/or eye involvement prior to the onset of specific symptoms. This symptom-free interval represents a time when a targeted intervention could help to minimize the neurological and ophthalmologic manifestations of the disorder. This paper summarizes a 2015 SWS workshop in Bethesda, Maryland that was sponsored by the National Institutes of Health. Meeting attendees included a diverse group of clinical and translational researchers with a goal of establishing research priorities for the next few years. The initial portion of the meeting included a thorough review of the recent genetic discovery and what is known of the pathogenesis of SWS. Breakout sessions related to neurology, dermatology, and ophthalmology aimed to establish SWS research priorities in each field. Key priorities for future development include the need for clinical consensus guidelines, further work to develop a clinical trial network, improvement of tissue banking for research purposes, and the need for multiple animal and cell culture models of SWS. PMID:27268758

  13. Leveraging a Sturge-Weber Gene Discovery: An Agenda for Future Research.

    PubMed

    Comi, Anne M; Sahin, Mustafa; Hammill, Adrienne; Kaplan, Emma H; Juhász, Csaba; North, Paula; Ball, Karen L; Levin, Alex V; Cohen, Bernard; Morris, Jill; Lo, Warren; Roach, E Steve

    2016-05-01

    Sturge-Weber syndrome (SWS) is a vascular neurocutaneous disorder that results from a somatic mosaic mutation in GNAQ, which is also responsible for isolated port-wine birthmarks. Infants with SWS are born with a cutaneous capillary malformation (port-wine birthmark) of the forehead or upper eyelid which can signal an increased risk of brain and/or eye involvement prior to the onset of specific symptoms. This symptom-free interval represents a time when a targeted intervention could help to minimize the neurological and ophthalmologic manifestations of the disorder. This paper summarizes a 2015 SWS workshop in Bethesda, Maryland that was sponsored by the National Institutes of Health. Meeting attendees included a diverse group of clinical and translational researchers with a goal of establishing research priorities for the next few years. The initial portion of the meeting included a thorough review of the recent genetic discovery and what is known of the pathogenesis of SWS. Breakout sessions related to neurology, dermatology, and ophthalmology aimed to establish SWS research priorities in each field. Key priorities for future development include the need for clinical consensus guidelines, further work to develop a clinical trial network, improvement of tissue banking for research purposes, and the need for multiple animal and cell culture models of SWS.

  14. Characterization of Greater Middle Eastern genetic variation for enhanced disease gene discovery.

    PubMed

    Scott, Eric M; Halees, Anason; Itan, Yuval; Spencer, Emily G; He, Yupeng; Azab, Mostafa Abdellateef; Gabriel, Stacey B; Belkadi, Aziz; Boisson, Bertrand; Abel, Laurent; Clark, Andrew G; Alkuraya, Fowzan S; Casanova, Jean-Laurent; Gleeson, Joseph G

    2016-09-01

    The Greater Middle East (GME) has been a central hub of human migration and population admixture. The tradition of consanguinity, variably practiced in the Persian Gulf region, North Africa, and Central Asia, has resulted in an elevated burden of recessive disease. Here we generated a whole-exome GME variome from 1,111 unrelated subjects. We detected substantial diversity and admixture in continental and subregional populations, corresponding to several ancient founder populations with little evidence of bottlenecks. Measured consanguinity rates were an order of magnitude above those in other sampled populations, and the GME population exhibited an increased burden of runs of homozygosity (ROHs) but showed no evidence for reduced burden of deleterious variation due to classically theorized 'genetic purging'. Applying this database to unsolved recessive conditions in the GME population reduced the number of potential disease-causing variants by four- to sevenfold. These results show variegated genetic architecture in GME populations and support future human genetic discoveries in Mendelian and population genetics.

  15. miR2Gene: pattern discovery of single gene, multiple genes, and pathways by enrichment analysis of their microRNA regulators

    PubMed Central

    2011-01-01

    Background In recent years, a number of tools have been developed to explore microRNAs (miRNAs) by analyzing their target genes. However, a reverse problem, that is, inferring patterns of protein-coding genes through their miRNA regulators, has not been explored. As various miRNA annotation data become available, exploring gene patterns by analyzing the prior knowledge of their miRNA regulators is becoming more feasible. Results In this study, we developed a tool, miR2Gene, for this purpose. Various sets of miRNAs, according to prior rules such as function, associated disease, tissue specificity, family, and cluster, were integrated with miR2Gene. For given genes, miR2Gene evaluates the enrichment of the predicted miRNAs that regulate them in each miRNA set. This tool can be used for single genes, multiple genes, and KEGG pathways. For the KEGG pathway, genes with enriched miRNA sets are highlighted according to various rules. We confirmed the usefulness of miR2Gene through case studies. Conclusions miR2Gene represents a novel and useful tool that integrates miRNA knowledge for protein-coding gene analysis. miR2Gene is freely available at http://cmbi.hsc.pku.edu.cn/mir2gene. PMID:22784580

  16. Case-only exome sequencing and complex disease susceptibility gene discovery: study design considerations.

    PubMed

    Wu, Lang; Schaid, Daniel J; Sicotte, Hugues; Wieben, Eric D; Li, Hu; Petersen, Gloria M

    2015-01-01

    Whole exome sequencing (WES) provides an unprecedented opportunity to identify the potential aetiological role of rare functional variants in human complex diseases. Large-scale collaborations have generated germline WES data on patients with a number of diseases, especially cancer, but less often on healthy controls under the same sequencing procedures. These data can be a valuable resource for identifying new disease susceptibility loci if study designs are appropriately applied. This review describes suggested strategies and technical considerations when focusing on case-only study designs that use WES data in complex disease scenarios. These include variant filtering based on frequency and functionality, gene prioritisation, interrogation of different data types and targeted sequencing validation. We propose that if case-only WES designs were applied in an appropriate manner, new susceptibility genes containing rare variants for human complex diseases can be detected.

  17. Genome-wide target profiling of piggyBac and Tol2 in HEK 293: pros and cons for gene discovery and gene therapy

    PubMed Central

    2011-01-01

    Background DNA transposons have emerged as indispensible tools for manipulating vertebrate genomes with applications ranging from insertional mutagenesis and transgenesis to gene therapy. To fully explore the potential of two highly active DNA transposons, piggyBac and Tol2, as mammalian genetic tools, we have conducted a side-by-side comparison of the two transposon systems in the same setting to evaluate their advantages and disadvantages for use in gene therapy and gene discovery. Results We have observed that (1) the Tol2 transposase (but not piggyBac) is highly sensitive to molecular engineering; (2) the piggyBac donor with only the 40 bp 3'-and 67 bp 5'-terminal repeat domain is sufficient for effective transposition; and (3) a small amount of piggyBac transposases results in robust transposition suggesting the piggyBac transpospase is highly active. Performing genome-wide target profiling on data sets obtained by retrieving chromosomal targeting sequences from individual clones, we have identified several piggyBac and Tol2 hotspots and observed that (4) piggyBac and Tol2 display a clear difference in targeting preferences in the human genome. Finally, we have observed that (5) only sites with a particular sequence context can be targeted by either piggyBac or Tol2. Conclusions The non-overlapping targeting preference of piggyBac and Tol2 makes them complementary research tools for manipulating mammalian genomes. PiggyBac is the most promising transposon-based vector system for achieving site-specific targeting of therapeutic genes due to the flexibility of its transposase for being molecularly engineered. Insights from this study will provide a basis for engineering piggyBac transposases to achieve site-specific therapeutic gene targeting. PMID:21447194

  18. Discovery of Nuclear-Encoded Genes for the Neurotoxin Saxitoxin in Dinoflagellates

    PubMed Central

    Stüken, Anke; Orr, Russell J. S.; Kellmann, Ralf; Murray, Shauna A.; Neilan, Brett A.; Jakobsen, Kjetill S.

    2011-01-01

    Saxitoxin is a potent neurotoxin that occurs in aquatic environments worldwide. Ingestion of vector species can lead to paralytic shellfish poisoning, a severe human illness that may lead to paralysis and death. In freshwaters, the toxin is produced by prokaryotic cyanobacteria; in marine waters, it is associated with eukaryotic dinoflagellates. However, several studies suggest that saxitoxin is not produced by dinoflagellates themselves, but by co-cultured bacteria. Here, we show that genes required for saxitoxin synthesis are encoded in the nuclear genomes of dinoflagellates. We sequenced >1.2×106 mRNA transcripts from the two saxitoxin-producing dinoflagellate strains Alexandrium fundyense CCMP1719 and A. minutum CCMP113 using high-throughput sequencing technology. In addition, we used in silico transcriptome analyses, RACE, qPCR and conventional PCR coupled with Sanger sequencing. These approaches successfully identified genes required for saxitoxin-synthesis in the two transcriptomes. We focused on sxtA, the unique starting gene of saxitoxin synthesis, and show that the dinoflagellate transcripts of sxtA have the same domain structure as the cyanobacterial sxtA genes. But, in contrast to the bacterial homologs, the dinoflagellate transcripts are monocistronic, have a higher GC content, occur in multiple copies, contain typical dinoflagellate spliced-leader sequences and eukaryotic polyA-tails. Further, we investigated 28 saxitoxin-producing and non-producing dinoflagellate strains from six different genera for the presence of genomic sxtA homologs. Our results show very good agreement between the presence of sxtA and saxitoxin-synthesis, except in three strains of A. tamarense, for which we amplified sxtA, but did not detect the toxin. Our work opens for possibilities to develop molecular tools to detect saxitoxin-producing dinoflagellates in the environment. PMID:21625593

  19. Gene discovery in an invasive tephritid model pest species, the Mediterranean fruit fly, Ceratitis capitata

    PubMed Central

    Gomulski, Ludvik M; Dimopoulos, George; Xi, Zhiyong; Soares, Marcelo B; Bonaldo, Maria F; Malacrida, Anna R; Gasperi, Giuliano

    2008-01-01

    Background The medfly, Ceratitis capitata, is a highly invasive agricultural pest that has become a model insect for the development of biological control programs. Despite research into the behavior and classical and population genetics of this organism, the quantity of sequence data available is limited. We have utilized an expressed sequence tag (EST) approach to obtain detailed information on transcriptome signatures that relate to a variety of physiological systems in the medfly; this information emphasizes on reproduction, sex determination, and chemosensory perception, since the study was based on normalized cDNA libraries from embryos and adult heads. Results A total of 21,253 high-quality ESTs were obtained from the embryo and head libraries. Clustering analyses performed separately for each library resulted in 5201 embryo and 6684 head transcripts. Considering an estimated 19% overlap in the transcriptomes of the two libraries, they represent about 9614 unique transcripts involved in a wide range of biological processes and molecular functions. Of particular interest are the sequences that share homology with Drosophila genes involved in sex determination, olfaction, and reproductive behavior. The medfly transformer2 (tra2) homolog was identified among the embryonic sequences, and its genomic organization and expression were characterized. Conclusion The sequences obtained in this study represent the first major dataset of expressed genes in a tephritid species of agricultural importance. This resource provides essential information to support the investigation of numerous questions regarding the biology of the medfly and other related species and also constitutes an invaluable tool for the annotation of complete genome sequences. Our study has revealed intriguing findings regarding the transcript regulation of tra2 and other sex determination genes, as well as insights into the comparative genomics of genes implicated in chemosensory reception and

  20. Gene Discovery through Transcriptome Sequencing for the Invasive Mussel Limnoperna fortunei

    PubMed Central

    Uliano-Silva, Marcela; Americo, Juliana Alves; Brindeiro, Rodrigo; Dondero, Francesco; Prosdocimi, Francisco; de Freitas Rebelo, Mauro

    2014-01-01

    The success of the Asian bivalve Limnoperna fortunei as an invader in South America is related to its high acclimation capability. It can inhabit waters with a wide range of temperatures and salinity and handle long-term periods of air exposure. We describe the transcriptome of L. fortunei aiming to give a first insight into the phenotypic plasticity that allows non-native taxa to become established and widespread. We sequenced 95,219 reads from five main tissues of the mussel L. fortunei using Roche’s 454 and assembled them to form a set of 84,063 unigenes (contigs and singletons) representing partial or complete gene sequences. We annotated 24,816 unigenes using a BLAST sequence similarity search against a NCBI nr database. Unigenes were divided into 20 eggNOG functional categories and 292 KEGG metabolic pathways. From the total unigenes, 1,351 represented putative full-length genes of which 73.2% were functionally annotated. We described the first partial and complete gene sequences in order to start understanding bivalve invasiveness. An expansion of the hsp70 gene family, seen also in other bivalves, is present in L. fortunei and could be involved in its adaptation to extreme environments, e.g. during intertidal periods. The presence of toll-like receptors gives a first insight into an immune system that could be more complex than previously assumed and may be involved in the prevention of disease and extinction when population densities are high. Finally, the apparent lack of special adaptations to extremely low O2 levels is a target worth pursuing for the development of a molecular control approach. PMID:25047650

  1. Genome-wide discovery of cis-elements in promoter sequences using gene expression.

    PubMed

    Troukhan, Maxim; Tatarinova, Tatiana; Bouck, John; Flavell, Richard B; Alexandrov, Nickolai N

    2009-04-01

    The availability of complete or nearly complete genome sequences, a large number of 5' expressed sequence tags, and significant public expression data allow for a more accurate identification of cis-elements regulating gene expression. We have implemented a global approach that takes advantage of available expression data, genomic sequences, and transcript information to predict cis-elements associated with specific expression patterns. The key components of our approach are: (1) precise identification of transcription start sites, (2) specific locations of cis-elements relative to the transcription start site, and (3) assessment of statistical significance for all sequence motifs. By applying our method to promoters of Arabidopsis thaliana and Mus musculus, we have identified motifs that affect gene expression under specific environmental conditions or in certain tissues. We also found that the presence of the TATA box is associated with increased variability of gene expression. Strong correlation between our results and experimentally determined motifs shows that the method is capable of predicting new functionally important cis-elements in promoter sequences. PMID:19231992

  2. An Isogenic Human ESC Platform for Functional Evaluation of Genome-wide-Association-Study-Identified Diabetes Genes and Drug Discovery.

    PubMed

    Zeng, Hui; Guo, Min; Zhou, Ting; Tan, Lei; Chong, Chi Nok; Zhang, Tuo; Dong, Xue; Xiang, Jenny Zhaoying; Yu, Albert S; Yue, Lixia; Qi, Qibin; Evans, Todd; Graumann, Johannes; Chen, Shuibing

    2016-09-01

    Genome-wide association studies (GWASs) have increased our knowledge of loci associated with a range of human diseases. However, applying such findings to elucidate pathophysiology and promote drug discovery remains challenging. Here, we created isogenic human ESCs (hESCs) with mutations in GWAS-identified susceptibility genes for type 2 diabetes. In pancreatic beta-like cells differentiated from these lines, we found that mutations in CDKAL1, KCNQ1, and KCNJ11 led to impaired glucose secretion in vitro and in vivo, coinciding with defective glucose homeostasis. CDKAL1 mutant insulin+ cells were also hypersensitive to glucolipotoxicity. A high-content chemical screen identified a candidate drug that rescued CDKAL1-specific defects in vitro and in vivo by inhibiting the FOS/JUN pathway. Our approach of a proof-of-principle platform, which uses isogenic hESCs for functional evaluation of GWAS-identified loci and identification of a drug candidate that rescues gene-specific defects, paves the way for precision therapy of metabolic diseases. PMID:27524441

  3. An Isogenic Human ESC Platform for Functional Evaluation of Genome-wide-Association-Study-Identified Diabetes Genes and Drug Discovery.

    PubMed

    Zeng, Hui; Guo, Min; Zhou, Ting; Tan, Lei; Chong, Chi Nok; Zhang, Tuo; Dong, Xue; Xiang, Jenny Zhaoying; Yu, Albert S; Yue, Lixia; Qi, Qibin; Evans, Todd; Graumann, Johannes; Chen, Shuibing

    2016-09-01

    Genome-wide association studies (GWASs) have increased our knowledge of loci associated with a range of human diseases. However, applying such findings to elucidate pathophysiology and promote drug discovery remains challenging. Here, we created isogenic human ESCs (hESCs) with mutations in GWAS-identified susceptibility genes for type 2 diabetes. In pancreatic beta-like cells differentiated from these lines, we found that mutations in CDKAL1, KCNQ1, and KCNJ11 led to impaired glucose secretion in vitro and in vivo, coinciding with defective glucose homeostasis. CDKAL1 mutant insulin+ cells were also hypersensitive to glucolipotoxicity. A high-content chemical screen identified a candidate drug that rescued CDKAL1-specific defects in vitro and in vivo by inhibiting the FOS/JUN pathway. Our approach of a proof-of-principle platform, which uses isogenic hESCs for functional evaluation of GWAS-identified loci and identification of a drug candidate that rescues gene-specific defects, paves the way for precision therapy of metabolic diseases.

  4. A proposed essential gene discovery pipeline: a Campylobacter jejuni case study.

    PubMed

    Reuter, Mark; Gaskin, Duncan J H; Metris, Aline

    2015-01-01

    Genes required for an organism's growth and survival are termed essential and represent potential intervention targets. Following in the footsteps of the genomics era, the "next-gen" genomic era provides vast amounts of genetic information. Sequencing of a representative bacterial pathogen genome has been superseded by sequencing of whole strain collections, whether from environmental or clinical sources (Harris et al., Science 327:469-474, 2010; Lewis et al., J Hosp Infect 75:37-41, 2010; Beres et al., Proc Natl Acad Sci U S A 107:4371-4376, 2010; Qi et al., PLoS Pathog 5:e1000580, 2009; He et al., Proc Natl Acad Sci U S A 107:7527-7532, 2010; Barrick et al., Nature 461:1243-1247, 2009; Sheppard et al., Mol Ecol 22:1051-1064, 2013). However, the challenge of using this information to gain biological insight remains. Nonetheless, this information, in combination with experimental data from the literature, can serve as the framework for gaining a better understanding of an organism's biology. Generic metabolic pathways have long been known, and a number of websites (e.g., KEGG and BioCyc) attempt to map information from genome annotation to metabolic pathways (Kanehisa et al., Nucleic Acids Res 40:D109-D114, 2010; Karp et al., Nucleic Acids Res 33:6083-6089, 2005). Extending this analysis to incorporate metabolic flux models further allows in silico prediction of potential essential genes. Such efforts are of value, either to highlight novel generic antimicrobials or to seek novel treatments for non-paradigm organisms. Such in silico approaches are attractive as they can highlight pathways and genes that would otherwise only be identified by costly and time-consuming laboratory methods. PMID:25636619

  5. Functional Analysis and Discovery of Microbial Genes Transforming Metallic and Organic Pollutants: Database and Experimental Tools

    SciTech Connect

    Lawrence P. Wackett; Lynda B.M. Ellis

    2004-12-09

    Microbial functional genomics is faced with a burgeoning list of genes which are denoted as unknown or hypothetical for lack of any knowledge about their function. The majority of microbial genes encode enzymes. Enzymes are the catalysts of metabolism; catabolism, anabolism, stress responses, and many other cell functions. A major problem facing microbial functional genomics is proposed here to derive from the breadth of microbial metabolism, much of which remains undiscovered. The breadth of microbial metabolism has been surveyed by the PIs and represented according to reaction types on the University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD): http://umbbd.ahc.umn.edu/search/FuncGrps.html The database depicts metabolism of 49 chemical functional groups, representing most of current knowledge. Twice that number of chemical groups are proposed here to be metabolized by microbes. Thus, at least 50% of the unique biochemical reactions catalyzed by microbes remain undiscovered. This further suggests that many unknown and hypothetical genes encode functions yet undiscovered. This gap will be partly filled by the current proposal. The UM-BBD will be greatly expanded as a resource for microbial functional genomics. Computational methods will be developed to predict microbial metabolism which is not yet discovered. Moreover, a concentrated effort to discover new microbial metabolism will be conducted. The research will focus on metabolism of direct interest to DOE, dealing with the transformation of metals, metalloids, organometallics and toxic organics. This is precisely the type of metabolism which has been characterized most poorly to date. Moreover, these studies will directly impact functional genomic analysis of DOE-relevant genomes.

  6. Transcriptomics Analysis of Crassostrea hongkongensis for the Discovery of Reproduction-Related Genes

    PubMed Central

    Tong, Ying; Zhang, Yang; Huang, Jiaomei; Xiao, Shu; Zhang, Yuehuan; Li, Jun; Chen, Jinhui; Yu, Ziniu

    2015-01-01

    Background The reproductive mechanisms of mollusk species have been interesting targets in biological research because of the diverse reproductive strategies observed in this phylum. These species have also been studied for the development of fishery technologies in molluscan aquaculture. Although the molecular mechanisms underlying the reproductive process have been well studied in animal models, the relevant information from mollusks remains limited, particularly in species of great commercial interest. Crassostrea hongkongensis is the dominant oyster species that is distributed along the coast of the South China Sea and little genomic information on this species is available. Currently, high-throughput sequencing techniques have been widely used for investigating the basis of physiological processes and facilitating the establishment of adequate genetic selection programs. Results The C.hongkongensis transcriptome included a total of 1,595,855 reads, which were generated by 454 sequencing and were assembled into 41,472 contigs using de novo methods. Contigs were clustered into 33,920 isotigs and further grouped into 22,829 isogroups. Approximately 77.6% of the isogroups were successfully annotated by the Nr database. More than 1,910 genes were identified as being related to reproduction. Some key genes involved in germline development, sex determination and differentiation were identified for the first time in C.hongkongensis (nanos, piwi, ATRX, FoxL2, β-catenin, etc.). Gene expression analysis indicated that vasa, nanos, piwi, ATRX, FoxL2, β-catenin and SRD5A1 were highly or specifically expressed in C.hongkongensis gonads. Additionally, 94,056 single nucleotide polymorphisms (SNPs) and 1,699 simple sequence repeats (SSRs) were compiled. Conclusions Our study significantly increased C.hongkongensis genomic information based on transcriptomics analysis. The group of reproduction-related genes identified in the present study constitutes a new tool for research

  7. Age-related trends in gene expression in the chemosensory-nasal mucosae of senescence-accelerated mice.

    PubMed

    Getchell, Thomas V; Peng, Xuejun; Stromberg, Arnold J; Chen, Kuey-Chu; Paul Green, C; Subhedar, Nishikant K; Shah, Dharmen S; Mattson, Mark P; Getchell, Marilyn L

    2003-04-01

    We have utilized high-density GeneChip oligonucleotide arrays to investigate the use of the senescence-accelerated mouse (SAM) as a biogerontological resource to identify patterns of gene expression in the chemosensory-nasal mucosa. Gene profiling in chronologically young and old mice of the senescence-resistant (SAMR) and senescence-prone (SAMP) strains revealed 133 known genes that were modulated by a three-fold or greater change either in one strain or the other or in both strains during aging. We also identified known genes in our study which based on their encoded proteins were identified as aging-related genes in the aging neocortex and cerebellum of mice as reported by Lee et al. (2000) [Nat. Genet. 25 (2000) 294]. Changes in gene profiles for chemosensory-related genes including olfactory and vomeronasal receptors, sensory transduction-associated proteins, and odor and pheromone transport molecules in the young SAMR and SAMP were compared with age-matched C57BL/6J mice. An analysis of known gene expression profiles suggests that changes in the expression of immune factor genes and genes associated with cell cycle progression and cell death were particularly prominent in the old SAM strains. A preliminary cellular validation study supported the dysregulation of cell cycle-related genes in the old SAM strains. The results of our initial study indicated that the use of the SAM models of aging could provide substantive information leading to a more fundamental understanding of the aging process in the chemosensory-nasal mucosa at the genomic, molecular, and cellular levels. PMID:12605961

  8. Gene Discovery and Molecular Marker Development, Based on High-Throughput Transcript Sequencing of Paspalum dilatatum Poir

    PubMed Central

    Giordano, Andrea; Cogan, Noel O. I.; Kaur, Sukhjiwan; Drayton, Michelle; Mouradov, Aidyn; Panter, Stephen; Schrauf, Gustavo E.; Mason, John G.; Spangenberg, German C.

    2014-01-01

    Background Paspalum dilatatum Poir. (common name dallisgrass) is a native grass species of South America, with special relevance to dairy and red meat production. P. dilatatum exhibits higher forage quality than other C4 forage grasses and is tolerant to frost and water stress. This species is predominantly cultivated in an apomictic monoculture, with an inherent high risk that biotic and abiotic stresses could potentially devastate productivity. Therefore, advanced breeding strategies that characterise and use available genetic diversity, or assess germplasm collections effectively are required to deliver advanced cultivars for production systems. However, there are limited genomic resources available for this forage grass species. Results Transcriptome sequencing using second-generation sequencing platforms has been employed using pooled RNA from different tissues (stems, roots, leaves and inflorescences) at the final reproductive stage of P. dilatatum cultivar Primo. A total of 324,695 sequence reads were obtained, corresponding to c. 102 Mbp. The sequences were assembled, generating 20,169 contigs of a combined length of 9,336,138 nucleotides. The contigs were BLAST analysed against the fully sequenced grass species of Oryza sativa subsp. japonica, Brachypodium distachyon, the closely related Sorghum bicolor and foxtail millet (Setaria italica) genomes as well as against the UniRef 90 protein database allowing a comprehensive gene ontology analysis to be performed. The contigs generated from the transcript sequencing were also analysed for the presence of simple sequence repeats (SSRs). A total of 2,339 SSR motifs were identified within 1,989 contigs and corresponding primer pairs were designed. Empirical validation of a cohort of 96 SSRs was performed, with 34% being polymorphic between sexual and apomictic biotypes. Conclusions The development of genetic and genomic resources for P. dilatatum will contribute to gene discovery and expression studies

  9. High-Throughput, Motility-Based Sorter for Microswimmers and Gene Discovery Platform

    NASA Astrophysics Data System (ADS)

    Yuan, Jinzhou; Raizen, David; Bau, Haim

    2015-11-01

    Animal motility varies with genotype, disease progression, aging, and environmental conditions. In many studies, it is desirable to carry out high throughput motility-based sorting to isolate rare animals for, among other things, forward genetic screens to identify genetic pathways that regulate phenotypes of interest. Many commonly used screening processes are labor-intensive, lack sensitivity, and require extensive investigator training. Here, we describe a sensitive, high throughput, automated, motility-based method for sorting nematodes. Our method was implemented in a simple microfluidic device capable of sorting many thousands of animals per hour per module, and is amenable to parallelism. The device successfully enriched for known C. elegans motility mutants. Furthermore, using this device, we isolated low-abundance mutants capable of suppressing the somnogenic effects of the flp-13 gene, which regulates sleep-like quiescence in C. elegans. Subsequent genomic sequencing led to the identification of a flp-13-suppressor gene. This research was supported, in part, by NIH NIA Grant 5R03AG042690-02.

  10. Hexosamine template. A platform for modulating gene expression and for sugar-based drug discovery.

    PubMed

    Elmouelhi, Noha; Aich, Udayanath; Paruchuri, Venkata D P; Meledeo, M Adam; Campbell, Christopher T; Wang, Jean J; Srinivas, Raja; Khanna, Hargun S; Yarema, Kevin J

    2009-04-23

    This study investigates the breadth of cellular responses engendered by short chain fatty acid (SCFA)-hexosamine hybrid molecules, a class of compounds long used in "metabolic glycoengineering" that are now emerging as drug candidates. First, a "mix and match" strategy showed that different SCFA (n-butyrate and acetate) appended to the same core sugar altered biological activity, complementing previous results [Campbell et al. J. Med. Chem. 2008, 51, 8135-8147] where a single type of SCFA elicited distinct responses. Microarray profiling then compared transcriptional responses engendered by regioisomerically modified ManNAc, GlcNAc, and GalNAc analogues in MDA-MB-231 cells. These data, which were validated by qRT-PCR or Western analysis for ID1, TP53, HPSE, NQO1, EGR1, and VEGFA, showed a two-pronged response where a core set of genes was coordinately regulated by all analogues while each analogue simultaneously uniquely regulated a larger number of genes. Finally, AutoDock modeling supported a mechanism where the analogues directly interact with elements of the NF-kappaB pathway. Together, these results establish the SCFA-hexosamine template as a versatile platform for modulating biological activity and developing new therapeutics. PMID:19326913

  11. 60 YEARS OF POMC: The proopiomelanocortin gene: discovery, deletion and disease.

    PubMed

    Clark, Adrian J L

    2016-05-01

    The cloning of the bovine proopiomelanocortin (POMC) cDNA in 1978 by Nakanishi and colleagues was the result of a remarkable series of exacting and ingenious experiments. With this work, they instantly confirmed the single precursor hypothesis for adrenocorticotrophic hormone-β-lipotropin, as it was then known, and in so doing revealed the existence of additional, largely unpredicted, N-terminal peptides that together formed the POMC precursor peptide. This work paved the way for a host of additional studies into the physiology of these peptides and their regulation. Furthermore, the cloning of the murine Pomc gene was essential for subsequent studies, in which Pomc was intentionally deleted in the mouse illuminating its substantial role in body weight regulation and adrenal function. Contemporaneously with this work, naturally occurring mutations in human POMC came to light underlining the vital role of this gene in appetite regulation. This article reviews each of these aspects of POMC with the benefit of several decades of hindsight and informed by more recent genomic and transcriptomic data.

  12. Gene discovery in the Antarctic fur seal (Arctocephalus gazella) skin transcriptome.

    PubMed

    Hoffman, Joseph I

    2011-07-01

    Next-generation sequencing provides a powerful new approach for developing functional genomic tools for nonmodel species, helping to narrow the gap between studies of model organisms and those of natural populations. Consequently, massively parallel 454 sequencing was used to characterize a normalized cDNA library derived from skin biopsy samples of twelve Antarctic fur seal (Arctocephalus gazella) individuals. Over 412 Mb of sequence data were generated, comprising 1.4 million reads of average length 286 bp. De novo assembly using Newbler 2.3 yielded 156 contigs plus 22 869 isotigs, which in turn clustered into 18,576 isogroups. Almost half of the assembled transcript sequences showed significant similarity to the nr database, revealing a functionally diverse array of genes. Moreover, 97.9% of these mapped to the dog (Canis lupis familiaris) genome, with a strong positive relationship between the number of sequences locating to a given chromosome and the length of that chromosome in the dog indicating a broad genomic distribution. Average depth of coverage was also almost 20-fold, sufficient to detect several thousand putative microsatellite loci and single nucleotide polymorphisms. This study constitutes an important step towards developing genomic resources with which to address consequential questions in pinniped ecology and evolution. It also supports an earlier but smaller study showing that skin tissue can be a rich source of expressed genes, with important implications for studying the genomics not only of marine mammals, but also more generally of species that cannot be destructively sampled.

  13. Transcriptome Analysis of the Portunus trituberculatus: De Novo Assembly, Growth-Related Gene Identification and Marker Discovery

    PubMed Central

    Lv, Jianjian; Liu, Ping; Gao, Baoquan; Wang, Yu; Wang, Zheng; Chen, Ping; Li, Jian

    2014-01-01

    Background The swimming crab, Portunus trituberculatus, is an important farmed species in China, has been attracting extensive studies, which require more and more genome background knowledge. To date, the sequencing of its whole genome is unavailable and transcriptomic information is also scarce for this species. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive transcript dataset for major tissues of Portunus trituberculatus by the Illumina paired-end sequencing technology. Results Total RNA was isolated from eyestalk, gill, heart, hepatopancreas and muscle. Equal quantities of RNA from each tissue were pooled to construct a cDNA library. Using the Illumina paired-end sequencing technology, we generated a total of 120,137 transcripts with an average length of 1037 bp. Further assembly analysis showed that all contigs contributed to 87,100 unigenes, of these, 16,029 unigenes (18.40% of the total) can be matched in the GenBank non-redundant database. Potential genes and their functions were predicted by GO, KEGG pathway mapping and COG analysis. Based on our sequence analysis and published literature, many putative genes with fundamental roles in growth and muscle development, including actin, myosin, tropomyosin, troponin and other potentially important candidate genes were identified for the first time in this specie. Furthermore, 22,673 SSRs and 66,191 high-confidence SNPs were identified in this EST dataset. Conclusion The transcriptome provides an invaluable new data for a functional genomics resource and future biological research in Portunus trituberculatus. The data will also instruct future functional studies to manipulate or select for genes influencing growth that should find practical applications in aquaculture breeding programs. The molecular markers identified in this study will provide a material basis for future genetic linkage and quantitative trait loci analyses, and will be essential for accelerating

  14. Gene discovery using massively parallel pyrosequencing to develop ESTs for the flesh fly Sarcophaga crassipalpis

    PubMed Central

    Hahn, Daniel A; Ragland, Gregory J; Shoemaker, D DeWayne; Denlinger, David L

    2009-01-01

    Background Flesh flies in the genus Sarcophaga are important models for investigating endocrinology, diapause, cold hardiness, reproduction, and immunity. Despite the prominence of Sarcophaga flesh flies as models for insect physiology and biochemistry, and in forensic studies, little genomic or transcriptomic data are available for members of this genus. We used massively parallel pyrosequencing on the Roche 454-FLX platform to produce a substantial EST dataset for the flesh fly Sarcophaga crassipalpis. To maximize sequence diversity, we pooled RNA extracted from whole bodies of all life stages and normalized the cDNA pool after reverse transcription. Results We obtained 207,110 ESTs with an average read length of 241 bp. These reads assembled into 20,995 contigs and 31,056 singletons. Using BLAST searches of the NR and NT databases we were able to identify 11,757 unique gene elements (E<0.0001) representing approximately 9,000 independent transcripts. Comparison of the distribution of S. crassipalpis unigenes among GO Biological Process functional groups with that of the Drosophila melanogaster transcriptome suggests that our ESTs are broadly representative of the flesh fly transcriptome. Insertion and deletion errors in 454 sequencing present a serious hurdle to comparative transcriptome analysis. Aided by a new approach to correcting for these errors, we performed a comparative analysis of genetic divergence across GO categories among S. crassipalpis, D. melanogaster, and Anopheles gambiae. The results suggest that non-synonymous substitutions occur at similar rates across categories, although genes related to response to stimuli may evolve slightly faster. In addition, we identified over 500 potential microsatellite loci and more than 12,000 SNPs among our ESTs. Conclusion Our data provides the first large-scale EST-project for flesh flies, a much-needed resource for exploring this model species. In addition, we identified a large number of potential

  15. Microfluidic droplet-based PCR instrumentation for high-throughput gene expression profiling and biomarker discovery.

    PubMed

    Hayes, Christopher J; Dalton, Tara M

    2015-06-01

    PCR is a common and often indispensable technique used in medical and biological research labs for a variety of applications. Real-time quantitative PCR (RT-qPCR) has become a definitive technique for quantitating differences in gene expression levels between samples. Yet, in spite of this importance, reliable methods to quantitate nucleic acid amounts in a higher throughput remain elusive. In the following paper, a unique design to quantify gene expression levels at the nanoscale in a continuous flow system is presented. Fully automated, high-throughput, low volume amplification of deoxynucleotides (DNA) in a droplet based microfluidic system is described. Unlike some conventional qPCR instrumentation that use integrated fluidic circuits or plate arrays, the instrument performs qPCR in a continuous, micro-droplet flowing process with droplet generation, distinctive reagent mixing, thermal cycling and optical detection platforms all combined on one complete instrument. Detailed experimental profiling of reactions of less than 300 nl total volume is achieved using the platform demonstrating the dynamic range to be 4 order logs and consistent instrument sensitivity. Furthermore, reduced pipetting steps by as much as 90% and a unique degree of hands-free automation makes the analytical possibilities for this instrumentation far reaching. In conclusion, a discussion of the first demonstrations of this approach to perform novel, continuous high-throughput biological screens is presented. The results generated from the instrument, when compared with commercial instrumentation, demonstrate the instrument reliability and robustness to carry out further studies of clinical significance with added throughput and economic benefits. PMID:27077035

  16. Discovery of Gene Sources for Economic Traits in Hanwoo by Whole-genome Resequencing

    PubMed Central

    Shin, Younhee; Jung, Ho-jin; Jung, Myunghee; Yoo, Seungil; Subramaniyam, Sathiyamoorthy; Markkandan, Kesavan; Kang, Jun-Mo; Rai, Rajani; Park, Junhyung; Kim, Jong-Joo

    2016-01-01

    Hanwoo, a Korean native cattle (Bos taurus coreana), has great economic value due to high meat quality. Also, the breed has genetic variations that are associated with production traits such as health, disease resistance, reproduction, growth as well as carcass quality. In this study, next generation sequencing technologies and the availability of an appropriate reference genome were applied to discover a large amount of single nucleotide polymorphisms (SNPs) in ten Hanwoo bulls. Analysis of whole-genome resequencing generated a total of 26.5 Gb data, of which 594,716,859 and 592,990,750 reads covered 98.73% and 93.79% of the bovine reference genomes of UMD 3.1 and Btau 4.6.1, respectively. In total, 2,473,884 and 2,402,997 putative SNPs were discovered, of which 1,095,922 (44.3%) and 982,674 (40.9%) novel SNPs were discovered against UMD3.1 and Btau 4.6.1, respectively. Among the SNPs, the 46,301 (UMD 3.1) and 28,613 SNPs (Btau 4.6.1) that were identified as Hanwoo-specific SNPs were included in the functional genes that may be involved in the mechanisms of milk production, tenderness, juiciness, marbling of Hanwoo beef and yellow hair. Most of the Hanwoo-specific SNPs were identified in the promoter region, suggesting that the SNPs influence differential expression of the regulated genes relative to the relevant traits. In particular, the non-synonymous (ns) SNPs found in CORIN, which is a negative regulator of Agouti, might be a causal variant to determine yellow hair of Hanwoo. Our results will provide abundant genetic sources of variation to characterize Hanwoo genetics and for subsequent breeding. PMID:26954201

  17. Discovery and characterization of the first genuine avian leptin gene in the rock dove (Columba livia).

    PubMed

    Friedman-Einat, Miriam; Cogburn, Larry A; Yosefi, Sara; Hen, Gideon; Shinder, Dmitry; Shirak, Andrey; Seroussi, Eyal

    2014-09-01

    Leptin, the key regulator of mammalian energy balance, has been at the center of a great controversy in avian biology for the last 15 years since initial reports of a putative leptin gene (LEP) in chickens. Here, we characterize a novel LEP in rock dove (Columba livia) with low similarity of the predicted protein sequence (30% identity, 47% similarity) to the human ortholog. Searching the Sequence-Read-Archive database revealed leptin transcripts, in the dove's liver, with 2 noncoding exons preceding 2 coding exons. This unusual 4-exon structure was validated by sequencing of a GC-rich product (76% GC, 721 bp) amplified from liver RNA by RT-PCR. Sequence alignment of the dove leptin with orthologous leptins indicated that it consists of a leader peptide (21 amino acids; aa) followed by the mature protein (160 aa), which has a putative structure typical of 4-helical-bundle cytokines except that it is 12 aa longer than human leptin. Extra residues (10 aa) were located within the loop between 2 5'-helices, interrupting the amino acid motif that is conserved in tetrapods and considered essential for activation of leptin receptor (LEPR) but not for receptor binding per se. Quantitative RT-PCR of 11 tissues showed highest (P < .05) expression of LEP in the dove's liver, whereas the dove LEPR peaked (P < .01) in the pituitary. Both genes were prominently expressed in the gonads and at lower levels in tissues involved in mammalian leptin signaling (adipose; hypothalamus). A bioassay based on activation of the chicken LEPR in vitro showed leptin activity in the dove's circulation, suggesting that dove LEP encodes an active protein, despite the interrupted loop motif. Providing tools to study energy-balance control at an evolutionary perspective, our original demonstration of leptin signaling in dove predicts a more ancient role of leptin in growth and reproduction in birds, rather than appetite control.

  18. Microfluidic droplet-based PCR instrumentation for high-throughput gene expression profiling and biomarker discovery

    PubMed Central

    Hayes, Christopher J.; Dalton, Tara M.

    2015-01-01

    PCR is a common and often indispensable technique used in medical and biological research labs for a variety of applications. Real-time quantitative PCR (RT-qPCR) has become a definitive technique for quantitating differences in gene expression levels between samples. Yet, in spite of this importance, reliable methods to quantitate nucleic acid amounts in a higher throughput remain elusive. In the following paper, a unique design to quantify gene expression levels at the nanoscale in a continuous flow system is presented. Fully automated, high-throughput, low volume amplification of deoxynucleotides (DNA) in a droplet based microfluidic system is described. Unlike some conventional qPCR instrumentation that use integrated fluidic circuits or plate arrays, the instrument performs qPCR in a continuous, micro-droplet flowing process with droplet generation, distinctive reagent mixing, thermal cycling and optical detection platforms all combined on one complete instrument. Detailed experimental profiling of reactions of less than 300 nl total volume is achieved using the platform demonstrating the dynamic range to be 4 order logs and consistent instrument sensitivity. Furthermore, reduced pipetting steps by as much as 90% and a unique degree of hands-free automation makes the analytical possibilities for this instrumentation far reaching. In conclusion, a discussion of the first demonstrations of this approach to perform novel, continuous high-throughput biological screens is presented. The results generated from the instrument, when compared with commercial instrumentation, demonstrate the instrument reliability and robustness to carry out further studies of clinical significance with added throughput and economic benefits. PMID:27077035

  19. Search for major genes with progeny test data to accelerate the development of genetically superior loblolly pine

    SciTech Connect

    NCSU

    2003-12-30

    This research project is to develop a novel approach that fully utilized the current breeding materials and genetic test information available from the NCSU-Industry Cooperative Tree Improvement Program to identify major genes that are segregating for growth and disease resistance in loblolly pine. If major genes can be identified in the existing breeding population, they can be utilized directly in the conventional loblolly pine breeding program. With the putative genotypes of parents identified, tree breeders can make effective decisions on management of breeding populations and operational deployment of genetically superior trees. Forest productivity will be significantly enhanced if genetically superior genotypes with major genes for economically important traits could be deployed in an operational plantation program. The overall objective of the project is to develop genetic model and analytical methods for major gene detection with progeny test data and accelerate the development of genetically superior loblolly pine. Specifically, there are three main tasks: (1) Develop genetic models for major gene detection and implement statistical methods and develop computer software for screening progeny test data; (2) Confirm major gene segregation with molecular markers; and (3) Develop strategies for using major genes for tree breeding.

  20. Computational discovery of Epstein-Barr virus targeted human genes and signalling pathways

    PubMed Central

    Mei, Suyu; Zhang, Kun

    2016-01-01

    Epstein-Barr virus (EBV) plays important roles in the origin and the progression of human carcinomas, e.g. diffuse large B cell tumors, T cell lymphomas, etc. Discovering EBV targeted human genes and signaling pathways is vital to understand EBV tumorigenesis. In this study we propose a noise-tolerant homolog knowledge transfer method to reconstruct functional protein-protein interactions (PPI) networks between Epstein-Barr virus and Homo sapiens. The training set is augmented via homolog instances and the homolog noise is counteracted by support vector machine (SVM). Additionally we propose two methods to define subcellular co-localization (i.e. stringent and relaxed), based on which to further derive physical PPI networks. Computational results show that the proposed method achieves sound performance of cross validation and independent test. In the space of 648,672 EBV-human protein pairs, we obtain 51,485 functional interactions (7.94%), 869 stringent physical PPIs and 46,050 relaxed physical PPIs. Fifty-eight evidences are found from the latest database and recent literature to validate the model. This study reveals that Epstein-Barr virus interferes with normal human cell life, such as cholesterol homeostasis, blood coagulation, EGFR binding, p53 binding, Notch signaling, Hedgehog signaling, etc. The proteome-wide predictions are provided in the supplementary file for further biomedical research. PMID:27470517

  1. Discovery of new glomerular disease-relevant genes by translational profiling of podocytes in vivo

    PubMed Central

    Grgic, Ivica; Hofmeister, Andreas F.; Genovese, Giulio; Bernhardy, Andrea J.; Sun, Hua; Maarouf, Omar H.; Bijol, Vanesa; Pollak, Martin R.; Humphreys, Benjamin D.

    2014-01-01

    Identifying new biomarkers and therapeutic targets for podocytopathies such as focal segmental glomerulosclerosis (FSGS) requires a detailed analysis of transcriptional changes in podocytes over the course of disease. Here we used translating ribosome affinity purification (TRAP) to isolate and profile podocyte-specific mRNA in two different models of FSGS. Expressed eGFP-tagged ribosomal protein L10a in podocytes under the control of the Collagen-1α1 promoter enabled podocyte-specific mRNA isolation in a one-step process over the course of disease. This TRAP protocol robustly enriched known podocyte-specific mRNAs. We crossed col1α1-L10a mice with the actn4−/− and actn4+/K256E models of FSGS and analyzed podocyte transcriptional profiles at 2, 6 and 44 weeks of age. Two upregulated podocyte genes in murine FSGS (CXCL1 and DMPK) were found to be upregulated at the protein level in biopsies from patients with FSGS, validating this approach. There was no dilution of podocyte-specific transcripts during disease. These are the first podocyte-specific RNA expression datasets during aging and in two models of FSGS. This approach identified new podocyte proteins that are upregulated in FSGS and help define novel biomarkers and therapeutic targets for human glomerular disease. PMID:24940801

  2. Discovery of PPi-type Phosphoenolpyruvate Carboxykinase Genes in Eukaryotes and Bacteria*

    PubMed Central

    Chiba, Yoko; Kamikawa, Ryoma; Nakada-Tsukui, Kumiko; Saito-Nakano, Yumiko; Nozaki, Tomoyoshi

    2015-01-01

    Phosphoenolpyruvate carboxykinase (PEPCK) is one of the pivotal enzymes that regulates the carbon flow of the central metabolism by fixing CO2 to phosphoenolpyruvate (PEP) to produce oxaloacetate or vice versa. Whereas ATP- and GTP-type PEPCKs have been well studied, and their protein identities are established, inorganic pyrophosphate (PPi)-type PEPCK (PPi-PEPCK) is poorly characterized. Despite extensive enzymological studies, its protein identity and encoding gene remain unknown. In this study, PPi-PEPCK has been identified for the first time from a eukaryotic human parasite, Entamoeba histolytica, by conventional purification and mass spectrometric identification of the native enzyme, followed by demonstration of its enzymatic activity. A homolog of the amebic PPi-PEPCK from an anaerobic bacterium Propionibacterium freudenreichii subsp. shermanii also exhibited PPi-PEPCK activity. The primary structure of PPi-PEPCK has no similarity to the functional homologs ATP/GTP-PEPCKs and PEP carboxylase, strongly suggesting that PPi-PEPCK arose independently from the other functional homologues and very likely has unique catalytic sites. PPi-PEPCK homologs were found in a variety of bacteria and some eukaryotes but not in archaea. The molecular identification of this long forgotten enzyme shows us the diversity and functional redundancy of enzymes involved in the central metabolism and can help us to understand the central metabolism more deeply. PMID:26269598

  3. Gene mutation discovery research of non-smoking lung cancer patients due to indoor radon exposure.

    PubMed

    Choi, Jung Ran; Park, Seong Yong; Noh, O Kyu; Koh, Young Wha; Kang, Dae Ryong

    2016-01-01

    Although the incidence and mortality for most cancers such as lung and colon are decreasing in several countries, they are increasing in several developed countries because of an unhealthy western lifestyles including smoking, physical inactivity and consumption of calorie-dense food. The incidences for lung and colon cancers in a few of these countries have already exceeded those in the United States and other western countries. Among them, lung cancer is the main cause of cancer death in worldwide. The cumulative survival rate at five years differs between 13 and 21 % in several countries. Although the most important risk factors are smoking for lung cancer, however, the increased incidence of lung cancer in never smokers(LCINS) is necessary to improve knowledge concerning other risk factors. Environmental factors and genetic susceptibility are also thought to contribute to lung cancer risk. Patients with lung adenocarcinoma who have never smoking frequently contain mutation within tyrosine kinase domain of the epidermal growth factor receptor(EGFR) gene. Also, K-ras mutations are more common in individuals with a history of smoking use and are related with resistance to EFGR-tyrosine kinase inhibitors. Recently, radon(Rn), natural and noble gas, has been recognized as second common reason of lung cancer. In this review, we aim to know whether residential radon is associated with an increased risk for developing lung cancer and regulated by several genetic polymorphisms.

  4. Discovery of molecular associations among aging, stem cells, and cancer based on gene expression profiling.

    PubMed

    Wang, Xiaosheng

    2013-04-01

    The emergence of a huge volume of "omics" data enables a computational approach to the investigation of the biology of cancer. The cancer informatics approach is a useful supplement to the traditional experimental approach. I reviewed several reports that used a bioinformatics approach to analyze the associations among aging, stem cells, and cancer by microarray gene expression profiling. The high expression of aging- or human embryonic stem cell-related molecules in cancer suggests that certain important mechanisms are commonly underlying aging, stem cells, and cancer. These mechanisms are involved in cell cycle regulation, metabolic process, DNA damage response, apoptosis, p53 signaling pathway, immune/inflammatory response, and other processes, suggesting that cancer is a developmental and evolutional disease that is strongly related to aging. Moreover, these mechanisms demonstrate that the initiation, proliferation, and metastasis of cancer are associated with the deregulation of stem cells. These findings provide insights into the biology of cancer. Certainly, the findings that are obtained by the informatics approach should be justified by experimental validation. This review also noted that next-generation sequencing data provide enriched sources for cancer informatics study.

  5. Identification and Validation of HCC-specific Gene Transcriptional Signature for Tumor Antigen Discovery.

    PubMed

    Petrizzo, Annacarmen; Caruso, Francesca Pia; Tagliamonte, Maria; Tornesello, Maria Lina; Ceccarelli, Michele; Costa, Valerio; Aprile, Marianna; Esposito, Roberta; Ciliberto, Gennaro; Buonaguro, Franco M; Buonaguro, Luigi

    2016-07-08

    A novel two-step bioinformatics strategy was applied for identification of signatures with therapeutic implications in hepatitis-associated HCC. Transcriptional profiles from HBV- and HCV-associated HCC samples were compared with non-tumor liver controls. Resulting HCC modulated genes were subsequently compared with different non-tumor tissue samples. Two related signatures were identified, namely "HCC-associated" and "HCC-specific". Expression data were validated by RNA-Seq analysis carried out on unrelated HCC samples and protein expression was confirmed according to The Human Protein Atlas" (http://proteinatlas.org/), a public repository of immunohistochemistry data. Among all, aldo-keto reductase family 1 member B10, and IGF2 mRNA-binding protein 3 were found strictly HCC-specific with no expression in 18/20 normal tissues. Target peptides for vaccine design were predicted for both proteins associated with the most prevalent HLA-class I and II alleles. The described novel strategy showed to be feasible for identification of HCC-specific proteins as highly potential target for HCC immunotherapy.

  6. Discovery of PPi-type Phosphoenolpyruvate Carboxykinase Genes in Eukaryotes and Bacteria.

    PubMed

    Chiba, Yoko; Kamikawa, Ryoma; Nakada-Tsukui, Kumiko; Saito-Nakano, Yumiko; Nozaki, Tomoyoshi

    2015-09-25

    Phosphoenolpyruvate carboxykinase (PEPCK) is one of the pivotal enzymes that regulates the carbon flow of the central metabolism by fixing CO2 to phosphoenolpyruvate (PEP) to produce oxaloacetate or vice versa. Whereas ATP- and GTP-type PEPCKs have been well studied, and their protein identities are established, inorganic pyrophosphate (PPi)-type PEPCK (PPi-PEPCK) is poorly characterized. Despite extensive enzymological studies, its protein identity and encoding gene remain unknown. In this study, PPi-PEPCK has been identified for the first time from a eukaryotic human parasite, Entamoeba histolytica, by conventional purification and mass spectrometric identification of the native enzyme, followed by demonstration of its enzymatic activity. A homolog of the amebic PPi-PEPCK from an anaerobic bacterium Propionibacterium freudenreichii subsp. shermanii also exhibited PPi-PEPCK activity. The primary structure of PPi-PEPCK has no similarity to the functional homologs ATP/GTP-PEPCKs and PEP carboxylase, strongly suggesting that PPi-PEPCK arose independently from the other functional homologues and very likely has unique catalytic sites. PPi-PEPCK homologs were found in a variety of bacteria and some eukaryotes but not in archaea. The molecular identification of this long forgotten enzyme shows us the diversity and functional redundancy of enzymes involved in the central metabolism and can help us to understand the central metabolism more deeply. PMID:26269598

  7. Identification and Validation of HCC-specific Gene Transcriptional Signature for Tumor Antigen Discovery

    PubMed Central

    Petrizzo, Annacarmen; Caruso, Francesca Pia; Tagliamonte, Maria; Tornesello, Maria Lina; Ceccarelli, Michele; Costa, Valerio; Aprile, Marianna; Esposito, Roberta; Ciliberto, Gennaro; Buonaguro, Franco M.; Buonaguro, Luigi

    2016-01-01

    A novel two-step bioinformatics strategy was applied for identification of signatures with therapeutic implications in hepatitis-associated HCC. Transcriptional profiles from HBV- and HCV-associated HCC samples were compared with non-tumor liver controls. Resulting HCC modulated genes were subsequently compared with different non-tumor tissue samples. Two related signatures were identified, namely “HCC-associated” and “HCC-specific”. Expression data were validated by RNA-Seq analysis carried out on unrelated HCC samples and protein expression was confirmed according to The Human Protein Atlas" (http://proteinatlas.org/), a public repository of immunohistochemistry data. Among all, aldo-keto reductase family 1 member B10, and IGF2 mRNA-binding protein 3 were found strictly HCC-specific with no expression in 18/20 normal tissues. Target peptides for vaccine design were predicted for both proteins associated with the most prevalent HLA-class I and II alleles. The described novel strategy showed to be feasible for identification of HCC-specific proteins as highly potential target for HCC immunotherapy. PMID:27387388

  8. Gene mutation discovery research of non-smoking lung cancer patients due to indoor radon exposure.

    PubMed

    Choi, Jung Ran; Park, Seong Yong; Noh, O Kyu; Koh, Young Wha; Kang, Dae Ryong

    2016-01-01

    Although the incidence and mortality for most cancers such as lung and colon are decreasing in several countries, they are increasing in several developed countries because of an unhealthy western lifestyles including smoking, physical inactivity and consumption of calorie-dense food. The incidences for lung and colon cancers in a few of these countries have already exceeded those in the United States and other western countries. Among them, lung cancer is the main cause of cancer death in worldwide. The cumulative survival rate at five years differs between 13 and 21 % in several countries. Although the most important risk factors are smoking for lung cancer, however, the increased incidence of lung cancer in never smokers(LCINS) is necessary to improve knowledge concerning other risk factors. Environmental factors and genetic susceptibility are also thought to contribute to lung cancer risk. Patients with lung adenocarcinoma who have never smoking frequently contain mutation within tyrosine kinase domain of the epidermal growth factor receptor(EGFR) gene. Also, K-ras mutations are more common in individuals with a history of smoking use and are related with resistance to EFGR-tyrosine kinase inhibitors. Recently, radon(Rn), natural and noble gas, has been recognized as second common reason of lung cancer. In this review, we aim to know whether residential radon is associated with an increased risk for developing lung cancer and regulated by several genetic polymorphisms. PMID:26985396

  9. Computational discovery of Epstein-Barr virus targeted human genes and signalling pathways.

    PubMed

    Mei, Suyu; Zhang, Kun

    2016-01-01

    Epstein-Barr virus (EBV) plays important roles in the origin and the progression of human carcinomas, e.g. diffuse large B cell tumors, T cell lymphomas, etc. Discovering EBV targeted human genes and signaling pathways is vital to understand EBV tumorigenesis. In this study we propose a noise-tolerant homolog knowledge transfer method to reconstruct functional protein-protein interactions (PPI) networks between Epstein-Barr virus and Homo sapiens. The training set is augmented via homolog instances and the homolog noise is counteracted by support vector machine (SVM). Additionally we propose two methods to define subcellular co-localization (i.e. stringent and relaxed), based on which to further derive physical PPI networks. Computational results show that the proposed method achieves sound performance of cross validation and independent test. In the space of 648,672 EBV-human protein pairs, we obtain 51,485 functional interactions (7.94%), 869 stringent physical PPIs and 46,050 relaxed physical PPIs. Fifty-eight evidences are found from the latest database and recent literature to validate the model. This study reveals that Epstein-Barr virus interferes with normal human cell life, such as cholesterol homeostasis, blood coagulation, EGFR binding, p53 binding, Notch signaling, Hedgehog signaling, etc. The proteome-wide predictions are provided in the supplementary file for further biomedical research. PMID:27470517

  10. EST-SNP discovery and dense genetic mapping in lentil (Lens culinaris Medik.) enable candidate gene selection for boron tolerance.

    PubMed

    Kaur, Sukhjiwan; Cogan, Noel O I; Stephens, Amber; Noy, Dianne; Butsch, Mirella; Forster, John W; Materne, Michael

    2014-03-01

    Large-scale SNP discovery and dense genetic mapping in a lentil intraspecific cross permitted identification of a single chromosomal region controlling tolerance to boron toxicity, an important breeding objective. Lentil (Lens culinaris Medik.) is a highly nutritious food legume crop that is cultivated world-wide. Until recently, lentil has been considered a genomic 'orphan' crop, limiting the feasibility of marker-assisted selection strategies in breeding programs. The present study reports on the identification of single-nucleotide polymorphisms (SNPs) from transcriptome sequencing data, utilisation of expressed sequence tag (EST)-derived simple sequence repeat (SSR) and SNP markers for construction of a gene-based genetic linkage map, and identification of markers in close linkage to major QTLs for tolerance to boron (B) toxicity. A total of 2,956 high-quality SNP markers were identified from a lentil EST database. Sub-sets of 546 SSRs and 768 SNPs were further used for genetic mapping of an intraspecific mapping population (Cassab × ILL2024) that exhibits segregation for B tolerance. Comparative analysis of the lentil linkage map with the sequenced genomes of Medicago truncatula Gaertn., soybean (Glycine max [L.] Merr.) and Lotus japonicus L. indicated blocks of conserved macrosynteny, as well as a number of rearrangements. A single genomic region was found to be associated with variation for B tolerance in lentil, based on evaluation performed over 2 years. Comparison of flanking markers to genome sequences of model species (M. truncatula, soybean and Arabidopsis thaliana) identified candidate genes that are functionally associated with B tolerance, and could potentially be used for diagnostic marker development in lentil.

  11. De Novo Regulatory Motif Discovery Identifies Significant Motifs in Promoters of Five Classes of Plant Dehydrin Genes

    PubMed Central

    Zolotarov, Yevgen; Strömvik, Martina

    2015-01-01

    Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved. PMID:26114291

  12. The fragile x mental retardation syndrome 20 years after the FMR1 gene discovery: an expanding universe of knowledge.

    PubMed

    Rousseau, François; Labelle, Yves; Bussières, Johanne; Lindsay, Carmen

    2011-08-01

    The fragile X mental retardation (FXMR) syndrome is one of the most frequent causes of mental retardation. Affected individuals display a wide range of additional characteristic features including behavioural and physical phenotypes, and the extent to which individuals are affected is highly variable. For these reasons, elucidation of the pathophysiology of this disease has been an important challenge to the scientific community. 1991 marks the year of the discovery of both the FMR1 gene mutations involved in this disease, and of their dynamic nature. Although a mouse model for the disease has been available for 16 years and extensive research has been performed on the FMR1 protein (FMRP), we still understand little about how the disease develops, and no treatment has yet been shown to be effective. In this review, we summarise current knowledge on FXMR with an emphasis on the technical challenges of molecular diagnostics, on its prevalence and dynamics among populations, and on the potential of screening for FMR1 mutations.

  13. The Fragile X Mental Retardation Syndrome 20 Years After the FMR1 Gene Discovery: an Expanding Universe of Knowledge

    PubMed Central

    Rousseau, François; Labelle, Yves; Bussières, Johanne; Lindsay, Carmen

    2011-01-01

    The fragile X mental retardation (FXMR) syndrome is one of the most frequent causes of mental retardation. Affected individuals display a wide range of additional characteristic features including behavioural and physical phenotypes, and the extent to which individuals are affected is highly variable. For these reasons, elucidation of the pathophysiology of this disease has been an important challenge to the scientific community. 1991 marks the year of the discovery of both the FMR1 gene mutations involved in this disease, and of their dynamic nature. Although a mouse model for the disease has been available for 16 years and extensive research has been performed on the FMR1 protein (FMRP), we still understand little about how the disease develops, and no treatment has yet been shown to be effective. In this review, we summarise current knowledge on FXMR with an emphasis on the technical challenges of molecular diagnostics, on its prevalence and dynamics among populations, and on the potential of screening for FMR1 mutations. PMID:21912443

  14. A joint modeling approach for uncovering associations between gene expression, bioactivity and chemical structure in early drug discovery to guide lead selection and genomic biomarker development.

    PubMed

    Perualila-Tan, Nolen; Kasim, Adetayo; Talloen, Willem; Verbist, Bie; Göhlmann, Hinrich W H; Shkedy, Ziv

    2016-08-01

    The modern drug discovery process involves multiple sources of high-dimensional data. This imposes the challenge of data integration. A typical example is the integration of chemical structure (fingerprint features), phenotypic bioactivity (bioassay read-outs) data for targets of interest, and transcriptomic (gene expression) data in early drug discovery to better understand the chemical and biological mechanisms of candidate drugs, and to facilitate early detection of safety issues prior to later and expensive phases of drug development cycles. In this paper, we discuss a joint model for the transcriptomic and the phenotypic variables conditioned on the chemical structure. This modeling approach can be used to uncover, for a given set of compounds, the association between gene expression and biological activity taking into account the influence of the chemical structure of the compound on both variables. The model allows to detect genes that are associated with the bioactivity data facilitating the identification of potential genomic biomarkers for compounds efficacy. In addition, the effect of every structural feature on both genes and pIC50 and their associations can be simultaneously investigated. Two oncology projects are used to illustrate the applicability and usefulness of the joint model to integrate multi-source high-dimensional information to aid drug discovery. PMID:27269248

  15. Comprehensive analysis of animal TALE homeobox genes: new conserved motifs and cases of accelerated evolution.

    PubMed

    Mukherjee, Krishanu; Bürglin, Thomas R

    2007-08-01

    TALE homeodomain proteins are an ancient subgroup within the group of homeodomain transcription factors that play important roles in animal, plant, and fungal development. We have extracted the full complement of TALE superclass homeobox genes from the genome projects of seven protostomes, seven deuterostomes, and Nematostella. This was supplemented with TALE homeobox genes from additional species and phylogenetic analyses were carried out with 276 sequences. We found 20 homeobox genes and 4 pseudogenes in humans, 21 genes in mouse, 8 genes in Drosophila, and 5 genes plus one truncated gene in Caenorhabditis elegans. Apart from the previously identified TALE classes MEIS, PBC, IRO, and TGIF, a novel class is identified, termed MOHAWK (MKX). Further, we show that the MEIS class can be divided into two families, PREP and MEIS. Prep genes have previously only been described in vertebrates but are lacking in Drosophila. Here we identify orthologues in other insect taxa as well as in the cnidarian Nematostella. In C. elegans, a divergent Prep protein has lost the homeodomain. Full-length multiple sequence alignment of the protostome and deuterostome sequences allowed us to identify several novel conserved motifs within the MKX, TGIF, and MEIS classes. Phylogenetic analyses revealed fast-evolving PBC class genes; in particular, some X-linked PBC genes in nematodes are subject to rapid evolution. In addition, several instances of gene loss were identified. In conclusion, our comprehensive analysis provides a defining framework for the classification of animal TALE homeobox genes and the understanding of their evolution.

  16. Accelerated Recruitment of New Brain Development Genes into the Human Genome

    PubMed Central

    Zhang, Yong E.; Landback, Patrick; Vibranovski, Maria D.; Long, Manyuan

    2011-01-01

    How the human brain evolved has attracted tremendous interests for decades. Motivated by case studies of primate-specific genes implicated in brain function, we examined whether or not the young genes, those emerging genome-wide in the lineages specific to the primates or rodents, showed distinct spatial and temporal patterns of transcription compared to old genes, which had existed before primate and rodent split. We found consistent patterns across different sources of expression data: there is a significantly larger proportion of young genes expressed in the fetal or infant brain of humans than in mouse, and more young genes in humans have expression biased toward early developing brains than old genes. Most of these young genes are expressed in the evolutionarily newest part of human brain, the neocortex. Remarkably, we also identified a number of human-specific genes which are expressed in the prefrontal cortex, which is implicated in complex cognitive behaviors. The young genes upregulated in the early developing human brain play diverse functional roles, with a significant enrichment of transcription factors. Genes originating from different mechanisms show a similar expression bias in the developing brain. Moreover, we found that the young genes upregulated in early brain development showed rapid protein evolution compared to old genes also expressed in the fetal brain. Strikingly, genes expressed in the neocortex arose soon after its morphological origin. These four lines of evidence suggest that positive selection for brain function may have contributed to the origination of young genes expressed in the developing brain. These data demonstrate a striking recruitment of new genes into the early development of the human brain. PMID:22028629

  17. Analysis of Gene Expression Profiles in Leaf Tissues of Cultivated Peanuts and Development of EST-SSR Markers and Gene Discovery.

    PubMed

    Guo, Baozhu; Chen, Xiaoping; Hong, Yanbin; Liang, Xuanqiang; Dang, Phat; Brenneman, Tim; Holbrook, Corley; Culbreath, Albert

    2009-01-01

    Peanut is vulnerable to a range of foliar diseases such as spotted wilt caused by Tomato spotted wilt virus (TSWV), early (Cercospora arachidicola) and late (Cercosporidium personatum) leaf spots, southern stem rot (Sclerotium rolfsii), and sclerotinia blight (Sclerotinia minor). In this study, we report the generation of 17,376 peanut expressed sequence tags (ESTs) from leaf tissues of a peanut cultivar (Tifrunner, resistant to TSWV and leaf spots) and a breeding line (GT-C20, susceptible to TSWV and leaf spots). After trimming vector and discarding low quality sequences, a total of 14,432 high-quality ESTs were selected for further analysis and deposition to GenBank. Sequence clustering resulted in 6,888 unique ESTs composed of 1,703 tentative consensus (TCs) sequences and 5185 singletons. A large number of ESTs (5717) representing genes of unknown functions were also identified. Among the unique sequences, there were 856 EST-SSRs identified. A total of 290 new EST-based SSR markers were developed and examined for amplification and polymorphism in cultivated peanut and wild species. Resequencing information of selected amplified alleles revealed that allelic diversity could be attributed mainly to differences in repeat type and length in the SSR regions. In addition, a few additional INDEL mutations and substitutions were observed in the regions flanking the microsatellite regions. In addition, some defense-related transcripts were also identified, such as putative oxalate oxidase (EU024476) and NBS-LRR domains. EST data in this study have provided a new source of information for gene discovery and development of SSR markers in cultivated peanut. A total of 16931 ESTs have been deposited to the NCBI GenBank database with accession numbers ES751523 to ES768453. PMID:19584933

  18. Accelerated adhesion of grafted skin by laser-induced stress wave-based gene transfer of hepatocyte growth factor

    NASA Astrophysics Data System (ADS)

    Aizawa, Kazuya; Sato, Shunichi; Terakawa, Mitsuhiro; Saitoh, Daizoh; Tsuda, Hitoshi; Ashida, Hiroshi; Obara, Minoru

    2009-11-01

    Gene therapy using wound healing-associated growth factor gene has received much attention as a new strategy for improving the outcome of tissue transplantation. We delivered plasmid DNA coding for human hepatocyte growth factor (hHGF) to rat free skin grafts by the use of laser-induced stress waves (LISWs); autografting was performed with the grafts. Systematic analysis was conducted to evaluate the adhesion properties of the grafted tissue; angiogenesis, cell proliferation, and reepithelialization were assessed by immunohistochemistry, and reperfusion was measured by laser Doppler imaging as a function of time after grafting. Both the level of angiogenesis on day 3 after grafting and the increased ratio of blood flow on day 4 to that on day 3 were significantly higher than those in five control groups: grafting with hHGF gene injection alone, grafting with control plasmid vector injection alone, grafting with LISW application alone, grafting with LISW application after control plasmid vector injection, and normal grafting. Reepithelialization was almost completed on day 7 even at the center of the graft with LISW application after hHGF gene injection, while it was not for the grafts of the five control groups. These findings demonstrate the validity of our LISW-based HGF gene transfection to accelerate the adhesion of grafted skins.

  19. Integration of Cot Analysis, DNA Cloning, and High-Throughput Sequencing Facilitates Genome Characterization and Gene Discovery

    PubMed Central

    Peterson, Daniel G.; Schulze, Stefan R.; Sciara, Erica B.; Lee, Scott A.; Bowers, John E.; Nagel, Alexander; Jiang, Ning; Tibbitts, Deanne C.; Wessler, Susan R.; Paterson, Andrew H.

    2002-01-01

    Cot-based sequence discovery represents a powerful means by which both low-copy and repetitive sequences can be selectively and efficiently fractionated, cloned, and characterized. Based upon the results of a Cot analysis, hydroxyapatite chromatography was used to fractionate sorghum (Sorghum bicolor) genomic DNA into highly repetitive (HR), moderately repetitive (MR), and single/low-copy (SL) sequence components that were consequently cloned to produce HRCot, MRCot, and SLCot genomic libraries. Filter hybridization (blotting) and sequence analysis both show that the HRCot library is enriched in sequences traditionally found in high-copy number (e.g., retroelements, rDNA, centromeric repeats), the SLCot library is enriched in low-copy sequences (e.g., genes and “nonrepetitive ESTs”), and the MRCot library contains sequences of moderate redundancy. The Cot analysis suggests that the sorghum genome is approximately 700 Mb (in agreement with previous estimates) and that HR, MR, and SL components comprise 15%, 41%, and 24% of sorghum DNA, respectively. Unlike previously described techniques to sequence the low-copy components of genomes, sequencing of Cot components is independent of expression and methylation patterns that vary widely among DNA elements, developmental stages, and taxa. High-throughput sequencing of Cot clones may be a means of “capturing” the sequence complexity of eukaryotic genomes at unprecedented efficiency. [Online supplementary material is available at www.genome.org. The sequence data described in this paper have been submitted to the GenBank under accession nos. AZ921847-AZ923007. Reagents, samples, and unpublished information freely provided by H. Ma and J. Messing.] PMID:11997346

  20. Accelerated alcoholic fermentation caused by defective gene expression related to glucose derepression in Saccharomyces cerevisiae.

    PubMed

    Watanabe, Daisuke; Hashimoto, Naoya; Mizuno, Megumi; Zhou, Yan; Akao, Takeshi; Shimoi, Hitoshi

    2013-01-01

    Sake yeast strains maintain high fermentation rates, even after the stationary growth phase begins. To determine the molecular mechanisms underlying this advantageous brewing property, we compared the gene expression profiles of sake and laboratory yeast strains of Saccharomyces cerevisiae during the stationary growth phase. DNA microarray analysis revealed that the sake yeast strain examined had defects in expression of the genes related to glucose derepression mediated by transcription factors Adr1p and Cat8p. Furthermore, deletion of the ADR1 and CAT8 genes slightly but statistically significantly improved the fermentation rate of a laboratory yeast strain. We also identified two loss-of-function mutations in the ADR1 gene of existing sake yeast strains. Taken together, these results indicate that the gene expression program associated with glucose derepression for yeast acts as an impediment to effective alcoholic fermentation under glucose-rich fermentative conditions.

  1. Accelerated Evolution of Schistosome Genes Coding for Proteins Located at the Host–Parasite Interface

    PubMed Central

    Philippsen, Gisele S.; Wilson, R. Alan; DeMarco, Ricardo

    2015-01-01

    Study of proteins located at the host–parasite interface in schistosomes might provide clues about the mechanisms utilized by the parasite to escape the host immune system attack. Micro-exon gene (MEG) protein products and venom allergen-like (VAL) proteins have been shown to be present in schistosome secretions or associated with glands, which led to the hypothesis that they are important components in the molecular interaction of the parasite with the host. Phylogenetic and structural analysis of genes and their transcripts in these two classes shows that recent species-specific expansion of gene number for these families occurred separately in three different species of schistosomes. Enrichment of transposable elements in MEG and VAL genes in Schistosoma mansoni provides a credible mechanism for preferential expansion of gene numbers for these families. Analysis of the ratio between synonymous and nonsynonymous substitution rates (dN/dS) in the comparison between schistosome orthologs for the two classes of genes reveals significantly higher values when compared with a set of a control genes coding for secreted proteins, and for proteins previously localized in the tegument. Additional analyses of paralog genes indicate that exposure of the protein to the definitive host immune system is a determining factor leading to the higher than usual dN/dS values in those genes. The observation that two genes encoding S. mansoni vaccine candidate proteins, known to be exposed at the parasite surface, also display similar evolutionary dynamics suggests a broad response of the parasite to evolutionary pressure imposed by the definitive host immune system. PMID:25567667

  2. Accelerated adhesion of grafted skins by laser-induced stress wave-based gene transfer of hepatocyte growth factor

    NASA Astrophysics Data System (ADS)

    Aizawa, Kazuya; Sato, Shunichi; Saitoh, Daizoh; Tsuda, Hitoshi; Ashida, Hiroshi; Obara, Minoru

    2009-02-01

    In our previous study, we delivered plasmid DNA coding for human hepatocyto growth factor (hHGF) to rat skin grafts based on laser-induced stress wave (LISW), by which production of CD31-positive cells in the grafted skins was found to be enhanced, suggesting improved angiogenesis. In this study, we validated the efficacy of this method to accelerate adhesion of grafted skins; reperfusion and reepithelialization in the grafted skins were examined. As a graft, dorsal skin of a rat was exsected and its subcutaneous fat was removed. Plasmid DNA expression vector for hHGF was injected into the graft; on its back surface a laser target with a transparent sheet for plasma confinement was placed, and irradiated with three nanosecond laser pulses at a laser fluence of 1.2 J/cm2 (532 nm; spot diameter, 3 mm) to generate LISWs. After the application of LISWs, the graft was transplanted onto its donor site. We evaluated blood flow by laser Doppler imaging and analyzed reepithelialization based on immunohistochemistry as a function of postgrafting time. It was found that both reperfusion and reepithelialization were significantly enhanced for the grafts with gene transfection than for normal grafts; reepithelialization was completed within 7 days after transplantation with the transfected grafts. These findings demonstrate that adhesion of grafted skins can be accelerated by delivering HGF gene to the grafts based on LISWs.

  3. A Revolution in Plant Metabolism: Genome-Enabled Pathway Discovery

    PubMed Central

    Kim, Jeongwoon; Buell, C. Robin

    2015-01-01

    Genome-enabled discoveries are the hallmark of 21st century biology, including major discoveries in the biosynthesis and regulation of plant metabolic pathways. Access to next generation sequencing technologies has enabled research on the biosynthesis of diverse plant metabolites, especially secondary metabolites, resulting in a broader understanding of not only the structural and regulatory genes involved in metabolite biosynthesis but also in the evolution of chemical diversity in the plant kingdom. Several paradigms that govern secondary metabolism have emerged, including that (1) gene family expansion and diversification contribute to the chemical diversity found in the plant kingdom, (2) genes encoding biochemical pathway components are frequently transcriptionally coregulated, and (3) physical clustering of nonhomologous genes that encode components of secondary metabolic pathways can occur. With an increasing knowledge base that is coupled with user-friendly and inexpensive technologies, biochemists are poised to accelerate the annotation of biochemical pathways relevant to human health, agriculture, and the environment. PMID:26224805

  4. Analysis of an Inactive Cyanobactin Biosynthetic Gene Cluster Leads to Discovery of New Natural Products from Strains of the Genus Microcystis

    PubMed Central

    Leikoski, Niina; Fewer, David P.; Jokela, Jouni; Alakoski, Pirita; Wahlsten, Matti; Sivonen, Kaarina

    2012-01-01

    Cyanobactins are cyclic peptides assembled through the cleavage and modification of short precursor proteins. An inactive cyanobactin gene cluster has been described from the genome Microcystis aeruginosa NIES843. Here we report the discovery of active counterparts in strains of the genus Microcystis guided by this silent cyanobactin gene cluster. The end products of the gene clusters were structurally diverse cyclic peptides, which we named piricyclamides. Some of the piricyclamides consisted solely of proteinogenic amino acids while others contained disulfide bridges and some were prenylated or geranylated. The piricyclamide gene clusters encoded between 1 and 4 precursor genes. They encoded highly diverse core peptides ranging in length from 7–17 amino acids with just a single conserved amino acid. Heterologous expression of the pir gene cluster from Microcystis aeruginosa PCC7005 in Escherichia coli confirmed that this gene cluster is responsible for the biosynthesis of piricyclamides. Chemical analysis demonstrated that Microcystis strains could produce an array of piricyclamides some of which are geranylated or prenylated. The genetic diversity of piricyclamides in a bloom sample was explored and 19 different piricyclamide precursor genes were found. This study provides evidence for a stunning array of piricyclamides in Microcystis, a worldwide occurring bloom forming cyanobacteria. PMID:22952627

  5. DISCOVERY OF SUB- TO SUPERLUMINAL MOTIONS IN THE M87 JET: AN IMPLICATION OF ACCELERATION FROM SUB-RELATIVISTIC TO RELATIVISTIC SPEEDS

    SciTech Connect

    Asada, Keiichi; Nakamura, Masanori; Inoue, Makoto; Doi, Akihiro; Nagai, Hiroshi E-mail: nakamura@asiaa.sinica.edu.tw

    2014-01-20

    The velocity field of the M87 jet from milli-arcsecond (mas) to arcsecond scales is extensively investigated together with new radio images taken from European VLBI Network (EVN) observations. We detected proper motions of components located at between 160 mas from the core and the HST-1 complex for the first time. Newly derived velocity fields exhibit a systematic increase from sub- to superluminal speeds in the upstream of HST-1. If we assume that the observed velocities reflect the bulk flow, here we suggest that the M87 jet may be gradually accelerated through a distance of 10{sup 6} times the Schwarzschild radius of the supermassive black hole. The acceleration zone is co-spatial with the jet parabolic region, which is interpreted as the collimation zone of the jet. The acceleration and collimation take place simultaneously, which we suggest is characteristic of magnetohydrodynamic flows. The distribution of the velocity field has a peak at HST-1, which is considered as the site of over-collimation, and shows a deceleration downstream of HST-1 where the jet is conical. Our interpretation of the velocity map in the M87 jet provides a hypothesis for active galactic nuclei which suggests that the acceleration and collimation zone of relativistic jets extends over the whole scale within the sphere of influence of the supermassive black hole.

  6. Exome sequencing of senescence-accelerated mice (SAM) reveals deleterious mutations in degenerative disease-causing genes

    PubMed Central

    2013-01-01

    Background Senescence-accelerated mice (SAM) are a series of mouse strains originally derived from unexpected crosses between AKR/J and unknown mice, from which phenotypically distinct senescence-prone (SAMP) and -resistant (SAMR) inbred strains were subsequently established. Although SAMP strains have been widely used for aging research focusing on their short life spans and various age-related phenotypes, such as immune dysfunction, osteoporosis, and brain atrophy, the responsible gene mutations have not yet been fully elucidated. Results To identify mutations specific to SAMP strains, we performed whole exome sequencing of 6 SAMP and 3 SAMR strains. This analysis revealed 32,019 to 38,925 single-nucleotide variants in the coding region of each SAM strain. We detected Ogg1 p.R304W and Mbd4 p.D129N deleterious mutations in all 6 of the SAMP strains but not in the SAMR or AKR/J strains. Moreover, we extracted 31 SAMP-specific novel deleterious mutations. In all SAMP strains except SAMP8, we detected a p.R473W missense mutation in the Ldb3 gene, which has been associated with myofibrillar myopathy. In 3 SAMP strains (SAMP3, SAMP10, and SAMP11), we identified a p.R167C missense mutation in the Prx gene, in which mutations causing hereditary motor and sensory neuropathy (Dejerine-Sottas syndrome) have been identified. In SAMP6 we detected a p.S540fs frame-shift mutation in the Il4ra gene, a mutation potentially causative of ulcerative colitis and osteoporosis. Conclusions Our data indicate that different combinations of mutations in disease-causing genes may be responsible for the various phenotypes of SAMP strains. PMID:23586671

  7. MR1 discovery.

    PubMed

    Hashimoto, Keiichiro

    2016-08-01

    The moment of MR1 discovery is described. The MR1 gene is the first and the last reported human MHC-related gene intentionally isolated from the human genome composed of three billion base pairs. Evolutionary considerations formed the basis of its isolation. Some details surrounding the moment and some retrospective descriptions with various kinds of encounters are also included.

  8. Novel tools for accelerated materials discovery in the AFLOWLIB.ORG repository: breakthroughs and challenges in the mapping of the materials genome

    NASA Astrophysics Data System (ADS)

    Buongiorno Nardelli, Marco

    2015-03-01

    High-Throughput Quantum-Mechanics computation of materials properties by ab initio methods has become the foundation of an effective approach to materials design, discovery and characterization. This data driven approach to materials science currently presents the most promising path to the development of advanced technological materials that could solve or mitigate important social and economic challenges of the 21st century. In particular, the rapid proliferation of computational data on materials properties presents the possibility to complement and extend materials property databases where the experimental data is lacking and difficult to obtain. Enhanced repositories such as AFLOWLIB, open novel opportunities for structure discovery and optimization, including uncovering of unsuspected compounds, metastable structures and correlations between various properties. The practical realization of these opportunities depends on the the design effcient algorithms for electronic structure simulations of realistic material systems, the systematic compilation and classification of the generated data, and its presentation in easily accessed form to the materials science community, the primary mission of the AFLOW consortium. This work was supported by ONR-MURI under Contract N00014-13-1-0635 and the Duke University Center for Materials Genomics.

  9. Microwave-Accelerated Metal-Enhanced Fluorescence (MAMEF) with silver colloids in 96-well plates: Application to ultra fast and sensitive immunoassays, High Throughput Screening and drug discovery.

    PubMed

    Aslan, Kadir; Holley, Patrick; Geddes, Chris D

    2006-05-30

    Fluorescence detection is the basis of most assays used in drug discovery and High Throughput Screening (HTS) today. In all of these assays, assay rapidity and sensitivity is a primary concern, the sensitivity determined by both the quantum yield of the fluorophores and efficiency of the detection system, while rapidity is determined by the physical and biophysical parameters of temperature, concentration, assay bioaffinity, etc. In this paper we describe a platform technology that promises to fundamentally address these two physical constraints of sensitivity and rapidity. By combining the use of Metal-Enhanced Fluorescence (MEF), a near-field effect that can significantly enhance fluorescence signatures, with low power microwave heating, we can significantly increase the sensitivity of surface assays as well as >95% kinetically complete the assay within a few seconds. In addition, the metallic nanostructures used to facilitate MEF appear to be preferentially heated as compared to the surface assay fluid, advantageously localizing the MEF and heating around the nanostructures. To demonstrate proof of principle, a 96-well plate has been functionalized with silver nanostructures, and a model protein avidin-biotin assay studied. In our findings, a greater than 5-fold fluorescence enhancement coupled with a approximately 90-fold increase in assay kinetics was observed, but with no assay washing steps needed due to the silver-enhanced evanescent field mode of excitation. These findings promise to strongly facilitate high throughput fluorescence-based processes, such as in biology, drug discovery and general compound screening.

  10. Guided Discoveries.

    ERIC Educational Resources Information Center

    Ehrlich, Amos

    1991-01-01

    Presented are four mathematical discoveries made by students on an arithmetical function using the Fibonacci sequence. Discussed is the nature of the role of the teacher in directing the students' discovery activities. (KR)

  11. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework.

    PubMed

    Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org.

  12. The Genetics of Obsessive-Compulsive Disorder and Tourette Syndrome: An Epidemiological and Pathway-Based Approach for Gene Discovery

    ERIC Educational Resources Information Center

    Grados, Marco A.

    2010-01-01

    Objective: To provide a contemporary perspective on genetic discovery methods applied to obsessive-compulsive disorder (OCD) and Tourette syndrome (TS). Method: A review of research trends in genetics research in OCD and TS is conducted, with emphasis on novel approaches. Results: Genome-wide association studies (GWAS) are now in progress in OCD…

  13. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework.

    PubMed

    Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org. PMID:26989145

  14. An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework

    PubMed Central

    Chen, Yi-An; Tripathi, Lokesh P.; Mizuguchi, Kenji

    2016-01-01

    Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org PMID:26989145

  15. Discovery and identification of candidate sex-related genes based on transcriptome sequencing of Russian sturgeon (Acipenser gueldenstaedtii) gonads.

    PubMed

    Chen, Yadong; Xia, Yongtao; Shao, Changwei; Han, Lei; Chen, Xuejie; Yu, Mengjun; Sha, Zhenxia

    2016-07-01

    As the Russian sturgeon (Acipenser gueldenstaedtii) is an important food and is the main source of caviar, it is necessary to discover the genes associated with its sex differentiation. However, the complicated life and maturity cycles of the Russian sturgeon restrict the accurate identification of sex in early development. To generate a first look at specific sex-related genes, we sequenced the transcriptome of gonads in different development stages (1, 2, and 5 yr old stages) with next-generation RNA sequencing. We generated >60 million raw reads, and the filtered reads were assembled into 263,341 contigs, which produced 38,505 unigenes. Genes involved in signal transduction mechanisms were the most abundant, suggesting that development of sturgeon gonads is under control of signal transduction mechanisms. Differentially expressed gene analysis suggests that more genes for protein synthesis, cytochrome c oxidase subunits, and ribosomal proteins were expressed in female gonads than in male. Meanwhile, male gonads expressed more transposable element transposase, reverse transcriptase, and transposase-related genes than female. In total, 342, 782, and 7,845 genes were detected in intersex, male, and female transcriptomes, respectively. The female gonad expressed more genes than the male gonad, and more genes were involved in female gonadal development. Genes (sox9, foxl2) are differentially expressed in different sexes and may be important sex-related genes in Russian sturgeon. Sox9 genes are responsible for the development of male gonads and foxl2 for female gonads.

  16. Gene discovery within the planctomycete division of the domain Bacteria using sequence tags from genomic DNA libraries

    PubMed Central

    Jenkins, Cheryl; Kedar, Vishram; Fuerst, John A

    2002-01-01

    Background The planctomycetes comprise a distinct group of the domain Bacteria, forming a separate division by phylogenetic analysis. The organization of their cells into membrane-defined compartments including membrane-bounded nucleoids, their budding reproduction and complete absence of peptidoglycan distinguish them from most other Bacteria. A random sequencing approach was applied to the genomes of two planctomycete species, Gemmata obscuriglobus and Pirellula marina, to discover genes relevant to their cell biology and physiology. Results Genes with a wide variety of functions were identified in G. obscuriglobus and Pi. marina, including those of metabolism and biosynthesis, transport, regulation, translation and DNA replication, consistent with established phenotypic characters for these species. The genes sequenced were predominantly homologous to those in members of other divisions of the Bacteria, but there were also matches with nuclear genomic genes of the domain Eukarya, genes that may have appeared in the planctomycetes via horizontal gene transfer events. Significant among these matches are those with two genes atypical for Bacteria and with significant cell-biology implications - integrin alpha-V and inter-alpha-trypsin inhibitor protein - with homologs in G. obscuriglobus and Pi. marina respectively. Conclusions The random-sequence-tag approach applied here to G. obscuriglobus and Pi. marina is the first report of gene recovery and analysis from members of the planctomycetes using genome-based methods. Gene homologs identified were predominantly similar to genes of Bacteria, but some significant best matches to genes from Eukarya suggest that lateral gene transfer events between domains may have involved this division at some time during its evolution. PMID:12093378

  17. Discovery and evaluation of candidate sex-determining genes and xenobiotics in the gonads of lake sturgeon (Acipenser fulvescens).

    PubMed

    Hale, Matthew C; Jackson, James R; Dewoody, J Andrew

    2010-07-01

    Modern pyrosequencing has the potential to uncover many interesting aspects of genome evolution, even in lineages where genomic resources are scarce. In particular, 454 pyrosequencing of nonmodel species has been used to characterize expressed sequence tags, xenobiotics, gene ontologies, and relative levels of gene expression. Herein, we use pyrosequencing to study the evolution of genes expressed in the gonads of a polyploid fish, the lake sturgeon (Acipenser fulvescens). Using 454 pyrosequencing of transcribed genes, we produced more than 125 MB of sequence data from 473,577 high-quality sequencing reads. Sequences that passed stringent quality control thresholds were assembled into 12,791 male contigs and 32,629 female contigs. Average depth of coverage was 4.2 x for the male assembly and 5.5x for the female assembly. Analytical rarefaction indicates that our assemblies include most of the genes expressed in lake sturgeon gonads. Over 86,700 sequencing reads were assigned gene ontologies, many to general housekeeping genes like protein, RNA, and ion binding genes. We searched specifically for sex determining genes and documented significant sex differences in the expression of two genes involved in animal sex determination, DMRT1 and TRA-1. DMRT1 is the master sex determining gene in birds and in medaka (Oryzias latipes) whereas TRA-1 helps direct sexual differentiation in nematodes. We also searched the lake sturgeon assembly for evidence of xenobiotic organisms that may exist as endosymbionts. Our results suggest that exogenous parasites (trematodes) and pathogens (protozoans) apparently have infected lake sturgeon gonads, and the trematodes have horizontally transferred some genes to the lake sturgeon genome.

  18. Enrichment of bacteria possessing catechol dioxygenase genes in the rhizosphere of Spirodela polyrrhiza: a mechanism of accelerated biodegradation of phenol.

    PubMed

    Toyama, Tadashi; Sei, Kazunari; Yu, Ning; Kumada, Hirohide; Inoue, Daisuke; Hoang, Hai; Soda, Satoshi; Chang, Young-Cheol; Kikuchi, Shintaro; Fujita, Masanori; Ike, Michihiko

    2009-08-01

    The bacterial community structure in bulk water and in rhizosphere fractions of giant duckweed, Spirodela polyrrhiza, was quantitatively and qualitatively investigated by PCR-based methods using 6 environmental water samples to elucidate the mechanisms underlying selective accumulation of aromatic compound-degrading bacteria in the rhizosphere of S. polyrrhiza. S. polyrrhiza selectively accumulated a diverse range of aromatic compound-degrading bacteria in its rhizosphere, regardless of the origin of water samples, despite no exposure to phenol. The relative abundances of the catechol 1,2-dioxygenase (C12O) gene (C12O DNA) and catechol 2,3-dioxygenase (C23O) gene (C23O DNA) were calculated as the ratios of the copy numbers of these genes to the copy number of 16S rDNA and are referred to as the rhizosphere effect (RE) value. The RE values for C12O DNA and C23O DNA were 1.0 x 10(1)-9.3 x 10(3) and 1.7 x 10(2)-1.5 x 10(4) times as high, respectively, in rhizosphere fractions as in bulk water fractions, and these higher values were associated with a notably higher sequence diversity of C12O DNA and C23O DNA. The RE values during phenol degradation were 3.6 x 10(0)-4.3 x 10(2) and 2.2 x 10(0)-1.7 x 10(2), respectively, indicating the ability of S. polyrrhiza to selectively accumulate aromatic compound-degrading bacteria in its rhizosphere during phenol degradation. The bacterial communities in the rhizosphere fractions differed from those in the bulk water fractions, and those in the bulk water fractions were notably affected by the rhizosphere bacterial communities. S. polyrrhiza released more than 100 types of phenolic compound into its rhizosphere as root exudates at the considerably high specific release rate of 1520mg TOC and 214mg phenolic compounds/d/g root (wet weight). This ability of S. polyrrhiza might result in the selective recruitment and accumulation of a diverse range of bacteria harboring genes encoding C12O and C23O, and the subsequent accelerated

  19. Prenatal Arsenic Exposure Alters Gene Expression in the Adult Liver to a Proinflammatory State Contributing to Accelerated Atherosclerosis

    PubMed Central

    States, J. Christopher; Singh, Amar V.; Knudsen, Thomas B.; Rouchka, Eric C.; Ngalame, Ntube O.; Arteel, Gavin E.; Piao, Yulan; Ko, Minoru S. H.

    2012-01-01

    The mechanisms by which environmental toxicants alter developmental processes predisposing individuals to adult onset chronic disease are not well-understood. Transplacental arsenic exposure promotes atherogenesis in apolipoprotein E-knockout (ApoE−/−) mice. Because the liver plays a central role in atherosclerosis, diabetes and metabolic syndrome, we hypothesized that accelerated atherosclerosis may be linked to altered hepatic development. This hypothesis was tested in ApoE−/− mice exposed to 49 ppm arsenic in utero from gestational day (GD) 8 to term. GD18 hepatic arsenic was 1.2 µg/g in dams and 350 ng/g in fetuses. The hepatic transcriptome was evaluated by microarray analysis to assess mRNA and microRNA abundance in control and exposed pups at postnatal day (PND) 1 and PND70. Arsenic exposure altered postnatal developmental trajectory of mRNA and microRNA profiles. We identified an arsenic exposure related 51-gene signature at PND1 and PND70 with several hubs of interaction (Hspa8, IgM and Hnf4a). Gene ontology (GO) annotation analyses indicated that pathways for gluconeogenesis and glycolysis were suppressed in exposed pups at PND1, and pathways for protein export, ribosome, antigen processing and presentation, and complement and coagulation cascades were induced by PND70. Promoter analysis of differentially-expressed transcripts identified enriched transcription factor binding sites and clustering to common regulatory sites. SREBP1 binding sites were identified in about 16% of PND70 differentially-expressed genes. Western blot analysis confirmed changes in the liver at PND70 that included increases of heat shock protein 70 (Hspa8) and active SREBP1. Plasma AST and ALT levels were increased at PND70. These results suggest that transplacental arsenic exposure alters developmental programming in fetal liver, leading to an enduring stress and proinflammatory response postnatally that may contribute to early onset of atherosclerosis. Genes containing

  20. GAMUT: GPU accelerated microRNA analysis to uncover target genes through CUDA-miRanda

    PubMed Central

    2014-01-01

    Background Non-coding sequences such as microRNAs have important roles in disease processes. Computational microRNA target identification (CMTI) is becoming increasingly important since traditional experimental methods for target identification pose many difficulties. These methods are time-consuming, costly, and often need guidance from computational methods to narrow down candidate genes anyway. However, most CMTI methods are computationally demanding, since they need to handle not only several million query microRNA and reference RNA pairs, but also several million nucleotide comparisons within each given pair. Thus, the need to perform microRNA identification at such large scale has increased the demand for parallel computing. Methods Although most CMTI programs (e.g., the miRanda algorithm) are based on a modified Smith-Waterman (SW) algorithm, the existing parallel SW implementations (e.g., CUDASW++ 2.0/3.0, SWIPE) are unable to meet this demand in CMTI tasks. We present CUDA-miRanda, a fast microRNA target identification algorithm that takes advantage of massively parallel computing on Graphics Processing Units (GPU) using NVIDIA's Compute Unified Device Architecture (CUDA). CUDA-miRanda specifically focuses on the local alignment of short (i.e., ≤ 32 nucleotides) sequences against longer reference sequences (e.g., 20K nucleotides). Moreover, the proposed algorithm is able to report multiple alignments (up to 191 top scores) and the corresponding traceback sequences for any given (query sequence, reference sequence) pair. Results Speeds over 5.36 Giga Cell Updates Per Second (GCUPs) are achieved on a server with 4 NVIDIA Tesla M2090 GPUs. Compared to the original miRanda algorithm, which is evaluated on an Intel Xeon E5620@2.4 GHz CPU, the experimental results show up to 166 times performance gains in terms of execution time. In addition, we have verified that the exact same targets were predicted in both CUDA-miRanda and the original mi

  1. The discovery of quarks.

    PubMed

    Riordan, M

    1992-05-29

    Quarks are widely recognized today as being among the elementary particles of which matter is composed. The key evidence for their existence came from a series of inelastic electron-nucleon scattering experiments conducted between 1967 and 1973 at the Stanford Linear Accelerator Center. Other theoretical and experimental advances of the 1970s confirmed this discovery, leading to the present standard model of elementary particle physics.

  2. Immune gene discovery by expressed sequence tag (EST) analysis of hemocytes in the ridgetail white prawn Exopalaemon carinicauda

    PubMed Central

    Duan, Yafei; Liu, Ping; Li, Jitao; Li, Jian; Chen, Ping

    2013-01-01

    The ridgetail white prawn Exopalaemon carinicauda is one of the most important commercial species in eastern China. However, little information of immune genes in E. carinicauda has been reported. To identify distinctive genes associated with immunity, an expressed sequence tag (EST) library was constructed from hemocytes of E. carinicauda. A total of 3411 clones were sequenced, yielding 2853 ESTs and the average sequence length is 436 bp. The cluster and assembly analysis yielded 1053 unique sequences including 329 contigs and 724 singletons. Blast analysis identified 593 (56.3%) of the unique sequences as orthologs of genes from other organisms (E-value < 1e-5). Based on the COG and Gene Ontology (GO), 593 unique sequences were classified. Through comparison with previous studies, 153 genes assembled from 367 ESTs have been identified as possibly involved in defense or immune functions. These genes are categorized into seven categories according to their putative functions in shrimp immune system: antimicrobial peptides, prophenoloxidase activating system, antioxidant defense systems, chaperone proteins, clottable proteins, pattern recognition receptors and other immune-related genes. According to EST abundance, the major immune-related genes were thioredoxin (141, 4.94% of all ESTs) and calmodulin (14, 0.49% of all ESTs). The EST sequences of E. carinicauda hemocytes provide important information of the immune system and lay the groundwork for development of molecular markers related to disease resistance in prawn species. PMID:23092732

  3. Coupled Transcriptome and Proteome Analysis of Human Lymphotropic Tumor Viruses: Insights on the Detection and Discovery of Viral Genes

    SciTech Connect

    Dresang, Lindsay R.; Teuton, Jeremy R.; Feng, Huichen; Jacobs, Jon M.; Camp, David G.; Purvine, Samuel O.; Gritsenko, Marina A.; Li, Zhihua; Smith, Richard D.; Sugden, Bill; Moore, Patrick S.; Chang, Yuan

    2011-12-20

    Kaposi's sarcoma-associated herpesvirus (KSHV) and Epstein-Barr virus (EBV) are related human tumor viruses that cause primary effusion lymphomas (PEL) and Burkitt's lymphomas (BL), respectively. Viral genes expressed in naturally-infected cancer cells contribute to disease pathogenesis; knowing which viral genes are expressed is critical in understanding how these viruses cause cancer. To evaluate the expression of viral genes, we used high-resolution separation and mass spectrometry coupled with custom tiling arrays to align the viral proteomes and transcriptomes of three PEL and two BL cell lines under latent and lytic culture conditions. Results The majority of viral genes were efficiently detected at the transcript and/or protein level on manipulating the viral life cycle. Overall the correlation of expressed viral proteins and transcripts was highly complementary in both validating and providing orthogonal data with latent/lytic viral gene expression. Our approach also identified novel viral genes in both KSHV and EBV, and extends viral genome annotation. Several previously uncharacterized genes were validated at both transcript and protein levels. Conclusions This systems biology approach coupling proteome and transcriptome measurements provides a comprehensive view of viral gene expression that could not have been attained using each methodology independently. Detection of viral proteins in combination with viral transcripts is a potentially powerful method for establishing virus-disease relationships.

  4. The discovery of the two types of small subunit ribosomal RNA gene in Eimeria mitis contests the existence of E. mivati as an independent species.

    PubMed

    Vrba, Vladimir; Poplstein, Martin; Pakandl, Michal

    2011-12-29

    Although the validity of the coccidian species, Eimeria mivati, has been questioned by many researchers for a long time there has not been any molecular analysis that would help resolve this issue. Here we report on the discovery of the two types of small ribosomal subunit (18S) gene within the Eimeria mitis genome that correspond to the known 18S sequences of E. mitis and E. mivati, and this is in conflict with the existence of E. mivati as an independent species. We have carried out five single oocyst isolations to obtain five single-oocyst-derived strains of E. mitis and these were analyzed by the sequencing of 18S and mitochondrial cytochrome c oxidase subunit I genes. The two types of 18S gene were found to be present in each strain in roughly equal ratios. This indicates that if the strains carrying only one or the other 18S type exist, they will likely cross-breed and still represent a single species. However, the more probable explanation is that all strains of E. mitis contain two types of 18S gene and that the occasional detection of only one or the other type by sequencing might be caused by insufficient sampling. This is also the first report of the two types of 18S gene in Eimeria, which has already been described in some other apicomplexan species, most notably Plasmodium. We also found that these two types of ribosomal RNA differ significantly in their secondary structure. The biological significance of the two 18S gene variants in E. mitis is not known, however, we hypothesize that these variants might be used in different stages of the parasite's life-cycle as it is in other apicomplexan species investigated so far.

  5. Normalized lmQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in Cancers

    PubMed Central

    Zhang, Jie; Huang, Kun

    2014-01-01

    In this paper, we present a new approach for mining weighted networks to identify densely connected modules such as quasi-cliques. Quasi-cliques are densely connected subnetworks in a network. Detecting quasi-cliques is an important topic in data mining, with applications such as social network study and biomedicine. Our approach has two major improvements upon previous work. The first is the use of local maximum edges to initialize the search in order to avoid excessive overlaps among the modules, thereby greatly reducing the computing time. The second is the inclusion of a weight normalization procedure to enable discovery of “subtle” modules with more balanced sizes. We carried out careful tests on multiple parameters and settings using two large cancer datasets. This approach allowed us to identify a large number of gene modules enriched in both biological functions and chromosomal bands in cancer data, suggesting potential roles of copy number variations (CNVs) involved in the cancer development. We then tested the genes in selected modules with enriched chromosomal bands using The Cancer Genome Atlas data, and the results strongly support our hypothesis that the coexpression in these modules are associated with CNVs. While gene coexpression network analyses have been widely adopted in disease studies, most of them focus on the functional relationships of coexpressed genes. The relationship between coexpression gene modules and CNVs are much less investigated despite the potential advantage that we can infer from such relationship without genotyping data. Our new approach thus provides a means to carry out deep mining of the gene coexpression network to obtain both functional and genetic information from the expression data. PMID:27486298

  6. p53 Haploinsufficiency Profoundly Accelerates the Onset of Tongue Tumors in Mice Lacking the Xeroderma Pigmentosum Group A Gene

    PubMed Central

    Ide, Fumio; Kitada, Munenori; Sakashita, Hideaki; Kusama, Kaoru; Tanaka, Kiyoji; Ishikawa, Takatoshi

    2003-01-01

    Mice lacking the xeroderma pigmentosum group A gene (XPA−/− mice), which have a complete deficiency in nucleotide excision repair (NER), are highly predisposed to tongue squamous cell carcinoma (SCC) when exposed to 4-nitroquinoline 1-oxide (4NQO). To explore the effects of the interaction of the NER machinery with p53 in oral tumorigenesis, we generated an XPA−/− mouse strain carrying mutant alleles for p53. This mouse model of 4NQO carcinogenesis demonstrated that despite the same tumor frequency, XPA−/−p53+/− mice reached 100% SCC incidence at 25 weeks compared with 50 weeks for XPA−/−p53+/+ littermates. XPA−/−p53−/− mice succumbed to spontaneous thymic lymphomas before the development of tongue tumors (before 13 weeks of age). SCC originated in XPA−/−p53+/− mice maintained the p53+/− genotype and the retained wild-type p53 allele appeared to be structurally intact. Only one of 20 XPA−/−p53+/+ SCC showed a missense mutation of p53. Collectively, the accelerated tongue tumor growth may be a consequence of haploinsufficiency but not of mutation of p53 in the context of NER deficiency. PMID:14578172

  7. Discovery of genes related to witches broom disease in Paulownia tomentosa × Paulownia fortunei by a De Novo assembled transcriptome.

    PubMed

    Liu, Rongning; Dong, Yanpeng; Fan, Guoqiang; Zhao, Zhenli; Deng, Minjie; Cao, Xibing; Niu, Suyan

    2013-01-01

    In spite of its economic importance, very little molecular genetics and genomic research has been targeted at the family Paulownia spp. The little genetic information on this plant is a big obstacle to studying the mechanisms of its ability to resist Paulownia Witches' Broom (PaWB) disease. Analysis of the Paulownia transcriptome and its expression profile data are essential to extending the genetic resources on this species, thus will greatly improves our studies on Paulownia. In the current study, we performed the de novo assembly of a transcriptome on P. tomentosa × P. fortunei using the short-read sequencing technology (Illumina). 203,664 unigenes with a mean length of 1,328 bp was obtained. Of these unigenes, 32,976 (30% of all unigenes) containing complete structures were chosen. Eukaryotic clusters of orthologous groups, gene orthology, and the Kyoto Encyclopedia of Genes and Genomes annotations were performed of these unigenes. Genes related to PaWB disease resistance were analyzed in detail. To our knowledge, this is the first study to elucidate the genetic makeup of Paulownia. This transcriptome provides a quick way to understanding Paulownia, increases the number of gene sequences available for further functional genomics studies and provides clues to the identification of potential PaWB disease resistance genes. This study has provided a comprehensive insight into gene expression profiles at different states, which facilitates the study of each gene's roles in the developmental process and in PaWB disease resistance.

  8. Discovery of genes related to witches broom disease in Paulownia tomentosa × Paulownia fortunei by a De Novo assembled transcriptome.

    PubMed

    Liu, Rongning; Dong, Yanpeng; Fan, Guoqiang; Zhao, Zhenli; Deng, Minjie; Cao, Xibing; Niu, Suyan

    2013-01-01

    In spite of its economic importance, very little molecular genetics and genomic research has been targeted at the family Paulownia spp. The little genetic information on this plant is a big obstacle to studying the mechanisms of its ability to resist Paulownia Witches' Broom (PaWB) disease. Analysis of the Paulownia transcriptome and its expression profile data are essential to extending the genetic resources on this species, thus will greatly improves our studies on Paulownia. In the current study, we performed the de novo assembly of a transcriptome on P. tomentosa × P. fortunei using the short-read sequencing technology (Illumina). 203,664 unigenes with a mean length of 1,328 bp was obtained. Of these unigenes, 32,976 (30% of all unigenes) containing complete structures were chosen. Eukaryotic clusters of orthologous groups, gene orthology, and the Kyoto Encyclopedia of Genes and Genomes annotations were performed of these unigenes. Genes related to PaWB disease resistance were analyzed in detail. To our knowledge, this is the first study to elucidate the genetic makeup of Paulownia. This transcriptome provides a quick way to understanding Paulownia, increases the number of gene sequences available for further functional genomics studies and provides clues to the identification of potential PaWB disease resistance genes. This study has provided a comprehensive insight into gene expression profiles at different states, which facilitates the study of each gene's roles in the developmental process and in PaWB disease resistance. PMID:24278262

  9. Pigmentation in sand pear (Pyrus pyrifolia) fruit: biochemical characterization, gene discovery and expression analysis with exocarp pigmentation mutant.

    PubMed

    Wang, Yue-zhi; Zhang, Shujun; Dai, Mei-song; Shi, Ze-bin

    2014-05-01

    Exocarp color of sand pear is an important trait for the fruit production and has caused our concern for a long time. Our previous study explored the different expression genes between the two genotypes contrasting for exocarp color, which indicated the different suberin, cutin, wax and lignin biosynthesis between the russet- and green-exocarp. In this study, we carried out microscopic observation and Fourier transform infrared spectroscopy analysis to detect the differences of tissue structure and biochemical composition between the russet- and green-exocarp of sand pear. The green exocarp was covered with epidermis and cuticle which was replaced by a cork layer on the surface of russet exocarp, and the chemicals of the russet exocarp were characterized by lignin, cellulose and hemicellulose. We explored differential gene expression between the russet exocarp of 'Niitaka' and its green exocarp mutant cv. 'Suisho' using Illumina RNA-sequencing. A total of 559 unigenes showed different expression between the two types of exocarp, and 123 of them were common to the previous study. The quantitative real time-PCR analysis supports the RNA-seq-derived gene with different expression between the two types of exocarp and revealed the preferential expression of these genes in exocarp than in mesocarp and fruit core. Gene ontology enrichment analysis revealed divorced expression of lipid metabolic process genes, transport genes, stress responsive genes and other biological process genes in the two types of exocarp. Expression changes in lignin metabolism-related genes were consistent with the different pigmentation of russet and green exocarp. Increased transcripts of putative genes involved the suberin, cutin and wax biosynthesis in 'Suisho' exocarp could facilitate deposition of the chemicals and take a role in the mutant trait responsible for the green exocarp. In addition, the divorced expression of ATP-binding cassette transporters involved in the trans

  10. GeneFizz: A web tool to compare genetic (coding/non-coding) and physical (helix/coil) segmentations of DNA sequences. Gene discovery and evolutionary perspectives.

    PubMed

    Yeramian, Edouard; Jones, Louis

    2003-07-01

    The GeneFizz (http://pbga.pasteur.fr/GeneFizz) web tool permits the direct comparison between two types of segmentations for DNA sequences (possibly annotated): the coding/non-coding segmentation associated with genomic annotations (simple genes or exons in split genes) and the physics-based structural segmentation between helix and coil domains (as provided by the classical helix-coil model). There appears to be a varying degree of coincidence for different genomes between the two types of segmentations, from almost perfect to non-relevant. Following these two extremes, GeneFizz can be used for two purposes: ab initio physics-based identification of new genes (as recently shown for Plasmodium falciparum) or the exploration of possible evolutionary signals revealed by the discrepancies observed between the two types of information.

  11. Biomarker discovery to improve prediction of breast cancer survival: using gene expression profiling, meta-analysis, and tissue validation

    PubMed Central

    Meng, Liwei; Xu, Yingchun; Xu, Chaoyang; Zhang, Wei

    2016-01-01

    Purpose Breast cancer is the leading cause of cancer death worldwide in women. The molecular mechanism for human breast cancer is unknown. Gene microarray has been widely used in breast cancer research to identify clinically relevant molecular subtypes as well as to predict prognosis survival. So far, the valuable multigene signatures in clinical practice are unclear, and the biological importance of individual genes is difficult to detect, as the described signatures virtually do not overlap. Early prognosis of this disease, breast invasive ductal carcinoma (IDC) and breast ductal carcinoma in situ (DCIS), is vital in breast surgery. Methods Thus, this study reports gene expression profiling in large breast cancer cohorts from Gene Expression Omnibus, including GSE29044 (N=138) and GSE10780 (N=185) test series and four independent validation series GSE21653 (N=266), GSE20685 (N=327), GSE26971 (N=276), and GSE12776 (N=204). Significantly differentially expressed genes in human breast IDC and breast DCIS were detected by transcriptome microarray analysis. Results We created a set of three genes (MAMDC2, TSHZ2, and CLDN11) that were significantly correlated with disease-free survival of breast cancer patients using a univariate Cox regression model (significance level P<0.01) in a meta-analysis. Based on the risk score of the three genes, the test series patients could be separated into low-risk and high-risk groups with significantly different survival times. This signature was validated in the other three cohorts. The prognostic value of this three-gene signature was confirmed in the internal validation series and another four independent breast cancer data sets. The prognostic impact of one of the three genes, CLDN11, was confirmed by immunohistochemistry. CLDN11 was significantly overexpressed in human breast IDC as compared with normal breast tissues and breast DCIS. Conclusion Using novel gene expression profiling together with a meta-analysis validation

  12. TLR and IMD signaling pathways from Caligus rogercresseyi (Crustacea: Copepoda): in silico gene expression and SNPs discovery.

    PubMed

    Valenzuela-Muñoz, V; Gallardo-Escárate, C

    2014-02-01

    The Toll and IMD signaling pathways represent one of the first lines of innate immune defense in invertebrates like Drosophila. However, for crustaceans like Caligus rogercresseyi, there is very little genomic information and, consequently, understanding of immune mechanisms. Massive sequencing data obtained for three developmental stages of C. rogercresseyi were used to evaluate in silico the expression patterns and presence of SNPs variants in genes involved in the Toll and IMD pathways. Through RNA-seq analysis, which used 20 contigs corresponding to relevant genes of the Toll and IMD pathways, an overexpression of genes linked to the Toll pathway, such as toll3 and Dorsal, were observed in the copepod stage. For the chalimus and adult stages, overexpression of genes in both pathways, such as Akirin and Tollip and IAP and Toll9, respectively, were observed. On the other hand, PCA statistical analysis inferred that in the chalimus and adult stages, the immune response mechanism was more developed, as evidenced by a relation between these two stages and the genes of both pathways. Moreover, 136 SNPs were identified for 20 contigs in genes of the Toll and IMD pathways. This study provides transcriptomic information about the immune response mechanisms of Caligus, thus providing a foundation for the development of new control strategies through blocking the innate immune response. PMID:24389530

  13. Sleeping Beauty Transposon Mutagenesis as a Tool for Gene Discovery in the NOD Mouse Model of Type 1 Diabetes.

    PubMed

    Elso, Colleen M; Chu, Edward P F; Alsayb, May A; Mackin, Leanne; Ivory, Sean T; Ashton, Michelle P; Bröer, Stefan; Silveira, Pablo A; Brodnicki, Thomas C

    2015-10-04

    A number of different strategies have been used to identify genes for which genetic variation contributes to type 1 diabetes (T1D) pathogenesis. Genetic studies in humans have identified >40 loci that affect the risk for developing T1D, but the underlying causative alleles are often difficult to pinpoint or have subtle biological effects. A complementary strategy to identifying "natural" alleles in the human population is to engineer "artificial" alleles within inbred mouse strains and determine their effect on T1D incidence. We describe the use of the Sleeping Beauty (SB) transposon mutagenesis system in the nonobese diabetic (NOD) mouse strain, which harbors a genetic background predisposed to developing T1D. Mutagenesis in this system is random, but a green fluorescent protein (GFP)-polyA gene trap within the SB transposon enables early detection of mice harboring transposon-disrupted genes. The SB transposon also acts as a molecular tag to, without additional breeding, efficiently identify mutated genes and prioritize mutant mice for further characterization. We show here that the SB transposon is functional in NOD mice and can produce a null allele in a novel candidate gene that increases diabetes incidence. We propose that SB transposon mutagenesis could be used as a complementary strategy to traditional methods to help identify genes that, when disrupted, affect T1D pathogenesis.

  14. Sleeping Beauty Transposon Mutagenesis as a Tool for Gene Discovery in the NOD Mouse Model of Type 1 Diabetes

    PubMed Central

    Elso, Colleen M.; Chu, Edward P. F.; Alsayb, May A.; Mackin, Leanne; Ivory, Sean T.; Ashton, Michelle P.; Bröer, Stefan; Silveira, Pablo A.; Brodnicki, Thomas C.

    2015-01-01

    A number of different strategies have been used to identify genes for which genetic variation contributes to type 1 diabetes (T1D) pathogenesis. Genetic studies in humans have identified >40 loci that affect the risk for developing T1D, but the underlying causative alleles are often difficult to pinpoint or have subtle biological effects. A complementary strategy to identifying “natural” alleles in the human population is to engineer “artificial” alleles within inbred mouse strains and determine their effect on T1D incidence. We describe the use of the Sleeping Beauty (SB) transposon mutagenesis system in the nonobese diabetic (NOD) mouse strain, which harbors a genetic background predisposed to developing T1D. Mutagenesis in this system is random, but a green fluorescent protein (GFP)-polyA gene trap within the SB transposon enables early detection of mice harboring transposon-disrupted genes. The SB transposon also acts as a molecular tag to, without additional breeding, efficiently identify mutated genes and prioritize mutant mice for further characterization. We show here that the SB transposon is functional in NOD mice and can produce a null allele in a novel candidate gene that increases diabetes incidence. We propose that SB transposon mutagenesis could be used as a complementary strategy to traditional methods to help identify genes that, when disrupted, affect T1D pathogenesis. PMID:26438296

  15. Sleeping Beauty Transposon Mutagenesis as a Tool for Gene Discovery in the NOD Mouse Model of Type 1 Diabetes.

    PubMed

    Elso, Colleen M; Chu, Edward P F; Alsayb, May A; Mackin, Leanne; Ivory, Sean T; Ashton, Michelle P; Bröer, Stefan; Silveira, Pablo A; Brodnicki, Thomas C

    2015-12-01

    A number of different strategies have been used to identify genes for which genetic variation contributes to type 1 diabetes (T1D) pathogenesis. Genetic studies in humans have identified >40 loci that affect the risk for developing T1D, but the underlying causative alleles are often difficult to pinpoint or have subtle biological effects. A complementary strategy to identifying "natural" alleles in the human population is to engineer "artificial" alleles within inbred mouse strains and determine their effect on T1D incidence. We describe the use of the Sleeping Beauty (SB) transposon mutagenesis system in the nonobese diabetic (NOD) mouse strain, which harbors a genetic background predisposed to developing T1D. Mutagenesis in this system is random, but a green fluorescent protein (GFP)-polyA gene trap within the SB transposon enables early detection of mice harboring transposon-disrupted genes. The SB transposon also acts as a molecular tag to, without additional breeding, efficiently identify mutated genes and prioritize mutant mice for further characterization. We show here that the SB transposon is functional in NOD mice and can produce a null allele in a novel candidate gene that increases diabetes incidence. We propose that SB transposon mutagenesis could be used as a complementary strategy to traditional methods to help identify genes that, when disrupted, affect T1D pathogenesis. PMID:26438296

  16. Discovery and replication of dopamine-related gene effects on caudate volume in young and elderly populations (N=1198) using genome-wide search

    PubMed Central

    Stein, Jason L.; Hibar, Derrek P.; Madsen, Sarah K.; Khamis, Mathew; McMahon, Katie L.; de Zubicaray, Greig I.; Hansell, Narelle K.; Montgomery, Grant W.; Martin, Nicholas G.; Wright, Margaret J.; Saykin, Andrew J.; Jack, Clifford R.; Weiner, Michael W.; Toga, Arthur W.; Thompson, Paul M.

    2011-01-01

    The caudate is a subcortical brain structure implicated in many common neurological and psychiatric disorders. To identify specific genes associated with variations in caudate volume, structural MRI and genome-wide genotypes were acquired from two large cohorts, the Alzheimer’s Disease NeuroImaging Initiative (ADNI; N=734) and the Brisbane Adolescent/Young Adult Longitudinal Twin Study (BLTS; N=464). In a preliminary analysis of heritability, around 90% of the variation in caudate volume was due to genetic factors. We then conducted genome-wide association to find common variants that contribute to this relatively high heritability. Replicated genetic association was found for the right caudate volume at SNP rs163030 in the ADNI discovery sample (P=2.36×10−6) and in the BLTS replication sample (P=0.012). This genetic variation accounted for 2.79% and 1.61% of the trait variance, respectively. The peak of association was found in and around two genes, WDR41 and PDE8B, involved in dopamine signaling and development. In addition, a previously identified mutation in PDE8B causes a rare autosomal-dominant type of striatal degeneration. Searching across both samples offers a rigorous way to screen for genes consistently influencing brain structure at different stages of life. Variants identified here may be relevant to common disorders affecting the caudate. PMID:21502949

  17. Discovery and replication of dopamine-related gene effects on caudate volume in young and elderly populations (N=1198) using genome-wide search.

    PubMed

    Stein, J L; Hibar, D P; Madsen, S K; Khamis, M; McMahon, K L; de Zubicaray, G I; Hansell, N K; Montgomery, G W; Martin, N G; Wright, M J; Saykin, A J; Jack, C R; Weiner, M W; Toga, A W; Thompson, P M

    2011-09-01

    The caudate is a subcortical brain structure implicated in many common neurological and psychiatric disorders. To identify specific genes associated with variations in caudate volume, structural magnetic resonance imaging and genome-wide genotypes were acquired from two large cohorts, the Alzheimer's Disease NeuroImaging Initiative (ADNI; N=734) and the Brisbane Adolescent/Young Adult Longitudinal Twin Study (BLTS; N=464). In a preliminary analysis of heritability, around 90% of the variation in caudate volume was due to genetic factors. We then conducted genome-wide association to find common variants that contribute to this relatively high heritability. Replicated genetic association was found for the right caudate volume at single-nucleotide polymorphism rs163030 in the ADNI discovery sample (P=2.36 × 10⁻⁶) and in the BLTS replication sample (P=0.012). This genetic variation accounted for 2.79 and 1.61% of the trait variance, respectively. The peak of association was found in and around two genes, WDR41 and PDE8B, involved in dopamine signaling and development. In addition, a previously identified mutation in PDE8B causes a rare autosomal-dominant type of striatal degeneration. Searching across both samples offers a rigorous way to screen for genes consistently influencing brain structure at different stages of life. Variants identified here may be relevant to common disorders affecting the caudate. PMID:21502949

  18. The first set of expressed sequence tags (EST) from the medicinal mushroom Agaricus subrufescens delivers resource for gene discovery and marker development.

    PubMed

    Foulongne-Oriol, Marie; Lapalu, Nicolas; Férandon, Cyril; Spataro, Cathy; Ferrer, Nathalie; Amselem, Joelle; Savoie, Jean-Michel

    2014-09-01

    Agaricus subrufescens is one of the most important culinary-medicinal cultivable mushrooms with potentially high-added-value products and extended agronomical valorization. The development of A. subrufescens-related technologies is hampered by, among others, the lack of suitable molecular tools. Thus, this mushroom is considered as a genomic orphan species with a very limited number of available molecular markers or sequences. To fill this gap, this study reports the generation and analysis of the first set of expressed sequence tags (EST) for A. subrufescens. cDNA fragments obtained from young sporophores (SP) and vegetative mycelium in liquid culture (CL) were sequenced using 454 pyrosequencing technology. After assembly process, 4,989 and 5,125 sequences were obtained in SP and CL libraries, respectively. About 87% of the EST had significant similarity with Agaricus bisporus-predicted proteins, and 79% correspond to known proteins. Functional categorization according to Gene Ontology could be assigned to 49% of the sequences. Some gene families potentially involved in bioactive compound biosynthesis could be identified. A total of 232 simple sequence repeats (SSRs) were identified, and a set of 40 EST-SSR polymorphic markers were successfully developed. This EST dataset provides a new resource for gene discovery and molecular marker development. It constitutes a solid basis for further genetic and genomic studies in A. subrufescens.

  19. A specific group of genes respond to cold dehydration stress in cut Alstroemeria flowers whereas ambient dehydration stress accelerates developmental senescence expression patterns.

    PubMed

    Wagstaff, Carol; Bramke, Irene; Breeze, Emily; Thornber, Sarah; Harrison, Elizabeth; Thomas, Brian; Buchanan-Wollaston, Vicky; Stead, Tony; Rogers, Hilary

    2010-06-01

    Petal development and senescence entails a normally irreversible process. It starts with petal expansion and pigment production, and ends with nutrient remobilization and ultimately cell death. In many species this is accompanied by petal abscission. Post-harvest stress is an important factor in limiting petal longevity in cut flowers and accelerates some of the processes of senescence such as petal wilting and abscission. However, some of the effects of moderate stress in young flowers are reversible with appropriate treatments. Transcriptomic studies have shown that distinct gene sets are expressed during petal development and senescence. Despite this, the overlap in gene expression between developmental and stress-induced senescence in petals has not been fully investigated in any species. Here a custom-made cDNA microarray from Alstroemeria petals was used to investigate the overlap in gene expression between developmental changes (bud to first sign of senescence) and typical post-harvest stress treatments. Young flowers were stressed by cold or ambient temperatures without water followed by a recovery and rehydration period. Stressed flowers were still at the bud stage after stress treatments. Microarray analysis showed that ambient dehydration stress accelerates many of the changes in gene expression patterns that would normally occur during developmental senescence. However, a higher proportion of gene expression changes in response to cold stress were specific to this stimulus and not senescence related. The expression of 21 transcription factors was characterized, showing that overlapping sets of regulatory genes are activated during developmental senescence and by different stresses.

  20. Transcriptome-based discovery of pathways and genes related to resistance against Fusarium head blight in wheat landrace Wangshuibai

    PubMed Central

    2013-01-01

    Background Fusarium head blight (FHB), caused mainly by Fusarium graminearum (Fg) Schwabe (teleomorph: Gibberellazeae Schwble), brings serious damage to wheat production. Chinese wheat landrace Wangshuibai is one of the most important resistance sources in the world. The knowledge of mechanism underlying its resistance to FHB is still limited. Results To get an overview of transcriptome characteristics of Wangshuibai during infection by Fg, a high-throughput RNA sequencing based on next generation sequencing (NGS) technology (Illumina) were performed. Totally, 165,499 unigenes were generated and assigned to known protein databases including NCBI non-redundant protein database (nr) (82,721, 50.0%), Gene Ontology (GO) (38,184, 23.1%), Swiss-Prot (50,702, 30.6%), Clusters of orthologous groups (COG) (51,566, 31.2%) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) (30,657, 18.5%), as determined by Blastx search. With another NGS based platform, a digital gene expression (DGE) system, gene expression in Wangshuibai and its FHB susceptible mutant NAUH117 was profiled and compared at two infection stages by inoculation of Fg at 24 and 48 hour, with the aim of identifying genes involved in FHB resistance. Conclusion Pathogen-related proteins such as PR5, PR14 and ABC transporter and JA signaling pathway were crucial for FHB resistance, especially that mediated by Fhb1. ET pathway and ROS/NO pathway were not activated in Wangshuibai and may be not pivotal in defense to FHB. Consistent with the fact that in NAUH117 there presented a chromosome fragment deletion, which led to its increased FHB susceptibility, in Wangshuibai, twenty out of eighty-nine genes showed changed expression patterns upon the infection of Fg. The up-regulation of eight of them was confirmed by qRT-PCR, revealing they may be candidate genes for Fhb1 and need further functional analysis to confirm their roles in FHB resistance. PMID:23514540

  1. Generation of ESTs in Vitis vinifera wine grape (Cabernet Sauvignon) and table grape (Muscat Hamburg) and discovery of new candidate genes with potential roles in berry development.

    PubMed

    Peng, Fred Y; Reid, Karen E; Liao, Nancy; Schlosser, James; Lijavetzky, Diego; Holt, Robert; Martínez Zapater, José M; Jones, Steven; Marra, Marco; Bohlmann, Jörg; Lund, Steven T

    2007-11-01

    We report the generation and analysis of a total of 77,583 expressed sequence tags (ESTs) from two grapevine (Vitis vinifera L.) cultivars, Cabernet Sauvignon (wine grape) and Muscat Hamburg (table grape) with a focus on EST sequence quality and assembly optimization. The majority of the ESTs were derived from normalized cDNA libraries representing berry pericarp and seed developmental series, pooled non-berry tissues including root, flower, and leaf in Cabernet Sauvignon, and pooled tissues of berry, seed, and flower in Muscat Hamburg. EST and unigene sequence quality were determined by computational filtering coupled with small-scale contig reassembly, manual review, and BLAST analyses. EST assembly was optimized to better discriminate among closely related paralogs using two independent grape sequence sets, a previously published set of Vitis spp. gene families and our EST dataset derived from pooled leaf, flower, and root tissues of Cabernet Sauvignon. Sequence assembly within individual libraries indicated that those prepared from pooled tissues contributed the most to gene discovery. Annotations based upon searches against multiple databases including tomato and strawberry sequences helped to identify putative functions of ESTs and unigenes, particularly with respect to fleshy fruit development. Sequence comparison among the three wine grape libraries identified a number of genes preferentially expressed in the pericarp tissue, including transcription factors, receptor-like protein kinases, and hexose transporters. Gene ontology (GO) classification in the biological process aspect showed that GO categories corresponding to 'transport' and 'cell organization and biogenesis', which are associated with metabolite movement and cell wall structural changes during berry ripening, were higher in pericarp than in other tissues in the wine grape studied. The sequence data were used to characterize potential roles of new genes in berry development and composition. PMID

  2. Discovery of Novel Leaf Rust Responsive microRNAs in Wheat and Prediction of Their Target Genes

    PubMed Central

    Kumar, Dhananjay; Singh, Dharmendra; Kanodia, Pulkit; Prabhu, Kumble Vinod; Kumar, Manish; Mukhopadhyay, Kunal

    2014-01-01

    MicroRNAs are endogenous small noncoding RNAs which play critical roles in gene regulation. Few wheat (Triticum aestivum L.) miRNA sequences are available in miRBase repertoire and knowledge of their biological functions related to biotic stress is limited. We identified 52 miRNAs, belonging to 19 families, from next-generation transcriptome sequence data based on homology search. One wheat specific novel miRNA was identified but could not be ascribed or assigned to any known miRNA family. Differentially expressed 22 miRNAs were found between susceptible and resistant wheat near-isogenic lines inoculated with leaf rust pathogen Puccinia triticina and compared with mock inoculated controls. Most miRNAs were more upregulated in susceptible NIL compared to resistant NIL. We identified 1306 potential target genes for these 52 miRNAs with vital roles in response to stimuli, signaling, and diverse metabolic and cellular processes. Gene ontology analysis showed 66, 20, and 35 target genes to be categorized into biological process, molecular function, and cellular component, respectively. A miRNA-mediated regulatory network revealed relationships among the components of the targetome. The present study provides insight into potential miRNAs with probable roles in leaf rust pathogenesis and their target genes in wheat which establish a foundation for future studies. PMID:25180085

  3. A Population of Deletion Mutants and an Integrated Mapping and Exome-seq Pipeline for Gene Discovery in Maize

    PubMed Central

    Jia, Shangang; Li, Aixia; Morton, Kyla; Avoles-Kianian, Penny; Kianian, Shahryar F.; Zhang, Chi; Holding, David

    2016-01-01

    To better understand maize endosperm filling and maturation, we used γ-irradiation of the B73 maize reference line to generate mutants with opaque endosperm and reduced kernel fill phenotypes, and created a population of 1788 lines including 39 Mo17 × F2s showing stable, segregating, and viable kernel phenotypes. For molecular characterization of the mutants, we developed a novel functional genomics platform that combined bulked segregant RNA and exome sequencing (BSREx-seq) to map causative mutations and identify candidate genes within mapping intervals. To exemplify the utility of the mutants and provide proof-of-concept for the bioinformatics platform, we present detailed characterization of line 937, an opaque mutant harboring a 6203 bp in-frame deletion covering six exons within the Opaque-1 gene. In addition, we describe mutant line 146 which contains a 4.8 kb intragene deletion within the Sugary-1 gene and line 916 in which an 8.6 kb deletion knocks out a Cyclin A2 gene. The publically available algorithm developed in this work improves the identification of causative deletions and its corresponding gaps within mapping peaks. This study demonstrates the utility of γ-irradiation for forward genetics in large nondense genomes such as maize since deletions often affect single genes. Furthermore, we show how this classical mutagenesis method becomes applicable for functional genomics when combined with state-of-the-art genomics tools. PMID:27261000

  4. Anti-cancer Parasporin Toxins are Associated with Different Environments: Discovery of Two Novel Parasporin 5-like Genes.

    PubMed

    Ammons, David R; Short, John D; Bailey, Jeffery; Hinojosa, Gabriela; Tavarez, Lourdes; Salazar, Martha; Rampersad, Joanne N

    2016-02-01

    Cry toxins are primarily a family of insecticidal toxins produced by the bacterium Bacillus thuringiensis (Bt). However, some Cry toxins, called parasporins (PSs), are non-insecticidal and have been shown to differentially kill human cancer cells. Based on amino acid homology, there are currently six different classes of parasporins (PS1-6). It is not known what role parasporins play in nature, nor if certain PSs are associated with Bt found in particular environments. Herein, we present ten parasporin-containing isolates of Bt from the Caribbean island of Trinidad. Genes coding for PS1 and PS6 were found in isolates associated mainly with artificial aquatic environments (e.g., barrels with rain water), while Bt possessing two novel PS5-like genes (ps5-1 and ps5-2), were isolated from manure collected directly from the rectum of cattle. The amino acid sequences inferred from the two PS5-like genes were 51 % homologous to each other, while being only 41 or 45 % similar to PS5Aa1/Cry64Aa, the only reported member of the parasporin five class. The low level of amino acid homology between the two PS5-like genes and PS5Aa1 indicate that the two PS5-like genes may represent a new class of parasporins, or greatly expand the level of diversity within the current parasporin 5 class. PMID:26563301

  5. Anti-cancer Parasporin Toxins are Associated with Different Environments: Discovery of Two Novel Parasporin 5-like Genes.

    PubMed

    Ammons, David R; Short, John D; Bailey, Jeffery; Hinojosa, Gabriela; Tavarez, Lourdes; Salazar, Martha; Rampersad, Joanne N

    2016-02-01

    Cry toxins are primarily a family of insecticidal toxins produced by the bacterium Bacillus thuringiensis (Bt). However, some Cry toxins, called parasporins (PSs), are non-insecticidal and have been shown to differentially kill human cancer cells. Based on amino acid homology, there are currently six different classes of parasporins (PS1-6). It is not known what role parasporins play in nature, nor if certain PSs are associated with Bt found in particular environments. Herein, we present ten parasporin-containing isolates of Bt from the Caribbean island of Trinidad. Genes coding for PS1 and PS6 were found in isolates associated mainly with artificial aquatic environments (e.g., barrels with rain water), while Bt possessing two novel PS5-like genes (ps5-1 and ps5-2), were isolated from manure collected directly from the rectum of cattle. The amino acid sequences inferred from the two PS5-like genes were 51 % homologous to each other, while being only 41 or 45 % similar to PS5Aa1/Cry64Aa, the only reported member of the parasporin five class. The low level of amino acid homology between the two PS5-like genes and PS5Aa1 indicate that the two PS5-like genes may represent a new class of parasporins, or greatly expand the level of diversity within the current parasporin 5 class.

  6. Breast Cancer Biomarker Discovery in the Functional Genomic Age: A Systematic Review of 42 Gene Expression Signatures

    PubMed Central

    Abba, M.C; Lacunza, E; Butti, M; Aldaz, C.M

    2010-01-01

    In this review we provide a systematic analysis of transcriptomic signatures derived from 42 breast cancer gene expression studies, in an effort to identify the most relevant breast cancer biomarkers using a meta-analysis method. Meta-data revealed a set of 117 genes that were the most commonly affected ranging from 12% to 36% of overlap among breast cancer gene expression studies. Data mining analysis of transcripts and protein-protein interactions of these commonly modulated genes indicate three functional modules significantly affected among signatures, one module related with the response to steroid hormone stimulus, and two modules related to the cell cycle. Analysis of a publicly available gene expression data showed that the obtained meta-signature is capable of predicting overall survival (P < 0.0001) and relapse-free survival (P < 0.0001) in patients with early-stage breast carcinomas. In addition, the identified meta-signature improves breast cancer patient stratification independently of traditional prognostic factors in a multivariate Cox proportional-hazards analysis. PMID:21082037

  7. Discovery of gene-gene interactions across multiple independent data sets of late onset Alzheimer disease from the Alzheimer Disease Genetics Consortium.

    PubMed

    Hohman, Timothy J; Bush, William S; Jiang, Lan; Brown-Gentry, Kristin D; Torstenson, Eric S; Dudek, Scott M; Mukherjee, Shubhabrata; Naj, Adam; Kunkle, Brian W; Ritchie, Marylyn D; Martin, Eden R; Schellenberg, Gerard D; Mayeux, Richard; Farrer, Lindsay A; Pericak-Vance, Margaret A; Haines, Jonathan L; Thornton-Wells, Tricia A

    2016-02-01

    Late-onset Alzheimer disease (AD) has a complex genetic etiology, involving locus heterogeneity, polygenic inheritance, and gene-gene interactions; however, the investigation of interactions in recent genome-wide association studies has been limited. We used a biological knowledge-driven approach to evaluate gene-gene interactions for consistency across 13 data sets from the Alzheimer Disease Genetics Consortium. Fifteen single nucleotide polymorphism (SNP)-SNP pairs within 3 gene-gene combinations were identified: SIRT1 × ABCB1, PSAP × PEBP4, and GRIN2B × ADRA1A. In addition, we extend a previously identified interaction from an endophenotype analysis between RYR3 × CACNA1C. Finally, post hoc gene expression analyses of the implicated SNPs further implicate SIRT1 and ABCB1, and implicate CDH23 which was most recently identified as an AD risk locus in an epigenetic analysis of AD. The observed interactions in this article highlight ways in which genotypic variation related to disease may depend on the genetic context in which it occurs. Further, our results highlight the utility of evaluating genetic interactions to explain additional variance in AD risk and identify novel molecular mechanisms of AD pathogenesis.

  8. Discovery of a Linear Peptide for Improving Tumor Targeting of Gene Products and Treatment of Distal Tumors by IL-12 Gene Therapy

    PubMed Central

    Cutrera, Jeffry; Dibra, Denada; Xia, Xueqing; Hasan, Azeem; Reed, Scott; Li, Shulin

    2011-01-01

    Like many effective therapeutics, interleukin-12 (IL-12) therapy often causes side effects. Tumor targeted delivery may improve the efficacy and decrease the toxicity of systemic IL-12 treatments. In this study, a novel targeting approach was investigated. A secreted alkaline phosphatase (SEAP) reporter gene-based screening process was used to identify a mini-peptide which can be produced in vivo to target gene products to tumors. The coding region for the best peptide was inserted into an IL-12 gene to determine the antitumor efficacy. Affinity chromatography, mass spectrometry analysis, and binding studies were used to identify a receptor for this peptide. We discovered that the linear peptide VNTANST increased the tumor accumulation of the reporter gene products in five independent tumor models including one human xenogeneic model. The product from VNTANST-IL-12 fusion gene therapy increased accumulation of IL-12 in the tumor environment, and in three tumor models, VNTANST-IL-12 gene therapy inhibited distal tumor growth. In a spontaneous lung metastasis model, inhibition of metastatic tumor growth was improved compared to wild-type IL-12 gene therapy, and in a squamous cell carcinoma model, toxic liver lesions were reduced. The receptor for VNTANST was identified as vimentin. These results show the promise of using VNTANST to improve IL-12 treatments. PMID:21386825

  9. A population of deletion mutants and an integrated mapping and Exome-seq pipeline for gene discovery in maize

    Technology Transfer Automated Retrieval System (TEKTRAN)

    To better understand maize endosperm filling and maturation, we developed a novel functional genomics platform that combined Bulked Segregant RNA and Exome sequencing (BSREx-seq) to map causative mutations and identify candidate genes within mapping intervals. Using gamma-irradiation of B73 maize to...

  10. Transcriptome Analysis of the White Body of the Squid Euprymna tasmanica with Emphasis on Immune and Hematopoietic Gene Discovery

    PubMed Central

    Salazar, Karla A.; Joffe, Nina R.; Dinguirard, Nathalie; Houde, Peter; Castillo, Maria G.

    2015-01-01

    In the mutualistic relationship between the squid Euprymna tasmanica and the bioluminescent bacterium Vibrio fischeri, several host factors, including immune-related proteins, are known to interact and respond specifically and exclusively to the presence of the symbiont. In squid and octopus, the white body is considered to be an immune organ mainly due to the fact that blood cells, or hemocytes, are known to be present in high numbers and in different developmental stages. Hence, the white body has been described as the site of hematopoiesis in cephalopods. However, to our knowledge, there are no studies showing any molecular evidence of such functions. In this study, we performed a transcriptomic analysis of white body tissue of the Southern dumpling squid, E. tasmanica. Our primary goal was to gain insights into the functions of this tissue and to test for the presence of gene transcripts associated with hematopoietic and immune processes. Several hematopoiesis genes including CPSF1, GATA 2, TFIID, and FGFR2 were found to be expressed in the white body. In addition, transcripts associated with immune-related signal transduction pathways, such as the toll-like receptor/NF-κβ, and MAPK pathways were also found, as well as other immune genes previously identified in E. tasmanica’s sister species, E. scolopes. This study is the first to analyze an immune organ within cephalopods, and to provide gene expression data supporting the white body as a hematopoietic tissue. PMID:25775132

  11. Transcriptome analysis of the white body of the squid Euprymna tasmanica with emphasis on immune and hematopoietic gene discovery.

    PubMed

    Salazar, Karla A; Joffe, Nina R; Dinguirard, Nathalie; Houde, Peter; Castillo, Maria G

    2015-01-01

    In the mutualistic relationship between the squid Euprymna tasmanica and the bioluminescent bacterium Vibrio fischeri, several host factors, including immune-related proteins, are known to interact and respond specifically and exclusively to the presence of the symbiont. In squid and octopus, the white body is considered to be an immune organ mainly due to the fact that blood cells, or hemocytes, are known to be present in high numbers and in different developmental stages. Hence, the white body has been described as the site of hematopoiesis in cephalopods. However, to our knowledge, there are no studies showing any molecular evidence of such functions. In this study, we performed a transcriptomic analysis of white body tissue of the Southern dumpling squid, E. tasmanica. Our primary goal was to gain insights into the functions of this tissue and to test for the presence of gene transcripts associated with hematopoietic and immune processes. Several hematopoiesis genes including CPSF1, GATA 2, TFIID, and FGFR2 were found to be expressed in the white body. In addition, transcripts associated with immune-related signal transduction pathways, such as the toll-like receptor/NF-κβ, and MAPK pathways were also found, as well as other immune genes previously identified in E. tasmanica's sister species, E. scolopes. This study is the first to analyze an immune organ within cephalopods, and to provide gene expression data supporting the white body as a hematopoietic tissue.

  12. SNP discovery and development of genetic markers for mapping immune response genes in common carp (Cyprinus carpio)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Single nucleotide polymorphisms (SNPs) in immune response genes have been reported as markers for susceptibility to infectious diseases in human and livestock. A disease caused by cyprinid herpesvirus 3 (CyHV-3) is highly contagious and virulent in common carp (Cyprinus carpio). With the aim to de...

  13. Discovery of single nucleotide polymorphisms in candidate genes associated with fertility and production traits in Holstein cattle

    PubMed Central

    2013-01-01

    Background Identification of single nucleotide polymorphisms (SNPs) for specific genes involved in reproduction might improve reliability of genomic estimates for these low-heritability traits. Semen from 550 Holstein bulls of high (≥ 1.7; n = 288) or low (≤ −2; n = 262) daughter pregnancy rate (DPR) was genotyped for 434 candidate SNPs using the Sequenom MassARRAY® system. Three types of SNPs were evaluated: SNPs previously reported to be associated with reproductive traits or physically close to genetic markers for reproduction, SNPs in genes that are well known to be involved in reproductive processes, and SNPs in genes that are differentially expressed between physiological conditions in a variety of tissues associated in reproductive function. Eleven reproduction and production traits were analyzed. Results A total of 40 SNPs were associated (P < 0.05) with DPR. Among these were genes involved in the endocrine system, cell signaling, immune function and inhibition of apoptosis. A total of 10 genes were regulated by estradiol. In addition, 22 SNPs were associated with heifer conception rate, 33 with cow conception rate, 36 with productive life, 34 with net merit, 23 with milk yield, 19 with fat yield, 13 with fat percent, 19 with protein yield, 22 with protein percent, and 13 with somatic cell score. The allele substitution effect for SNPs associated with heifer conception rate, cow conception rate, productive life and net merit were in the same direction as for DPR. Allele substitution effects for several SNPs associated with production traits were in the opposite direction as DPR. Nonetheless, there were 29 SNPs associated with DPR that were not negatively associated with production traits. Conclusion SNPs in a total of 40 genes associated with DPR were identified as well as SNPs for other traits. It might be feasible to include these SNPs into genomic tests of reproduction and other traits. The genes associated with DPR are likely to be

  14. Gene expression profiling of coelomic cells and discovery of immune-related genes in the earthworm, Eisenia andrei, using expressed sequence tags.

    PubMed

    Tak, Eun Sik; Cho, Sung-Jin; Park, Soon Cheol

    2015-01-01

    The coelomic cells of the earthworm consist of leukocytes, chlorogocytes, and coelomocytes, which play an important role in innate immunity reactions. To gain insight into the expression profiles of coelomic cells of the earthworm, Eisenia andrei, we analyzed 1151 expressed sequence tags (ESTs) derived from the cDNA library of the coelomic cells. Among the 1151 ESTs analyzed, 493 ESTs (42.8%) showed a significant similarity to known genes and represented 164 unique genes, of which 93 ESTs were singletons and 71 ESTs manifested as two or more ESTs. From the 164 unique genes sequenced, we found 24 immune-related and cell defense genes. Furthermore, real-time PCR analysis showed that levels of lysenin-related proteins mRNA in coelomic cells of E. andrei were upregulated after the injection of Bacillus subtilis bacteria. This EST data-set would provide a valuable resource for future researches of earthworm immune system.

  15. The Complete Genome Sequence of Plodia Interpunctella Granulovirus: Evidence for Horizontal Gene Transfer and Discovery of an Unusual Inhibitor-of-Apoptosis Gene

    PubMed Central

    Harrison, Robert L.; Rowley, Daniel L.; Funk, C. Joel

    2016-01-01

    The Indianmeal moth, Plodia interpunctella (Lepidoptera: Pyralidae), is a common pest of stored goods with a worldwide distribution. The complete genome sequence for a larval pathogen of this moth, the baculovirus Plodia interpunctella granulovirus (PiGV), was determined by next-generation sequencing. The PiGV genome was found to be 112, 536 bp in length with a 44.2% G+C nucleotide distribution. A total of 123 open reading frames (ORFs) and seven homologous regions (hrs) were identified and annotated. Phylogenetic inference using concatenated alignments of 36 baculovirus core genes placed PiGV in the “b” clade of viruses from genus Betabaculovirus with a branch length suggesting that PiGV represents a distinct betabaculovirus species. In addition to the baculovirus core genes and orthologues of other genes found in other betabaculovirus genomes, the PiGV genome sequence contained orthologues of the bidensovirus NS3 gene, as well as ORFs that occur in alphabaculoviruses but not betabaculoviruses. While PiGV contained an orthologue of inhibitor of apoptosis-5 (iap-5), an orthologue of inhibitor of apoptosis-3 (iap-3) was not present. Instead, the PiGV sequence contained an ORF (PiGV ORF81) encoding an IAP homologue with sequence similarity to insect cellular IAPs, but not to viral IAPs. Phylogenetic analysis of baculovirus and insect IAP amino acid sequences suggested that the baculovirus IAP-3 genes and the PiGV ORF81 IAP homologue represent different lineages arising from more than one acquisition event. The presence of genes from other sources in the PiGV genome highlights the extent to which baculovirus gene content is shaped by horizontal gene transfer. PMID:27472489

  16. The Complete Genome Sequence of Plodia Interpunctella Granulovirus: Evidence for Horizontal Gene Transfer and Discovery of an Unusual Inhibitor-of-Apoptosis Gene.

    PubMed

    Harrison, Robert L; Rowley, Daniel L; Funk, C Joel

    2016-01-01

    The Indianmeal moth, Plodia interpunctella (Lepidoptera: Pyralidae), is a common pest of stored goods with a worldwide distribution. The complete genome sequence for a larval pathogen of this moth, the baculovirus Plodia interpunctella granulovirus (PiGV), was determined by next-generation sequencing. The PiGV genome was found to be 112, 536 bp in length with a 44.2% G+C nucleotide distribution. A total of 123 open reading frames (ORFs) and seven homologous regions (hrs) were identified and annotated. Phylogenetic inference using concatenated alignments of 36 baculovirus core genes placed PiGV in the "b" clade of viruses from genus Betabaculovirus with a branch length suggesting that PiGV represents a distinct betabaculovirus species. In addition to the baculovirus core genes and orthologues of other genes found in other betabaculovirus genomes, the PiGV genome sequence contained orthologues of the bidensovirus NS3 gene, as well as ORFs that occur in alphabaculoviruses but not betabaculoviruses. While PiGV contained an orthologue of inhibitor of apoptosis-5 (iap-5), an orthologue of inhibitor of apoptosis-3 (iap-3) was not present. Instead, the PiGV sequence contained an ORF (PiGV ORF81) encoding an IAP homologue with sequence similarity to insect cellular IAPs, but not to viral IAPs. Phylogenetic analysis of baculovirus and insect IAP amino acid sequences suggested that the baculovirus IAP-3 genes and the PiGV ORF81 IAP homologue represent different lineages arising from more than one acquisition event. The presence of genes from other sources in the PiGV genome highlights the extent to which baculovirus gene content is shaped by horizontal gene transfer.

  17. The Complete Genome Sequence of Plodia Interpunctella Granulovirus: Evidence for Horizontal Gene Transfer and Discovery of an Unusual Inhibitor-of-Apoptosis Gene.

    PubMed

    Harrison, Robert L; Rowley, Daniel L; Funk, C Joel

    2016-01-01

    The Indianmeal moth, Plodia interpunctella (Lepidoptera: Pyralidae), is a common pest of stored goods with a worldwide distribution. The complete genome sequence for a larval pathogen of this moth, the baculovirus Plodia interpunctella granulovirus (PiGV), was determined by next-generation sequencing. The PiGV genome was found to be 112, 536 bp in length with a 44.2% G+C nucleotide distribution. A total of 123 open reading frames (ORFs) and seven homologous regions (hrs) were identified and annotated. Phylogenetic inference using concatenated alignments of 36 baculovirus core genes placed PiGV in the "b" clade of viruses from genus Betabaculovirus with a branch length suggesting that PiGV represents a distinct betabaculovirus species. In addition to the baculovirus core genes and orthologues of other genes found in other betabaculovirus genomes, the PiGV genome sequence contained orthologues of the bidensovirus NS3 gene, as well as ORFs that occur in alphabaculoviruses but not betabaculoviruses. While PiGV contained an orthologue of inhibitor of apoptosis-5 (iap-5), an orthologue of inhibitor of apoptosis-3 (iap-3) was not present. Instead, the PiGV sequence contained an ORF (PiGV ORF81) encoding an IAP homologue with sequence similarity to insect cellular IAPs, but not to viral IAPs. Phylogenetic analysis of baculovirus and insect IAP amino acid sequences suggested that the baculovirus IAP-3 genes and the PiGV ORF81 IAP homologue represent different lineages arising from more than one acquisition event. The presence of genes from other sources in the PiGV genome highlights the extent to which baculovirus gene content is shaped by horizontal gene transfer. PMID:27472489

  18. De Novo Transcriptome Analysis of an Aerial Microalga Trentepohlia jolithus: Pathway Description and Gene Discovery for Carbon Fixation and Carotenoid Biosynthesis

    PubMed Central

    Li, Qianqian; Liu, Jianguo; Zhang, Litao; Liu, Qian

    2014-01-01

    Background Algae in the order Trentepohliales have a broad geographic distribution and are generally characterized by the presence of abundant β-carotene. The many monographs published to date have mainly focused on their morphology, taxonomy, phylogeny, distribution and reproduction; molecular studies of this order are still rare. High-throughput RNA sequencing (RNA-Seq) technology provides a powerful and efficient method for transcript analysis and gene discovery in Trentepohlia jolithus. Methods/Principal Findings Illumina HiSeq 2000 sequencing generated 55,007,830 Illumina PE raw reads, which were assembled into 41,328 assembled unigenes. Based on NR annotation, 53.28% of the unigenes (22,018) could be assigned to gene ontology classes with 54 subcategories and 161,451 functional terms. A total of 26,217 (63.44%) assembled unigenes were mapped to 128 KEGG pathways. Furthermore, a set of 5,798 SSRs in 5,206 unigenes and 131,478 putative SNPs were identified. Moreover, the fact that all of the C4 photosynthesis genes exist in T. jolithus suggests a complex carbon acquisition and fixation system. Similarities and differences between T. jolithus and other algae in carotenoid biosynthesis are also described in depth. Conclusions/Significance This is the first broad transcriptome survey for T. jolithus, increasing the amount of molecular data available for the class Ulvophyceae. As well as providing resources for functional genomics studies, the functional genes and putative pathways identified here will contribute to a better understanding of carbon fixation and fatty acid and carotenoid biosynthesis in T. jolithus. PMID:25254555

  19. Discovery and Classification in Astronomy

    NASA Astrophysics Data System (ADS)

    Dick, Steven J.

    2012-01-01

    Three decades after Martin Harwit's pioneering Cosmic Discovery (1981), and following on the recent IAU Symposium "Accelerating the Rate of Astronomical Discovery,” we have revisited the problem of discovery in astronomy, emphasizing new classes of objects. 82 such classes have been identified and analyzed, including 22 in the realm of the planets, 36 in the realm of the stars, and 24 in the realm of the galaxies. We find an extended structure of discovery, consisting of detection, interpretation and understanding, each with its own nuances and a microstructure including conceptual, technological and social roles. This is true with a remarkable degree of consistency over the last 400 years of telescopic astronomy, ranging from Galileo's discovery of satellites, planetary rings and star clusters, to the discovery of quasars and pulsars. Telescopes have served as "engines of discovery” in several ways, ranging from telescope size and sensitivity (planetary nebulae and spiral galaxies), to specialized detectors (TNOs) and the opening of the electromagnetic spectrum for astronomy (pulsars, pulsar planets, and most active galaxies). A few classes (radiation belts, the solar wind and cosmic rays), were initially discovered without the telescope. Classification also plays an important role in discovery. While it might seem that classification marks the end of discovery, or a post-discovery phase, in fact it often marks the beginning, even a pre-discovery phase. Nowhere is this more clearly seen than in the classification of stellar spectra, long before dwarfs, giants and supergiants were known, or their evolutionary sequence recognized. Classification may also be part of a post-discovery phase, as in the MK system of stellar classification, constructed after the discovery of stellar luminosity classes. Some classes are declared rather than discovered, as in the case of gas and ice giant planets, and, infamously, Pluto as a dwarf planet.

  20. Discovery of [NiFe] hydrogenase genes in metagenomic DNA: cloning and heterologous expression in Thiocapsa roseopersicina.

    PubMed

    Maróti, Gergely; Tong, Yingkai; Yooseph, Shibu; Baden-Tillson, Holly; Smith, Hamilton O; Kovács, Kornél L; Frazier, Marvin; Venter, J Craig; Xu, Qing

    2009-09-01

    Using a metagenomics approach, we have cloned a piece of environmental DNA from the Sargasso Sea that encodes an [NiFe] hydrogenase showing 60% identity to the large subunit and 64% to the small subunit of a Thiocapsa roseopersicina O2-tolerant [NiFe] hydrogenase. The DNA sequence of the hydrogenase identified by the metagenomic approach was subsequently found to be 99% identical to the hyaA and hyaB genes of an Alteromonas macleodii hydrogenase, indicating that it belongs to the Alteromonas clade. We were able to express our new Alteromonas hydrogenase in T. roseopersicina. Expression was accomplished by coexpressing only two accessory genes, hyaD and hupH, without the need to express any of the hyp accessory genes (hypABCDEF). These results suggest that the native accessory proteins in T. roseopersicina could substitute for the Alteromonas counterparts that are absent in the host to facilitate the assembly of a functional Alteromonas hydrogenase. To further compare the complex assembly machineries of these two [NiFe] hydrogenases, we performed complementation experiments by introducing the new Alteromonas hyaD gene into the T. roseopersicina hynD mutant. Interestingly, Alteromonas endopeptidase HyaD could complement T. roseopersicina HynD to cleave endoproteolytically the C-terminal end of the T. roseopersicina HynL hydrogenase large subunit and activate the enzyme. This study refines our knowledge on the selectivity and pleiotropy of the elements of the [NiFe] hydrogenase assembly machineries. It also provides a model for functionally analyzing novel enzymes from environmental microbes in a culture-independent manner. PMID:19633107

  1. Functional gene-based discovery of phenazines from the actinobacteria associated with marine sponges in the South China Sea.

    PubMed

    Karuppiah, Valliappan; Li, Yingxin; Sun, Wei; Feng, Guofang; Li, Zhiyong

    2015-07-01

    Phenazines represent a large group of nitrogen-containing heterocyclic compounds produced by the diverse group of bacteria including actinobacteria. In this study, a total of 197 actinobacterial strains were isolated from seven different marine sponge species in the South China Sea using five different culture media. Eighty-seven morphologically different actinobacterial strains were selected and grouped into 13 genera, including Actinoalloteichus, Kocuria, Micrococcus, Micromonospora, Mycobacterium, Nocardiopsis, Prauserella, Rhodococcus, Saccharopolyspora, Salinispora, Serinicoccus, and Streptomyces by the phylogenetic analysis of 16S rRNA gene. Based on the screening of phzE genes, ten strains, including five Streptomyces, two Nocardiopsis, one Salinispora, one Micrococcus, and one Serinicoccus were found to be potential for phenazine production. The level of phzE gene expression was highly expressed in Nocardiopsis sp. 13-33-15, 13-12-13, and Serinicoccus sp. 13-12-4 on the fifth day of fermentation. Finally, 1,6-dihydroxy phenazine (1) from Nocardiopsis sp. 13-33-15 and 13-12-13, and 1,6-dimethoxy phenazine (2) from Nocardiopsis sp. 13-33-15 were isolated and identified successfully based on ESI-MS and NMR analysis. The compounds 1 and 2 showed antibacterial activity against Bacillus mycoides SJ14, Staphylococcus aureus SJ51, Escherichia coli SJ42, and Micrococcus luteus SJ47. This study suggests that the integrated approach of gene screening and chemical analysis is an effective strategy to find the target compounds and lays the basis for the production of phenazine from the sponge-associated actinobacteria.

  2. Gene expression analysis and SNP/InDel discovery to investigate yield heterosis of two rubber tree F1 hybrids

    PubMed Central

    Li, Dejun; Zeng, Rizhong; Li, Yan; Zhao, Manman; Chao, Jinquan; Li, Yu; Wang, Kai; Zhu, Lihuang; Tian, Wei-Min; Liang, Chengzhi

    2016-01-01

    As an important industrial material, natural rubber is mainly harvested from the rubber tree. Rubber tree breeding is inefficient, expensive and time-consuming, whereas marker-assisted selection is a feasible method for early selection of high-yield hybrids. We thus sequenced and analyzed the transcriptomes of two parent rubber trees (RRIM 600 and PR 107) and their most productive hybrids (RY 7-33-97 and RY 7-20-59) to understand their gene expression patterns and genetic variations including single nucleotide polymorphisms (SNPs) and small insertions/deletions (InDels). We discovered >31,000 genetic variations in 112,702 assembled unigenes. Our results showed that the higher yield in F1 hybrids was positively associated with their higher genome heterozygosity, which was further confirmed by genotyping 10 SNPs in 20 other varieties. We also showed that RY 7-33-97 and RY 7-20-59 were genetically closer to RRIM 600 and PR 107, respectively, in agreement with both their phenotypic similarities and gene expression profiles. After identifying ethylene- and jasmonic acid–responsive genes at the transcription level, we compared and analyzed the genetic variations underlying rubber biosynthesis and the jasmonic acid and ethylene pathways in detail. Our results suggest that genome-wide genetic variations play a substantive role in maintaining rubber tree heterosis. PMID:27108962

  3. Tissue-specific laser microdissection of the Brassica napus funiculus improves gene discovery and spatial identification of biological processes.

    PubMed

    Chan, Ainsley C; Khan, Deirdre; Girard, Ian J; Becker, Michael G; Millar, Jenna L; Sytnik, David; Belmonte, Mark F

    2016-05-01

    The three primary tissue systems of the funiculus each undergo unique developmental programs to support the growth and development of the filial seed. To understand the underlying transcriptional mechanisms that orchestrate development of the funiculus at the globular embryonic stage of seed development, we used laser microdissection coupled with RNA-sequencing to produce a high-resolution dataset of the mRNAs present in the epidermis, cortex, and vasculature of the Brassica napus (canola) funiculus. We identified 7761 additional genes in these tissues compared with the whole funiculus organ alone using this technology. Differential expression and enrichment analyses were used to identify several biological processes associated with each tissue system. Our data show that cell wall modification and lipid metabolism are prominent in the epidermis, cell growth and modification occur in the cortex, and vascular tissue proliferation and differentiation occur in the central vascular strand. We provide further evidence that each of the three tissue systems of the globular stage funiculus are involved in specific biological processes that all co-ordinate to support seed development. The identification of genes and gene regulators responsible for tissue-specific developmental processes of the canola funiculus now serves as a valuable resource for seed improvement research. PMID:27194740

  4. Tissue-specific laser microdissection of the Brassica napus funiculus improves gene discovery and spatial identification of biological processes

    PubMed Central

    Chan, Ainsley C.; Khan, Deirdre; Girard, Ian J.; Becker, Michael G.; Millar, Jenna L.; Sytnik, David; Belmonte, Mark F.

    2016-01-01

    The three primary tissue systems of the funiculus each undergo unique developmental programs to support the growth and development of the filial seed. To understand the underlying transcriptional mechanisms that orchestrate development of the funiculus at the globular embryonic stage of seed development, we used laser microdissection coupled with RNA-sequencing to produce a high-resolution dataset of the mRNAs present in the epidermis, cortex, and vasculature of the Brassica napus (canola) funiculus. We identified 7761 additional genes in these tissues compared with the whole funiculus organ alone using this technology. Differential expression and enrichment analyses were used to identify several biological processes associated with each tissue system. Our data show that cell wall modification and lipid metabolism are prominent in the epidermis, cell growth and modification occur in the cortex, and vascular tissue proliferation and differentiation occur in the central vascular strand. We provide further evidence that each of the three tissue systems of the globular stage funiculus are involved in specific biological processes that all co-ordinate to support seed development. The identification of genes and gene regulators responsible for tissue-specific developmental processes of the canola funiculus now serves as a valuable resource for seed improvement research. PMID:27194740

  5. Discovery of bacterial polyhydroxyalkanoate synthase (PhaC)-encoding genes from seasonal Baltic Sea ice and cold estuarine waters.

    PubMed

    Pärnänen, Katariina; Karkman, Antti; Virta, Marko; Eronen-Rasimus, Eeva; Kaartokallio, Hermanni

    2015-01-01

    Polyhydroxyalkanoates (PHAs) are macromolecules produced by bacteria as means for storing carbon and energy in intracellular granules. PHAs have physical properties similar to those of plastics and have become of interest to industry as materials for environmentally friendly bioplastic production. There is an ongoing search for new PHA-producing bacterial strains and PHA-synthesizing enzymes tolerating extreme conditions to find ways of producing PHAs at cold temperatures and high solute concentrations. Moreover, the study of PHA producers in the sea-ice biome can aid in understanding the microbial ecology of carbon cycling in ice-associated ecosystems. In this study, PHA producers and PHA synthase genes were examined under the extreme environmental conditions of sea ice and cold seawater to find evidence of PHA production in an environment requiring adaptation to high salinity and cold temperatures. Sea ice and cold estuarine water samples were collected from the northern Baltic Sea and evidence of PHA production was gathered, using microscopy with Nile Blue A staining of PHA-granules and PCR assays detecting PHA-synthesis genes. The PHA granules and PHA synthases were found at all sampling locations, in both sea ice and water, and throughout the sampling period spanning over 10 years. Our study shows, for the first time, that PHA synthesis occurs in Baltic Sea cold-adapted bacteria in their natural environment, which makes the Baltic Sea and its cold environments an interesting choice in the quest for PHA-synthesizing bacteria and synthesis genes. PMID:25280551

  6. A new approach to the rationale discovery of polymeric biomaterials

    PubMed Central

    Kohn, Joachim; Welsh, William J.; Knight, Doyle

    2007-01-01

    This paper attempts to illustrate both the need for new approaches to biomaterials discovery as well as the significant promise inherent in the use of combinatorial and computational design strategies. The key observation of this Leading Opinion Paper is that the biomaterials community has been slow to embrace advanced biomaterials discovery tools such as combinatorial methods, high throughput experimentation, and computational modeling in spite of the significant promise shown by these discovery tools in materials science, medicinal chemistry and the pharmaceutical industry. It seems that the complexity of living cells and their interactions with biomaterials has been a conceptual as well as a practical barrier to the use of advanced discovery tools in biomaterials science. However, with the continued increase in computer power, the goal of predicting the biological response of cells in contact with biomaterials surfaces is within reach. Once combinatorial synthesis, high throughput experimentation, and computational modeling are integrated into the biomaterials discovery process, a significant acceleration is possible in the pace of development of improved medical implants, tissue regeneration scaffolds, and gene/drug delivery systems. PMID:17644176

  7. Accelerated evolution of functional plastid rRNA and elongation factor genes due to reduced protein synthetic load after the loss of photosynthesis in the chlorophyte alga Polytoma.

    PubMed

    Vernon, D; Gutell, R R; Cannone, J J; Rumpf, R W; Birky, C W

    2001-09-01

    Polytoma obtusum and Polytoma uvella are members of a clade of nonphotosynthetic chlorophyte algae closely related to Chlamydomonas humicola and other photosynthetic members of the Chlamydomonadaceae. Descended from a nonphotosynthetic mutant, these obligate heterotrophs retain a plastid (leucoplast) with a functional protein synthetic system, and a plastid genome (lpDNA) with functional genes encoding proteins required for transcription and translation. Comparative studies of the evolution of genes in chloroplasts and leucoplasts can identify modes of selection acting on the plastid genome. Two plastid genes--rrn16, encoding the plastid small-subunit rRNA, and tufA, encoding elongation factor Tu--retain their functions in protein synthesis after the loss of photosynthesis in two nonphotosynthetic Polytoma clades but show a substantially accelerated rate of base substitution in the P. uvella clade. The accelerated evolution of tufA is due, at least partly, to relaxed codon bias favoring codons that can be read without wobble, mainly in three amino acids. Selection for these codons may be relaxed because leucoplasts are required to synthesize fewer protein molecules per unit time than are chloroplasts (reduced protein synthetic load) and thus require a lower rate of synthesis of elongation factor Tu. Relaxed selection due to a lower protein synthetic load is also a plausible explanation for the accelerated rate of evolution of rrn16, but the available data are insufficient to test the hypothesis for this gene. The tufA and rrn16 genes in Polytoma oviforme, the sole member of a second nonphotosynthetic clade, are also functional but show no sign of relaxed selection.

  8. Gene discovery for enzymes involved in limonene modification or utilization by the mountain pine beetle-associated pathogen Grosmannia clavigera.

    PubMed

    Wang, Ye; Lim, Lynette; Madilao, Lina; Lah, Ljerka; Bohlmann, Joerg; Breuil, Colette

    2014-08-01

    To successfully colonize and eventually kill pine trees, Grosmannia clavigera (Gs cryptic species), the main fungal pathogen associated with the mountain pine beetle (Dendroctonus ponderosae), has developed multiple mechanisms to overcome host tree chemical defenses, of which terpenoids are a major component. In addition to a monoterpene efflux system mediated by a recently discovered ABC transporter, Gs has genes that are highly induced by monoterpenes and that encode enzymes that modify or utilize monoterpenes [especially (+)-limonene]. We showed that pine-inhabiting Ophiostomale fungi are tolerant to monoterpenes, but only a few, including Gs, are known to utilize monoterpenes as a carbon source. Gas chromatography-mass spectrometry (GC-MS) revealed that Gs can modify (+)-limonene through various oxygenation pathways, producing carvone, p-mentha-2,8-dienol, perillyl alcohol, and isopiperitenol. It can also degrade (+)-limonene through the C-1-oxygenated pathway, producing limonene-1,2-diol as the most abundant intermediate. Transcriptome sequencing (RNA-seq) data indicated that Gs may utilize limonene 1,2-diol through beta-oxidation and then valine and tricarboxylic acid (TCA) metabolic pathways. The data also suggested that at least two gene clusters, located in genome contigs 108 and 161, were highly induced by monoterpenes and may be involved in monoterpene degradation processes. Further, gene knockouts indicated that limonene degradation required two distinct Baeyer-Villiger monooxygenases (BVMOs), an epoxide hydrolase and an enoyl coenzyme A (enoyl-CoA) hydratase. Our work provides information on enzyme-mediated limonene utilization or modification and a more comprehensive understanding of the interaction between an economically important fungal pathogen and its host's defense chemicals.

  9. Gene Discovery for Enzymes Involved in Limonene Modification or Utilization by the Mountain Pine Beetle-Associated Pathogen Grosmannia clavigera

    PubMed Central

    Wang, Ye; Lim, Lynette; Madilao, Lina; Lah, Ljerka; Bohlmann, Joerg

    2014-01-01

    To successfully colonize and eventually kill pine trees, Grosmannia clavigera (Gs cryptic species), the main fungal pathogen associated with the mountain pine beetle (Dendroctonus ponderosae), has developed multiple mechanisms to overcome host tree chemical defenses, of which terpenoids are a major component. In addition to a monoterpene efflux system mediated by a recently discovered ABC transporter, Gs has genes that are highly induced by monoterpenes and that encode enzymes that modify or utilize monoterpenes [especially (+)-limonene]. We showed that pine-inhabiting Ophiostomale fungi are tolerant to monoterpenes, but only a few, including Gs, are known to utilize monoterpenes as a carbon source. Gas chromatography-mass spectrometry (GC-MS) revealed that Gs can modify (+)-limonene through various oxygenation pathways, producing carvone, p-mentha-2,8-dienol, perillyl alcohol, and isopiperitenol. It can also degrade (+)-limonene through the C-1-oxygenated pathway, producing limonene-1,2-diol as the most abundant intermediate. Transcriptome sequencing (RNA-seq) data indicated that Gs may utilize limonene 1,2-diol through beta-oxidation and then valine and tricarboxylic acid (TCA) metabolic pathways. The data also suggested that at least two gene clusters, located in genome contigs 108 and 161, were highly induced by monoterpenes and may be involved in monoterpene degradation processes. Further, gene knockouts indicated that limonene degradation required two distinct Baeyer-Villiger monooxygenases (BVMOs), an epoxide hydrolase and an enoyl coenzyme A (enoyl-CoA) hydratase. Our work provides information on enzyme-mediated limonene utilization or modification and a more comprehensive understanding of the interaction between an economically important fungal pathogen and its host's defense chemicals. PMID:24837377

  10. Insights into shell deposition in the Antarctic bivalve Laternula elliptica: gene discovery in the mantle transcriptome using 454 pyrosequencing

    PubMed Central

    2010-01-01

    Background The Antarctic clam, Laternula elliptica, is an infaunal stenothermal bivalve mollusc with a circumpolar distribution. It plays a significant role in bentho-pelagic coupling and hence has been proposed as a sentinel species for climate change monitoring. Previous studies have shown that this mollusc displays a high level of plasticity with regard to shell deposition and damage repair against a background of genetic homogeneity. The Southern Ocean has amongst the lowest present-day CaCO3 saturation rate of any ocean region, and is predicted to be among the first to become undersaturated under current ocean acidification scenarios. Hence, this species presents as an ideal candidate for studies into the processes of calcium regulation and shell deposition in our changing ocean environments. Results 454 sequencing of L. elliptica mantle tissue generated 18,290 contigs with an average size of 535 bp (ranging between 142 bp-5.591 kb). BLAST sequence similarity searching assigned putative function to 17% of the data set, with a significant proportion of these transcripts being involved in binding and potentially of a secretory nature, as defined by GO molecular function and biological process classifications. These results indicated that the mantle is a transcriptionally active tissue which is actively proliferating. All transcripts were screened against an in-house database of genes shown to be involved in extracellular matrix formation and calcium homeostasis in metazoans. Putative identifications were made for a number of classical shell deposition genes, such as tyrosinase, carbonic anhydrase and metalloprotease 1, along with novel members of the family 2 G-Protein Coupled Receptors (GPCRs). A membrane transport protein (SEC61) was also characterised and this demonstrated the utility of the clam sequence data as a resource for examining cold adapted amino acid substitutions. The sequence data contained 46,235 microsatellites and 13,084 Single Nucleotide

  11. Array-Based Gene Discovery with Three Unrelated Subjects Shows SCARB2/LIMP-2 Deficiency Causes Myoclonus Epilepsy and Glomerulosclerosis

    PubMed Central

    Berkovic, Samuel F.; Dibbens, Leanne M.; Oshlack, Alicia; Silver, Jeremy D.; Katerelos, Marina; Vears, Danya F.; Lüllmann-Rauch, Renate; Blanz, Judith; Zhang, Ke Wei; Stankovich, Jim; Kalnins, Renate M.; Dowling, John P.; Andermann, Eva; Andermann, Frederick; Faldini, Enrico; D'Hooge, Rudi; Vadlamudi, Lata; Macdonell, Richard A.; Hodgson, Bree L.; Bayly, Marta A.; Savige, Judy; Mulley, John C.; Smyth, Gordon K.; Power, David A.; Saftig, Paul; Bahlo, Melanie

    2008-01-01

    Action myoclonus-renal failure syndrome (AMRF) is an autosomal-recessive disorder with the remarkable combination of focal glomerulosclerosis, frequently with glomerular collapse, and progressive myoclonus epilepsy associated with storage material in the brain. Here, we employed a novel combination of molecular strategies to find the responsible gene and show its effects in an animal model. Utilizing only three unrelated affected individuals and their relatives, we used homozygosity mapping with single-nucleotide polymorphism chips to localize AMRF. We then used microarray-expression analysis to prioritize candidates prior to sequencing. The disorder was mapped to 4q13-21, and microarray-expression analysis identified SCARB2/Limp2, which encodes a lysosomal-membrane protein, as the likely candidate. Mutations in SCARB2/Limp2 were found in all three families used for mapping and subsequently confirmed in two other unrelated AMRF families. The mutations were associated with lack of SCARB2 protein. Reanalysis of an existing Limp2 knockout mouse showed intracellular inclusions in cerebral and cerebellar cortex, and the kidneys showed subtle glomerular changes. This study highlights that recessive genes can be identified with a very small number of subjects. The ancestral lysosomal-membrane protein SCARB2/LIMP-2 is responsible for AMRF. The heterogeneous pathology in the kidney and brain suggests that SCARB2/Limp2 has pleiotropic effects that may be relevant to understanding the pathogenesis of other forms of glomerulosclerosis or collapse and myoclonic epilepsies. PMID:18308289

  12. Low-coverage, whole-genome sequencing of Artocarpus camansi (Moraceae) for phylogenetic marker development and gene discovery1

    PubMed Central

    Gardner, Elliot M.; Johnson, Matthew G.; Ragone, Diane; Wickett, Norman J.; Zerega, Nyree J. C.

    2016-01-01

    Premise of the study: We used moderately low-coverage (17×) whole-genome sequencing of Artocarpus camansi (Moraceae) to develop genomic resources for Artocarpus and Moraceae. Methods and Results: A de novo assembly of Illumina short reads (251,378,536 pairs, 2 × 100 bp) accounted for 93% of the predicted genome size. Predicted coding regions were used in a three-way orthology search with published genomes of Morus notabilis and Cannabis sativa. Phylogenetic markers for Moraceae were developed from 333 inferred single-copy exons. Ninety-eight putative MADS-box genes were identified. Analysis of all predicted coding regions resulted in preliminary annotation of 49,089 genes. An analysis of synonymous substitutions for pairs of orthologs (Ks analysis) in M. notabilis and A. camansi strongly suggested a lineage-specific whole-genome duplication in Artocarpus. Conclusions: This study substantially increases the genomic resources available for Artocarpus and Moraceae and demonstrates the value of low-coverage de novo assemblies for nonmodel organisms with moderately large genomes. PMID:27437173

  13. Automated Discovery of Tissue-Targeting Enhancers and Transcription Factors from Binding Motif and Gene Function Data

    PubMed Central

    Tuteja, Geetu; Moreira, Karen Betancourt; Chung, Tisha; Chen, Jenny; Wenger, Aaron M.; Bejerano, Gill

    2014-01-01

    Identifying enhancers regulating gene expression remains an important and challenging task. While recent sequencing-based methods provide epigenomic characteristics that correlate well with enhancer activity, it remains onerous to comprehensively identify all enhancers across development. Here we introduce a computational framework to identify tissue-specific enhancers evolving under purifying selection. First, we incorporate high-confidence binding site predictions with target gene functional enrichment analysis to identify transcription factors (TFs) likely functioning in a particular context. We then search the genome for clusters of binding sites for these TFs, overcoming previous constraints associated with biased manual curation of TFs or enhancers. Applying our method to the placenta, we find 33 known and implicate 17 novel TFs in placental function, and discover 2,216 putative placenta enhancers. Using luciferase reporter assays, 31/36 (86%) tested candidates drive activity in placental cells. Our predictions agree well with recent epigenomic data in human and mouse, yet over half our loci, including 7/8 (87%) tested regions, are novel. Finally, we establish that our method is generalizable by applying it to 5 additional tissues: heart, pancreas, blood vessel, bone marrow, and liver. PMID:24499934

  14. De novo characterization of the Dialeurodes citri transcriptome: mining genes involved in stress resistance and simple sequence repeats (SSRs) discovery.

    PubMed

    Chen, E-H; Wei, D-D; Shen, G-M; Yuan, G-R; Bai, P-P; Wang, J-J

    2014-02-01

    The citrus whitefly, Dialeurodes citri (Ashmead), is one of the three economically important whitefly species that infest citrus plants around the world; however, limited genetic research has been focused on D. citri, partly because of lack of genomic resources. In this study, we performed de novo assembly of a transcriptome using Illumina paired-end sequencing technology (Illumina Inc., San Diego, CA, USA). In total, 36,766 unigenes with a mean length of 497 bp were identified. Of these unigenes, we identified 17,788 matched known proteins in the National Center for Biotechnology Information database, as determined by Blast search, with 5731, 4850 and 14,441 unigenes assigned to clusters of orthologous groups (COG), gene ontology (GO), and SwissProt, respectively. In total, 7507 unigenes were assigned to 308 known pathways. In-depth analysis of the data showed that 117 unigenes were identified as potentially involved in the detoxification of xenobiotics and 67 heat shock protein (Hsp) genes were associated with environmental stress. In addition, these enzymes were searched against the GO and COG database, and the results showed that the three major detoxification enzymes and Hsps were classified into 18 and 3, 6, and 8 annotations, respectively. In addition, 149 simple sequence repeats were detected. The results facilitate the investigation of molecular resistance mechanisms to insecticides and environmental stress, and contribute to molecular marker development. The findings greatly improve our genetic understanding of D. citri, and lay the foundation for future functional genomics studies on this species.

  15. High-throughput discovery of mutations in tef semi-dwarfing genes by next-generation sequencing analysis.

    PubMed

    Zhu, Qihui; Smith, Shavannor M; Ayele, Mulu; Yang, Lixing; Jogi, Ansuya; Chaluvadi, Srinivasa R; Bennetzen, Jeffrey L

    2012-11-01

    Tef (Eragrostis tef) is a major cereal crop in Ethiopia. Lodging is the primary constraint to increasing productivity in this allotetraploid species, accounting for losses of ∼15-45% in yield each year. As a first step toward identifying semi-dwarf varieties that might have improved lodging resistance, an ∼6× fosmid library was constructed and used to identify both homeologues of the dw3 semi-dwarfing gene of Sorghum bicolor. An EMS mutagenized population, consisting of ∼21,210 tef plants, was planted and leaf materials were collected into 23 superpools. Two dwarfing candidate genes, homeologues of dw3 of sorghum and rht1 of wheat, were sequenced directly from each superpool with 454 technology, and 120 candidate mutations were identified. Out of 10 candidates tested, six independent mutations were validated by Sanger sequencing, including two predicted detrimental mutations in both dw3 homeologues with a potential to improve lodging resistance in tef through further breeding. This study demonstrates that high-throughput sequencing can identify potentially valuable mutations in under-studied plant species like tef and has provided mutant lines that can now be combined and tested in breeding programs for improved lodging resistance.

  16. Lignification in sugarcane: biochemical characterization, gene discovery, and expression analysis in two genotypes contrasting for lignin content.

    PubMed

    Bottcher, Alexandra; Cesarino, Igor; Santos, Adriana Brombini dos; Vicentini, Renato; Mayer, Juliana Lischka Sampaio; Vanholme, Ruben; Morreel, Kris; Goeminne, Geert; Moura, Jullyana Cristina Magalhães Silva; Nobile, Paula Macedo; Carmello-Guerreiro, Sandra Maria; Anjos, Ivan Antonio dos; Creste, Silvana; Boerjan, Wout; Landell, Marcos Guimarães de Andrade; Mazzafera, Paulo

    2013-12-01

    Sugarcane (Saccharum spp.) is currently one of the most efficient crops in the production of first-generation biofuels. However, the bagasse represents an additional abundant lignocellulosic resource that has the potential to increase the ethanol production per plant. To achieve a more efficient conversion of bagasse into ethanol, a better understanding of the main factors affecting biomass recalcitrance is needed. Because several studies have shown a negative effect of lignin on saccharification yield, the characterization of lignin biosynthesis, structure, and deposition in sugarcane is an important goal. Here, we present, to our knowledge, the first systematic study of lignin deposition during sugarcane stem development, using histological, biochemical, and transcriptional data derived from two sugarcane genotypes with contrasting lignin contents. Lignin amount and composition were determined in rind (outer) and pith (inner) tissues throughout stem development. In addition, the phenolic metabolome was analyzed by ultra-high-performance liquid chromatography-mass spectrometry, which allowed the identification of 35 compounds related to the phenylpropanoid pathway and monolignol biosynthesis. Furthermore, the Sugarcane EST Database was extensively surveyed to identify lignin biosynthetic gene homologs, and the expression of all identified genes during stem development was determined by quantitative reverse transcription-polymerase chain reaction. Our data provide, to our knowledge, the first in-depth characterization of lignin biosynthesis in sugarcane and form the baseline for the rational metabolic engineering of sugarcane feedstock for bioenergy purposes.

  17. The Hexosamine Template – A Platform for Modulating Gene Expression and for Sugar-based Drug Discovery

    PubMed Central

    Elmouelhi, Noha; Aich, Udayanath; Paruchuri, Venkata D.P.; Meledeo, M. Adam; Campbell, Christopher T.; Wang, Jean J.; Srinivas, Raja; Khanna, Hargun S.; Yarema, Kevin J.

    2009-01-01

    This study investigates the breadth of cellular responses engendered by short chain fatty acid (SCFA)-hexosamine hybrid molecules, a class of compounds long used in ‘metabolic glycoengineering’ that are now emerging as drug candidates. First, a ‘mix-and-match’ strategy showed that different SCFA (n-butyrate and acetate) appended to the same core sugar altered biological activity, complementing previous results [Campbell et al., (2008) J. Med. Chem. 51, 8135–8147] where a single type of SCFA elicited distinct responses. Microarray profiling then compared transcriptional responses engendered by regioisomerically-modified ManNAc, GlcNAc, and GalNAc analogs in MDA-MB-231 cells. These data – which were validated by qRT-PCR or Western analysis for ID1, TP53, HPSE, NQO1, EGR1 and VEGFA – showed a two-pronged response where a core set of genes was coordinately regulated by all analogs while each analog simultaneously uniquely regulated a larger number of genes. Finally, AutoDock modeling supported a mechanism where the analogs directly interact with elements of the NF-κB pathway. Together, these results establish the SCFA-hexosamine template as a versatile platform for modulating biological activity and developing new therapeutics. PMID:19326913

  18. Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses

    PubMed Central

    Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael

    2013-01-01

    Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343

  19. Lignification in Sugarcane: Biochemical Characterization, Gene Discovery, and Expression Analysis in Two Genotypes Contrasting for Lignin Content1[W

    PubMed Central

    Bottcher, Alexandra; Cesarino, Igor; Brombini dos Santos, Adriana; Vicentini, Renato; Mayer, Juliana Lischka Sampaio; Vanholme, Ruben; Morreel, Kris; Goeminne, Geert; Moura, Jullyana Cristina Magalhães Silva; Nobile, Paula Macedo; Carmello-Guerreiro, Sandra Maria; Antonio dos Anjos, Ivan; Creste, Silvana; Boerjan, Wout; Landell, Marcos Guimarães de Andrade; Mazzafera, Paulo

    2013-01-01

    Sugarcane (Saccharum spp.) is currently one of the most efficient crops in the production of first-generation biofuels. However, the bagasse represents an additional abundant lignocellulosic resource that has the potential to increase the ethanol production per plant. To achieve a more efficient conversion of bagasse into ethanol, a better understanding of the main factors affecting biomass recalcitrance is needed. Because several studies have shown a negative effect of lignin on saccharification yield, the characterization of lignin biosynthesis, structure, and deposition in sugarcane is an important goal. Here, we present, to our knowledge, the first systematic study of lignin deposition during sugarcane stem development, using histological, biochemical, and transcriptional data derived from two sugarcane genotypes with contrasting lignin contents. Lignin amount and composition were determined in rind (outer) and pith (inner) tissues throughout stem development. In addition, the phenolic metabolome was analyzed by ultra-high-performance liquid chromatography-mass spectrometry, which allowed the identification of 35 compounds related to the phenylpropanoid pathway and monolignol biosynthesis. Furthermore, the Sugarcane EST Database was extensively surveyed to identify lignin biosynthetic gene homologs, and the expression of all identified genes during stem development was determined by quantitative reverse transcription-polymerase chain reaction. Our data provide, to our knowledge, the first in-depth characterization of lignin biosynthesis in sugarcane and form the baseline for the rational metabolic engineering of sugarcane feedstock for bioenergy purposes. PMID:24144790

  20. ACBN0-tool for accelerated materials discovery

    NASA Astrophysics Data System (ADS)

    Gopal, Priya; Liyanage, Laalitha; Agapito, Luis; Lee, Seunghun; Takeuchi, Ichiro; Hart, Gus; Curtarolo, Stefano; Fornari, Marco; Buongiorno Nardelli, Marco

    High-Throughput QM computation of material properties by abinitio methods has become the foundation of an effective approach to materials design. One of the major challenges in mapping the materials genome is in developing efficient computational tools that are cost-effective and accurate at the same time. In this talk, we discuss the newly developed ACBN0 pseudo-hybrid Hubbard density functional where the Hubbard energy within the DFT + U formulation is calculated self consistently. The U depends on the electron density and depends both on the geometry and chemical environment of the system. We show that ACBN0 improves the description of both the structural and electronic properties in a range of complex materials from Zn/Cd based chalcogenides to the TMOs. The magnetic properties are better described compared to the LDA/GGA functionals. We will also discuss the application of the ACBN0 approach to surfaces, doped and multi-valent systems where it is possible to evaluate U for different sites and chemical bonding. For all the complex materials studied here, we find that the electronic properties are significantly improved over the DFT values and the accuracy is at par with the HSE values at a fraction of the computational cost.

  1. Berkeley Lab Computing Sciences: Accelerating Scientific Discovery

    SciTech Connect

    Hules, John A

    2008-12-12

    Scientists today rely on advances in computer science, mathematics, and computational science, as well as large-scale computing and networking facilities, to increase our understanding of ourselves, our planet, and our universe. Berkeley Lab's Computing Sciences organization researches, develops, and deploys new tools and technologies to meet these needs and to advance research in such areas as global climate change, combustion, fusion energy, nanotechnology, biology, and astrophysics.

  2. Linear Accelerators

    SciTech Connect

    Sidorin, Anatoly

    2010-01-05

    In linear accelerators the particles are accelerated by either electrostatic fields or oscillating Radio Frequency (RF) fields. Accordingly the linear accelerators are divided in three large groups: electrostatic, induction and RF accelerators. Overview of the different types of accelerators is given. Stability of longitudinal and transverse motion in the RF linear accelerators is briefly discussed. The methods of beam focusing in linacs are described.

  3. Gene discovery in Eimeria tenella by immunoscreening cDNA expression libraries of sporozoites and schizonts with chicken intestinal antibodies.

    PubMed

    Réfega, Susana; Girard-Misguich, Fabienne; Bourdieu, Christiane; Péry, Pierre; Labbé, Marie

    2003-04-01

    Specific antibodies were produced ex vivo from intestinal culture of Eimeria tenella infected chickens. The specificity of these intestinal antibodies was tested against different parasite stages. These antibodies were used to immunoscreen first generation schizont and sporozoite cDNA libraries permitting the identification of new E. tenella antigens. We obtained a total of 119 cDNA clones which were subjected to sequence analysis. The sequences coding for the proteins inducing local immune responses were compared with nucleotide or protein databases and with expressed sequence tags (ESTs) databases. We identified new Eimeria genes coding for heat shock proteins, a ribosomal protein, a pyruvate kinase and a pyridoxine kinase. Specific features of other sequences are discussed.

  4. From amplification to gene in thyroid cancer: A high-resolution mapped bacterial-artificial-chromosome resource for cancer chromosome aberrations guides gene discovery after comparative genome hybridization

    SciTech Connect

    Chen, X.N.; Gonsky, R.; Korenberg, J.R.; Knauf, J.A.; Fagin, J.A.; Chissoe, S.

    1998-08-01

    Chromosome rearrangements associated with neoplasms provide a rich resource for definition of the pathways of tumorigenesis. The power of comparative genome hybridization (CGH) to identify novel genes depends on the existence of suitable markers, which are lacking throughout most of the genome. The authors now report a general approach that translates CGH data into higher-resolution genomic-clone data that are then used to define the genes located in aneuploid regions. They used CGH to study 33 thyroid-tumor DNAs and two tumor-cell-line DNAs. The results revealed amplifications of chromosome band 2p21, with less-intense amplification on 2p13, 19q13.1, and 1p36 and with least-intense amplification on 1p34, 1q42, 5q31, 5q33-34, 9q32-34, and 14q32. To define the 2p21 region amplified, a dense array of 373 FISH-mapped chromosome 2 bacterial artificial chromosomes (BACs) was constructed, and 87 of these were hybridized to a tumor-cell line. Four BACs carried genomic DNA that was amplified in these cells. The maximum amplified region was narrowed to 3--6 Mb by multicolor FISH with the flanking BACs, and the minimum amplicon size was defined by a contig of 420 kb. Sequence analysis of the amplified BAC 1D9 revealed a fragment of the gene, encoding protein kinase C epsilon (PKC{epsilon}), that was then shown to be amplified and rearranged in tumor cells. In summary, CGH combined with a dense mapped resource of BACs and large-scale sequencing has led directly to the definition of PKC{epsilon} as a previously unmapped candidate gene involved in thyroid tumorigenesis.

  5. Discovery of miRNAs and Their Corresponding miRNA Genes in Atlantic Cod (Gadus morhua): Use of Stable miRNAs as Reference Genes Reveals Subgroups of miRNAs That Are Highly Expressed in Particular Organs

    PubMed Central

    Andreassen, Rune; Rangnes, Fredrik; Sivertsen, Maria; Chiang, Michelle; Tran, Michelle; Worren, Merete Molton

    2016-01-01

    Background Atlantic cod (Gadus morhua) is among the economically most important species in the northern Atlantic Ocean and a model species for studying development of the immune system in vertebrates. MicroRNAs (miRNAs) are an abundant class of small RNA molecules that regulate fundamental biological processes at the post-transcriptional level. Detailed knowledge about a species miRNA repertoire is necessary to study how the miRNA transcriptome modulate gene expression. We have therefore discovered and characterized mature miRNAs and their corresponding miRNA genes in Atlantic cod. We have also performed a validation study to identify suitable reference genes for RT-qPCR analysis of miRNA expression in Atlantic cod. Finally, we utilized the newly characterized miRNA repertoire and the dedicated RT-qPCR method to reveal miRNAs that are highly expressed in certain organs. Results The discovery analysis revealed 490 mature miRNAs (401 unique sequences) along with precursor sequences and genomic location of the miRNA genes. Twenty six of these were novel miRNA genes. Validation studies ranked gmo-miR-17-1—5p or the two-gene combination gmo-miR25-3p and gmo-miR210-5p as most suitable qPCR reference genes. Analysis by RT-qPCR revealed 45 miRNAs with significantly higher expression in tissues from one or a few organs. Comparisons to other vertebrates indicate that some of these miRNAs may regulate processes like growth, lipid metabolism, immune response to microbial infections and scar damage repair. Three teleost-specific and three novel Atlantic cod miRNAs were among the differentially expressed miRNAs. Conclusions The number of known mature miRNAs was considerably increased by our identification of miRNAs and miRNA genes in Atlantic cod. This will benefit further functional studies of miRNA expression using deep sequencing methods. The validation study showed that stable miRNAs are suitable reference genes for RT-qPCR analysis of miRNA expression. Applying RT-qPCR we

  6. Gene discovery, evolutionary affinity and molecular detection of Oxyspirura petrowi, an eye worm parasite of game birds

    PubMed Central

    2013-01-01

    Background Oxyspirura petrowi appears to be emerging as a nematode parasite that could negatively impact Northern Bobwhite quail individuals and populations within Texas and other regions of the United States. Despite this eye worm's potential importance in the conservation of wild quail, little is known about the general biology and genome composition of O. petrowi. To fill the knowledge gap, we performed a small scale random genome sequence survey, sequenced its 18S rRNA and the intergenic region between the 18S and 28S rRNA genes, studied its phylogenetic affinity, and developed a PCR protocol for the detection of this eye worm. Results We have generated ~240 kb of genome sequence data derived from 348 clones by a random genome survey of an O. petrowi genomic library. The eye worm genome is AT-rich (i.e., 62.2% AT-content), and contains a high number of microsatellite sequences. The discovered genes encode a wide-range of proteins including hypothetical proteins, enzymes, nematode-specific proteins. Phylogenetic analysis based on 18S rRNA sequences indicate that the Spiruroidea is paraphyletic, in which Oxyspirura and its closely related species are sisters to the filarial nematodes. We have also developed a PCR protocol based on the ITS2 sequence that allows sensitive and specific detection of eye worm DNA in feces. Using this newly developed protocol, we have determined that ~28% to 33% of the fecal samples collected from Northern Bobwhites and Scaled Quail in Texas in the spring of 2013 are O. petrowi positive. Conclusions The O. petrowi genome is rich in microsatellite sequences that may be used in future genotyping and molecular fingerprinting analysis. This eye worm is evolutionarily close to the filarial nematodes, implying that therapeutic strategies for filariasis such as Loa loa would be referential in developing treatments for the Thelazoidea parasites. Our qPCR-based survey has confirmed that O. petrowi infection is of potential concern to quail

  7. De Novo Deep Transcriptome Analysis of Medicinal Plants for Gene Discovery in Biosynthesis of Plant Natural Products.

    PubMed

    Han, R; Rai, A; Nakamura, M; Suzuki, H; Takahashi, H; Yamazaki, M; Saito, K

    2016-01-01

    Study on transcriptome, the entire pool of transcripts in an organism or single cells at certain physiological or pathological stage, is indispensable in unraveling the connection and regulation between DNA and protein. Before the advent of deep sequencing, microarray was the main approach to handle transcripts. Despite obvious shortcomings, including limited dynamic range and difficulties to compare the results from distinct experiments, microarray was widely applied. During the past decade, next-generation sequencing (NGS) has revolutionized our understanding of genomics in a fast, high-throughput, cost-effective, and tractable manner. By adopting NGS, efficiency and fruitful outcomes concerning the efforts to elucidate genes responsible for producing active compounds in medicinal plants were profoundly enhanced. The whole process involves steps, from the plant material sampling, to cDNA library preparation, to deep sequencing, and then bioinformatics takes over to assemble enormous-yet fragmentary-data from which to comb and extract information. The unprecedentedly rapid development of such technologies provides so many choices to facilitate the task, which can cause confusion when choosing the suitable methodology for specific purposes. Here, we review the general approaches for deep transcriptome analysis and then focus on their application in discovering biosynthetic pathways of medicinal plants that produce important secondary metabolites. PMID:27480681

  8. Space Discovery.

    ERIC Educational Resources Information Center

    Blackman, Joan

    1998-01-01

    Describes one teacher's experience taking Space Discovery courses that were sponsored by the United States Space Foundation (USSF). These courses examine the history of space science, theory of orbits and rocketry, the effects of living in outer space on humans, and space weather. (DDR)

  9. PosMed (Positional Medline): prioritizing genes with an artificial neural network comprising medical documents to accelerate positional cloning.

    PubMed

    Yoshida, Yuko; Makita, Yuko; Heida, Naohiko; Asano, Satomi; Matsushima, Akihiro; Ishii, Manabu; Mochizuki, Yoshiki; Masuya, Hiroshi; Wakana, Shigeharu; Kobayashi, Norio; Toyoda, Tetsuro

    2009-07-01

    PosMed (http://omicspace.riken.jp/) prioritizes candidate genes for positional cloning by employing our original database search engine GRASE, which uses an inferential process similar to an artificial neural network comprising documental neurons (or 'documentrons') that represent each document contained in databases such as MEDLINE and OMIM. Given a user-specified query, PosMed initially performs a full-text search of each documentron in the first-layer artificial neurons and then calculates the statistical significance of the connections between the hit documentrons and the second-layer artificial neurons representing each gene. When a chromosomal interval(s) is specified, PosMed explores the second-layer and third-layer artificial neurons representing genes within the chromosomal interval by evaluating the combined significance of the connections from the hit documentrons to the genes. PosMed is, therefore, a powerful tool that immediately ranks the candidate genes by connecting phenotypic keywords to the genes through connections representing not only gene-gene interactions but also other biological interactions (e.g. metabolite-gene, mutant mouse-gene, drug-gene, disease-gene and protein-protein interactions) and ortholog data. By utilizing orthologous connections, PosMed facilitates the ranking of human genes based on evidence found in other model species such as mouse. Currently, PosMed, an artificial superbrain that has learned a vast amount of biological knowledge ranging from genomes to phenomes (or 'omic space'), supports the prioritization of positional candidate genes in humans, mouse, rat and Arabidopsis thaliana.

  10. Discovery of potential new gene variants and inflammatory cytokine associations with fibromyalgia syndrome by whole exome sequencing.

    PubMed

    Feng, Jinong; Zhang, Zhifang; Wu, Xiwei; Mao, Allen; Chang, Frances; Deng, Xutao; Gao, Harry; Ouyang, Ching; Dery, Kenneth J; Le, Keith; Longmate, Jeffrey; Marek, Claudia; St Amand, R Paul; Krontiris, Theodore G; Shively, John E

    2013-01-01

    Fibromyalgia syndrome (FMS) is a chronic musculoskeletal pain disorder affecting 2% to 5% of the general population. Both genetic and environmental factors may be involved. To ascertain in an unbiased manner which genes play a role in the disorder, we performed complete exome sequencing on a subset of FMS patients. Out of 150 nuclear families (trios) DNA from 19 probands was subjected to complete exome sequencing. Since >80,000 SNPs were found per proband, the data were further filtered, including analysis of those with stop codons, a rare frequency (<2.5%) in the 1000 Genomes database, and presence in at least 2/19 probands sequenced. Two nonsense mutations, W32X in C11orf40 and Q100X in ZNF77 among 150 FMS trios had a significantly elevated frequency of transmission to affected probands (p = 0.026 and p = 0.032, respectively) and were present in a subset of 13% and 11% of FMS patients, respectively. Among 9 patients bearing more than one of the variants we have described, 4 had onset of symptoms between the ages of 10 and 18. The subset with the C11orf40 mutation had elevated plasma levels of the inflammatory cytokines, MCP-1 and IP-10, compared with unaffected controls or FMS patients with the wild-type allele. Similarly, patients with the ZNF77 mutation have elevated levels of the inflammatory cytokine, IL-12, compared with controls or patients with the wild type allele. Our results strongly implicate an inflammatory basis for FMS, as well as specific cytokine dysregulation, in at least 35% of our FMS cohort.

  11. Developmental gene discovery in a hemimetabolous insect: de novo assembly and annotation of a transcriptome for the cricket Gryllus bimaculatus.

    PubMed

    Zeng, Victor; Ewen-Campen, Ben; Horch, Hadley W; Roth, Siegfried; Mito, Taro; Extavour, Cassandra G

    2013-01-01

    Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects), representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket), a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts) and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr) identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in Gryllus.

  12. Discovery of precursor and mature microRNAs and their putative gene targets using high-throughput sequencing in pineapple (Ananas comosus var. comosus).

    PubMed

    Yusuf, Noor Hydayaty Md; Ong, Wen Dee; Redwan, Raimi Mohamed; Latip, Mariam Abd; Kumar, S Vijay

    2015-10-15

    MicroRNAs (miRNAs) are a class of small, endogenous non-coding RNAs that negatively regulate gene expression, resulting in the silencing of target mRNA transcripts through mRNA cleavage or translational inhibition. MiRNAs play significant roles in various biological and physiological processes in plants. However, the miRNA-mediated gene regulatory network in pineapple, the model tropical non-climacteric fruit, remains largely unexplored. Here, we report a complete list of pineapple mature miRNAs obtained from high-throughput small RNA sequencing and precursor miRNAs (pre-miRNAs) obtained from ESTs. Two small RNA libraries were constructed from pineapple fruits and leaves, respectively, using Illumina's Solexa technology. Sequence similarity analysis using miRBase revealed 579,179 reads homologous to 153 miRNAs from 41 miRNA families. In addition, a pineapple fruit transcriptome library consisting of approximately 30,000 EST contigs constructed using Solexa sequencing was used for the discovery of pre-miRNAs. In all, four pre-miRNAs were identified (MIR156, MIR399, MIR444 and MIR2673). Furthermore, the same pineapple transcriptome was used to dissect the function of the miRNAs in pineapple by predicting their putative targets in conjunction with their regulatory networks. In total, 23 metabolic pathways were found to be regulated by miRNAs in pineapple. The use of high-throughput sequencing in pineapples to unveil the presence of miRNAs and their regulatory pathways provides insight into the repertoire of miRNA regulation used exclusively in this non-climacteric model plant.

  13. Systems biology discoveries using non-human primate pluripotent stem and germ cells: novel gene and genomic imprinting interactions as well as unique expression patterns.

    PubMed

    Ben-Yehudah, Ahmi; Easley, Charles A; Hermann, Brian P; Castro, Carlos; Simerly, Calvin; Orwig, Kyle E; Mitalipov, Shoukhrat; Schatten, Gerald

    2010-08-05

    The study of pluripotent stem cells has generated much interest in both biology and medicine. Understanding the fundamentals of biological decisions, including what permits a cell to maintain pluripotency, that is, its ability to self-renew and thereby remain immortal, or to differentiate into multiple types of cells, is of profound importance. For clinical applications, pluripotent cells, including both embryonic stem cells and adult stem cells, have been proposed for cell replacement therapy for a number of human diseases and disorders, including Alzheimer's, Parkinson's, spinal cord injury and diabetes. One challenge in their usage for such therapies is understanding the mechanisms that allow the maintenance of pluripotency and controlling the specific differentiation into required functional target cells. Because of regulatory restrictions and biological feasibilities, there are many crucial investigations that are just impossible to perform using pluripotent stem cells (PSCs) from humans (for example, direct comparisons among panels of inbred embryonic stem cells from prime embryos obtained from pedigreed and fertile donors; genomic analysis of parent versus progeny PSCs and their identical differentiated tissues; intraspecific chimera analyses for pluripotency testing; and so on). However, PSCs from nonhuman primates are being investigated to bridge these knowledge gaps between discoveries in mice and vital information necessary for appropriate clinical evaluations. In this review, we consider the mRNAs and novel genes with unique expression and imprinting patterns that were discovered using systems biology approaches with primate pluripotent stem and germ cells.

  14. Generalized and specific neurocognitive deficits in psychotic disorders: utility for evaluating pharmacological treatment effects and as intermediate phenotypes for gene discovery.

    PubMed

    Reilly, James L; Sweeney, John A

    2014-05-01

    A growing body of research suggests that schizophrenia and bipolar disorder share overlapping clinical, neurobiological, and genetic features, raising important questions about the boundaries and distinctiveness of these 2 major psychiatric disorders. A generalized cognitive impairment has long been understood to be a core feature of schizophrenia. More recently, it has become apparent that cognitive impairment also occurs in bipolar disorder, particularly in those patients with a history of psychotic symptoms. Whether a generalized deficit exists across a spectrum of psychotic disorders is less clearly established. Additionally, in the context of a broad impairment, it remains a significant challenge to identify deficits in specific cognitive processes that may have distinct neurochemical or regional brain substrates and linkages to particular risk-associated genetic factors. In this article, we review the findings from neuropsychological studies across a spectrum that includes schizophrenia, schizoaffective and bipolar disorders, and conclude the available evidence strongly supports that a generalized deficit is present across psychotic disorders that differs in severity more so than form. We then consider the implications of generalized and specific deficits in psychosis for 2 areas of research--the evaluation of pharmacological treatments targeting cognitive deficits, and the investigation of cognitive intermediate phenotypes in family genetic studies. Examples from the literature that touch on the relevance of the generalized deficit in these contexts are provided, as well as consideration for the continued need to identify specific impairments that are separable from the generalized deficit in order to advance drug and gene discovery.

  15. Accelerated rates of protein evolution in barley grain and pistil biased genes might be legacy of domestication.

    PubMed

    Shi, Tao; Dimitrov, Ivan; Zhang, Yinling; Tax, Frans E; Yi, Jing; Gou, Xiaoping; Li, Jia

    2015-10-01

    Traits related to grain and reproductive organs in grass crops have been under continuous directional selection during domestication. Barley is one of the oldest domesticated crops in human history. Thus genes associated with the grain and reproductive organs in barley may show evidence of dramatic evolutionary change. To understand how artificial selection contributes to protein evolution of biased genes in different barley organs, we used Digital Gene Expression analysis of six barley organs (grain, pistil, anther, leaf, stem and root) to identify genes with biased expression in specific organs. Pairwise comparisons of orthologs between barley and Brachypodium distachyon, as well as between highland and lowland barley cultivars mutually indicated that grain and pistil biased genes show relatively higher protein evolutionary rates compared with the median of all orthologs and other organ biased genes. Lineage-specific protein evolutionary rates estimation showed similar patterns with elevated protein evolution in barley grain and pistil biased genes, yet protein sequences generally evolve much faster in the lowland barley cultivar. Further functional annotations revealed that some of these grain and pistil biased genes with rapid protein evolution are related to nutrient biosynthesis and cell cycle/division. Our analyses provide insights into how domestication differentially shaped the evolution of genes specific to different organs of a crop species, and implications for future functional studies of domestication genes.

  16. Can Accelerators Accelerate Learning?

    NASA Astrophysics Data System (ADS)

    Santos, A. C. F.; Fonseca, P.; Coelho, L. F. S.

    2009-03-01

    The 'Young Talented' education program developed by the Brazilian State Funding Agency (FAPERJ) [1] makes it possible for high-schools students from public high schools to perform activities in scientific laboratories. In the Atomic and Molecular Physics Laboratory at Federal University of Rio de Janeiro (UFRJ), the students are confronted with modern research tools like the 1.7 MV ion accelerator. Being a user-friendly machine, the accelerator is easily manageable by the students, who can perform simple hands-on activities, stimulating interest in physics, and getting the students close to modern laboratory techniques.

  17. Can Accelerators Accelerate Learning?

    SciTech Connect

    Santos, A. C. F.; Fonseca, P.; Coelho, L. F. S.

    2009-03-10

    The 'Young Talented' education program developed by the Brazilian State Funding Agency (FAPERJ)[1] makes it possible for high-schools students from public high schools to perform activities in scientific laboratories. In the Atomic and Molecular Physics Laboratory at Federal University of Rio de Janeiro (UFRJ), the students are confronted with modern research tools like the 1.7 MV ion accelerator. Being a user-friendly machine, the accelerator is easily manageable by the students, who can perform simple hands-on activities, stimulating interest in physics, and getting the students close to modern laboratory techniques.

  18. Silencing SlELP2L, a tomato Elongator complex protein 2-like gene, inhibits leaf growth, accelerates leaf, sepal senescence, and produces dark-green fruit.

    PubMed

    Zhu, Mingku; Li, Yali; Chen, Guoping; Ren, Lijun; Xie, Qiaoli; Zhao, Zhiping; Hu, Zongli

    2015-01-09

    The multi-subunit complex Elongator interacts with elongating RNA polymerase II (RNAPII) and is thought to facilitate transcription through histone acetylation. Elongator is highly conserved in eukaryotes, yet has multiple kingdom-specific functions in diverse organisms. Recent genetic studies performed in Arabidopsis have demonstrated that Elongator functions in plant growth and development, and in response to biotic and abiotic stress. However, little is known about its roles in other plant species. Here, we study the function of an Elongator complex protein 2-like gene in tomato, here designated as SlELP2L, through RNAi-mediated gene silencing. Silencing SlELP2L in tomato inhibits leaf growth, accelerates leaf and sepal senescence, and produces dark-green fruit with reduced GA and IAA contents in leaves, and increased chlorophyll accumulation in pericarps. Gene expression analysis indicated that SlELP2L-silenced plants had reduced transcript levels of ethylene- and ripening-related genes during fruit ripening with slightly decreased carotenoid content in fruits, while the expression of DNA methyltransferase genes was up-regulated, indicating that SlELP2L may modulate DNA methylation in tomato. Besides, silencing SlELP2L increases ABA sensitivity in inhibiting seedling growth. These results suggest that SlELP2L plays important roles in regulating plant growth and development, as well as in response to ABA in tomato.

  19. Lineage-specific duplications of Muroidea Faim and Spag6 genes and atypical accelerated evolution of the parental Spag6 gene.

    PubMed

    Qiu, Huan; Gołas, Aniela; Grzmil, Paweł; Wojnowski, Leszek

    2013-09-01

    Gene duplications restricted to single lineage combined with an asymmetric evolution of the resulting genes may play particularly important roles in this lineage's biology. We searched and identified asymmetrical evolution in nine gene families that duplicated exclusively in rodents and are present as single-copies in human, dog, cow, elephant, opossum, chicken, lizard, and Western clawed frog. Among those nine gene families are Fas apoptosis inhibitory molecule (Faim), implicated in apoptosis, and Sperm antigen 6 (Spag6), implicated in sperm mobility. Both genes were duplicated in or before the Muroidea ancestor. Due to the highly asymmetric evolution of the resulting paralogs, the existence of these duplications had been previously overlooked. Interestingly, Spag6, previously regarded and characterized as a single-copy ortholog of human Spag6, turns out to be a Muroidea-specific paralog. Conversely, the newly identified, highly divergent Spag6-BC061194 is in fact the parental gene. In consequence, this gene represents a rare exception from the general rule of rapid evolution of derived rather than parental genes following gene duplication. Unusual genes such as murine Spag6 may help to understand which mechanisms are responsible for this rule.

  20. Cosmic Discovery

    NASA Astrophysics Data System (ADS)

    Harwit, Martin

    1984-04-01

    In the remarkable opening section of this book, a well-known Cornell astronomer gives precise thumbnail histories of the 43 basic cosmic discoveries - stars, planets, novae, pulsars, comets, gamma-ray bursts, and the like - that form the core of our knowledge of the universe. Many of them, he points out, were made accidentally and outside the mainstream of astronomical research and funding. This observation leads him to speculate on how many more major phenomena there might be and how they might be most effectively sought out in afield now dominated by large instruments and complex investigative modes and observational conditions. The book also examines discovery in terms of its political, financial, and sociological context - the role of new technologies and of industry and the military in revealing new knowledge; and methods of funding, of peer review, and of allotting time on our largest telescopes. It concludes with specific recommendations for organizing astronomy in ways that will best lead to the discovery of the many - at least sixty - phenomena that Harwit estimates are still waiting to be found.

  1. Discovery and molecular mapping of a new gene conferring resistance to stem rust, Sr53, derived from Aegilops geniculata and characterization of spontaneous translocation stocks with reduced alien chromatin

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This study reports the discovery and molecular mapping of a resistance gene effective against stem rust races RKQQC and TTKSK (Ug99) derived from Aegilops geniculata (2n=4x=28, UgUgMgMg). Two populations from the crosses TA5599 (T5DL-5MgL.5MgS)/TA3809 (ph1b mutant in Chinese Spring background) and T...

  2. Differences in saccharin preference and genetic alterations of the Tas1r3 gene among senescence-accelerated mouse strains and their parental AKR/J strain.

    PubMed

    Niimi, Kimie; Takahashi, Eiki

    2014-05-10

    The senescence-accelerated mouse (SAM) is used as an animal model of senescence acceleration and age-associated disorders. SAM is derived from unexpected crosses between the AKR/J and unknown mouse strains. There are nine senescence-prone (SAMP) strains and three senescence-resistant (SAMR) strains. Although SAMP strains exhibit strain-specific and age-related pathological changes, the genes responsible for the pathologic changes in SAMP strains have not been comprehensively identified. In the present study, we evaluated sweet taste perception using the two-bottle test. We compared genotypes of the taste related gene, Tas1r3, using SAM strains and the parental AKR/J strain. The two-bottle test revealed that SAMR1 (R1), SAMP6 (P6), SAMP8 (P8), and SAMP10 (P10) mice were saccharin-preferring strains, whereas AKR/J did not prefer saccharin. All genotypes of the R1, P6, P8, and P10 strains at the polymorphic sites in Tas1r3, which is known to influence saccharin preference, were identical to those of C57BL6/J, a well-known saccharin-preferring strain, and were completely different from those of the parental AKR/J strain. These genetic alterations in SAM strains appear to arise from an unknown strain that is thought to have been crossed with AKR/J initially. PMID:24726396

  3. A comparison of mutations induced by accelerated iron particles versus those induced by low earth orbit space radiation in the FEM-3 gene of Caenorhabditis elegans.

    PubMed

    Hartman, P S; Hlavacek, A; Wilde, H; Lewicki, D; Schubert, W; Kern, R G; Kazarians, G A; Benton, E V; Benton, E R; Nelson, G A

    2001-03-01

    The fem-3 gene of Caenorhabditis elegans was employed to determine the mutation frequency as well as the nature of mutations induced by low earth orbit space radiation ambient to Space Shuttle flight STS-76. Recovered mutations were compared to those induced by accelerated iron ions generated by the AGS synchrotron accelerator at Brookhaven National Laboratory. For logistical reasons, dauer larvae were prepared at TCU, transported to either Kennedy Space Center or Brookhaven National Laboratory, flown in space or irradiated, returned to TCU and screened for mutants. A total of 25 fem-3 mutants were recovered after the shuttle flight and yielded a mutation frequency of 2.1x10(-5), roughly 3.3-fold higher than the spontaneous rate of 6.3x10(-6). Four of the mutations were homozygous inviable, suggesting that they were large deletions encompassing fem-3 as well as neighboring, essential genes. Southern blot analyses revealed that one of the 25 contained a polymorphism in fem-3, further evidence that space radiation can induce deletions. While no polymorphisms were detected among the iron ion-induced mutations, three of the 15 mutants were homozygous inviable, which is in keeping with previous observations that high LET iron particles generate deficiencies. These data provide evidence, albeit indirect, that an important mutagenic component of ambient space radiation is high LET charged particles such as iron ions.

  4. A comparison of mutations induced by accelerated iron particles versus those induced by low earth orbit space radiation in the FEM-3 gene of Caenorhabditis elegans

    NASA Technical Reports Server (NTRS)

    Hartman, P. S.; Hlavacek, A.; Wilde, H.; Lewicki, D.; Schubert, W.; Kern, R. G.; Kazarians, G. A.; Benton, E. V.; Benton, E. R.; Nelson, G. A.

    2001-01-01

    The fem-3 gene of Caenorhabditis elegans was employed to determine the mutation frequency as well as the nature of mutations induced by low earth orbit space radiation ambient to Space Shuttle flight STS-76. Recovered mutations were compared to those induced by accelerated iron ions generated by the AGS synchrotron accelerator at Brookhaven National Laboratory. For logistical reasons, dauer larvae were prepared at TCU, transported to either Kennedy Space Center or Brookhaven National Laboratory, flown in space or irradiated, returned to TCU and screened for mutants. A total of 25 fem-3 mutants were recovered after the shuttle flight and yielded a mutation frequency of 2.1x10(-5), roughly 3.3-fold higher than the spontaneous rate of 6.3x10(-6). Four of the mutations were homozygous inviable, suggesting that they were large deletions encompassing fem-3 as well as neighboring, essential genes. Southern blot analyses revealed that one of the 25 contained a polymorphism in fem-3, further evidence that space radiation can induce deletions. While no polymorphisms were detected among the iron ion-induced mutations, three of the 15 mutants were homozygous inviable, which is in keeping with previous observations that high LET iron particles generate deficiencies. These data provide evidence, albeit indirect, that an important mutagenic component of ambient space radiation is high LET charged particles such as iron ions.

  5. Differences in saccharin preference and genetic alterations of the Tas1r3 gene among senescence-accelerated mouse strains and their parental AKR/J strain.

    PubMed

    Niimi, Kimie; Takahashi, Eiki

    2014-05-10

    The senescence-accelerated mouse (SAM) is used as an animal model of senescence acceleration and age-associated disorders. SAM is derived from unexpected crosses between the AKR/J and unknown mouse strains. There are nine senescence-prone (SAMP) strains and three senescence-resistant (SAMR) strains. Although SAMP strains exhibit strain-specific and age-related pathological changes, the genes responsible for the pathologic changes in SAMP strains have not been comprehensively identified. In the present study, we evaluated sweet taste perception using the two-bottle test. We compared genotypes of the taste related gene, Tas1r3, using SAM strains and the parental AKR/J strain. The two-bottle test revealed that SAMR1 (R1), SAMP6 (P6), SAMP8 (P8), and SAMP10 (P10) mice were saccharin-preferring strains, whereas AKR/J did not prefer saccharin. All genotypes of the R1, P6, P8, and P10 strains at the polymorphic sites in Tas1r3, which is known to influence saccharin preference, were identical to those of C57BL6/J, a well-known saccharin-preferring strain, and were completely different from those of the parental AKR/J strain. These genetic alterations in SAM strains appear to arise from an unknown strain that is thought to have been crossed with AKR/J initially.

  6. Surface functionalization of inorganic nano-crystals with fibronectin and E-cadherin chimera synergistically accelerates trans-gene delivery into embryonic stem cells.

    PubMed

    Kutsuzawa, K; Chowdhury, E H; Nagaoka, M; Maruyama, K; Akiyama, Y; Akaike, T

    2006-11-24

    Stem cells holding great promises in regenerative medicine have the potential to be differentiated to a specific cell type through genetic manipulation. However, conventional ways of gene transfer to such progenitor cells suffer from a number of disadvantages particularly involving safety and efficacy issues. Here, we report on the development of a bio-functionalized inorganic nano-carrier of DNA by embedding fibronectin and E-cadherin chimera on the carrier, leading to its high affinity interactions with embryonic stem cell surface and accelerated trans-gene delivery for subsequent expression. While only apatite nano-particles were very inefficient in transfecting embryonic stem cells, fibronectin-anchored particles and to a more significant extent, fibronectin and E-cadherin-Fc-associated particles dramatically enhanced trans-gene delivery with a value notably higher than that of commercially available lipofection system. The involvement of both cell surface integrin and E-cadherin in mediating intracellular localization of the hybrid carrier was verified by blocking integrin binding site with excess free fibronectin and up-regulating both integrin and E-cadherin through PKC activation. Thus, the new establishment of a bio-functional hybrid gene-carrier would promote and facilitate development of stem cell-based therapy in regenerative medicine.

  7. Surface functionalization of inorganic nano-crystals with fibronectin and E-cadherin chimera synergistically accelerates trans-gene delivery into embryonic stem cells

    SciTech Connect

    Kutsuzawa, K.; Chowdhury, E.H.; Nagaoka, M.; Maruyama, K.; Akiyama, Y.; Akaike, T. . E-mail: takaike@bio.titech.ac.jp

    2006-11-24

    Stem cells holding great promises in regenerative medicine have the potential to be differentiated to a specific cell type through genetic manipulation. However, conventional ways of gene transfer to such progenitor cells suffer from a number of disadvantages particularly involving safety and efficacy issues. Here, we report on the development of a bio-functionalized inorganic nano-carrier of DNA by embedding fibronectin and E-cadherin chimera on the carrier, leading to its high affinity interactions with embryonic stem cell surface and accelerated trans-gene delivery for subsequent expression. While only apatite nano-particles were very inefficient in transfecting embryonic stem cells, fibronectin-anchored particles and to a more significant extent, fibronectin and E-cadherin-Fc-associated particles dramatically enhanced trans-gene delivery with a value notably higher than that of commercially available lipofection system. The involvement of both cell surface integrin and E-cadherin in mediating intracellular localization of the hybrid carrier was verified by blocking integrin binding site with excess free fibronectin and up-regulating both integrin and E-cadherin through PKC activation. Thus, the new establishment of a bio-functional hybrid gene-carrier would promote and facilitate development of stem cell-based therapy in regenerative medicine.

  8. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    PubMed

    Hill, Theresa A; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome

  9. Characterization of Capsicum annuum Genetic Diversity and Population Structure Based on Parallel Polymorphism Discovery with a 30K Unigene Pepper GeneChip

    PubMed Central

    Hill, Theresa A.; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W.; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome

  10. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    PubMed

    Hill, Theresa A; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome

  11. Search for major genes with progeny test data to accelerate the development of genetically superior loblolly pine. Technical progress report

    SciTech Connect

    2000-02-15

    This report details the progress of the three tasks of this project. The tasks are: (1) develop genetic models and analytical methods; (2) molecular confirmation of major gene segregation; and (3) develop strategies for marker-assisted breeding.

  12. Transcriptome Analysis for Identification of Genes Related to Gonad Differentiation, Growth, Immune Response and Marker Discovery in The Turbot (Scophthalmus maximus)

    PubMed Central

    Ma, Deyou; Ma, Aijun; Huang, Zhihui; Wang, Guangning; Wang, Ting; Xia, Dandan; Ma, Benhe

    2016-01-01

    Background Turbot Scophthalmus maximus is an economically important species extensively aquacultured in China. The genetic selection program is necessary and urgent for the sustainable development of this industry, requiring more and more genome background knowledge. Transcriptome sequencing is an excellent alternative way to identify transcripts involved in specific biological processes and exploit a considerable quantity of molecular makers when no genome sequences are available. In this study, a comprehensive transcript dataset for major tissues of S. maximus was produced on basis of an Illumina platform. Results Total RNA was isolated from liver, spleen, kidney, cerebrum, gonad (testis and ovary) and muscle. Equal quantities of RNA from each type of tissues were pooled to construct two cDNA libraries (male and female). Using the Illumina paired-end sequencing technology, nearly 44.22 million clean reads in length of 100 bp were generated and then assembled into 106,643 contigs, of which 71,107 were named unigenes with an average length of 892 bp after the elimination of redundancies. Of these, 24,052 unigenes (33.83% of the total) were successfully annotated. GO, KEGG pathway mapping and COG analysis were performed to predict potential genes and their functions. Based on our sequence analysis and published documents, many candidate genes with fundamental roles in sex determination and gonad differentiation (dmrt1), growth (ghrh, myf5, prl/prlr) and immune response (TLR1/TLR21/TLR22, IL-15/IL-34), were identified for the first time in this species. In addition, a large number of credible genetic markers, including 21,192 SSRs and 8,642 SNPs, were identified in the present dataset. Conclusion This informative transcriptome provides valuable new data to increase genomic resources of Scophthalmus maximus. The future studies of corresponding gene functions will be very useful for the management of reproduction, growth and disease control in turbot aquaculture

  13. Tissue distribution of products of the mouse decay-accelerating factor (DAF) genes. Exploitation of a Daf1 knock-out mouse and site-specific monoclonal antibodies.

    PubMed

    Lin, F; Fukuoka, Y; Spicer, A; Ohta, R; Okada, N; Harris, C L; Emancipator, S N; Medof, M E

    2001-10-01

    Decay-accelerating factor (DAF) is a membrane regulator of C3 activation that protects self cells from autologous complement attack. In humans, DAF is uniformly expressed as a glycosylphosphatidylinositol (GPI)-anchored molecule. In mice, both GPI-anchored and transmembrane-anchored DAF proteins are produced, each of which can be derived from two different genes (Daf1 and Daf2). In this report, we describe a Daf1 gene knock-out mouse arising as the first product of a strategy for targeting one or both Daf genes. As part of the work, we characterize recently described monoclonal antibodies against murine DAF protein using deletion mutants synthesized in yeast, and then employ the monoclonal antibodies in conjunction with wild-type and the Daf1 knock-out mice to determine the tissue distribution of the mouse Daf1 and Daf2 gene products. To enhance the immunohistochemical detection of murine DAF protein, we utilized the sensitive tyramide fluorescence method. In wild-type mice, we found strong DAF labelling of glomeruli, airway and gut epithelium, the spleen, vascular endothelium throughout all tissues, and seminiferous tubules of the testis. In Daf1 knock-out mice, DAF labelling was ablated in most tissues, but strong labelling of the testis and splenic dendritic cells remained. In both sites, reverse transcription-polymerase chain reaction analyses identified both GPI and transmembrane forms of Daf2 gene-derived protein. The results have relevance for studies of in vivo murine DAF function and of murine DAF structure.

  14. Construction and evaluation of normalized cDNA libraries enriched with full-length sequences for rapid discovery of new genes from Sisal (Agave sisalana Perr.) different developmental stages.

    PubMed

    Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

    2012-01-01

    To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944

  15. Construction and evaluation of normalized cDNA libraries enriched with full-length sequences for rapid discovery of new genes from Sisal (Agave sisalana Perr.) different developmental stages.

    PubMed

    Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

    2012-10-12

    To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing.

  16. Construction and Evaluation of Normalized cDNA Libraries Enriched with Full-Length Sequenc