comparative genomic profiling: Topics by Science.gov

Sample records for comparative genomic profiling

Assigning protein functions by comparative genome analysis protein phylogenetic profiles

DOEpatents

Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

2003-05-13

A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.
Comparative genomic analysis of Lactobacillus plantarum ZJ316 reveals its genetic adaptation and potential probiotic profiles* #

PubMed Central

Li, Ping; Li, Xuan; Gu, Qing; Lou, Xiu-yu; Zhang, Xiao-mei; Song, Da-feng; Zhang, Chen

2016-01-01

Objective: In previous studies, Lactobacillus plantarum ZJ316 showed probiotic properties, such as antimicrobial activity against various pathogens and the capacity to significantly improve pig growth and pork quality. The purpose of this study was to reveal the genes potentially related to its genetic adaptation and probiotic profiles based on comparative genomic analysis. Methods: The genome sequence of L. plantarum ZJ316 was compared with those of eight L. plantarum strains deposited in GenBank. BLASTN, Mauve, and MUMmer programs were used for genome alignment and comparison. CRISPRFinder was applied for searching the clustered regularly interspaced short palindromic repeats (CRISPRs). Results: We identified genes that encode proteins related to genetic adaptation and probiotic profiles, including carbohydrate transport and metabolism, proteolytic enzyme systems and amino acid biosynthesis, CRISPR adaptive immunity, stress responses, bile salt resistance, ability to adhere to the host intestinal wall, exopolysaccharide (EPS) biosynthesis, and bacteriocin biosynthesis. Conclusions: Comparative characterization of the L. plantarum ZJ316 genome provided the genetic basis for further elucidating the functional mechanisms of its probiotic properties. ZJ316 could be considered a potential probiotic candidate. PMID:27487802
Comparative genomic analysis of Lactobacillus plantarum ZJ316 reveals its genetic adaptation and potential probiotic profiles.

PubMed

Li, Ping; Li, Xuan; Gu, Qing; Lou, Xiu-Yu; Zhang, Xiao-Mei; Song, Da-Feng; Zhang, Chen

2016-08-01

In previous studies, Lactobacillus plantarum ZJ316 showed probiotic properties, such as antimicrobial activity against various pathogens and the capacity to significantly improve pig growth and pork quality. The purpose of this study was to reveal the genes potentially related to its genetic adaptation and probiotic profiles based on comparative genomic analysis. The genome sequence of L. plantarum ZJ316 was compared with those of eight L. plantarum strains deposited in GenBank. BLASTN, Mauve, and MUMmer programs were used for genome alignment and comparison. CRISPRFinder was applied for searching the clustered regularly interspaced short palindromic repeats (CRISPRs). We identified genes that encode proteins related to genetic adaptation and probiotic profiles, including carbohydrate transport and metabolism, proteolytic enzyme systems and amino acid biosynthesis, CRISPR adaptive immunity, stress responses, bile salt resistance, ability to adhere to the host intestinal wall, exopolysaccharide (EPS) biosynthesis, and bacteriocin biosynthesis. Comparative characterization of the L. plantarum ZJ316 genome provided the genetic basis for further elucidating the functional mechanisms of its probiotic properties. ZJ316 could be considered a potential probiotic candidate.
GEMINI: a computationally-efficient search engine for large gene expression datasets.

PubMed

DeFreitas, Timothy; Saddiki, Hachem; Flaherty, Patrick

2016-02-24

Low-cost DNA sequencing allows organizations to accumulate massive amounts of genomic data and use that data to answer a diverse range of research questions. Presently, users must search for relevant genomic data using a keyword, accession number of meta-data tag. However, in this search paradigm the form of the query - a text-based string - is mismatched with the form of the target - a genomic profile. To improve access to massive genomic data resources, we have developed a fast search engine, GEMINI, that uses a genomic profile as a query to search for similar genomic profiles. GEMINI implements a nearest-neighbor search algorithm using a vantage-point tree to store a database of n profiles and in certain circumstances achieves an [Formula: see text] expected query time in the limit. We tested GEMINI on breast and ovarian cancer gene expression data from The Cancer Genome Atlas project and show that it achieves a query time that scales as the logarithm of the number of records in practice on genomic data. In a database with 10(5) samples, GEMINI identifies the nearest neighbor in 0.05 sec compared to a brute force search time of 0.6 sec. GEMINI is a fast search engine that uses a query genomic profile to search for similar profiles in a very large genomic database. It enables users to identify similar profiles independent of sample label, data origin or other meta-data information.
Comparative genome analysis in the integrated microbial genomes (IMG) system.

PubMed

Markowitz, Victor M; Kyrpides, Nikos C

2007-01-01

Comparative genome analysis is critical for the effective exploration of a rapidly growing number of complete and draft sequences for microbial genomes. The Integrated Microbial Genomes (IMG) system (img.jgi.doe.gov) has been developed as a community resource that provides support for comparative analysis of microbial genomes in an integrated context. IMG allows users to navigate the multidimensional microbial genome data space and focus their analysis on a subset of genes, genomes, and functions of interest. IMG provides graphical viewers, summaries, and occurrence profile tools for comparing genes, pathways, and functions (terms) across specific genomes. Genes can be further examined using gene neighborhoods and compared with sequence alignment tools.
Phylo_dCor: distance correlation as a novel metric for phylogenetic profiling.

PubMed

Sferra, Gabriella; Fratini, Federica; Ponzi, Marta; Pizzi, Elisabetta

2017-09-05

Elaboration of powerful methods to predict functional and/or physical protein-protein interactions from genome sequence is one of the main tasks in the post-genomic era. Phylogenetic profiling allows the prediction of protein-protein interactions at a whole genome level in both Prokaryotes and Eukaryotes. For this reason it is considered one of the most promising methods. Here, we propose an improvement of phylogenetic profiling that enables handling of large genomic datasets and infer global protein-protein interactions. This method uses the distance correlation as a new measure of phylogenetic profile similarity. We constructed robust reference sets and developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation that makes it applicable to large genomic data. Using Saccharomyces cerevisiae and Escherichia coli genome datasets, we showed that Phylo-dCor outperforms phylogenetic profiling methods previously described based on the mutual information and Pearson's correlation as measures of profile similarity. In this work, we constructed and assessed robust reference sets and propose the distance correlation as a measure for comparing phylogenetic profiles. To make it applicable to large genomic data, we developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation. Two R scripts that can be run on a wide range of machines are available upon request.
Arpeggio: harmonic compression of ChIP-seq data reveals protein-chromatin interaction signatures

PubMed Central

Stanton, Kelly Patrick; Parisi, Fabio; Strino, Francesco; Rabin, Neta; Asp, Patrik; Kluger, Yuval

2013-01-01

Researchers generating new genome-wide data in an exploratory sequencing study can gain biological insights by comparing their data with well-annotated data sets possessing similar genomic patterns. Data compression techniques are needed for efficient comparisons of a new genomic experiment with large repositories of publicly available profiles. Furthermore, data representations that allow comparisons of genomic signals from different platforms and across species enhance our ability to leverage these large repositories. Here, we present a signal processing approach that characterizes protein–chromatin interaction patterns at length scales of several kilobases. This allows us to efficiently compare numerous chromatin-immunoprecipitation sequencing (ChIP-seq) data sets consisting of many types of DNA-binding proteins collected from a variety of cells, conditions and organisms. Importantly, these interaction patterns broadly reflect the biological properties of the binding events. To generate these profiles, termed Arpeggio profiles, we applied harmonic deconvolution techniques to the autocorrelation profiles of the ChIP-seq signals. We used 806 publicly available ChIP-seq experiments and showed that Arpeggio profiles with similar spectral densities shared biological properties. Arpeggio profiles of ChIP-seq data sets revealed characteristics that are not easily detected by standard peak finders. They also allowed us to relate sequencing data sets from different genomes, experimental platforms and protocols. Arpeggio is freely available at http://sourceforge.net/p/arpeggio/wiki/Home/. PMID:23873955
Arpeggio: harmonic compression of ChIP-seq data reveals protein-chromatin interaction signatures.

PubMed

Stanton, Kelly Patrick; Parisi, Fabio; Strino, Francesco; Rabin, Neta; Asp, Patrik; Kluger, Yuval

2013-09-01

Researchers generating new genome-wide data in an exploratory sequencing study can gain biological insights by comparing their data with well-annotated data sets possessing similar genomic patterns. Data compression techniques are needed for efficient comparisons of a new genomic experiment with large repositories of publicly available profiles. Furthermore, data representations that allow comparisons of genomic signals from different platforms and across species enhance our ability to leverage these large repositories. Here, we present a signal processing approach that characterizes protein-chromatin interaction patterns at length scales of several kilobases. This allows us to efficiently compare numerous chromatin-immunoprecipitation sequencing (ChIP-seq) data sets consisting of many types of DNA-binding proteins collected from a variety of cells, conditions and organisms. Importantly, these interaction patterns broadly reflect the biological properties of the binding events. To generate these profiles, termed Arpeggio profiles, we applied harmonic deconvolution techniques to the autocorrelation profiles of the ChIP-seq signals. We used 806 publicly available ChIP-seq experiments and showed that Arpeggio profiles with similar spectral densities shared biological properties. Arpeggio profiles of ChIP-seq data sets revealed characteristics that are not easily detected by standard peak finders. They also allowed us to relate sequencing data sets from different genomes, experimental platforms and protocols. Arpeggio is freely available at http://sourceforge.net/p/arpeggio/wiki/Home/.
The Cancer Genome Atlas Pan-Cancer analysis project.

PubMed

Weinstein, John N; Collisson, Eric A; Mills, Gordon B; Shaw, Kenna R Mills; Ozenberger, Brad A; Ellrott, Kyle; Shmulevich, Ilya; Sander, Chris; Stuart, Joshua M

2013-10-01

The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences and emergent themes across tumor lineages. The Pan-Cancer initiative compares the first 12 tumor types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumor types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile.
Comparative whole genome DNA methylation profiling of cattle sperm and somatic tissues reveals striking hypomethylated patterns in sperm

USDA-ARS?s Scientific Manuscript database

Using whole-genome bisulfite sequencing (WGBS), we profiled the DNA methylome of cattle sperms through comparison with three bovine somatic tissues (mammary grand, brain and blood). Large differences between them were observed in the methylation patterns of global CpGs, pericentromeric satellites, p...
Clonality: an R package for testing clonal relatedness of two tumors from the same patient based on their genomic profiles.

PubMed

Ostrovnaya, Irina; Seshan, Venkatraman E; Olshen, Adam B; Begg, Colin B

2011-06-15

If a cancer patient develops multiple tumors, it is sometimes impossible to determine whether these tumors are independent or clonal based solely on pathological characteristics. Investigators have studied how to improve this diagnostic challenge by comparing the presence of loss of heterozygosity (LOH) at selected genetic locations of tumor samples, or by comparing genomewide copy number array profiles. We have previously developed statistical methodology to compare such genomic profiles for an evidence of clonality. We assembled the software for these tests in a new R package called 'Clonality'. For LOH profiles, the package contains significance tests. The analysis of copy number profiles includes a likelihood ratio statistic and reference distribution, as well as an option to produce various plots that summarize the results. Bioconductor (http://bioconductor.org/packages/release/bioc/html/Clonality.html) and http://www.mskcc.org/mskcc/html/13287.cfm.
Inversions and inverted transpositions as the basis for an almost universal "format" of genome sequences.

PubMed

Albrecht-Buehler, Guenter

2007-09-01

In genome duplexes that exceed 100 kb the frequency distributions of their trinucleotides (triplet profiles) are the same in both strands. This remarkable symmetry, sometimes called Chargaff's second parity rule, is not the result of base pairing, but can be explained as the result of countless inversions and inverted transpositions that occurred throughout evolution (G. Albrecht-Buehler, 2006, Proc. Natl. Acad. Sci. USA 103, 17828-17833). Furthermore, comparing the triplet profiles of genomes from a large number of different taxa and species revealed that they were not only strand-symmetrical, but even surprisingly similar to one another (majority profile; G. Albrecht-Buehler, 2007, Genomics 89, 596-601). The present article proposes that the same inversion/transposition mechanism(s) that created the strand symmetry may also explain the existence of the majority profile. Thus they may be key factors in the creation of an almost universal "format" in which genome sequences are written. One may speculate that this universality of genome format may facilitate horizontal gene transfer and, thus, accelerate evolution.
UFO: a web server for ultra-fast functional profiling of whole genome protein sequences.

PubMed

Meinicke, Peter

2009-09-02

Functional profiling is a key technique to characterize and compare the functional potential of entire genomes. The estimation of profiles according to an assignment of sequences to functional categories is a computationally expensive task because it requires the comparison of all protein sequences from a genome with a usually large database of annotated sequences or sequence families. Based on machine learning techniques for Pfam domain detection, the UFO web server for ultra-fast functional profiling allows researchers to process large protein sequence collections instantaneously. Besides the frequencies of Pfam and GO categories, the user also obtains the sequence specific assignments to Pfam domain families. In addition, a comparison with existing genomes provides dissimilarity scores with respect to 821 reference proteomes. Considering the underlying UFO domain detection, the results on 206 test genomes indicate a high sensitivity of the approach. In comparison with current state-of-the-art HMMs, the runtime measurements show a considerable speed up in the range of four orders of magnitude. For an average size prokaryotic genome, the computation of a functional profile together with its comparison typically requires about 10 seconds of processing time. For the first time the UFO web server makes it possible to get a quick overview on the functional inventory of newly sequenced organisms. The genome scale comparison with a large number of precomputed profiles allows a first guess about functionally related organisms. The service is freely available and does not require user registration or specification of a valid email address.
The Cancer Genome Atlas Pan-Cancer Analysis Project

PubMed Central

Weinstein, John N.; Collisson, Eric A.; Mills, Gordon B.; Shaw, Kenna M.; Ozenberger, Brad A.; Ellrott, Kyle; Shmulevich, Ilya; Sander, Chris; Stuart, Joshua M.

2014-01-01

Cancer can take hundreds of different forms depending on the location, cell of origin and spectrum of genomic alterations that promote oncogenesis and affect therapeutic response. Although many genomic events with direct phenotypic impact have been identified, much of the complex molecular landscape remains incompletely charted for most cancer lineages. For that reason, The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumours to discover molecular aberrations at the DNA, RNA, protein, and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences, and emergent themes across tumour lineages. The Pan-Cancer initiative compares the first twelve tumour types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumour types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile. PMID:24071849
Customized Oligonucleotide Array-Based Comparative Genomic Hybridization as a Clinical Assay for Genomic Profiling of Chronic Lymphocytic Leukemia

PubMed Central

Sargent, Rachel; Jones, Dan; Abruzzo, Lynne V.; Yao, Hui; Bonderover, Jaime; Cisneros, Marissa; Wierda, William G.; Keating, Michael J.; Luthra, Rajyalakshmi

2009-01-01

Chromosome gains and losses used for risk stratification in chronic lymphocytic leukemia (CLL) are commonly assessed by multiprobe fluorescence in situ hybridization (FISH) studies. We designed and validated a customized array-comparative genomic hybridization (aCGH) platform as a clinical assay for CLL genomic profiling. A 60-mer, 44,000-probe oligonucleotide array with a 50-kb average spatial resolution was augmented with high-density probe tiling at loci that are frequently aberrant in CLL. Aberrations identified by aCGH were compared with those identified by a FISH panel, including locus-specific probes to ATM (11q22.3), the centromeric region of chromosome 12 (12p11.1–q11), D13S319 (13q14.3), LAMP1 (13q34), and TP53 (17p13.1). In 100 CLL samples, aCGH/FISH concordance was seen for 89% of FISH-called aberrations at the ATM (n = 18), D13S319 (n = 42), LAMP (n = 12), and TP53 (n = 22) loci and for chromosome 12 (n = 14). Eighty-four percentage of FISH/aCGH discordant calls were in samples either at or below the limit of aCGH sensitivity (10% to 25% FISH aberration-containing cells). Therefore, aCGH profiling is a feasible routine clinical test with comparable results to multiprobe FISH studies; however, it may be less sensitive than FISH in cases with low-level aberrations. Further, a customized array design can provide comprehensive genomic profiling with additional accuracy in both identifying and defining the extent of small aberrations at target loci. PMID:19074592
Kullback Leibler divergence in complete bacterial and phage genomes

PubMed Central

Akhter, Sajia; Kashef, Mona T.; Ibrahim, Eslam S.; Bailey, Barbara

2017-01-01

The amino acid content of the proteins encoded by a genome may predict the coding potential of that genome and may reflect lifestyle restrictions of the organism. Here, we calculated the Kullback–Leibler divergence from the mean amino acid content as a metric to compare the amino acid composition for a large set of bacterial and phage genome sequences. Using these data, we demonstrate that (i) there is a significant difference between amino acid utilization in different phylogenetic groups of bacteria and phages; (ii) many of the bacteria with the most skewed amino acid utilization profiles, or the bacteria that host phages with the most skewed profiles, are endosymbionts or parasites; (iii) the skews in the distribution are not restricted to certain metabolic processes but are common across all bacterial genomic subsystems; (iv) amino acid utilization profiles strongly correlate with GC content in bacterial genomes but very weakly correlate with the G+C percent in phage genomes. These findings might be exploited to distinguish coding from non-coding sequences in large data sets, such as metagenomic sequence libraries, to help in prioritizing subsequent analyses. PMID:29204318
Kullback Leibler divergence in complete bacterial and phage genomes.

PubMed

Akhter, Sajia; Aziz, Ramy K; Kashef, Mona T; Ibrahim, Eslam S; Bailey, Barbara; Edwards, Robert A

2017-01-01

The amino acid content of the proteins encoded by a genome may predict the coding potential of that genome and may reflect lifestyle restrictions of the organism. Here, we calculated the Kullback-Leibler divergence from the mean amino acid content as a metric to compare the amino acid composition for a large set of bacterial and phage genome sequences. Using these data, we demonstrate that (i) there is a significant difference between amino acid utilization in different phylogenetic groups of bacteria and phages; (ii) many of the bacteria with the most skewed amino acid utilization profiles, or the bacteria that host phages with the most skewed profiles, are endosymbionts or parasites; (iii) the skews in the distribution are not restricted to certain metabolic processes but are common across all bacterial genomic subsystems; (iv) amino acid utilization profiles strongly correlate with GC content in bacterial genomes but very weakly correlate with the G+C percent in phage genomes. These findings might be exploited to distinguish coding from non-coding sequences in large data sets, such as metagenomic sequence libraries, to help in prioritizing subsequent analyses.
Developing a Drosophila Model of Schwannomatosis

DTIC Science & Technology

2013-02-01

Drosophila melanogaster has become an important model system for cancer studies. Reduced redundancy in the Drosophila genome compared with that of...of high-resolution deletion coverage of the Drosophila melanogaster genome . Nat. Genet. 36, 288-292. Pastor-Pareja, J. C., Wu, M. and Xu. T. (2008...microarray analysis of the entire Drosophila melanogaster genome and compared gene expression profiles of wild type, dCap-D3 and rbf1 mutant
Determining protein function and interaction from genome analysis

DOEpatents

Eisenberg, David; Marcotte, Edward M.; Thompson, Michael J.; Pellegrini, Matteo; Yeates, Todd O.

2004-08-03

A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.
Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications

PubMed Central

Harris, R. Alan; Wang, Ting; Coarfa, Cristian; Nagarajan, Raman P.; Hong, Chibo; Downey, Sara L.; Johnson, Brett E.; Fouse, Shaun D.; Delaney, Allen; Zhao, Yongjun; Olshen, Adam; Ballinger, Tracy; Zhou, Xin; Forsberg, Kevin J.; Gu, Junchen; Echipare, Lorigail; O’Geen, Henriette; Lister, Ryan; Pelizzola, Mattia; Xi, Yuanxin; Epstein, Charles B.; Bernstein, Bradley E.; Hawkins, R. David; Ren, Bing; Chung, Wen-Yu; Gu, Hongcang; Bock, Christoph; Gnirke, Andreas; Zhang, Michael Q.; Haussler, David; Ecker, Joseph; Li, Wei; Farnham, Peggy J.; Waterland, Robert A.; Meissner, Alexander; Marra, Marco A.; Hirst, Martin; Milosavljevic, Aleksandar; Costello, Joseph F.

2010-01-01

Sequencing-based DNA methylation profiling methods are comprehensive and, as accuracy and affordability improve, will increasingly supplant microarrays for genome-scale analyses. Here, four sequencing-based methodologies were applied to biological replicates of human embryonic stem cells to compare their CpG coverage genome-wide and in transposons, resolution, cost, concordance and its relationship with CpG density and genomic context. The two bisulfite methods reached concordance of 82% for CpG methylation levels and 99% for non-CpG cytosine methylation levels. Using binary methylation calls, two enrichment methods were 99% concordant, while regions assessed by all four methods were 97% concordant. To achieve comprehensive methylome coverage while reducing cost, an approach integrating two complementary methods was examined. The integrative methylome profile along with histone methylation, RNA, and SNP profiles derived from the sequence reads allowed genome-wide assessment of allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression. PMID:20852635

BloodChIP: a database of comparative genome-wide transcription factor binding profiles in human blood cells.

PubMed

Chacon, Diego; Beck, Dominik; Perera, Dilmi; Wong, Jason W H; Pimanda, John E

2014-01-01

The BloodChIP database (http://www.med.unsw.edu.au/CRCWeb.nsf/page/BloodChIP) supports exploration and visualization of combinatorial transcription factor (TF) binding at a particular locus in human CD34-positive and other normal and leukaemic cells or retrieval of target gene sets for user-defined combinations of TFs across one or more cell types. Increasing numbers of genome-wide TF binding profiles are being added to public repositories, and this trend is likely to continue. For the power of these data sets to be fully harnessed by experimental scientists, there is a need for these data to be placed in context and easily accessible for downstream applications. To this end, we have built a user-friendly database that has at its core the genome-wide binding profiles of seven key haematopoietic TFs in human stem/progenitor cells. These binding profiles are compared with binding profiles in normal differentiated and leukaemic cells. We have integrated these TF binding profiles with chromatin marks and expression data in normal and leukaemic cell fractions. All queries can be exported into external sites to construct TF-gene and protein-protein networks and to evaluate the association of genes with cellular processes and tissue expression.
cisMEP: an integrated repository of genomic epigenetic profiles and cis-regulatory modules in Drosophila

PubMed Central

2014-01-01

Background Cis-regulatory modules (CRMs), or the DNA sequences required for regulating gene expression, play the central role in biological researches on transcriptional regulation in metazoan species. Nowadays, the systematic understanding of CRMs still mainly resorts to computational methods due to the time-consuming and small-scale nature of experimental methods. But the accuracy and reliability of different CRM prediction tools are still unclear. Without comparative cross-analysis of the results and combinatorial consideration with extra experimental information, there is no easy way to assess the confidence of the predicted CRMs. This limits the genome-wide understanding of CRMs. Description It is known that transcription factor binding and epigenetic profiles tend to determine functions of CRMs in gene transcriptional regulation. Thus integration of the genome-wide epigenetic profiles with systematically predicted CRMs can greatly help researchers evaluate and decipher the prediction confidence and possible transcriptional regulatory functions of these potential CRMs. However, these data are still fragmentary in the literatures. Here we performed the computational genome-wide screening for potential CRMs using different prediction tools and constructed the pioneer database, cisMEP (cis-regulatory module epigenetic profile database), to integrate these computationally identified CRMs with genomic epigenetic profile data. cisMEP collects the literature-curated TFBS location data and nine genres of epigenetic data for assessing the confidence of these potential CRMs and deciphering the possible CRM functionality. Conclusions cisMEP aims to provide a user-friendly interface for researchers to assess the confidence of different potential CRMs and to understand the functions of CRMs through experimentally-identified epigenetic profiles. The deposited potential CRMs and experimental epigenetic profiles for confidence assessment provide experimentally testable hypotheses for the molecular mechanisms of metazoan gene regulation. We believe that the information deposited in cisMEP will greatly facilitate the comparative usage of different CRM prediction tools and will help biologists to study the modular regulatory mechanisms between different TFs and their target genes. PMID:25521507
Rosetta stone method for detecting protein function and protein-protein interactions from genome sequences

DOEpatents

Eisenberg, David; Marcotte, Edward M.; Pellegrini, Matteo; Thompson, Michael J.; Yeates, Todd O.

2002-10-15

A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.
Comparative transcriptional profiling identifies takeout as a gene that regulates life span

PubMed Central

Bauer, Johannes; Antosh, Michael; Chang, Chengyi; Schorl, Christoph; Kolli, Santharam; Neretti, Nicola; Helfand, Stephen L.

2010-01-01

A major challenge in translating the positive effects of dietary restriction (DR) for the improvement of human health is the development of therapeutic mimics. One approach to finding DR mimics is based upon identification of the proximal effectors of DR life span extension. Whole genome profiling of DR in Drosophila shows a large number of changes in gene expression, making it difficult to establish which changes are involved in life span determination as opposed to other unrelated physiological changes. We used comparative whole genome expression profiling to discover genes whose change in expression is shared between DR and two molecular genetic life span extending interventions related to DR, increased dSir2 and decreased Dmp53 activity. We find twenty-one genes shared among the three related life span extending interventions. One of these genes, takeout, thought to be involved in circadian rhythms, feeding behavior and juvenile hormone binding is also increased in four other life span extending conditions: Rpd3, Indy, chico and methuselah. We demonstrate takeout is involved in longevity determination by specifically increasing adult takeout expression and extending life span. These studies demonstrate the power of comparative whole genome transcriptional profiling for identifying specific downstream elements of the DR life span extending pathway. PMID:20519778
ReprDB and panDB: minimalist databases with maximal microbial representation.

PubMed

Zhou, Wei; Gay, Nicole; Oh, Julia

2018-01-18

Profiling of shotgun metagenomic samples is hindered by a lack of unified microbial reference genome databases that (i) assemble genomic information from all open access microbial genomes, (ii) have relatively small sizes, and (iii) are compatible to various metagenomic read mapping tools. Moreover, computational tools to rapidly compile and update such databases to accommodate the rapid increase in new reference genomes do not exist. As a result, database-guided analyses often fail to profile a substantial fraction of metagenomic shotgun sequencing reads from complex microbiomes. We report pipelines that efficiently traverse all open access microbial genomes and assemble non-redundant genomic information. The pipelines result in two species-resolution microbial reference databases of relatively small sizes: reprDB, which assembles microbial representative or reference genomes, and panDB, for which we developed a novel iterative alignment algorithm to identify and assemble non-redundant genomic regions in multiple sequenced strains. With the databases, we managed to assign taxonomic labels and genome positions to the majority of metagenomic reads from human skin and gut microbiomes, demonstrating a significant improvement over a previous database-guided analysis on the same datasets. reprDB and panDB leverage the rapid increases in the number of open access microbial genomes to more fully profile metagenomic samples. Additionally, the databases exclude redundant sequence information to avoid inflated storage or memory space and indexing or analyzing time. Finally, the novel iterative alignment algorithm significantly increases efficiency in pan-genome identification and can be useful in comparative genomic analyses.
Molecular profiling reveals frequent gain of MYCN and anaplasia-specific loss of 4q and 14q in Wilms tumor.

PubMed

Williams, Richard D; Al-Saadi, Reem; Natrajan, Rachael; Mackay, Alan; Chagtai, Tasnim; Little, Suzanne; Hing, Sandra N; Fenwick, Kerry; Ashworth, Alan; Grundy, Paul; Anderson, James R; Dome, Jeffrey S; Perlman, Elizabeth J; Jones, Chris; Pritchard-Jones, Kathy

2011-12-01

Anaplasia in Wilms tumor, a distinctive histology characterized by abnormal mitoses, is associated with poor patient outcome. While anaplastic tumors frequently harbour TP53 mutations, little is otherwise known about their molecular biology. We have used array comparative genomic hybridization (aCGH) and cDNA microarray expression profiling to compare anaplastic and favorable histology Wilms tumors to determine their common and differentiating features. In addition to changes on 17p, consistent with TP53 deletion, recurrent anaplasia-specific genomic loss and under-expression were noted in several other regions, most strikingly 4q and 14q. Further aberrations, including gain of 1q and loss of 16q were common to both histologies. Focal gain of MYCN, initially detected by high resolution aCGH profiling in 6/61 anaplastic samples, was confirmed in a significant proportion of both tumor types by a genomic quantitative PCR survey of over 400 tumors. Overall, these results are consistent with a model where anaplasia, rather than forming an entirely distinct molecular entity, arises from the general continuum of Wilms tumor by the acquisition of additional genomic changes at multiple loci. Copyright © 2011 Wiley Periodicals, Inc.
Comprehensive Genomic Profiling Aids in Distinguishing Metastatic Recurrence from Second Primary Cancers

PubMed Central

Weinberg, Benjamin A.; Gowen, Kyle; Lee, Thomas K.; Ou, Sai‐Hong Ignatius; Bristow, Robert; Krill, Lauren; Almira‐Suarez, M. Isabel; Ali, Siraj M.; Miller, Vincent A.; Liu, Stephen V.

2017-01-01

Abstract Background. Metastatic recurrence after treatment for locoregional cancer is a major cause of morbidity and cancer‐specific mortality. Distinguishing metastatic recurrence from the development of a second primary cancer has important prognostic and therapeutic value and represents a difficult clinical scenario. Advances beyond histopathological comparison are needed. We sought to interrogate the ability of comprehensive genomic profiling (CGP) to aid in distinguishing between these clinical scenarios. Materials and Methods. We identified three prospective cases of recurrent tumors in patients previously treated for localized cancers in which histologic analyses suggested subsequent development of a distinct second primary. Paired samples from the original primary and recurrent tumor were subjected to hybrid capture next‐generation sequencing‐based CGP to identify base pair substitutions, insertions, deletions, copy number alterations (CNA), and chromosomal rearrangements. Genomic profiles between paired samples were compared using previously established statistical clonality assessment software to gauge relatedness beyond global CGP similarities. Results. A high degree of similarity was observed among genomic profiles from morphologically distinct primary and recurrent tumors. Genomic information suggested reclassification as recurrent metastatic disease, and patients received therapy for metastatic disease based on the molecular determination. Conclusions. Our cases demonstrate an important adjunct role for CGP technologies in separating metastatic recurrence from development of a second primary cancer. Larger series are needed to confirm our observations, but comparative CGP may be considered in patients for whom distinguishing metastatic recurrence from a second primary would alter the therapeutic approach. Implications for Practice. Distinguishing a metastatic recurrence from a second primary cancer can represent a difficult clinicopathologic problem but has important prognostic and therapeutic implications. Approaches to aid histologic analysis may improve clinician and pathologist confidence in this increasingly common clinical scenario. Our series provides early support for incorporating paired comprehensive genomic profiling in clinical situations in which determination of metastatic recurrence versus a distinct second primary cancer would influence patient management. PMID:28193735
Comparative genomics and transcriptional profiles of Saccharopolyspora erythraea NRRL 2338 and a classically improved erythromycin over-producing strain

PubMed Central

2012-01-01

Background The molecular mechanisms altered by the traditional mutation and screening approach during the improvement of antibiotic-producing microorganisms are still poorly understood although this information is essential to design rational strategies for industrial strain improvement. In this study, we applied comparative genomics to identify all genetic changes occurring during the development of an erythromycin overproducer obtained using the traditional mutate-and- screen method. Results Compared with the parental Saccharopolyspora erythraea NRRL 2338, the genome of the overproducing strain presents 117 deletion, 78 insertion and 12 transposition sites, with 71 insertion/deletion sites mapping within coding sequences (CDSs) and generating frame-shift mutations. Single nucleotide variations are present in 144 CDSs. Overall, the genomic variations affect 227 proteins of the overproducing strain and a considerable number of mutations alter genes of key enzymes in the central carbon and nitrogen metabolism and in the biosynthesis of secondary metabolites, resulting in the redirection of common precursors toward erythromycin biosynthesis. Interestingly, several mutations inactivate genes coding for proteins that play fundamental roles in basic transcription and translation machineries including the transcription anti-termination factor NusB and the transcription elongation factor Efp. These mutations, along with those affecting genes coding for pleiotropic or pathway-specific regulators, affect global expression profile as demonstrated by a comparative analysis of the parental and overproducer expression profiles. Genomic data, finally, suggest that the mutate-and-screen process might have been accelerated by mutations in DNA repair genes. Conclusions This study helps to clarify the mechanisms underlying antibiotic overproduction providing valuable information about new possible molecular targets for rationale strain improvement. PMID:22401291
CGI: Java Software for Mapping and Visualizing Data from Array-based Comparative Genomic Hybridization and Expression Profiling

PubMed Central

Gu, Joyce Xiuweu-Xu; Wei, Michael Yang; Rao, Pulivarthi H.; Lau, Ching C.; Behl, Sanjiv; Man, Tsz-Kwong

2007-01-01

With the increasing application of various genomic technologies in biomedical research, there is a need to integrate these data to correlate candidate genes/regions that are identified by different genomic platforms. Although there are tools that can analyze data from individual platforms, essential software for integration of genomic data is still lacking. Here, we present a novel Java-based program called CGI (Cytogenetics-Genomics Integrator) that matches the BAC clones from array-based comparative genomic hybridization (aCGH) to genes from RNA expression profiling datasets. The matching is computed via a fast, backend MySQL database containing UCSC Genome Browser annotations. This program also provides an easy-to-use graphical user interface for visualizing and summarizing the correlation of DNA copy number changes and RNA expression patterns from a set of experiments. In addition, CGI uses a Java applet to display the copy number values of a specific BAC clone in aCGH experiments side by side with the expression levels of genes that are mapped back to that BAC clone from the microarray experiments. The CGI program is built on top of extensible, reusable graphic components specifically designed for biologists. It is cross-platform compatible and the source code is freely available under the General Public License. PMID:19936083
CGI: Java software for mapping and visualizing data from array-based comparative genomic hybridization and expression profiling.

PubMed

Gu, Joyce Xiuweu-Xu; Wei, Michael Yang; Rao, Pulivarthi H; Lau, Ching C; Behl, Sanjiv; Man, Tsz-Kwong

2007-10-06

With the increasing application of various genomic technologies in biomedical research, there is a need to integrate these data to correlate candidate genes/regions that are identified by different genomic platforms. Although there are tools that can analyze data from individual platforms, essential software for integration of genomic data is still lacking. Here, we present a novel Java-based program called CGI (Cytogenetics-Genomics Integrator) that matches the BAC clones from array-based comparative genomic hybridization (aCGH) to genes from RNA expression profiling datasets. The matching is computed via a fast, backend MySQL database containing UCSC Genome Browser annotations. This program also provides an easy-to-use graphical user interface for visualizing and summarizing the correlation of DNA copy number changes and RNA expression patterns from a set of experiments. In addition, CGI uses a Java applet to display the copy number values of a specific BAC clone in aCGH experiments side by side with the expression levels of genes that are mapped back to that BAC clone from the microarray experiments. The CGI program is built on top of extensible, reusable graphic components specifically designed for biologists. It is cross-platform compatible and the source code is freely available under the General Public License.
Integrated analysis of copy number alteration and RNA expression profiles of cancer using a high-resolution whole-genome oligonucleotide array.

PubMed

Jung, Seung-Hyun; Shin, Seung-Hun; Yim, Seon-Hee; Choi, Hye-Sun; Lee, Sug-Hyung; Chung, Yeun-Jun

2009-07-31

Recently, microarray-based comparative genomic hybridization (array-CGH) has emerged as a very efficient technology with higher resolution for the genome-wide identification of copy number alterations (CNA). Although CNAs are thought to affect gene expression, there is no platform currently available for the integrated CNA-expression analysis. To achieve high-resolution copy number analysis integrated with expression profiles, we established human 30k oligoarray-based genome-wide copy number analysis system and explored the applicability of this system for integrated genome and transcriptome analysis using MDA-MB-231 cell line. We compared the CNAs detected by the oligoarray with those detected by the 3k BAC array for validation. The oligoarray identified the single copy difference more accurately and sensitively than the BAC array. Seventeen CNAs detected by both platforms in MDA-MB-231 such as gains of 5p15.33-13.1, 8q11.22-8q21.13, 17p11.2, and losses of 1p32.3, 8p23.3-8p11.21, and 9p21 were consistently identified in previous studies on breast cancer. There were 122 other small CNAs (mean size 1.79 mb) that were detected by oligoarray only, not by BAC-array. We performed genomic qPCR targeting 7 CNA regions, detected by oligoarray only, and one non-CNA region to validate the oligoarray CNA detection. All qPCR results were consistent with the oligoarray-CGH results. When we explored the possibility of combined interpretation of both DNA copy number and RNA expression profiles, mean DNA copy number and RNA expression levels showed a significant correlation. In conclusion, this 30k oligoarray-CGH system can be a reasonable choice for analyzing whole genome CNAs and RNA expression profiles at a lower cost.
High quality de novo sequencing and assembly of the Saccharomyces arboricolus genome

PubMed Central

2013-01-01

Background Comparative genomics is a formidable tool to identify functional elements throughout a genome. In the past ten years, studies in the budding yeast Saccharomyces cerevisiae and a set of closely related species have been instrumental in showing the benefit of analyzing patterns of sequence conservation. Increasing the number of closely related genome sequences makes the comparative genomics approach more powerful and accurate. Results Here, we report the genome sequence and analysis of Saccharomyces arboricolus, a yeast species recently isolated in China, that is closely related to S. cerevisiae. We obtained high quality de novo sequence and assemblies using a combination of next generation sequencing technologies, established the phylogenetic position of this species and considered its phenotypic profile under multiple environmental conditions in the light of its gene content and phylogeny. Conclusions We suggest that the genome of S. arboricolus will be useful in future comparative genomics analysis of the Saccharomyces sensu stricto yeasts. PMID:23368932
Plant-RRBS, a bisulfite and next-generation sequencing-based methylome profiling method enriching for coverage of cytosine positions.

PubMed

Schmidt, Martin; Van Bel, Michiel; Woloszynska, Magdalena; Slabbinck, Bram; Martens, Cindy; De Block, Marc; Coppens, Frederik; Van Lijsebettens, Mieke

2017-07-06

Cytosine methylation in plant genomes is important for the regulation of gene transcription and transposon activity. Genome-wide methylomes are studied upon mutation of the DNA methyltransferases, adaptation to environmental stresses or during development. However, from basic biology to breeding programs, there is a need to monitor multiple samples to determine transgenerational methylation inheritance or differential cytosine methylation. Methylome data obtained by sodium hydrogen sulfite (bisulfite)-conversion and next-generation sequencing (NGS) provide genome-wide information on cytosine methylation. However, a profiling method that detects cytosine methylation state dispersed over the genome would allow high-throughput analysis of multiple plant samples with distinct epigenetic signatures. We use specific restriction endonucleases to enrich for cytosine coverage in a bisulfite and NGS-based profiling method, which was compared to whole-genome bisulfite sequencing of the same plant material. We established an effective methylome profiling method in plants, termed plant-reduced representation bisulfite sequencing (plant-RRBS), using optimized double restriction endonuclease digestion, fragment end repair, adapter ligation, followed by bisulfite conversion, PCR amplification and NGS. We report a performant laboratory protocol and a straightforward bioinformatics data analysis pipeline for plant-RRBS, applicable for any reference-sequenced plant species. As a proof of concept, methylome profiling was performed using an Oryza sativa ssp. indica pure breeding line and a derived epigenetically altered line (epiline). Plant-RRBS detects methylation levels at tens of millions of cytosine positions deduced from bisulfite conversion in multiple samples. To evaluate the method, the coverage of cytosine positions, the intra-line similarity and the differential cytosine methylation levels between the pure breeding line and the epiline were determined. Plant-RRBS reproducibly covers commonly up to one fourth of the cytosine positions in the rice genome when using MspI-DpnII within a group of five biological replicates of a line. The method predominantly detects cytosine methylation in putative promoter regions and not-annotated regions in rice. Plant-RRBS offers high-throughput and broad, genome-dispersed methylation detection by effective read number generation obtained from reproducibly covered genome fractions using optimized endonuclease combinations, facilitating comparative analyses of multi-sample studies for cytosine methylation and transgenerational stability in experimental material and plant breeding populations.
A critical appraisal of the scientific basis of commercial genomic profiles used to assess health risks and personalize health interventions.

PubMed

Janssens, A Cecile J W; Gwinn, Marta; Bradley, Linda A; Oostra, Ben A; van Duijn, Cornelia M; Khoury, Muin J

2008-03-01

Predictive genomic profiling used to produce personalized nutrition and other lifestyle health recommendations is currently offered directly to consumers. By examining previous meta-analyses and HuGE reviews, we assessed the scientific evidence supporting the purported gene-disease associations for genes included in genomic profiles offered online. We identified seven companies that offer predictive genomic profiling. We searched PubMed for meta-analyses and HuGE reviews of studies of gene-disease associations published from 2000 through June 2007 in which the genotypes of people with a disease were compared with those of a healthy or general-population control group. The seven companies tested at least 69 different polymorphisms in 56 genes. Of the 56 genes tested, 24 (43%) were not reviewed in meta-analyses. For the remaining 32 genes, we found 260 meta-analyses that examined 160 unique polymorphism-disease associations, of which only 60 (38%) were found to be statistically significant. Even the 60 significant associations, which involved 29 different polymorphisms and 28 different diseases, were generally modest, with synthetic odds ratios ranging from 0.54 to 0.88 for protective variants and from 1.04 to 3.2 for risk variants. Furthermore, genes in cardiogenomic profiles were more frequently associated with noncardiovascular diseases than with cardiovascular diseases, and though two of the five genes of the osteogenomic profiles did show significant associations with disease, the associations were not with bone diseases. There is insufficient scientific evidence to conclude that genomic profiles are useful in measuring genetic risk for common diseases or in developing personalized diet and lifestyle recommendations for disease prevention.
Approaches to integrating germline and tumor genomic data in cancer research

PubMed Central

Feigelson, Heather Spencer; Goddard, Katrina A.B.; Hollombe, Celine; Tingle, Sharna R.; Gillanders, Elizabeth M.; Mechanic, Leah E.; Nelson, Stefanie A.

2014-01-01

Cancer is characterized by a diversity of genetic and epigenetic alterations occurring in both the germline and somatic (tumor) genomes. Hundreds of germline variants associated with cancer risk have been identified, and large amounts of data identifying mutations in the tumor genome that participate in tumorigenesis have been generated. Increasingly, these two genomes are being explored jointly to better understand how cancer risk alleles contribute to carcinogenesis and whether they influence development of specific tumor types or mutation profiles. To understand how data from germline risk studies and tumor genome profiling is being integrated, we reviewed 160 articles describing research that incorporated data from both genomes, published between January 2009 and December 2012, and summarized the current state of the field. We identified three principle types of research questions being addressed using these data: (i) use of tumor data to determine the putative function of germline risk variants; (ii) identification and analysis of relationships between host genetic background and particular tumor mutations or types; and (iii) use of tumor molecular profiling data to reduce genetic heterogeneity or refine phenotypes for germline association studies. We also found descriptive studies that compared germline and tumor genomic variation in a gene or gene family, and papers describing research methods, data sources, or analytical tools. We identified a large set of tools and data resources that can be used to analyze and integrate data from both genomes. Finally, we discuss opportunities and challenges for cancer research that integrates germline and tumor genomics data. PMID:25115441
Improving prokaryotic transposable elements identification using a combination of de novo and profile HMM methods.

PubMed

Kamoun, Choumouss; Payen, Thibaut; Hua-Van, Aurélie; Filée, Jonathan

2013-10-11

Insertion Sequences (ISs) and their non-autonomous derivatives (MITEs) are important components of prokaryotic genomes inducing duplication, deletion, rearrangement or lateral gene transfers. Although ISs and MITEs are relatively simple and basic genetic elements, their detection remains a difficult task due to their remarkable sequence diversity. With the advent of high-throughput genome and metagenome sequencing technologies, the development of fast, reliable and sensitive methods of ISs and MITEs detection become an important challenge. So far, almost all studies dealing with prokaryotic transposons have used classical BLAST-based detection methods against reference libraries. Here we introduce alternative methods of detection either taking advantages of the structural properties of the elements (de novo methods) or using an additional library-based method using profile HMM searches. In this study, we have developed three different work flows dedicated to ISs and MITEs detection: the first two use de novo methods detecting either repeated sequences or presence of Inverted Repeats; the third one use 28 in-house transposase alignment profiles with HMM search methods. We have compared the respective performances of each method using a reference dataset of 30 archaeal and 30 bacterial genomes in addition to simulated and real metagenomes. Compared to a BLAST-based method using ISFinder as library, de novo methods significantly improve ISs and MITEs detection. For example, in the 30 archaeal genomes, we discovered 30 new elements (+20%) in addition to the 141 multi-copies elements already detected by the BLAST approach. Many of the new elements correspond to ISs belonging to unknown or highly divergent families. The total number of MITEs has even doubled with the discovery of elements displaying very limited sequence similarities with their respective autonomous partners (mainly in the Inverted Repeats of the elements). Concerning metagenomes, with the exception of short reads data (<300 bp) for which both techniques seem equally limited, profile HMM searches considerably ameliorate the detection of transposase encoding genes (up to +50%) generating low level of false positives compare to BLAST-based methods. Compared to classical BLAST-based methods, the sensitivity of de novo and profile HMM methods developed in this study allow a better and more reliable detection of transposons in prokaryotic genomes and metagenomes. We believed that future studies implying ISs and MITEs identification in genomic data should combine at least one de novo and one library-based method, with optimal results obtained by running the two de novo methods in addition to a library-based search. For metagenomic data, profile HMM search should be favored, a BLAST-based step is only useful to the final annotation into groups and families.
A Comparative Genomic Study in Schizophrenic and in Bipolar Disorder Patients, Based on Microarray Expression Profiling Meta-Analysis

PubMed Central

Logotheti, Marianthi; Papadodima, Olga; Venizelos, Nikolaos; Chatziioannou, Aristotelis; Kolisis, Fragiskos

2013-01-01

Schizophrenia affecting almost 1% and bipolar disorder affecting almost 3%–5% of the global population constitute two severe mental disorders. The catecholaminergic and the serotonergic pathways have been proved to play an important role in the development of schizophrenia, bipolar disorder, and other related psychiatric disorders. The aim of the study was to perform and interpret the results of a comparative genomic profiling study in schizophrenic patients as well as in healthy controls and in patients with bipolar disorder and try to relate and integrate our results with an aberrant amino acid transport through cell membranes. In particular we have focused on genes and mechanisms involved in amino acid transport through cell membranes from whole genome expression profiling data. We performed bioinformatic analysis on raw data derived from four different published studies. In two studies postmortem samples from prefrontal cortices, derived from patients with bipolar disorder, schizophrenia, and control subjects, have been used. In another study we used samples from postmortem orbitofrontal cortex of bipolar subjects while the final study was performed based on raw data from a gene expression profiling dataset in the postmortem superior temporal cortex of schizophrenics. The data were downloaded from NCBI's GEO datasets. PMID:23554570
StreptoBase: An Oral Streptococcus mitis Group Genomic Resource and Analysis Platform.

PubMed

Zheng, Wenning; Tan, Tze King; Paterson, Ian C; Mutha, Naresh V R; Siow, Cheuk Chuen; Tan, Shi Yang; Old, Lesley A; Jakubovics, Nicholas S; Choo, Siew Woh

2016-01-01

The oral streptococci are spherical Gram-positive bacteria categorized under the phylum Firmicutes which are among the most common causative agents of bacterial infective endocarditis (IE) and are also important agents in septicaemia in neutropenic patients. The Streptococcus mitis group is comprised of 13 species including some of the most common human oral colonizers such as S. mitis, S. oralis, S. sanguinis and S. gordonii as well as species such as S. tigurinus, S. oligofermentans and S. australis that have only recently been classified and are poorly understood at present. We present StreptoBase, which provides a specialized free resource focusing on the genomic analyses of oral species from the mitis group. It currently hosts 104 S. mitis group genomes including 27 novel mitis group strains that we sequenced using the high throughput Illumina HiSeq technology platform, and provides a comprehensive set of genome sequences for analyses, particularly comparative analyses and visualization of both cross-species and cross-strain characteristics of S. mitis group bacteria. StreptoBase incorporates sophisticated in-house designed bioinformatics web tools such as Pairwise Genome Comparison (PGC) tool and Pathogenomic Profiling Tool (PathoProT), which facilitate comparative pathogenomics analysis of Streptococcus strains. Examples are provided to demonstrate how StreptoBase can be employed to compare genome structure of different S. mitis group bacteria and putative virulence genes profile across multiple streptococcal strains. In conclusion, StreptoBase offers access to a range of streptococci genomic resources as well as analysis tools and will be an invaluable platform to accelerate research in streptococci. Database URL: http://streptococcus.um.edu.my.
Integrating Microarray Analysis and the Soybean Genome to Understand the Soybean's Iron Deficiency Response

USDA-ARS?s Scientific Manuscript database

Transcriptional profiles of soybean (Glycine max, L. Merr) near isogenic lines Clark (PI548553, iron efficient) and IsoClark (PI547430, iron inefficient) were analyzed and compared using the Affymetrix® GeneChip® Soybean Genome Array. A comparison of plants grown under Fe-sufficient and Fe-limited ...
Mutational burdens and evolutionary ages of thyroid follicular adenoma are comparable to those of follicular carcinoma

PubMed Central

Jung, Seung-Hyun; Kim, Min Sung; Jung, Chan Kwon; Park, Hyun-Chun; Kim, So Youn; Liu, Jieying; Bae, Ja-Seong; Lee, Sung Hak; Kim, Tae-Min; Lee, Sug Hyung; Chung, Yeun-Jun

2016-01-01

Follicular thyroid adenoma (FTA) precedes follicular thyroid carcinoma (FTC) by definition with a favorable prognosis compared to FTC. However, the genetic mechanism of FTA to FTC progression remains unknown. For this, it is required to disclose FTA and FTC genomes in mutational and evolutionary perspectives. We performed whole-exome sequencing and copy number profiling of 14 FTAs and 13 FTCs, which exhibited previously-known gene mutations (NRAS, HRAS, BRAF, TSHR and EIF1AX) and copy number alterations (CNAs) (22q loss and 1q gain) in follicular tumors. In addition, we found eleven potential cancer-related genes with mutations (EZH1, SPOP, NF1, TCF12, IGF2BP3, KMT2C, CNOT1, BRIP1, KDM5C, STAG2 and MAP4K3) that have not been reported in thyroid follicular tumors. Of note, FTA genomes showed comparable levels of mutations to FTC in terms of the number, sequence composition and functional consequences (potential driver mutations) of mutations. Analyses of evolutionary ages using somatic mutations as molecular clocks further identified that FTA genomes were as old as FTC genomes. Whole-transcriptome sequencing did not find any gene fusions with potential significance. Our data indicate that FTA genomes may be as old as FTC genomes, thus suggesting that follicular thyroid tumor genomes during the transition from FTA to FTC may stand stable at genomic levels in contrast to the discernable changes at pathologic and clinical levels. Also, the data suggest a possibility that the mutational profiles obtained from early biopsies may be useful for the molecular diagnosis and therapeutics of follicular tumor patients. PMID:27626165

Mutational burdens and evolutionary ages of thyroid follicular adenoma are comparable to those of follicular carcinoma.

PubMed

Jung, Seung-Hyun; Kim, Min Sung; Jung, Chan Kwon; Park, Hyun-Chun; Kim, So Youn; Liu, Jieying; Bae, Ja-Seong; Lee, Sung Hak; Kim, Tae-Min; Lee, Sug Hyung; Chung, Yeun-Jun

2016-10-25

Follicular thyroid adenoma (FTA) precedes follicular thyroid carcinoma (FTC) by definition with a favorable prognosis compared to FTC. However, the genetic mechanism of FTA to FTC progression remains unknown. For this, it is required to disclose FTA and FTC genomes in mutational and evolutionary perspectives. We performed whole-exome sequencing and copy number profiling of 14 FTAs and 13 FTCs, which exhibited previously-known gene mutations (NRAS, HRAS, BRAF, TSHR and EIF1AX) and copy number alterations (CNAs) (22q loss and 1q gain) in follicular tumors. In addition, we found eleven potential cancer-related genes with mutations (EZH1, SPOP, NF1, TCF12, IGF2BP3, KMT2C, CNOT1, BRIP1, KDM5C, STAG2 and MAP4K3) that have not been reported in thyroid follicular tumors. Of note, FTA genomes showed comparable levels of mutations to FTC in terms of the number, sequence composition and functional consequences (potential driver mutations) of mutations. Analyses of evolutionary ages using somatic mutations as molecular clocks further identified that FTA genomes were as old as FTC genomes. Whole-transcriptome sequencing did not find any gene fusions with potential significance. Our data indicate that FTA genomes may be as old as FTC genomes, thus suggesting that follicular thyroid tumor genomes during the transition from FTA to FTC may stand stable at genomic levels in contrast to the discernable changes at pathologic and clinical levels. Also, the data suggest a possibility that the mutational profiles obtained from early biopsies may be useful for the molecular diagnosis and therapeutics of follicular tumor patients.
Strand-specific transcriptome profiling with directly labeled RNA on genomic tiling microarrays

PubMed Central

2011-01-01

Background With lower manufacturing cost, high spot density, and flexible probe design, genomic tiling microarrays are ideal for comprehensive transcriptome studies. Typically, transcriptome profiling using microarrays involves reverse transcription, which converts RNA to cDNA. The cDNA is then labeled and hybridized to the probes on the arrays, thus the RNA signals are detected indirectly. Reverse transcription is known to generate artifactual cDNA, in particular the synthesis of second-strand cDNA, leading to false discovery of antisense RNA. To address this issue, we have developed an effective method using RNA that is directly labeled, thus by-passing the cDNA generation. This paper describes this method and its application to the mapping of transcriptome profiles. Results RNA extracted from laboratory cultures of Porphyromonas gingivalis was fluorescently labeled with an alkylation reagent and hybridized directly to probes on genomic tiling microarrays specifically designed for this periodontal pathogen. The generated transcriptome profile was strand-specific and produced signals close to background level in most antisense regions of the genome. In contrast, high levels of signal were detected in the antisense regions when the hybridization was done with cDNA. Five antisense areas were tested with independent strand-specific RT-PCR and none to negligible amplification was detected, indicating that the strong antisense cDNA signals were experimental artifacts. Conclusions An efficient method was developed for mapping transcriptome profiles specific to both coding strands of a bacterial genome. This method chemically labels and uses extracted RNA directly in microarray hybridization. The generated transcriptome profile was free of cDNA artifactual signals. In addition, this method requires fewer processing steps and is potentially more sensitive in detecting small amount of RNA compared to conventional end-labeling methods due to the incorporation of more fluorescent molecules per RNA fragment. PMID:21235785
Comparative genome analysis and characterization of the Salmonella Typhimurium strain CCRJ_26 isolated from swine carcasses using whole-genome sequencing approach.

PubMed

Panzenhagen, P H N; Cabral, C C; Suffys, P N; Franco, R M; Rodrigues, D P; Conte-Junior, C A

2018-04-01

Salmonella pathogenicity relies on virulence factors many of which are clustered within the Salmonella pathogenicity islands. Salmonella also harbours mobile genetic elements such as virulence plasmids, prophage-like elements and antimicrobial resistance genes which can contribute to increase its pathogenicity. Here, we have genetically characterized a selected S. Typhimurium strain (CCRJ_26) from our previous study with Multiple Drugs Resistant profile and high-frequency PFGE clonal profile which apparently persists in the pork production centre of Rio de Janeiro State, Brazil. By whole-genome sequencing, we described the strain's genome virulent content and characterized the repertoire of bacterial plasmids, antibiotic resistance genes and prophage-like elements. Here, we have shown evidence that strain CCRJ_26 genome possible represent a virulence-associated phenotype which may be potentially virulent in human infection. Whole-genome sequencing technologies are still costly and remain underexplored for applied microbiology in Brazil. Hence, this genomic description of S. Typhimurium strain CCRJ_26 will provide help in future molecular epidemiological studies. The analysis described here reveals a quick and useful pipeline for bacterial virulence characterization using whole-genome sequencing approach. © 2018 The Society for Applied Microbiology.
Differential analysis between somatic mutation and germline variation profiles reveals cancer-related genes.

PubMed

Przytycki, Pawel F; Singh, Mona

2017-08-25

A major aim of cancer genomics is to pinpoint which somatically mutated genes are involved in tumor initiation and progression. We introduce a new framework for uncovering cancer genes, differential mutation analysis, which compares the mutational profiles of genes across cancer genomes with their natural germline variation across healthy individuals. We present DiffMut, a fast and simple approach for differential mutational analysis, and demonstrate that it is more effective in discovering cancer genes than considerably more sophisticated approaches. We conclude that germline variation across healthy human genomes provides a powerful means for characterizing somatic mutation frequency and identifying cancer driver genes. DiffMut is available at https://github.com/Singh-Lab/Differential-Mutation-Analysis .
Quantifying whole transcriptome size, a prerequisite for understanding transcriptome evolution across species: an example from a plant allopolyploid.

PubMed

Coate, Jeremy E; Doyle, Jeff J

2010-01-01

Evolutionary biologists are increasingly comparing gene expression patterns across species. Due to the way in which expression assays are normalized, such studies provide no direct information about expression per gene copy (dosage responses) or per cell and can give a misleading picture of genes that are differentially expressed. We describe an assay for estimating relative expression per cell. When used in conjunction with transcript profiling data, it is possible to compare the sizes of whole transcriptomes, which in turn makes it possible to compare expression per cell for each gene in the transcript profiling data set. We applied this approach, using quantitative reverse transcriptase-polymerase chain reaction and high throughput RNA sequencing, to a recently formed allopolyploid and showed that its leaf transcriptome was approximately 1.4-fold larger than either progenitor transcriptome (70% of the sum of the progenitor transcriptomes). In contrast, the allopolyploid genome is 94.3% as large as the sum of its progenitor genomes and retains > or =93.5% of the sum of its progenitor gene complements. Thus, "transcriptome downsizing" is greater than genome downsizing. Using this transcriptome size estimate, we inferred dosage responses for several thousand genes and showed that the majority exhibit partial dosage compensation. Homoeologue silencing is nonrandomly distributed across dosage responses, with genes showing extreme responses in either direction significantly more likely to have a silent homoeologue. This experimental approach will add value to transcript profiling experiments involving interspecies and interploidy comparisons by converting expression per transcriptome to expression per genome, eliminating the need for assumptions about transcriptome size.
Comparison of Models and Whole-Genome Profiling Approaches for Genomic-Enabled Prediction of Septoria Tritici Blotch, Stagonospora Nodorum Blotch, and Tan Spot Resistance in Wheat.

PubMed

Juliana, Philomin; Singh, Ravi P; Singh, Pawan K; Crossa, Jose; Rutkoski, Jessica E; Poland, Jesse A; Bergstrom, Gary C; Sorrells, Mark E

2017-07-01

The leaf spotting diseases in wheat that include Septoria tritici blotch (STB) caused by , Stagonospora nodorum blotch (SNB) caused by , and tan spot (TS) caused by pose challenges to breeding programs in selecting for resistance. A promising approach that could enable selection prior to phenotyping is genomic selection that uses genome-wide markers to estimate breeding values (BVs) for quantitative traits. To evaluate this approach for seedling and/or adult plant resistance (APR) to STB, SNB, and TS, we compared the predictive ability of least-squares (LS) approach with genomic-enabled prediction models including genomic best linear unbiased predictor (GBLUP), Bayesian ridge regression (BRR), Bayes A (BA), Bayes B (BB), Bayes Cπ (BC), Bayesian least absolute shrinkage and selection operator (BL), and reproducing kernel Hilbert spaces markers (RKHS-M), a pedigree-based model (RKHS-P) and RKHS markers and pedigree (RKHS-MP). We observed that LS gave the lowest prediction accuracies and RKHS-MP, the highest. The genomic-enabled prediction models and RKHS-P gave similar accuracies. The increase in accuracy using genomic prediction models over LS was 48%. The mean genomic prediction accuracies were 0.45 for STB (APR), 0.55 for SNB (seedling), 0.66 for TS (seedling) and 0.48 for TS (APR). We also compared markers from two whole-genome profiling approaches: genotyping by sequencing (GBS) and diversity arrays technology sequencing (DArTseq) for prediction. While, GBS markers performed slightly better than DArTseq, combining markers from the two approaches did not improve accuracies. We conclude that implementing GS in breeding for these diseases would help to achieve higher accuracies and rapid gains from selection. Copyright © 2017 Crop Science Society of America.
Genome-wide comparative analysis of DNA methylation between soybean cytoplasmic male-sterile line NJCMS5A and its maintainer NJCMS5B.

PubMed

Li, Yanwei; Ding, Xianlong; Wang, Xuan; He, Tingting; Zhang, Hao; Yang, Longshu; Wang, Tanliu; Chen, Linfeng; Gai, Junyi; Yang, Shouping

2017-08-10

DNA methylation is an important epigenetic modification. It can regulate the expression of many key genes without changing the primary structure of the genomic DNA, and plays a vital role in the growth and development of the organism. The genome-wide DNA methylation profile of the cytoplasmic male sterile (CMS) line in soybean has not been reported so far. In this study, genome-wide comparative analysis of DNA methylation between soybean CMS line NJCMS5A and its maintainer NJCMS5B was conducted by whole-genome bisulfite sequencing. The results showed 3527 differentially methylated regions (DMRs) and 485 differentially methylated genes (DMGs), including 353 high-credible methylated genes, 56 methylated genes coding unknown protein and 76 novel methylated genes with no known function were identified. Among them, 25 DMRs were further validated that the genome-wide DNA methylation data were reliable through bisulfite treatment, and 9 DMRs were confirmed the relationship between DNA methylation and gene expression by qRT-PCR. Finally, 8 key DMGs possibly associated with soybean CMS were identified. Genome-wide DNA methylation profile of the soybean CMS line NJCMS5A and its maintainer NJCMS5B was obtained for the first time. Several specific DMGs which participated in pollen and flower development were further identified to be probably associated with soybean CMS. This study will contribute to further understanding of the molecular mechanism behind soybean CMS.
Comparative and Evolutionary Analysis of Grass Pollen Allergens Using Brachypodium distachyon as a Model System.

PubMed

Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan

2017-01-01

Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species.
Discovery of the New Plant Growth-Regulating Compound LYXLF2 Based on Manipulating the Halogenase in Amycolatopsis orientalis.

PubMed

Xu, Li; Han, Ting; Ge, Mei; Zhu, Li; Qian, XiuPing

2016-09-01

Analysis of the Amycolatopsis orientalis HCCB10007 genome revealed new gene clusters involved in natural product biosynthesis that were not associated with the production of known compounds. Halogenases are a type of tailoring enzymes that are usually found within these secondary gene clusters. In this study, we identified an indole-type halometabolite 6-chrolo-1H-indole-3-carboxamide, named LYXLF2, by whole genome mining and metabolic profiling of a flavin-dependent halogenase mutant. LYXLF2 is a new plant growth-regulating compound that promotes root elongation. The results of this study demonstrated that the special gene knock-out/comparative metabolic profiling approach provides a powerful tool for the discovery of novel natural products by genome mining.
Genome profiling of ovarian adenocarcinomas using pangenomic BACs microarray comparative genomic hybridization

PubMed Central

Caserta, Donatella; Benkhalifa, Moncef; Baldi, Marina; Fiorentino, Francesco; Qumsiyeh, Mazin; Moscarini, Massimo

2008-01-01

Background Routine cytogenetic investigations for ovarian cancers are limited by culture failure and poor growth of cancer cells compared to normal cells. Fluorescence in situ Hybridization (FISH) application or classical comparative genome hybridization techniques are also have their own limitations in detecting genome imbalance especially for small changes that are not known ahead of time and for which FISH probes could not be thus designed. Methods We applied microarray comparative genomic hybridization (A-CGH) using one mega base BAC arrays to investigate chromosomal disorders in ovarian adenocarcinoma in patients with familial history. Results Our data on 10 cases of ovarian cancer revealed losses of 6q (4 cases mainly mosaic loss), 9p (4 cases), 10q (3 cases), 21q (3 cases), 22q (4 cases) with association to a monosomy X and gains of 8q and 9q (occurring together in 8 cases) and gain of 12p. There were other abnormalities such as loss of 17p that were noted in two profiles of the studied cases. Total or mosaic segmental gain of 2p, 3q, 4q, 7q and 13q were also observed. Seven of 10 patients were investigated by FISH to control array CGH results. The FISH data showed a concordance between the 2 methods. Conclusion The data suggest that A-CGH detects unique and common abnormalities with certain exceptions such as tetraploidy and balanced translocation, which may lead to understanding progression of genetic changes as well as aid in early diagnosis and have an impact on therapy and prognosis. PMID:18492273
Epigenomics of Total Acute Sleep Deprivation in Relation to Genome-Wide DNA Methylation Profiles and RNA Expression.

PubMed

Nilsson, Emil K; Boström, Adrian E; Mwinyi, Jessica; Schiöth, Helgi B

2016-06-01

Despite an established link between sleep deprivation and epigenetic processes in humans, it remains unclear to what extent sleep deprivation modulates DNA methylation. We performed a within-subject randomized blinded study with 16 healthy subjects to examine the effect of one night of total sleep deprivation (TSD) on the genome-wide methylation profile in blood compared with that in normal sleep. Genome-wide differences in methylation between both conditions were assessed by applying a paired regression model that corrected for monocyte subpopulations. In addition, the correlations between the methylation of genes detected to be modulated by TSD and gene expression were examined in a separate, publicly available cohort of 10 healthy male donors (E-GEOD-49065). Sleep deprivation significantly affected the DNA methylation profile both independently and in dependency of shifts in monocyte composition. Our study detected differential methylation of 269 probes. Notably, one CpG site was located 69 bp upstream of ING5, which has been shown to be differentially expressed after sleep deprivation. Gene set enrichment analysis detected the Notch and Wnt signaling pathways to be enriched among the differentially methylated genes. These results provide evidence that total acute sleep deprivation alters the methylation profile in healthy human subjects. This is, to our knowledge, the first study that systematically investigated the impact of total acute sleep deprivation on genome-wide DNA methylation profiles in blood and related the epigenomic findings to the expression data.
An integrated workflow for analysis of ChIP-chip data.

PubMed

Weigelt, Karin; Moehle, Christoph; Stempfl, Thomas; Weber, Bernhard; Langmann, Thomas

2008-08-01

Although ChIP-chip is a powerful tool for genome-wide discovery of transcription factor target genes, the steps involving raw data analysis, identification of promoters, and correlation with binding sites are still laborious processes. Therefore, we report an integrated workflow for the analysis of promoter tiling arrays with the Genomatix ChipInspector system. We compare this tool with open-source software packages to identify PU.1 regulated genes in mouse macrophages. Our results suggest that ChipInspector data analysis, comparative genomics for binding site prediction, and pathway/network modeling significantly facilitate and enhance whole-genome promoter profiling to reveal in vivo sites of transcription factor-DNA interactions.
Single-cell genomic profiling of acute myeloid leukemia for clinical use: A pilot study

PubMed Central

Yan, Benedict; Hu, Yongli; Ban, Kenneth H.K.; Tiang, Zenia; Ng, Christopher; Lee, Joanne; Tan, Wilson; Chiu, Lily; Tan, Tin Wee; Seah, Elaine; Ng, Chin Hin; Chng, Wee-Joo; Foo, Roger

2017-01-01

Although bulk high-throughput genomic profiling studies have led to a significant increase in the understanding of cancer biology, there is increasing awareness that bulk profiling approaches do not completely elucidate tumor heterogeneity. Single-cell genomic profiling enables the distinction of tumor heterogeneity, and may improve clinical diagnosis through the identification and characterization of putative subclonal populations. In the present study, the challenges associated with a single-cell genomics profiling workflow for clinical diagnostics were investigated. Single-cell RNA-sequencing (RNA-seq) was performed on 20 cells from an acute myeloid leukemia bone marrow sample. Putative blasts were identified based on their gene expression profiles and principal component analysis was performed to identify outlier cells. Variant calling was performed on the single-cell RNA-seq data. The present pilot study demonstrates a proof of concept for clinical single-cell genomic profiling. The recognized limitations include significant stochastic RNA loss and the relatively low throughput of the current proposed platform. Although the results of the present study are promising, further technological advances and protocol optimization are necessary for single-cell genomic profiling to be clinically viable. PMID:28454300
Diagnosis and therapy of oral squamous cell carcinoma.

PubMed

Konkimalla, V Badireenath; Suhas, Venkatramana Laxminarayana; Chandra, Nagasuma R; Gebhart, Erich; Efferth, Thomas

2007-03-01

Oral squamous cell carcinoma ranks among the top ten most common cancers worldwide. Despite the success in diagnosis and therapy during the past 30 years, oral squamous cell carcinoma still belongs to the tumor types with a very unfavorable prognosis. In an effort to identify genomic alterations with prognostic relevance, we applied the comparative genomic hybridization technique on oral squamous cell carcinoma. The tumors exhibited from five up to 47 DNA copy number alterations, indicating a considerable degree of genomic imbalance. Out of 35 tumors, 19 showed a gain of chromosome band 7p12. Genomic imbalances were investigated by hierarchical cluster analysis and clustered image mapping to investigate whether genomic profiles correlate with clinical data. Results of the present investigation show that profiling of genomic imbalances in general, and especially of the epidermal growth factor receptor (EGFR) on 7p12, may be suitable as prognostic factors. In order to identify small-molecule inhibitors for EGFR, we established a database of 531 natural compounds derived from medicinal plants used in traditional Chinese medicine. Candidate compounds were identified by correlation analysis using the Kendall tau-test of IC50 values of tumor cell lines and microarray-based EGFR mRNA expression. Further validation was performed by molecular docking studies using the AutoDock program with the crystal structure of EGFR tyrosine kinase domain as docking template. We estimate these results will be a further step toward the ultimate goal of individualized, patient-adapted tumor treatment based on tumor molecular profiling.
Comparative (Meta)genomic Analysis and Ecological Profiling of Human Gut-Specific Bacteriophage φB124-14

PubMed Central

Ogilvie, Lesley A.; Caplin, Jonathan; Dedi, Cinzia; Diston, David; Cheek, Elizabeth; Bowler, Lucas; Taylor, Huw; Ebdon, James; Jones, Brian V.

2012-01-01

Bacteriophage associated with the human gut microbiome are likely to have an important impact on community structure and function, and provide a wealth of biotechnological opportunities. Despite this, knowledge of the ecology and composition of bacteriophage in the gut bacterial community remains poor, with few well characterized gut-associated phage genomes currently available. Here we describe the identification and in-depth (meta)genomic, proteomic, and ecological analysis of a human gut-specific bacteriophage (designated φB124-14). In doing so we illuminate a fraction of the biological dark matter extant in this ecosystem and its surrounding eco-genomic landscape, identifying a novel and uncharted bacteriophage gene-space in this community. φB124-14 infects only a subset of closely related gut-associated Bacteroides fragilis strains, and the circular genome encodes functions previously found to be rare in viral genomes and human gut viral metagenome sequences, including those which potentially confer advantages upon phage and/or host bacteria. Comparative genomic analyses revealed φB124-14 is most closely related to φB40-8, the only other publically available Bacteroides sp. phage genome, whilst comparative metagenomic analysis of both phage failed to identify any homologous sequences in 136 non-human gut metagenomic datasets searched, supporting the human gut-specific nature of this phage. Moreover, a potential geographic variation in the carriage of these and related phage was revealed by analysis of their distribution and prevalence within 151 human gut microbiomes and viromes from Europe, America and Japan. Finally, ecological profiling of φB124-14 and φB40-8, using both gene-centric alignment-driven phylogenetic analyses, as well as alignment-free gene-independent approaches was undertaken. This not only verified the human gut-specific nature of both phage, but also indicated that these phage populate a distinct and unexplored ecological landscape within the human gut microbiome. PMID:22558115
Characterization of a Genomic Signature of Pregnancy in the Breast

PubMed Central

Belitskaya-Lévy, Ilana; Zeleniuch-Jacquotte, Anne; Russo, Jose; Russo, Irma H.; Bordás, Pal; Åhman, Janet; Afanasyeva, Yelena; Johansson, Robert; Lenner, Per; Li, Xiaochun; de Cicco, Ricardo López; Peri, Suraj; Ross, Eric; Russo, Patricia A.; Santucci-Pereira, Julia; Sheriff, Fathima S.; Slifker, Michael; Hallmans, Göran; Toniolo, Paolo; Arslan, Alan A.

2012-01-01

The objective of the current study was to comprehensively compare the genomic profiles in the breast of parous and nulliparous postmenopausal women to identify genes that permanently change their expression following pregnancy. The study was designed as a two-phase approach. In the discovery phase, we compared breast genomic profiles of 37 parous with 18 nulliparous postmenopausal women. In the validation phase, confirmation of the genomic patterns observed in the discovery phase was sought in an independent set of 30 parous and 22 nulliparous postmenopausal women. RNA was hybridized to Affymetrix HG_U133 Plus 2.0 oligonucleotide arrays containing probes to 54,675 transcripts; scanned and the images analyzed using Affymetrix GCOS software. Surrogate variable analysis, logistic regression and significance analysis for microarrays were used to identify statistically significant differences in expression of genes. The False Discovery Rate (FDR) approach was used to control for multiple comparisons. We found that 208 genes (305 probe sets) were differentially expressed between parous and nulliparous women in both discovery and validation phases of the study at a FDR of 10% and with at least a 1.25-fold change. These genes are involved in regulation of transcription, centrosome organization, RNA splicing, cell cycle control, adhesion and differentiation. The results provide persuasive evidence that full-term pregnancy induces long-term genomic changes in the breast. The genomic signature of pregnancy could be used as an intermediate marker to assess potential chemopreventive interventions with hormones mimicking the effects of pregnancy for prevention of breast cancer. PMID:21622728
Comparative Genomics of Klebsiella pneumoniae Strains with Different Antibiotic Resistance Profiles▿†

PubMed Central

Kumar, Vinod; Sun, Peng; Vamathevan, Jessica; Li, Yong; Ingraham, Karen; Palmer, Leslie; Huang, Jianzhong; Brown, James R.

2011-01-01

There is a global emergence of multidrug-resistant (MDR) strains of Klebsiella pneumoniae, a Gram-negative enteric bacterium that causes nosocomial and urinary tract infections. While the epidemiology of K. pneumoniae strains and occurrences of specific antibiotic resistance genes, such as plasmid-borne extended-spectrum β-lactamases (ESBLs), have been extensively studied, only four complete genomes of K. pneumoniae are available. To better understand the multidrug resistance factors in K. pneumoniae, we determined by pyrosequencing the nearly complete genome DNA sequences of two strains with disparate antibiotic resistance profiles, broadly drug-susceptible strain JH1 and strain 1162281, which is resistant to multiple clinically used antibiotics, including extended-spectrum β-lactams, fluoroquinolones, aminoglycosides, trimethoprim, and sulfamethoxazoles. Comparative genomic analysis of JH1, 1162281, and other published K. pneumoniae genomes revealed a core set of 3,631 conserved orthologous proteins, which were used for reconstruction of whole-genome phylogenetic trees. The close evolutionary relationship between JH1 and 1162281 relative to other K. pneumoniae strains suggests that a large component of the genetic and phenotypic diversity of clinical isolates is due to horizontal gene transfer. Using curated lists of over 400 antibiotic resistance genes, we identified all of the elements that differentiated the antibiotic profile of MDR strain 1162281 from that of susceptible strain JH1, such as the presence of additional efflux pumps, ESBLs, and multiple mechanisms of fluoroquinolone resistance. Our study adds new and significant DNA sequence data on K. pneumoniae strains and demonstrates the value of whole-genome sequencing in characterizing multidrug resistance in clinical isolates. PMID:21746949
Flavivirus and Filovirus EvoPrinters: New alignment tools for the comparative analysis of viral evolution.

PubMed

Brody, Thomas; Yavatkar, Amarendra S; Park, Dong Sun; Kuzin, Alexander; Ross, Jermaine; Odenwald, Ward F

2017-06-01

Flavivirus and Filovirus infections are serious epidemic threats to human populations. Multi-genome comparative analysis of these evolving pathogens affords a view of their essential, conserved sequence elements as well as progressive evolutionary changes. While phylogenetic analysis has yielded important insights, the growing number of available genomic sequences makes comparisons between hundreds of viral strains challenging. We report here a new approach for the comparative analysis of these hemorrhagic fever viruses that can superimpose an unlimited number of one-on-one alignments to identify important features within genomes of interest. We have adapted EvoPrinter alignment algorithms for the rapid comparative analysis of Flavivirus or Filovirus sequences including Zika and Ebola strains. The user can input a full genome or partial viral sequence and then view either individual comparisons or generate color-coded readouts that superimpose hundreds of one-on-one alignments to identify unique or shared identity SNPs that reveal ancestral relationships between strains. The user can also opt to select a database genome in order to access a library of pre-aligned genomes of either 1,094 Flaviviruses or 460 Filoviruses for rapid comparative analysis with all database entries or a select subset. Using EvoPrinter search and alignment programs, we show the following: 1) superimposing alignment data from many related strains identifies lineage identity SNPs, which enable the assessment of sublineage complexity within viral outbreaks; 2) whole-genome SNP profile screens uncover novel Dengue2 and Zika recombinant strains and their parental lineages; 3) differential SNP profiling identifies host cell A-to-I hyper-editing within Ebola and Marburg viruses, and 4) hundreds of superimposed one-on-one Ebola genome alignments highlight ultra-conserved regulatory sequences, invariant amino acid codons and evolutionarily variable protein-encoding domains within a single genome. EvoPrinter allows for the assessment of lineage complexity within Flavivirus or Filovirus outbreaks, identification of recombinant strains, highlights sequences that have undergone host cell A-to-I editing, and identifies unique input and database SNPs within highly conserved sequences. EvoPrinter's ability to superimpose alignment data from hundreds of strains onto a single genome has allowed us to identify unique Zika virus sublineages that are currently spreading in South, Central and North America, the Caribbean, and in China. This new set of integrated alignment programs should serve as a useful addition to existing tools for the comparative analysis of these viruses.
Global gene expression profiles of Phytophthora ramorum strain pr102 in response to plant host and tissue differentiation

Treesearch

Caroline M. Press; Niklaus J. Grunwald

2008-01-01

The release of the draft genome sequence of P. ramorum strain Pr102, enabled the construction of an oligonucleotide microarray of the entire genome of Pr102. The array contains 344,680 features (oligos) that represent the transcriptome of Pr102. P. ramorum RNA was extracted from mycelium and sporangia and used to compare gene...
SeeGH--a software tool for visualization of whole genome array comparative genomic hybridization data.

PubMed

Chi, Bryan; DeLeeuw, Ronald J; Coe, Bradley P; MacAulay, Calum; Lam, Wan L

2004-02-09

Array comparative genomic hybridization (CGH) is a technique which detects copy number differences in DNA segments. Complete sequencing of the human genome and the development of an array representing a tiling set of tens of thousands of DNA segments spanning the entire human genome has made high resolution copy number analysis throughout the genome possible. Since array CGH provides signal ratio for each DNA segment, visualization would require the reassembly of individual data points into chromosome profiles. We have developed a visualization tool for displaying whole genome array CGH data in the context of chromosomal location. SeeGH is an application that translates spot signal ratio data from array CGH experiments to displays of high resolution chromosome profiles. Data is imported from a simple tab delimited text file obtained from standard microarray image analysis software. SeeGH processes the signal ratio data and graphically displays it in a conventional CGH karyotype diagram with the added features of magnification and DNA segment annotation. In this process, SeeGH imports the data into a database, calculates the average ratio and standard deviation for each replicate spot, and links them to chromosome regions for graphical display. Once the data is displayed, users have the option of hiding or flagging DNA segments based on user defined criteria, and retrieve annotation information such as clone name, NCBI sequence accession number, ratio, base pair position on the chromosome, and standard deviation. SeeGH represents a novel software tool used to view and analyze array CGH data. The software gives users the ability to view the data in an overall genomic view as well as magnify specific chromosomal regions facilitating the precise localization of genetic alterations. SeeGH is easily installed and runs on Microsoft Windows 2000 or later environments.

Comparative genome analysis of 24 bovine-associated Staphylococcus isolates with special focus on the putative virulence genes

PubMed Central

Åvall-Jääskeläinen, Silja; Paulin, Lars; Blom, Jochen

2018-01-01

Non-aureus staphylococci (NAS) are most commonly isolated from subclinical mastitis. Different NAS species may, however, have diverse effects on the inflammatory response in the udder. We determined the genome sequences of 20 staphylococcal isolates from clinical or subclinical bovine mastitis, belonging to the NAS species Staphylococcus agnetis, S. chromogenes, and S. simulans, and focused on the putative virulence factor genes present in the genomes. For comparison we used our previously published genome sequences of four S. aureus isolates from bovine mastitis. The pan-genome and core genomes of the non-aureus isolates were characterized. After that, putative virulence factor orthologues were searched in silico. We compared the presence of putative virulence factors in the NAS species and S. aureus and evaluated the potential association between bacterial genotype and type of mastitis (clinical vs. subclinical). The NAS isolates had much less virulence gene orthologues than the S. aureus isolates. One third of the virulence genes were detected only in S. aureus. About 100 virulence genes were present in all S. aureus isolates, compared to about 40 to 50 in each NAS isolate. S. simulans differed the most. Several of the virulence genes detected among NAS were harbored only by S. simulans, but it also lacked a number of genes present both in S. agnetis and S. chromogenes. The type of mastitis was not associated with any specific virulence gene profile. It seems that the virulence gene profiles or cumulative number of different virulence genes are not directly associated with the type of mastitis (clinical or subclinical), indicating that host derived factors such as the immune status play a pivotal role in the manifestation of mastitis. PMID:29610707
Comparative Genomics Reveal That Host-Innate Immune Responses Influence the Clinical Prevalence of Legionella pneumophila Serogroups

PubMed Central

Khan, Mohammad Adil; Knox, Natalie; Prashar, Akriti; Alexander, David; Abdel-Nour, Mena; Duncan, Carla; Tang, Patrick; Amatullah, Hajera; Dos Santos, Claudia C.; Tijet, Nathalie; Low, Donald E.; Pourcel, Christine; Van Domselaar, Gary; Terebiznik, Mauricio; Ensminger, Alexander W.; Guyard, Cyril

2013-01-01

Legionella pneumophila is the primary etiologic agent of legionellosis, a potentially fatal respiratory illness. Amongst the sixteen described L. pneumophila serogroups, a majority of the clinical infections diagnosed using standard methods are serogroup 1 (Sg1). This high clinical prevalence of Sg1 is hypothesized to be linked to environmental specific advantages and/or to increased virulence of strains belonging to Sg1. The genetic determinants for this prevalence remain unknown primarily due to the limited genomic information available for non-Sg1 clinical strains. Through a systematic attempt to culture Legionella from patient respiratory samples, we have previously reported that 34% of all culture confirmed legionellosis cases in Ontario (n = 351) are caused by non-Sg1 Legionella. Phylogenetic analysis combining multiple-locus variable number tandem repeat analysis and sequence based typing profiles of all non-Sg1 identified that L. pneumophila clinical strains (n = 73) belonging to the two most prevalent molecular types were Sg6. We conducted whole genome sequencing of two strains representative of these sequence types and one distant neighbour. Comparative genomics of the three L. pneumophila Sg6 genomes reported here with published L. pneumophila serogroup 1 genomes identified genetic differences in the O-antigen biosynthetic cluster. Comparative optical mapping analysis between Sg6 and Sg1 further corroborated this finding. We confirmed an altered O-antigen profile of Sg6, and tested its possible effects on growth and replication in in vitro biological models and experimental murine infections. Our data indicates that while clinical Sg1 might not be better suited than Sg6 in colonizing environmental niches, increased bloodstream dissemination through resistance to the alternative pathway of complement mediated killing in the human host may explain its higher prevalence. PMID:23826259
Indexcov: fast coverage quality control for whole-genome sequencing.

PubMed

Pedersen, Brent S; Collins, Ryan L; Talkowski, Michael E; Quinlan, Aaron R

2017-11-01

The BAM and CRAM formats provide a supplementary linear index that facilitates rapid access to sequence alignments in arbitrary genomic regions. Comparing consecutive entries in a BAM or CRAM index allows one to infer the number of alignment records per genomic region for use as an effective proxy of sequence depth in each genomic region. Based on these properties, we have developed indexcov, an efficient estimator of whole-genome sequencing coverage to rapidly identify samples with aberrant coverage profiles, reveal large-scale chromosomal anomalies, recognize potential batch effects, and infer the sex of a sample. Indexcov is available at https://github.com/brentp/goleft under the MIT license. © The Authors 2017. Published by Oxford University Press.
The utility of multiple molecular methods including whole genome sequencing as tools to differentiate Escherichia coli O157:H7 outbreaks.

PubMed

Berenger, Byron M; Berry, Chrystal; Peterson, Trevor; Fach, Patrick; Delannoy, Sabine; Li, Vincent; Tschetter, Lorelee; Nadon, Celine; Honish, Lance; Louie, Marie; Chui, Linda

2015-01-01

A standardised method for determining Escherichia coli O157:H7 strain relatedness using whole genome sequencing or virulence gene profiling is not yet established. We sought to assess the capacity of either high-throughput polymerase chain reaction (PCR) of 49 virulence genes, core-genome single nt variants (SNVs) or k-mer clustering to discriminate between outbreak-associated and sporadic E. coli O157:H7 isolates. Three outbreaks and multiple sporadic isolates from the province of Alberta, Canada were included in the study. Two of the outbreaks occurred concurrently in 2014 and one occurred in 2012. Pulsed-field gel electrophoresis (PFGE) and multilocus variable-number tandem repeat analysis (MLVA) were employed as comparator typing methods. The virulence gene profiles of isolates from the 2012 and 2014 Alberta outbreak events and contemporary sporadic isolates were mostly identical; therefore the set of virulence genes chosen in this study were not discriminatory enough to distinguish between outbreak clusters. Concordant with PFGE and MLVA results, core genome SNV and k-mer phylogenies clustered isolates from the 2012 and 2014 outbreaks as distinct events. k-mer phylogenies demonstrated increased discriminatory power compared with core SNV phylogenies. Prior to the widespread implementation of whole genome sequencing for routine public health use, issues surrounding cost, technical expertise, software standardisation, and data sharing/comparisons must be addressed.
CoryneBase: Corynebacterium Genomic Resources and Analysis Tools at Your Fingertips

PubMed Central

Tan, Mui Fern; Jakubovics, Nick S.; Wee, Wei Yee; Mutha, Naresh V. R.; Wong, Guat Jah; Ang, Mia Yang; Yazdi, Amir Hessam; Choo, Siew Woh

2014-01-01

Corynebacteria are used for a wide variety of industrial purposes but some species are associated with human diseases. With increasing number of corynebacterial genomes having been sequenced, comparative analysis of these strains may provide better understanding of their biology, phylogeny, virulence and taxonomy that may lead to the discoveries of beneficial industrial strains or contribute to better management of diseases. To facilitate the ongoing research of corynebacteria, a specialized central repository and analysis platform for the corynebacterial research community is needed to host the fast-growing amount of genomic data and facilitate the analysis of these data. Here we present CoryneBase, a genomic database for Corynebacterium with diverse functionality for the analysis of genomes aimed to provide: (1) annotated genome sequences of Corynebacterium where 165,918 coding sequences and 4,180 RNAs can be found in 27 species; (2) access to comprehensive Corynebacterium data through the use of advanced web technologies for interactive web interfaces; and (3) advanced bioinformatic analysis tools consisting of standard BLAST for homology search, VFDB BLAST for sequence homology search against the Virulence Factor Database (VFDB), Pairwise Genome Comparison (PGC) tool for comparative genomic analysis, and a newly designed Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomic analysis. CoryneBase offers the access of a range of Corynebacterium genomic resources as well as analysis tools for comparative genomics and pathogenomics. It is publicly available at http://corynebacterium.um.edu.my/. PMID:24466021
Gene integrated set profile analysis: a context-based approach for inferring biological endpoints

PubMed Central

Kowalski, Jeanne; Dwivedi, Bhakti; Newman, Scott; Switchenko, Jeffery M.; Pauly, Rini; Gutman, David A.; Arora, Jyoti; Gandhi, Khanjan; Ainslie, Kylie; Doho, Gregory; Qin, Zhaohui; Moreno, Carlos S.; Rossi, Michael R.; Vertino, Paula M.; Lonial, Sagar; Bernal-Mizrachi, Leon; Boise, Lawrence H.

2016-01-01

The identification of genes with specific patterns of change (e.g. down-regulated and methylated) as phenotype drivers or samples with similar profiles for a given gene set as drivers of clinical outcome, requires the integration of several genomic data types for which an ‘integrate by intersection’ (IBI) approach is often applied. In this approach, results from separate analyses of each data type are intersected, which has the limitation of a smaller intersection with more data types. We introduce a new method, GISPA (Gene Integrated Set Profile Analysis) for integrated genomic analysis and its variation, SISPA (Sample Integrated Set Profile Analysis) for defining respective genes and samples with the context of similar, a priori specified molecular profiles. With GISPA, the user defines a molecular profile that is compared among several classes and obtains ranked gene sets that satisfy the profile as drivers of each class. With SISPA, the user defines a gene set that satisfies a profile and obtains sample groups of profile activity. Our results from applying GISPA to human multiple myeloma (MM) cell lines contained genes of known profiles and importance, along with several novel targets, and their further SISPA application to MM coMMpass trial data showed clinical relevance. PMID:26826710
Comparative genomics analyses revealed two virulent Listeria monocytogenes strains isolated from ready-to-eat food.

PubMed

Lim, Shu Yong; Yap, Kien-Pong; Thong, Kwai Lin

2016-01-01

Listeria monocytogenes is an important foodborne pathogen that causes considerable morbidity in humans with high mortality rates. In this study, we have sequenced the genomes and performed comparative genomics analyses on two strains, LM115 and LM41, isolated from ready-to-eat food in Malaysia. The genome size of LM115 and LM41 was 2,959,041 and 2,963,111 bp, respectively. These two strains shared approximately 90% homologous genes. Comparative genomics and phylogenomic analyses revealed that LM115 and LM41 were more closely related to the reference strains F2365 and EGD-e, respectively. Our virulence profiling indicated a total of 31 virulence genes shared by both analysed strains. These shared genes included those that encode for internalins and L. monocytogenes pathogenicity island 1 (LIPI-1). Both the Malaysian L. monocytogenes strains also harboured several genes associated with stress tolerance to counter the adverse conditions. Seven antibiotic and efflux pump related genes which may confer resistance against lincomycin, erythromycin, fosfomycin, quinolone, tetracycline, and penicillin, and macrolides were identified in the genomes of both strains. Whole genome sequencing and comparative genomics analyses revealed two virulent L. monocytogenes strains isolated from ready-to-eat foods in Malaysia. The identification of strains with pathogenic, persistent, and antibiotic resistant potentials from minimally processed food warrant close attention from both healthcare and food industry.
Construction of the first compendium of chemical-genetic profiles in the fission yeast Schizosaccharomyces pombe and comparative compendium approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Han, Sangjo; Lee, Minho; Chang, Hyeshik

Highlights: •The first compendium of chemical-genetic profiles form fission yeast was generated. •The first HTS of drug mode-of-action in fission yeast was performed. •The first comparative chemical genetic analysis between two yeasts was conducted. -- Abstract: Genome-wide chemical genetic profiles in Saccharomyces cerevisiae since the budding yeast deletion library construction have been successfully used to reveal unknown mode-of-actions of drugs. Here, we introduce comparative approach to infer drug target proteins more accurately using two compendiums of chemical-genetic profiles from the budding yeast S. cerevisiae and the fission yeast Schizosaccharomyces pombe. For the first time, we established DNA-chip based growth defectmore » measurement of genome-wide deletion strains of S. pombe, and then applied 47 drugs to the pooled heterozygous deletion strains to generate chemical-genetic profiles in S. pombe. In our approach, putative drug targets were inferred from strains hypersensitive to given drugs by analyzing S. pombe and S. cerevisiae compendiums. Notably, many evidences in the literature revealed that the inferred target genes of fungicide and bactericide identified by such comparative approach are in fact the direct targets. Furthermore, by filtering out the genes with no essentiality, the multi-drug sensitivity genes, and the genes with less eukaryotic conservation, we created a set of drug target gene candidates that are expected to be directly affected by a given drug in human cells. Our study demonstrated that it is highly beneficial to construct the multiple compendiums of chemical genetic profiles using many different species. The fission yeast chemical-genetic compendium is available at (http://pombe.kaist.ac.kr/compendium)« less
Comparison of gene expression in segregating families identifies genes and genomic regions involved in a novel adaptation, zinc hyperaccumulation.

PubMed

Filatov, Victor; Dowdle, John; Smirnoff, Nicholas; Ford-Lloyd, Brian; Newbury, H John; Macnair, Mark R

2006-09-01

One of the challenges of comparative genomics is to identify specific genetic changes associated with the evolution of a novel adaptation or trait. We need to be able to disassociate the genes involved with a particular character from all the other genetic changes that take place as lineages diverge. Here we show that by comparing the transcriptional profile of segregating families with that of parent species differing in a novel trait, it is possible to narrow down substantially the list of potential target genes. In addition, by assuming synteny with a related model organism for which the complete genome sequence is available, it is possible to use the cosegregation of markers differing in transcription level to identify regions of the genome which probably contain quantitative trait loci (QTLs) for the character. This novel combination of genomics and classical genetics provides a very powerful tool to identify candidate genes. We use this methodology to investigate zinc hyperaccumulation in Arabidopsis halleri, the sister species to the model plant, Arabidopsis thaliana. We compare the transcriptional profile of A. halleri with that of its sister nonaccumulator species, Arabidopsis petraea, and between accumulator and nonaccumulator F(3)s derived from the cross between the two species. We identify eight genes which consistently show greater expression in accumulator phenotypes in both roots and shoots, including two metal transporter genes (NRAMP3 and ZIP6), and cytoplasmic aconitase, a gene involved in iron homeostasis in mammals. We also show that there appear to be two QTLs for zinc accumulation, on chromosomes 3 and 7.
CHESS (CgHExpreSS): a comprehensive analysis tool for the analysis of genomic alterations and their effects on the expression profile of the genome.

PubMed

Lee, Mikyung; Kim, Yangseok

2009-12-16

Genomic alterations frequently occur in many cancer patients and play important mechanistic roles in the pathogenesis of cancer. Furthermore, they can modify the expression level of genes due to altered copy number in the corresponding region of the chromosome. An accumulating body of evidence supports the possibility that strong genome-wide correlation exists between DNA content and gene expression. Therefore, more comprehensive analysis is needed to quantify the relationship between genomic alteration and gene expression. A well-designed bioinformatics tool is essential to perform this kind of integrative analysis. A few programs have already been introduced for integrative analysis. However, there are many limitations in their performance of comprehensive integrated analysis using published software because of limitations in implemented algorithms and visualization modules. To address this issue, we have implemented the Java-based program CHESS to allow integrative analysis of two experimental data sets: genomic alteration and genome-wide expression profile. CHESS is composed of a genomic alteration analysis module and an integrative analysis module. The genomic alteration analysis module detects genomic alteration by applying a threshold based method or SW-ARRAY algorithm and investigates whether the detected alteration is phenotype specific or not. On the other hand, the integrative analysis module measures the genomic alteration's influence on gene expression. It is divided into two separate parts. The first part calculates overall correlation between comparative genomic hybridization ratio and gene expression level by applying following three statistical methods: simple linear regression, Spearman rank correlation and Pearson's correlation. In the second part, CHESS detects the genes that are differentially expressed according to the genomic alteration pattern with three alternative statistical approaches: Student's t-test, Fisher's exact test and Chi square test. By successive operations of two modules, users can clarify how gene expression levels are affected by the phenotype specific genomic alterations. As CHESS was developed in both Java application and web environments, it can be run on a web browser or a local machine. It also supports all experimental platforms if a properly formatted text file is provided to include the chromosomal position of probes and their gene identifiers. CHESS is a user-friendly tool for investigating disease specific genomic alterations and quantitative relationships between those genomic alterations and genome-wide gene expression profiling.
Early experience with formalin-fixed paraffin-embedded (FFPE) based commercial clinical genomic profiling of gliomas-robust and informative with caveats.

PubMed

Movassaghi, Masoud; Shabihkhani, Maryam; Hojat, Seyed A; Williams, Ryan R; Chung, Lawrance K; Im, Kyuseok; Lucey, Gregory M; Wei, Bowen; Mareninov, Sergey; Wang, Michael W; Ng, Denise W; Tashjian, Randy S; Magaki, Shino; Perez-Rosendahl, Mari; Yang, Isaac; Khanlou, Negar; Vinters, Harry V; Liau, Linda M; Nghiemphu, Phioanh L; Lai, Albert; Cloughesy, Timothy F; Yong, William H

2017-08-01

Commercial targeted genomic profiling with next generation sequencing using formalin-fixed paraffin embedded (FFPE) tissue has recently entered into clinical use for diagnosis and for the guiding of therapy. However, there is limited independent data regarding the accuracy or robustness of commercial genomic profiling in gliomas. As part of patient care, FFPE samples of gliomas from 71 patients were submitted for targeted genomic profiling to one commonly used commercial vendor, Foundation Medicine. Genomic alterations were determined for the following grades or groups of gliomas; Grade I/II, Grade III, primary glioblastomas (GBMs), recurrent primary GBMs, and secondary GBMs. In addition, FFPE samples from the same patients were independently assessed with conventional methods such as immunohistochemistry (IHC), Quantitative real-time PCR (qRT-PCR), or Fluorescence in situ hybridization (FISH) for three genetic alterations: IDH1 mutations, EGFR amplification, and EGFRvIII expression. A total of 100 altered genes were detected by the aforementioned targeted genomic profiling assay. The number of different genomic alterations was significantly different between the five groups of gliomas and consistent with the literature. CDKN2A/B, TP53, and TERT were the most common genomic alterations seen in primary GBMs, whereas IDH1, TP53, and PIK3CA were the most common in secondary GBMs. Targeted genomic profiling demonstrated 92.3%-100% concordance with conventional methods. The targeted genomic profiling report provided an average of 5.5 drugs, and listed an average of 8.4 clinical trials for the 71 glioma patients studied but only a third of the trials were appropriate for glioma patients. In this limited comparison study, this commercial next generation sequencing based-targeted genomic profiling showed a high concordance rate with conventional methods for the 3 genetic alterations and identified mutations expected for the type of glioma. While it may not be feasible to exhaustively independently validate a commercial genomic profiling assay, examination of a few markers provides some reassurance of its robustness. While potential targeted drugs are recommended based on genetic alterations, to date most targeted therapies have failed in glioblasomas so the usefulness of such recommendations will increase with development of novel and efficacious drugs. Copyright © 2017. Published by Elsevier Inc.
Combined CRISPRi/a-Based Chemical Genetic Screens Reveal that Rigosertib Is a Microtubule-Destabilizing Agent. | Office of Cancer Genomics

Cancer.gov

Chemical libraries paired with phenotypic screens can now readily identify compounds with therapeutic potential. A central limitation to exploiting these compounds, however, has been in identifying their relevant cellular targets. Here, we present a two-tiered CRISPR-mediated chemical-genetic strategy for target identification: combined genome-wide knockdown and overexpression screening as well as focused, comparative chemical-genetic profiling.
Genome profiling of sterol synthesis shows convergent evolution in parasites and guides chemotherapeutic attack.

PubMed

Fügi, Matthias A; Gunasekera, Kapila; Ochsenreiter, Torsten; Guan, Xueli; Wenk, Markus R; Mäser, Pascal

2014-05-01

Sterols are an essential class of lipids in eukaryotes, where they serve as structural components of membranes and play important roles as signaling molecules. Sterols are also of high pharmacological significance: cholesterol-lowering drugs are blockbusters in human health, and inhibitors of ergosterol biosynthesis are widely used as antifungals. Inhibitors of ergosterol synthesis are also being developed for Chagas's disease, caused by Trypanosoma cruzi. Here we develop an in silico pipeline to globally evaluate sterol metabolism and perform comparative genomics. We generate a library of hidden Markov model-based profiles for 42 sterol biosynthetic enzymes, which allows expressing the genomic makeup of a given species as a numerical vector. Hierarchical clustering of these vectors functionally groups eukaryote proteomes and reveals convergent evolution, in particular metabolic reduction in obligate endoparasites. We experimentally explore sterol metabolism by testing a set of sterol biosynthesis inhibitors against trypanosomatids, Plasmodium falciparum, Giardia, and mammalian cells, and by quantifying the expression levels of sterol biosynthetic genes during the different life stages of T. cruzi and Trypanosoma brucei. The phenotypic data correlate with genomic makeup for simvastatin, which showed activity against trypanosomatids. Other findings, such as the activity of terbinafine against Giardia, are not in agreement with the genotypic profile.
Genome profiling of sterol synthesis shows convergent evolution in parasites and guides chemotherapeutic attack

PubMed Central

Fügi, Matthias A.; Gunasekera, Kapila; Ochsenreiter, Torsten; Guan, Xueli; Wenk, Markus R.; Mäser, Pascal

2014-01-01

Sterols are an essential class of lipids in eukaryotes, where they serve as structural components of membranes and play important roles as signaling molecules. Sterols are also of high pharmacological significance: cholesterol-lowering drugs are blockbusters in human health, and inhibitors of ergosterol biosynthesis are widely used as antifungals. Inhibitors of ergosterol synthesis are also being developed for Chagas’s disease, caused by Trypanosoma cruzi. Here we develop an in silico pipeline to globally evaluate sterol metabolism and perform comparative genomics. We generate a library of hidden Markov model-based profiles for 42 sterol biosynthetic enzymes, which allows expressing the genomic makeup of a given species as a numerical vector. Hierarchical clustering of these vectors functionally groups eukaryote proteomes and reveals convergent evolution, in particular metabolic reduction in obligate endoparasites. We experimentally explore sterol metabolism by testing a set of sterol biosynthesis inhibitors against trypanosomatids, Plasmodium falciparum, Giardia, and mammalian cells, and by quantifying the expression levels of sterol biosynthetic genes during the different life stages of T. cruzi and Trypanosoma brucei. The phenotypic data correlate with genomic makeup for simvastatin, which showed activity against trypanosomatids. Other findings, such as the activity of terbinafine against Giardia, are not in agreement with the genotypic profile. PMID:24627128
Breeding and identification of novel koji molds with high activity of acid protease by genome recombination between Aspergillus oryzae and Aspergillus niger.

PubMed

Xu, Defeng; Pan, Li; Zhao, Haifeng; Zhao, Mouming; Sun, Jiaxin; Liu, Dongmei

2011-09-01

Acid protease is essential for degradation of proteins during soy sauce fermentation. To breed more suitable koji molds with high activity of acid protease, interspecific genome recombination between A. oryzae and A. niger was performed. Through stabilization with d-camphor and haploidization with benomyl, several stable fusants with higher activity of acid protease were obtained, showing different degrees of improvement in acid protease activity compared with the parental strain A. oryzae. In addition, analyses of mycelial morphology, expression profiles of extracellular proteins, esterase isoenzyme profiles, and random amplified polymorphic DNA (RAPD) were applied to identify the fusants through their phenotypic and genetic relationships. Morphology analysis of the mycelial shape of fusants indicated a phenotype intermediate between A. oryzae and A. niger. The profiles of extracellular proteins and esterase isoenzyme electrophoresis showed the occurrence of genome recombination during or after protoplast fusion. The dendrogram constructed from RAPD data revealed great heterogeneity, and genetic dissimilarity indices showed there were considerable differences between the fusants and their parental strains. This investigation suggests that genome recombination is a powerful tool for improvement of food-grade industrial strains. Furthermore, the presented strain improvement procedure will be applicable for widespread use for other industrial strains.
Genomic profiling of human penile carcinoma predicts worse prognosis and survival.

PubMed

Busso-Lopes, Ariane F; Marchi, Fábio A; Kuasne, Hellen; Scapulatempo-Neto, Cristovam; Trindade-Filho, José Carlos S; de Jesus, Carlos Márcio N; Lopes, Ademar; Guimarães, Gustavo C; Rogatto, Silvia R

2015-02-01

The molecular mechanisms underlying penile carcinoma are still poorly understood, and the detection of genetic markers would be of great benefit for these patients. In this study, we assessed the genomic profile aiming at identifying potential prognostic biomarkers in penile carcinoma. Globally, 46 penile carcinoma samples were considered to evaluate DNA copy-number alterations via array comparative genomic hybridization (aCGH) combined with human papillomavirus (HPV) genotyping. Specific genes were investigated by using qPCR, FISH, and RT-qPCR. Genomic alterations mapped at 3p and 8p were related to worse prognostic features, including advanced T and clinical stage, recurrence and death from the disease. Losses of 3p21.1-p14.3 and gains of 3q25.31-q29 were associated with reduced cancer-specific and disease-free survival. Genomic alterations detected for chromosome 3 (LAMP3, PPARG, TNFSF10 genes) and 8 (DLC1) were evaluated by qPCR. DLC1 and PPARG losses were associated with poor prognosis characteristics. Losses of DLC1 were an independent risk factor for recurrence on multivariate analysis. The gene-expression analysis showed downexpression of DLC1 and PPARG and overexpression of LAMP3 and TNFSF10 genes. Chromosome Y losses and MYC gene (8q24) gains were confirmed by FISH. HPV infection was detected in 34.8% of the samples, and 19 differential genomic regions were obtained related to viral status. At first time, we described recurrent copy-number alterations and its potential prognostic value in penile carcinomas. We also showed a specific genomic profile according to HPV infection, supporting the hypothesis that penile tumors present distinct etiologies according to virus status. ©2014 American Association for Cancer Research.
Identification of copy number variants in whole-genome data using Reference Coverage Profiles

PubMed Central

Glusman, Gustavo; Severson, Alissa; Dhankani, Varsha; Robinson, Max; Farrah, Terry; Mauldin, Denise E.; Stittrich, Anna B.; Ament, Seth A.; Roach, Jared C.; Brunkow, Mary E.; Bodian, Dale L.; Vockley, Joseph G.; Shmulevich, Ilya; Niederhuber, John E.; Hood, Leroy

2015-01-01

The identification of DNA copy numbers from short-read sequencing data remains a challenge for both technical and algorithmic reasons. The raw data for these analyses are measured in tens to hundreds of gigabytes per genome; transmitting, storing, and analyzing such large files is cumbersome, particularly for methods that analyze several samples simultaneously. We developed a very efficient representation of depth of coverage (150–1000× compression) that enables such analyses. Current methods for analyzing variants in whole-genome sequencing (WGS) data frequently miss copy number variants (CNVs), particularly hemizygous deletions in the 1–100 kb range. To fill this gap, we developed a method to identify CNVs in individual genomes, based on comparison to joint profiles pre-computed from a large set of genomes. We analyzed depth of coverage in over 6000 high quality (>40×) genomes. The depth of coverage has strong sequence-specific fluctuations only partially explained by global parameters like %GC. To account for these fluctuations, we constructed multi-genome profiles representing the observed or inferred diploid depth of coverage at each position along the genome. These Reference Coverage Profiles (RCPs) take into account the diverse technologies and pipeline versions used. Normalization of the scaled coverage to the RCP followed by hidden Markov model (HMM) segmentation enables efficient detection of CNVs and large deletions in individual genomes. Use of pre-computed multi-genome coverage profiles improves our ability to analyze each individual genome. We make available RCPs and tools for performing these analyses on personal genomes. We expect the increased sensitivity and specificity for individual genome analysis to be critical for achieving clinical-grade genome interpretation. PMID:25741365
Comparative genomic analysis reveals 2-oxoacid dehydrogenase complex lipoylation correlation with aerobiosis in archaea.

PubMed

Borziak, Kirill; Posner, Mareike G; Upadhyay, Abhishek; Danson, Michael J; Bagby, Stefan; Dorus, Steve

2014-01-01

Metagenomic analyses have advanced our understanding of ecological microbial diversity, but to what extent can metagenomic data be used to predict the metabolic capacity of difficult-to-study organisms and their abiotic environmental interactions? We tackle this question, using a comparative genomic approach, by considering the molecular basis of aerobiosis within archaea. Lipoylation, the covalent attachment of lipoic acid to 2-oxoacid dehydrogenase multienzyme complexes (OADHCs), is essential for metabolism in aerobic bacteria and eukarya. Lipoylation is catalysed either by lipoate protein ligase (LplA), which in archaea is typically encoded by two genes (LplA-N and LplA-C), or by a lipoyl(octanoyl) transferase (LipB or LipM) plus a lipoic acid synthetase (LipA). Does the genomic presence of lipoylation and OADHC genes across archaea from diverse habitats correlate with aerobiosis? First, analyses of 11,826 biotin protein ligase (BPL)-LplA-LipB transferase family members and 147 archaeal genomes identified 85 species with lipoylation capabilities and provided support for multiple ancestral acquisitions of lipoylation pathways during archaeal evolution. Second, with the exception of the Sulfolobales order, the majority of species possessing lipoylation systems exclusively retain LplA, or either LipB or LipM, consistent with archaeal genome streamlining. Third, obligate anaerobic archaea display widespread loss of lipoylation and OADHC genes. Conversely, a high level of correspondence is observed between aerobiosis and the presence of LplA/LipB/LipM, LipA and OADHC E2, consistent with the role of lipoylation in aerobic metabolism. This correspondence between OADHC lipoylation capacity and aerobiosis indicates that genomic pathway profiling in archaea is informative and that well characterized pathways may be predictive in relation to abiotic conditions in difficult-to-study extremophiles. Given the highly variable retention of gene repertoires across the archaea, the extension of comparative genomic pathway profiling to broader metabolic and homeostasis networks should be useful in revealing characteristics from metagenomic datasets related to adaptations to diverse environments.
Comparative and Evolutionary Analysis of Grass Pollen Allergens Using Brachypodium distachyon as a Model System

PubMed Central

Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan

2017-01-01

Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species. PMID:28103252
Prediction of individualized therapeutic vulnerabilities in cancer from genomic profiles

PubMed Central

Aksoy, Bülent Arman; Demir, Emek; Babur, Özgün; Wang, Weiqing; Jing, Xiaohong; Schultz, Nikolaus; Sander, Chris

2014-01-01

Motivation: Somatic homozygous deletions of chromosomal regions in cancer, while not necessarily oncogenic, may lead to therapeutic vulnerabilities specific to cancer cells compared with normal cells. A recently reported example is the loss of one of the two isoenzymes in glioblastoma cancer cells such that the use of a specific inhibitor selectively inhibited growth of the cancer cells, which had become fully dependent on the second isoenzyme. We have now made use of the unprecedented conjunction of large-scale cancer genomics profiling of tumor samples in The Cancer Genome Atlas (TCGA) and of tumor-derived cell lines in the Cancer Cell Line Encyclopedia, as well as the availability of integrated pathway information systems, such as Pathway Commons, to systematically search for a comprehensive set of such epistatic vulnerabilities. Results: Based on homozygous deletions affecting metabolic enzymes in 16 TCGA cancer studies and 972 cancer cell lines, we identified 4104 candidate metabolic vulnerabilities present in 1019 tumor samples and 482 cell lines. Up to 44% of these vulnerabilities can be targeted with at least one Food and Drug Administration-approved drug. We suggest focused experiments to test these vulnerabilities and clinical trials based on personalized genomic profiles of those that pass preclinical filters. We conclude that genomic profiling will in the future provide a promising basis for network pharmacology of epistatic vulnerabilities as a promising therapeutic strategy. Availability and implementation: A web-based tool for exploring all vulnerabilities and their details is available at http://cbio.mskcc.org/cancergenomics/statius/ along with supplemental data files. Contact: statius@cbio.mskcc.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24665131

In silico identification and comparative analysis of differentially expressed genes in human and mouse tissues

PubMed Central

Pao, Sheng-Ying; Lin, Win-Li; Hwang, Ming-Jing

2006-01-01

Background Screening for differentially expressed genes on the genomic scale and comparative analysis of the expression profiles of orthologous genes between species to study gene function and regulation are becoming increasingly feasible. Expressed sequence tags (ESTs) are an excellent source of data for such studies using bioinformatic approaches because of the rich libraries and tremendous amount of data now available in the public domain. However, any large-scale EST-based bioinformatics analysis must deal with the heterogeneous, and often ambiguous, tissue and organ terms used to describe EST libraries. Results To deal with the issue of tissue source, in this work, we carefully screened and organized more than 8 million human and mouse ESTs into 157 human and 108 mouse tissue/organ categories, to which we applied an established statistic test using different thresholds of the p value to identify genes differentially expressed in different tissues. Further analysis of the tissue distribution and level of expression of human and mouse orthologous genes showed that tissue-specific orthologs tended to have more similar expression patterns than those lacking significant tissue specificity. On the other hand, a number of orthologs were found to have significant disparity in their expression profiles, hinting at novel functions, divergent regulation, or new ortholog relationships. Conclusion Comprehensive statistics on the tissue-specific expression of human and mouse genes were obtained in this very large-scale, EST-based analysis. These statistical results have been organized into a database, freely accessible at our website , for easy searching of human and mouse tissue-specific genes and for investigating gene expression profiles in the context of comparative genomics. Comparative analysis showed that, although highly tissue-specific genes tend to exhibit similar expression profiles in human and mouse, there are significant exceptions, indicating that orthologous genes, while sharing basic genomic properties, could result in distinct phenotypes. PMID:16626500
Genomic and Expression Profiling of Benign and Malignant Nerve Sheath Profiling of Benign and Malignant Nerve Sheath

DTIC Science & Technology

2007-05-01

Benign and Malignant Nerve Sheath Tumors in Neurofibromatosis Patients PRINCIPAL INVESTIGATOR: Matt van de Rijn, M.D., Ph.D. Torsten...Annual 3. DATES COVERED 1 May 2006 –30 Apr 2007 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Genomic and Expression Profiling of Benign and Malignant Nerve...Award Number: DAMD17-03-1-0297 Title: Genomic and Expression Profiling of Benign and Malignant Nerve Sheath Tumors in Neurofibromatosis
DNA methylation profiling identifies global methylation differences and markers of adrenocortical tumors.

PubMed

Rechache, Nesrin S; Wang, Yonghong; Stevenson, Holly S; Killian, J Keith; Edelman, Daniel C; Merino, Maria; Zhang, Lisa; Nilubol, Naris; Stratakis, Constantine A; Meltzer, Paul S; Kebebew, Electron

2012-06-01

It is not known whether there are any DNA methylation alterations in adrenocortical tumors. The objective of the study was to determine the methylation profile of normal adrenal cortex and benign and malignant adrenocortical tumors. Genome-wide methylation status of CpG regions were determined in normal (n = 19), benign (n = 48), primary malignant (n = 8), and metastatic malignant (n = 12) adrenocortical tissue samples. An integrated analysis of genome-wide methylation and mRNA expression in benign vs. malignant adrenocortical tissue samples was also performed. Methylation profiling revealed the following: 1) that methylation patterns were distinctly different and could distinguish normal, benign, primary malignant, and metastatic tissue samples; 2) that malignant samples have global hypomethylation; and 3) that the methylation of CpG regions are different in benign adrenocortical tumors by functional status. Normal compared with benign samples had the least amount of methylation differences, whereas normal compared with primary and metastatic adrenocortical carcinoma samples had the greatest variability in methylation (adjusted P ≤ 0.01). Of 215 down-regulated genes (≥2-fold, adjusted P ≤ 0.05) in malignant primary adrenocortical tumor samples, 52 of these genes were also hypermethylated. Malignant adrenocortical tumors are globally hypomethylated as compared with normal and benign tumors. Methylation profile differences may accurately distinguish between primary benign and malignant adrenocortical tumors. Several differentially methylated sites are associated with genes known to be dysregulated in malignant adrenocortical tumors.
Adeno-Associated Virus Type 2 Wild-Type and Vector-Mediated Genomic Integration Profiles of Human Diploid Fibroblasts Analyzed by Third-Generation PacBio DNA Sequencing

PubMed Central

Hüser, Daniela; Gogol-Döring, Andreas; Chen, Wei

2014-01-01

ABSTRACT Genome-wide analysis of adeno-associated virus (AAV) type 2 integration in HeLa cells has shown that wild-type AAV integrates at numerous genomic sites, including AAVS1 on chromosome 19q13.42. Multiple GAGY/C repeats, resembling consensus AAV Rep-binding sites are preferred, whereas rep-deficient AAV vectors (rAAV) regularly show a random integration profile. This study is the first study to analyze wild-type AAV integration in diploid human fibroblasts. Applying high-throughput third-generation PacBio-based DNA sequencing, integration profiles of wild-type AAV and rAAV are compared side by side. Bioinformatic analysis reveals that both wild-type AAV and rAAV prefer open chromatin regions. Although genomic features of AAV integration largely reproduce previous findings, the pattern of integration hot spots differs from that described in HeLa cells before. DNase-Seq data for human fibroblasts and for HeLa cells reveal variant chromatin accessibility at preferred AAV integration hot spots that correlates with variant hot spot preferences. DNase-Seq patterns of these sites in human tissues, including liver, muscle, heart, brain, skin, and embryonic stem cells further underline variant chromatin accessibility. In summary, AAV integration is dependent on cell-type-specific, variant chromatin accessibility leading to random integration profiles for rAAV, whereas wild-type AAV integration sites cluster near GAGY/C repeats. IMPORTANCE Adeno-associated virus type 2 (AAV) is assumed to establish latency by chromosomal integration of its DNA. This is the first genome-wide analysis of wild-type AAV2 integration in diploid human cells and the first to compare wild-type to recombinant AAV vector integration side by side under identical experimental conditions. Major determinants of wild-type AAV integration represent open chromatin regions with accessible consensus AAV Rep-binding sites. The variant chromatin accessibility of different human tissues or cell types will have impact on vector targeting to be considered during gene therapy. PMID:25031342
The spectrum of genomic signatures: from dinucleotides to chaos game representation.

PubMed

Wang, Yingwei; Hill, Kathleen; Singh, Shiva; Kari, Lila

2005-02-14

In the post genomic era, access to complete genome sequence data for numerous diverse species has opened multiple avenues for examining and comparing primary DNA sequence organization of entire genomes. Previously, the concept of a genomic signature was introduced with the observation of species-type specific Dinucleotide Relative Abundance Profiles (DRAPs); dinucleotides were identified as the subsequences with the greatest bias in representation in a majority of genomes. Herein, we demonstrate that DRAP is one particular genomic signature contained within a broader spectrum of signatures. Within this spectrum, an alternative genomic signature, Chaos Game Representation (CGR), provides a unique visualization of patterns in sequence organization. A genomic signature is associated with a particular integer order or subsequence length that represents a measure of the resolution or granularity in the analysis of primary DNA sequence organization. We quantitatively explore the organizational information provided by genomic signatures of different orders through different distance measures, including a novel Image Distance. The Image Distance and other existing distance measures are evaluated by comparing the phylogenetic trees they generate for 26 complete mitochondrial genomes from a diversity of species. The phylogenetic tree generated by the Image Distance is compatible with the known relatedness of species. Quantitative evaluation of the spectrum of genomic signatures may be used to ultimately gain insight into the determinants and biological relevance of the genome signatures.
A combinatorial approach of comprehensive QTL-based comparative genome mapping and transcript profiling identified a seed weight-regulating candidate gene in chickpea

PubMed Central

Bajaj, Deepak; Upadhyaya, Hari D.; Khan, Yusuf; Das, Shouvik; Badoni, Saurabh; Shree, Tanima; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. L.; Singh, Sube; Sharma, Shivali; Tyagi, Akhilesh K.; Chattopdhyay, Debasis; Parida, Swarup K.

2015-01-01

High experimental validation/genotyping success rate (94–96%) and intra-specific polymorphic potential (82–96%) of 1536 SNP and 472 SSR markers showing in silico polymorphism between desi ICC 4958 and kabuli ICC 12968 chickpea was obtained in a 190 mapping population (ICC 4958 × ICC 12968) and 92 diverse desi and kabuli genotypes. A high-density 2001 marker-based intra-specific genetic linkage map comprising of eight LGs constructed is comparatively much saturated (mean map-density: 0.94 cM) in contrast to existing intra-specific genetic maps in chickpea. Fifteen robust QTLs (PVE: 8.8–25.8% with LOD: 7.0–13.8) associated with pod and seed number/plant (PN and SN) and 100 seed weight (SW) were identified and mapped on 10 major genomic regions of eight LGs. One of 126.8 kb major genomic region harbouring a strong SW-associated robust QTL (Caq'SW1.1: 169.1–171.3 cM) has been delineated by integrating high-resolution QTL mapping with comprehensive marker-based comparative genome mapping and differential expression profiling. This identified one potential regulatory SNP (G/A) in the cis-acting element of candidate ERF (ethylene responsive factor) TF (transcription factor) gene governing seed weight in chickpea. The functionally relevant molecular tags identified have potential to be utilized for marker-assisted genetic improvement of chickpea. PMID:25786576
Comparing Patterns of Natural Selection across Species Using Selective Signatures

PubMed Central

Shapiro, B. Jesse; Alm, Eric J

2008-01-01

Comparing gene expression profiles over many different conditions has led to insights that were not obvious from single experiments. In the same way, comparing patterns of natural selection across a set of ecologically distinct species may extend what can be learned from individual genome-wide surveys. Toward this end, we show how variation in protein evolutionary rates, after correcting for genome-wide effects such as mutation rate and demographic factors, can be used to estimate the level and types of natural selection acting on genes across different species. We identify unusually rapidly and slowly evolving genes, relative to empirically derived genome-wide and gene family-specific background rates for 744 core protein families in 30 γ-proteobacterial species. We describe the pattern of fast or slow evolution across species as the “selective signature” of a gene. Selective signatures represent a profile of selection across species that is predictive of gene function: pairs of genes with correlated selective signatures are more likely to share the same cellular function, and genes in the same pathway can evolve in concert. For example, glycolysis and phenylalanine metabolism genes evolve rapidly in Idiomarina loihiensis, mirroring an ecological shift in carbon source from sugars to amino acids. In a broader context, our results suggest that the genomic landscape is organized into functional modules even at the level of natural selection, and thus it may be easier than expected to understand the complex evolutionary pressures on a cell. PMID:18266472
In silico genomic analyses reveal three distinct lineages of Escherichia coli O157:H7, one of which is associated with hyper-virulence.

PubMed

Laing, Chad R; Buchanan, Cody; Taboada, Eduardo N; Zhang, Yongxiang; Karmali, Mohamed A; Thomas, James E; Gannon, Victor Pj

2009-06-29

Many approaches have been used to study the evolution, population structure and genetic diversity of Escherichia coli O157:H7; however, observations made with different genotyping systems are not easily relatable to each other. Three genetic lineages of E. coli O157:H7 designated I, II and I/II have been identified using octamer-based genome scanning and microarray comparative genomic hybridization (mCGH). Each lineage contains significant phenotypic differences, with lineage I strains being the most commonly associated with human infections. Similarly, a clade of hyper-virulent O157:H7 strains implicated in the 2006 spinach and lettuce outbreaks has been defined using single-nucleotide polymorphism (SNP) typing. In this study an in silico comparison of six different genotyping approaches was performed on 19 E. coli genome sequences from 17 O157:H7 strains and single O145:NM and K12 MG1655 strains to provide an overall picture of diversity of the E. coli O157:H7 population, and to compare genotyping methods for O157:H7 strains. In silico determination of lineage, Shiga-toxin bacteriophage integration site, comparative genomic fingerprint, mCGH profile, novel region distribution profile, SNP type and multi-locus variable number tandem repeat analysis type was performed and a supernetwork based on the combination of these methods was produced. This supernetwork showed three distinct clusters of strains that were O157:H7 lineage-specific, with the SNP-based hyper-virulent clade 8 synonymous with O157:H7 lineage I/II. Lineage I/II/clade 8 strains clustered closest on the supernetwork to E. coli K12 and E. coli O55:H7, O145:NM and sorbitol-fermenting O157 strains. The results of this study highlight the similarities in relationships derived from multi-locus genome sampling methods and suggest a "common genotyping language" may be devised for population genetics and epidemiological studies. Future genotyping methods should provide data that can be stored centrally and accessed locally in an easily transferable, informative and extensible format based on comparative genomic analyses.
Renal cell carcinoma primary cultures maintain genomic and phenotypic profile of parental tumor tissues.

PubMed

Cifola, Ingrid; Bianchi, Cristina; Mangano, Eleonora; Bombelli, Silvia; Frascati, Fabio; Fasoli, Ester; Ferrero, Stefano; Di Stefano, Vitalba; Zipeto, Maria A; Magni, Fulvio; Signorini, Stefano; Battaglia, Cristina; Perego, Roberto A

2011-06-13

Clear cell renal cell carcinoma (ccRCC) is characterized by recurrent copy number alterations (CNAs) and loss of heterozygosity (LOH), which may have potential diagnostic and prognostic applications. Here, we explored whether ccRCC primary cultures, established from surgical tumor specimens, maintain the DNA profile of parental tumor tissues allowing a more confident CNAs and LOH discrimination with respect to the original tissues. We established a collection of 9 phenotypically well-characterized ccRCC primary cell cultures. Using the Affymetrix SNP array technology, we performed the genome-wide copy number (CN) profiling of both cultures and corresponding tumor tissues. Global concordance for each culture/tissue pair was assayed evaluating the correlations between whole-genome CN profiles and SNP allelic calls. CN analysis was performed using the two CNAG v3.0 and Partek software, and comparing results returned by two different algorithms (Hidden Markov Model and Genomic Segmentation). A very good overlap between the CNAs of each culture and corresponding tissue was observed. The finding, reinforced by high whole-genome CN correlations and SNP call concordances, provided evidence that each culture was derived from its corresponding tissue and maintained the genomic alterations of parental tumor. In addition, primary culture DNA profile remained stable for at least 3 weeks, till to third passage. These cultures showed a greater cell homogeneity and enrichment in tumor component than original tissues, thus enabling a better discrimination of CNAs and LOH. Especially for hemizygous deletions, primary cultures presented more evident CN losses, typically accompanied by LOH; differently, in original tissues the intensity of these deletions was weaken by normal cell contamination and LOH calls were missed. ccRCC primary cultures are a reliable in vitro model, well-reproducing original tumor genetics and phenotype, potentially useful for future functional approaches aimed to study genes or pathways involved in ccRCC etiopathogenesis and to identify novel clinical markers or therapeutic targets. Moreover, SNP array technology proved to be a powerful tool to better define the cell composition and homogeneity of RCC primary cultures. © 2011 Cifola et al; licensee BioMed Central Ltd.
Development of a tissue-specific ribosome profiling approach in Drosophila enables genome-wide evaluation of translational adaptations

PubMed Central

2017-01-01

Recent advances in next-generation sequencing approaches have revolutionized our understanding of transcriptional expression in diverse systems. However, measurements of transcription do not necessarily reflect gene translation, the process of ultimate importance in understanding cellular function. To circumvent this limitation, biochemical tagging of ribosome subunits to isolate ribosome-associated mRNA has been developed. However, this approach, called TRAP, lacks quantitative resolution compared to a superior technology, ribosome profiling. Here, we report the development of an optimized ribosome profiling approach in Drosophila. We first demonstrate successful ribosome profiling from a specific tissue, larval muscle, with enhanced resolution compared to conventional TRAP approaches. We next validate the ability of this technology to define genome-wide translational regulation. This technology is leveraged to test the relative contributions of transcriptional and translational mechanisms in the postsynaptic muscle that orchestrate the retrograde control of presynaptic function at the neuromuscular junction. Surprisingly, we find no evidence that significant changes in the transcription or translation of specific genes are necessary to enable retrograde homeostatic signaling, implying that post-translational mechanisms ultimately gate instructive retrograde communication. Finally, we show that a global increase in translation induces adaptive responses in both transcription and translation of protein chaperones and degradation factors to promote cellular proteostasis. Together, this development and validation of tissue-specific ribosome profiling enables sensitive and specific analysis of translation in Drosophila. PMID:29194454
DNA methylation profiling of genomic DNA isolated from urine in diabetic chronic kidney disease: A pilot study

PubMed Central

Sexton-Oates, Alexandra; Carmody, Jake; Ekinci, Elif I.; Dwyer, Karen M.; Saffery, Richard

2018-01-01

Aim To characterise the genomic DNA (gDNA) yield from urine and quality of derived methylation data generated from the widely used Illuminia Infinium MethylationEPIC (HM850K) platform and compare this with buffy coat samples. Background DNA methylation is the most widely studied epigenetic mark and variations in DNA methylation profile have been implicated in diabetes which affects approximately 415 million people worldwide. Methods QIAamp Viral RNA Mini Kit and QIAamp DNA micro kit were used to extract DNA from frozen and fresh urine samples as well as increasing volumes of fresh urine. Matched buffy coats to the frozen urine were also obtained and DNA was extracted from the buffy coats using the QIAamp DNA Mini Kit. Genomic DNA of greater concentration than 20μg/ml were used for methylation analysis using the HM850K array. Results Irrespective of extraction technique or the use of fresh versus frozen urine samples, limited genomic DNA was obtained using a starting sample volume of 5ml (0–0.86μg/mL). In order to optimize the yield, we increased starting volumes to 50ml fresh urine, which yielded only 0–9.66μg/mL A different kit, QIAamp DNA Micro Kit, was trialled in six fresh urine samples and ten frozen urine samples with inadequate DNA yields from 0–17.7μg/mL and 0–1.6μg/mL respectively. Sufficient genomic DNA was obtained from only 4 of the initial 41 frozen urine samples (10%) for DNA methylation profiling. In comparison, all four buffy coat samples (100%) provided sufficient genomic DNA. Conclusion High quality data can be obtained provided a sufficient yield of genomic DNA is isolated. Despite optimizing various extraction methodologies, the modest amount of genomic DNA derived from urine, may limit the generalisability of this approach for the identification of DNA methylation biomarkers of chronic diabetic kidney disease. PMID:29462136
DNA methylation profiling of genomic DNA isolated from urine in diabetic chronic kidney disease: A pilot study.

PubMed

Lecamwasam, Ashani; Sexton-Oates, Alexandra; Carmody, Jake; Ekinci, Elif I; Dwyer, Karen M; Saffery, Richard

2018-01-01

To characterise the genomic DNA (gDNA) yield from urine and quality of derived methylation data generated from the widely used Illuminia Infinium MethylationEPIC (HM850K) platform and compare this with buffy coat samples. DNA methylation is the most widely studied epigenetic mark and variations in DNA methylation profile have been implicated in diabetes which affects approximately 415 million people worldwide. QIAamp Viral RNA Mini Kit and QIAamp DNA micro kit were used to extract DNA from frozen and fresh urine samples as well as increasing volumes of fresh urine. Matched buffy coats to the frozen urine were also obtained and DNA was extracted from the buffy coats using the QIAamp DNA Mini Kit. Genomic DNA of greater concentration than 20μg/ml were used for methylation analysis using the HM850K array. Irrespective of extraction technique or the use of fresh versus frozen urine samples, limited genomic DNA was obtained using a starting sample volume of 5ml (0-0.86μg/mL). In order to optimize the yield, we increased starting volumes to 50ml fresh urine, which yielded only 0-9.66μg/mL A different kit, QIAamp DNA Micro Kit, was trialled in six fresh urine samples and ten frozen urine samples with inadequate DNA yields from 0-17.7μg/mL and 0-1.6μg/mL respectively. Sufficient genomic DNA was obtained from only 4 of the initial 41 frozen urine samples (10%) for DNA methylation profiling. In comparison, all four buffy coat samples (100%) provided sufficient genomic DNA. High quality data can be obtained provided a sufficient yield of genomic DNA is isolated. Despite optimizing various extraction methodologies, the modest amount of genomic DNA derived from urine, may limit the generalisability of this approach for the identification of DNA methylation biomarkers of chronic diabetic kidney disease.
Molecular Characteristics of Malignant Ovarian Germ Cell Tumors and Comparison With Testicular Counterparts: Implications for Pathogenesis

PubMed Central

Kraggerud, Sigrid Marie; Hoei-Hansen, Christina E.; Alagaratnam, Sharmini; Skotheim, Rolf I.; Abeler, Vera M.

2013-01-01

This review focuses on the molecular characteristics and development of rare malignant ovarian germ cell tumors (mOGCTs). We provide an overview of the genomic aberrations assessed by ploidy, cytogenetic banding, and comparative genomic hybridization. We summarize and discuss the transcriptome profiles of mRNA and microRNA (miRNA), and biomarkers (DNA methylation, gene mutation, individual protein expression) for each mOGCT histological subtype. Parallels between the origin of mOGCT and their male counterpart testicular GCT (TGCT) are discussed from the perspective of germ cell development, endocrinological influences, and pathogenesis, as is the GCT origin in patients with disorders of sex development. Integrated molecular profiles of the 3 main histological subtypes, dysgerminoma (DG), yolk sac tumor (YST), and immature teratoma (IT), are presented. DGs show genomic aberrations comparable to TGCT. In contrast, the genome profiles of YST and IT are different both from each other and from DG/TGCT. Differences between DG and YST are underlined by their miRNA/mRNA expression patterns, suggesting preferential involvement of the WNT/β-catenin and TGF-β/bone morphogenetic protein signaling pathways among YSTs. Characteristic protein expression patterns are observed in DG, YST and IT. We propose that mOGCT develop through different developmental pathways, including one that is likely shared with TGCT and involves insufficient sexual differentiation of the germ cell niche. The molecular features of the mOGCTs underline their similarity to pluripotent precursor cells (primordial germ cells, PGCs) and other stem cells. This similarity combined with the process of ovary development, explain why mOGCTs present so early in life, and with greater histological complexity, than most somatic solid tumors. PMID:23575763
PanCoreGen - Profiling, detecting, annotating protein-coding genes in microbial genomes.

PubMed

Paul, Sandip; Bhardwaj, Archana; Bag, Sumit K; Sokurenko, Evgeni V; Chattopadhyay, Sujay

2015-12-01

A large amount of genomic data, especially from multiple isolates of a single species, has opened new vistas for microbial genomics analysis. Analyzing the pan-genome (i.e. the sum of genetic repertoire) of microbial species is crucial in understanding the dynamics of molecular evolution, where virulence evolution is of major interest. Here we present PanCoreGen - a standalone application for pan- and core-genomic profiling of microbial protein-coding genes. PanCoreGen overcomes key limitations of the existing pan-genomic analysis tools, and develops an integrated annotation-structure for a species-specific pan-genomic profile. It provides important new features for annotating draft genomes/contigs and detecting unidentified genes in annotated genomes. It also generates user-defined group-specific datasets within the pan-genome. Interestingly, analyzing an example-set of Salmonella genomes, we detect potential footprints of adaptive convergence of horizontally transferred genes in two human-restricted pathogenic serovars - Typhi and Paratyphi A. Overall, PanCoreGen represents a state-of-the-art tool for microbial phylogenomics and pathogenomics study. Copyright © 2015 Elsevier Inc. All rights reserved.
Functional genomic mRNA profiling of a large cancer data base demonstrates mesothelin overexpression in a broad range of tumor types.

PubMed

Lamberts, Laetitia E; de Groot, Derk Jan A; Bense, Rico D; de Vries, Elisabeth G E; Fehrmann, Rudolf S N

2015-09-29

The membrane bound glycoprotein mesothelin (MSLN) is a highly specific tumor marker, which is currently exploited as target for drugs. There are only limited data available on MSLN expression by human tumors. Therefore we determined overexpression of MSLN across different tumor types with Functional Genomic mRNA (FGM) profiling of a large cancer database. Results were compared with data in articles reporting immunohistochemical (IHC) MSLN tumor expression. FGM profiling is a technique that allows prediction of biologically relevant overexpression of proteins from a robust data set of mRNA microarrays. This technique was used in a database comprising 19,746 tumors to identify for 41 tumor types the percentage of samples with an overexpression of MSLN compared to a normal background. A literature search was performed to compare the FGM profiling data with studies reporting IHC MSLN tumor expression. FGM profiling showed MSLN overexpression in gastrointestinal (12-36%) and gynecological tumors (20-66%), non-small cell lung cancer (21%) and synovial sarcomas (30%). The overexpression found in thyroid cancers (5%) and renal cell cancers (10%) was not yet reported with IHC analyses. We observed that MSLN amplification rate within esophageal cancer depends on the histotype (31% for adenocarcinomas versus 3% for squamous-cell carcinomas). Subset analysis in breast cancer showed MSLN amplification rates of 28% in triple-negative breast cancer (TNBC) and 33% in basal-like breast cancer. Further subtype analysis of TNBCs showed the highest amplification rate (42%) in the basal-like 1 subtype and the lowest amplification rate (9%) in the luminal androgen receptor subtype.
Salivary gland carcinosarcoma: oligonucleotide array CGH reveals similar genomic profiles in epithelial and mesenchymal components.

PubMed

Vékony, Hedy; Leemans, C René; Ylstra, Bauke; Meijer, Gerrit A; van der Waal, Isaäc; Bloemena, Elisabeth

2009-03-01

In this study, we present a case of parotid gland de novo carcinosarcoma. Salivary gland carcinosarcoma (or true malignant mixed tumor) is a rare biphasic neoplasm, composed of both malignant epithelial and malignant mesenchymal components. It is yet unclear whether these two phenotypes occur by collision of two independent tumors or if they are of clonal origin. To analyze the clonality of the different morphologic tumor components, oligonucleotide microarray-based comparative genomic hybridization (oaCGH) was performed on the carcinoma and the sarcoma entity separately. This technique enables a high-resolution, genome-wide overview of the chromosomal alterations in the distinct tumor elements. Analysis of both fractions showed a high number of DNA copy number changes. Losses were more prevalent than gains (82 and 49, respectively). The carcinomatous element displayed more chromosomal aberrations than the sarcomatous component. Specific amplifications of MUC20 (in mesenchymal element) and BMI-1 (in both elements) loci were observed. Overall homology between the two genomic profiles was 75%. DNA copy number profiles of the epithelial and mesenchymal components in this salivary gland carcinosarcoma displayed extensive overlap, indicating a monoclonal origin. Since losses are shared to a larger extent than gains, they seem to be more essential for initial oncogenic events. Furthermore, specific amplifications of a mucin and a Polycomb group gene imply these proteins in the tumorigenesis of carcinosarcomas.
Substrate utilization profiles of bacterial strains in plankton from the River Warnow, a humic and eutrophic river in north Germany.

PubMed

Freese, Heike M; Eggert, Anja; Garland, Jay L; Schumann, Rhena

2010-01-01

Bacteria are very important degraders of organic substances in aquatic environments. Despite their influential role in the carbon (and many other element) cycle(s), the specific genetic identity of active bacteria is mostly unknown, although contributing phylogenetic groups had been investigated. Moreover, the degree to which phenotypic potential (i. e., utilization of environmentally relevant carbon substrates) is related to the genomic identity of bacteria or bacterial groups is unclear. The present study compared the genomic fingerprints of 27 bacterial isolates from the humic River Warnow with their ability to utilize 14 environmentally relevant substrates. Acetate was the only substrate utilized by all bacterial strains. Only 60% of the strains respired glucose, but this substrate always stimulated the highest bacterial activity (respiration and growth). Two isolates, both closely related to the same Pseudomonas sp., also had very similar substrate utilization patterns. However, similar substrate utilization profiles commonly belonged to genetically different strains (e.g., the substrate profile of Janthinobacterium lividum OW6/RT-3 and Flavobacterium sp. OW3/15-5 differed by only three substrates). Substrate consumption was sometimes totally different for genetically related isolates. Thus, the genomic profiles of bacterial strains were not congruent with their different substrate utilization profiles. Additionally, changes in pre-incubation conditions strongly influenced substrate utilization. Therefore, it is problematic to infer substrate utilization and especially microbial dissolved organic matter transformation in aquatic systems from bacterial molecular taxonomy.
Copy number variation profile in the placental and parental genomes of recurrent pregnancy loss families

PubMed Central

Kasak, Laura; Rull, Kristiina; Sõber, Siim; Laan, Maris

2017-01-01

We have previously shown an extensive load of somatic copy number variations (CNVs) in the human placental genome with the highest fraction detected in normal term pregnancies. Hereby, we hypothesized that insufficient promotion of CNVs may impair placental development and lead to recurrent pregnancy loss (RPL). RPL affects ~3% of couples aiming at childbirth and idiopathic RPL represents ~50% of cases. We analysed placental and parental CNV profiles of idiopathic RPL trios (mother-father-placenta) and duos (mother-placenta). Consistent with the hypothesis, the placental genomes of RPL cases exhibited 2-fold less CNVs compared to uncomplicated 1st trimester pregnancies (P = 0.02). This difference mainly arose from lower number of duplications. Overall, 1st trimester control placentas shared only 5.3% of identified CNV regions with RPL cases, whereas the respective fraction with term placentas was 35.1% (P = 1.1 × 10−9). Disruption of the genes NUP98 (embryonic stem cell development) and MTRR (folate metabolism) was detected exclusively in RPL placentas, potentially indicative to novel loci implicated in RPL. Interestingly, genes with higher overall expression were prone to deletions (>3-fold higher median expression compared to genes unaffected by CNVs, P = 6.69 × 10−20). Additionally, large pericentromeric and subtelomeric CNVs in parental genomes emerged as a risk factor for RPL. PMID:28345611
Gene Structures, Evolution and Transcriptional Profiling of the WRKY Gene Family in Castor Bean (Ricinus communis L.).

PubMed

Zou, Zhi; Yang, Lifu; Wang, Danhua; Huang, Qixing; Mo, Yeyong; Xie, Guishui

2016-01-01

WRKY proteins comprise one of the largest transcription factor families in plants and form key regulators of many plant processes. This study presents the characterization of 58 WRKY genes from the castor bean (Ricinus communis L., Euphorbiaceae) genome. Compared with the automatic genome annotation, one more WRKY-encoding locus was identified and 20 out of the 57 predicted gene models were manually corrected. All RcWRKY genes were shown to contain at least one intron in their coding sequences. According to the structural features of the present WRKY domains, the identified RcWRKY genes were assigned to three previously defined groups (I-III). Although castor bean underwent no recent whole-genome duplication event like physic nut (Jatropha curcas L., Euphorbiaceae), comparative genomics analysis indicated that one gene loss, one intron loss and one recent proximal duplication occurred in the RcWRKY gene family. The expression of all 58 RcWRKY genes was supported by ESTs and/or RNA sequencing reads derived from roots, leaves, flowers, seeds and endosperms. Further global expression profiles with RNA sequencing data revealed diverse expression patterns among various tissues. Results obtained from this study not only provide valuable information for future functional analysis and utilization of the castor bean WRKY genes, but also provide a useful reference to investigate the gene family expansion and evolution in Euphorbiaceus plants.
MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes.

PubMed

Zhu, Huaiqiu; Hu, Gang-Qing; Yang, Yi-Fan; Wang, Jin; She, Zhen-Su

2007-03-16

Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures prevents from further elucidation of differences among genomes. It continues to be interesting to develop new ab initio algorithms which not only accurately predict genes, but also facilitate comparative studies of prokaryotic genomes. This paper describes a new prokaryotic genefinding algorithm based on a comprehensive statistical model of protein coding Open Reading Frames (ORFs) and Translation Initiation Sites (TISs). The former is based on a linguistic "Entropy Density Profile" (EDP) model of coding DNA sequence and the latter comprises several relevant features related to the translation initiation. They are combined to form a so-called Multivariate Entropy Distance (MED) algorithm, MED 2.0, that incorporates several strategies in the iterative program. The iterations enable us to develop a non-supervised learning process and to obtain a set of genome-specific parameters for the gene structure, before making the prediction of genes. Results of extensive tests show that MED 2.0 achieves a competitive high performance in the gene prediction for both 5' and 3' end matches, compared to the current best prokaryotic gene finders. The advantage of the MED 2.0 is particularly evident for GC-rich genomes and archaeal genomes. Furthermore, the genome-specific parameters given by MED 2.0 match with the current understanding of prokaryotic genomes and may serve as tools for comparative genomic studies. In particular, MED 2.0 is shown to reveal divergent translation initiation mechanisms in archaeal genomes while making a more accurate prediction of TISs compared to the existing gene finders and the current GenBank annotation.

ChIP-chip versus ChIP-seq: Lessons for experimental design and data analysis

PubMed Central

2011-01-01

Background Chromatin immunoprecipitation (ChIP) followed by microarray hybridization (ChIP-chip) or high-throughput sequencing (ChIP-seq) allows genome-wide discovery of protein-DNA interactions such as transcription factor bindings and histone modifications. Previous reports only compared a small number of profiles, and little has been done to compare histone modification profiles generated by the two technologies or to assess the impact of input DNA libraries in ChIP-seq analysis. Here, we performed a systematic analysis of a modENCODE dataset consisting of 31 pairs of ChIP-chip/ChIP-seq profiles of the coactivator CBP, RNA polymerase II (RNA PolII), and six histone modifications across four developmental stages of Drosophila melanogaster. Results Both technologies produce highly reproducible profiles within each platform, ChIP-seq generally produces profiles with a better signal-to-noise ratio, and allows detection of more peaks and narrower peaks. The set of peaks identified by the two technologies can be significantly different, but the extent to which they differ varies depending on the factor and the analysis algorithm. Importantly, we found that there is a significant variation among multiple sequencing profiles of input DNA libraries and that this variation most likely arises from both differences in experimental condition and sequencing depth. We further show that using an inappropriate input DNA profile can impact the average signal profiles around genomic features and peak calling results, highlighting the importance of having high quality input DNA data for normalization in ChIP-seq analysis. Conclusions Our findings highlight the biases present in each of the platforms, show the variability that can arise from both technology and analysis methods, and emphasize the importance of obtaining high quality and deeply sequenced input DNA libraries for ChIP-seq analysis. PMID:21356108
funRNA: a fungi-centered genomics platform for genes encoding key components of RNAi.

PubMed

Choi, Jaeyoung; Kim, Ki-Tae; Jeon, Jongbum; Wu, Jiayao; Song, Hyeunjeong; Asiegbu, Fred O; Lee, Yong-Hwan

2014-01-01

RNA interference (RNAi) is involved in genome defense as well as diverse cellular, developmental, and physiological processes. Key components of RNAi are Argonaute, Dicer, and RNA-dependent RNA polymerase (RdRP), which have been functionally characterized mainly in model organisms. The key components are believed to exist throughout eukaryotes; however, there is no systematic platform for archiving and dissecting these important gene families. In addition, few fungi have been studied to date, limiting our understanding of RNAi in fungi. Here we present funRNA http://funrna.riceblast.snu.ac.kr/, a fungal kingdom-wide comparative genomics platform for putative genes encoding Argonaute, Dicer, and RdRP. To identify and archive genes encoding the abovementioned key components, protein domain profiles were determined from reference sequences obtained from UniProtKB/SwissProt. The domain profiles were searched using fungal, metazoan, and plant genomes, as well as bacterial and archaeal genomes. 1,163, 442, and 678 genes encoding Argonaute, Dicer, and RdRP, respectively, were predicted. Based on the identification results, active site variation of Argonaute, diversification of Dicer, and sequence analysis of RdRP were discussed in a fungus-oriented manner. funRNA provides results from diverse bioinformatics programs and job submission forms for BLAST, BLASTMatrix, and ClustalW. Furthermore, sequence collections created in funRNA are synced with several gene family analysis portals and databases, offering further analysis opportunities. funRNA provides identification results from a broad taxonomic range and diverse analysis functions, and could be used in diverse comparative and evolutionary studies. It could serve as a versatile genomics workbench for key components of RNAi.
Tumor Touch Imprints as Source for Whole Genome Analysis of Neuroblastoma Tumors

PubMed Central

Brunner, Clemens; Brunner-Herglotz, Bettina; Ziegler, Andrea; Frech, Christian; Amann, Gabriele; Ladenstein, Ruth; Ambros, Inge M.; Ambros, Peter F.

2016-01-01

Introduction Tumor touch imprints (TTIs) are routinely used for the molecular diagnosis of neuroblastomas by interphase fluorescence in-situ hybridization (I-FISH). However, in order to facilitate a comprehensive, up-to-date molecular diagnosis of neuroblastomas and to identify new markers to refine risk and therapy stratification methods, whole genome approaches are needed. We examined the applicability of an ultra-high density SNP array platform that identifies copy number changes of varying sizes down to a few exons for the detection of genomic changes in tumor DNA extracted from TTIs. Material and Methods DNAs were extracted from TTIs of 46 neuroblastoma and 4 other pediatric tumors. The DNAs were analyzed on the Cytoscan HD SNP array platform to evaluate numerical and structural genomic aberrations. The quality of the data obtained from TTIs was compared to that from randomly chosen fresh or fresh frozen solid tumors (n = 212) and I-FISH validation was performed. Results SNP array profiles were obtained from 48 (out of 50) TTI DNAs of which 47 showed genomic aberrations. The high marker density allowed for single gene analysis, e.g. loss of nine exons in the ATRX gene and the visualization of chromothripsis. Data quality was comparable to fresh or fresh frozen tumor SNP profiles. SNP array results were confirmed by I-FISH. Conclusion TTIs are an excellent source for SNP array processing with the advantage of simple handling, distribution and storage of tumor tissue on glass slides. The minimal amount of tumor tissue needed to analyze whole genomes makes TTIs an economic surrogate source in the molecular diagnostic work up of tumor samples. PMID:27560999
Technical Report: Benchmarking for Quasispecies Abundance Inference with Confidence Intervals from Metagenomic Sequence Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

McLoughlin, K.

2016-01-22

The software application “MetaQuant” was developed by our group at Lawrence Livermore National Laboratory (LLNL). It is designed to profile microbial populations in a sample using data from whole-genome shotgun (WGS) metagenomic DNA sequencing. Several other metagenomic profiling applications have been described in the literature. We ran a series of benchmark tests to compare the performance of MetaQuant against that of a few existing profiling tools, using real and simulated sequence datasets. This report describes our benchmarking procedure and results.
Comprehensive evaluation of genome-wide 5-hydroxymethylcytosine profiling approaches in human DNA.

PubMed

Skvortsova, Ksenia; Zotenko, Elena; Luu, Phuc-Loi; Gould, Cathryn M; Nair, Shalima S; Clark, Susan J; Stirzaker, Clare

2017-01-01

The discovery that 5-methylcytosine (5mC) can be oxidized to 5-hydroxymethylcytosine (5hmC) by the ten-eleven translocation (TET) proteins has prompted wide interest in the potential role of 5hmC in reshaping the mammalian DNA methylation landscape. The gold-standard bisulphite conversion technologies to study DNA methylation do not distinguish between 5mC and 5hmC. However, new approaches to mapping 5hmC genome-wide have advanced rapidly, although it is unclear how the different methods compare in accurately calling 5hmC. In this study, we provide a comparative analysis on brain DNA using three 5hmC genome-wide approaches, namely whole-genome bisulphite/oxidative bisulphite sequencing (WG Bis/OxBis-seq), Infinium HumanMethylation450 BeadChip arrays coupled with oxidative bisulphite (HM450K Bis/OxBis) and antibody-based immunoprecipitation and sequencing of hydroxymethylated DNA (hMeDIP-seq). We also perform loci-specific TET-assisted bisulphite sequencing (TAB-seq) for validation of candidate regions. We show that whole-genome single-base resolution approaches are advantaged in providing precise 5hmC values but require high sequencing depth to accurately measure 5hmC, as this modification is commonly in low abundance in mammalian cells. HM450K arrays coupled with oxidative bisulphite provide a cost-effective representation of 5hmC distribution, at CpG sites with 5hmC levels >~10%. However, 5hmC analysis is restricted to the genomic location of the probes, which is an important consideration as 5hmC modification is commonly enriched at enhancer elements. Finally, we show that the widely used hMeDIP-seq method provides an efficient genome-wide profile of 5hmC and shows high correlation with WG Bis/OxBis-seq 5hmC distribution in brain DNA. However, in cell line DNA with low levels of 5hmC, hMeDIP-seq-enriched regions are not detected by WG Bis/OxBis or HM450K, either suggesting misinterpretation of 5hmC calls by hMeDIP or lack of sensitivity of the latter methods. We highlight both the advantages and caveats of three commonly used genome-wide 5hmC profiling technologies and show that interpretation of 5hmC data can be significantly influenced by the sensitivity of methods used, especially as the levels of 5hmC are low and vary in different cell types and different genomic locations.
Global transcriptomic profiling using small volumes of whole blood: a cost-effective method for translational genomic biomarker identification in small animals.

PubMed

Fricano, Meagan M; Ditewig, Amy C; Jung, Paul M; Liguori, Michael J; Blomme, Eric A G; Yang, Yi

2011-01-01

Blood is an ideal tissue for the identification of novel genomic biomarkers for toxicity or efficacy. However, using blood for transcriptomic profiling presents significant technical challenges due to the transcriptomic changes induced by ex vivo handling and the interference of highly abundant globin mRNA. Most whole blood RNA stabilization and isolation methods also require significant volumes of blood, limiting their effective use in small animal species, such as rodents. To overcome these challenges, a QIAzol-based RNA stabilization and isolation method (QSI) was developed to isolate sufficient amounts of high quality total RNA from 25 to 500 μL of rat whole blood. The method was compared to the standard PAXgene Blood RNA System using blood collected from rats exposed to saline or lipopolysaccharide (LPS). The QSI method yielded an average of 54 ng total RNA per μL of rat whole blood with an average RNA Integrity Number (RIN) of 9, a performance comparable with the standard PAXgene method. Total RNA samples were further processed using the NuGEN Ovation Whole Blood Solution system and cDNA was hybridized to Affymetrix Rat Genome 230 2.0 Arrays. The microarray QC parameters using RNA isolated with the QSI method were within the acceptable range for microarray analysis. The transcriptomic profiles were highly correlated with those using RNA isolated with the PAXgene method and were consistent with expected LPS-induced inflammatory responses. The present study demonstrated that the QSI method coupled with NuGEN Ovation Whole Blood Solution system is cost-effective and particularly suitable for transcriptomic profiling of minimal volumes of whole blood, typical of those obtained with small animal species.
Theranostic Profiling for Actionable Aberrations in Advanced High Risk Osteosarcoma with Aggressive Biology Reveals High Molecular Diversity: The Human Fingerprint Hypothesis.

PubMed

Egas-Bejar, Daniela; Anderson, Pete M; Agarwal, Rishi; Corrales-Medina, Fernando; Devarajan, Eswaran; Huh, Winston W; Brown, Robert E; Subbiah, Vivek

2014-03-12

The survival of patients with advanced osteosarcoma is poor with limited therapeutic options. There is an urgent need for new targeted therapies based on biomarkers. Recently, theranostic molecular profiling services for cancer patients by CLIA-certified commercial companies as well as in-house profiling in academic medical centers have expanded exponentially. We evaluated molecular profiles of patients with advanced osteosarcoma whose tumor tissue had been analyzed by one of the following methods: 1. 182-gene next-generation exome sequencing (Foundation Medicine, Boston, MA), 2. Immunohistochemistry (IHC)/PCR-based panel (CARIS Target Now, Irving, Tx), 3.Comparative genome hybridization (Oncopath, San Antonio, TX). 4. Single-gene PCR assays, PTEN IHC (MDACC CLIA), 5. UT Houston morphoproteomics (Houston, TX). The most common actionable aberrations occur in the PI3K/PTEN/mTOR pathway. No patterns in genomic alterations beyond the above are readily identifiable, and suggest both high molecular diversity in osteosarcoma and the need for more analyses to define distinct subgroups of osteosarcoma defined by genomic alterations. Based on our preliminary observations we hypothesize that the biology of aggressive and the metastatic phenotype osteosarcoma at the molecular level is similar to human fingerprints, in that no two tumors are identical. Further large scale analyses of osteosarcoma samples are warranted to test this hypothesis.
Comparative genomics uncovers the prolific and distinctive metabolic potential of the cyanobacterial genus Moorea

PubMed Central

Leao, Tiago; Castelão, Guilherme; Monroe, Emily A.; Podell, Sheila; Glukhov, Evgenia; Allen, Eric E.; Gerwick, William H.; Gerwick, Lena

2017-01-01

Cyanobacteria are major sources of oxygen, nitrogen, and carbon in nature. In addition to the importance of their primary metabolism, some cyanobacteria are prolific producers of unique and bioactive secondary metabolites. Chemical investigations of the cyanobacterial genus Moorea have resulted in the isolation of over 190 compounds in the last two decades. However, preliminary genomic analysis has suggested that genome-guided approaches can enable the discovery of novel compounds from even well-studied Moorea strains, highlighting the importance of obtaining complete genomes. We report a complete genome of a filamentous tropical marine cyanobacterium, Moorea producens PAL, which reveals that about one-fifth of its genome is devoted to production of secondary metabolites, an impressive four times the cyanobacterial average. Moreover, possession of the complete PAL genome has allowed improvement to the assembly of three other Moorea draft genomes. Comparative genomics revealed that they are remarkably similar to one another, despite their differences in geography, morphology, and secondary metabolite profiles. Gene cluster networking highlights that this genus is distinctive among cyanobacteria, not only in the number of secondary metabolite pathways but also in the content of many pathways, which are potentially distinct from all other bacterial gene clusters to date. These findings portend that future genome-guided secondary metabolite discovery and isolation efforts should be highly productive. PMID:28265051
Comparative Analysis of 35 Basidiomycete Genomes Reveals Diversity and Uniqueness of the Phylum

DOE Office of Scientific and Technical Information (OSTI.GOV)

Riley, Robert; Salamov, Asaf; Otillar, Robert

Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprobes including wood decaying fungi. To better understand the diversity of this phylum we compared the genomes of 35 basidiomycete fungi including 6 newly sequenced genomes. The genomes of basidiomycetes span extremes of genome size, gene number, and repeat content. A phylogenetic tree of Basidiomycota was generated using the Phyldog software, which uses all available protein sequence data to simultaneously infer gene and species trees. Analysis of core genes revealsmore » that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) comprising proteins found in only one organism. Phylogenetic patterns of plant biomass-degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay among the members of Agaricomycotina subphylum. There is a correlation of the profile of certain gene families to nutritional mode in Agaricomycotina. Based on phylogenetically-informed PCA analysis of such profiles, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has liginolytic class II fungal peroxidases. Furthermore, we find that both fungi exhibit wood decay with white rot-like characteristics in growth assays. Analysis of the rate of discovery of proteins with no or few homologs suggests the high value of continued sequencing of basidiomycete fungi.« less
Molecular characterization of chronic-type adult T-cell leukemia/lymphoma.

PubMed

Yoshida, Noriaki; Karube, Kennosuke; Utsunomiya, Atae; Tsukasaki, Kunihiro; Imaizumi, Yoshitaka; Taira, Naoya; Uike, Naokuni; Umino, Akira; Arita, Kotaro; Suguro, Miyuki; Tsuzuki, Shinobu; Kinoshita, Tomohiro; Ohshima, Koichi; Seto, Masao

2014-11-01

Adult T-cell leukemia/lymphoma (ATL) is a human T-cell leukemia virus type-1-induced neoplasm with four clinical subtypes: acute, lymphoma, chronic, and smoldering. Although the chronic type is regarded as indolent ATL, about half of the cases progress to acute-type ATL. The molecular pathogenesis of acute transformation in chronic-type ATL is only partially understood. In an effort to determine the molecular pathogeneses of ATL, and especially the molecular mechanism of acute transformation, oligo-array comparative genomic hybridization and comprehensive gene expression profiling were applied to 27 and 35 cases of chronic and acute type ATL, respectively. The genomic profile of the chronic type was nearly identical to that of acute-type ATL, although more genomic alterations characteristic of acute-type ATL were observed. Among the genomic alterations frequently observed in acute-type ATL, the loss of CDKN2A, which is involved in cell-cycle deregulation, was especially characteristic of acute-type ATL compared with chronic-type ATL. Furthermore, we found that genomic alteration of CD58, which is implicated in escape from the immunosurveillance mechanism, is more frequently observed in acute-type ATL than in the chronic-type. Interestingly, the chronic-type cases with cell-cycle deregulation and disruption of immunosurveillance mechanism were associated with earlier progression to acute-type ATL. These findings suggested that cell-cycle deregulation and the immune escape mechanism play important roles in acute transformation of the chronic type and indicated that these alterations are good predictive markers for chronic-type ATL. ©2014 American Association for Cancer Research.
A Comparative Analysis of the Lyve-SET Phylogenomics Pipeline for Genomic Epidemiology of Foodborne Pathogens

PubMed Central

Katz, Lee S.; Griswold, Taylor; Williams-Newkirk, Amanda J.; Wagner, Darlene; Petkau, Aaron; Sieffert, Cameron; Van Domselaar, Gary; Deng, Xiangyu; Carleton, Heather A.

2017-01-01

Modern epidemiology of foodborne bacterial pathogens in industrialized countries relies increasingly on whole genome sequencing (WGS) techniques. As opposed to profiling techniques such as pulsed-field gel electrophoresis, WGS requires a variety of computational methods. Since 2013, United States agencies responsible for food safety including the CDC, FDA, and USDA, have been performing whole-genome sequencing (WGS) on all Listeria monocytogenes found in clinical, food, and environmental samples. Each year, more genomes of other foodborne pathogens such as Escherichia coli, Campylobacter jejuni, and Salmonella enterica are being sequenced. Comparing thousands of genomes across an entire species requires a fast method with coarse resolution; however, capturing the fine details of highly related isolates requires a computationally heavy and sophisticated algorithm. Most L. monocytogenes investigations employing WGS depend on being able to identify an outbreak clade whose inter-genomic distances are less than an empirically determined threshold. When the difference between a few single nucleotide polymorphisms (SNPs) can help distinguish between genomes that are likely outbreak-associated and those that are less likely to be associated, we require a fine-resolution method. To achieve this level of resolution, we have developed Lyve-SET, a high-quality SNP pipeline. We evaluated Lyve-SET by retrospectively investigating 12 outbreak data sets along with four other SNP pipelines that have been used in outbreak investigation or similar scenarios. To compare these pipelines, several distance and phylogeny-based comparison methods were applied, which collectively showed that multiple pipelines were able to identify most outbreak clusters and strains. Currently in the US PulseNet system, whole genome multi-locus sequence typing (wgMLST) is the preferred primary method for foodborne WGS cluster detection and outbreak investigation due to its ability to name standardized genomic profiles, its central database, and its ability to be run in a graphical user interface. However, creating a functional wgMLST scheme requires extended up-front development and subject-matter expertise. When a scheme does not exist or when the highest resolution is needed, SNP analysis is used. Using three Listeria outbreak data sets, we demonstrated the concordance between Lyve-SET SNP typing and wgMLST. Availability: Lyve-SET can be found at https://github.com/lskatz/Lyve-SET. PMID:28348549
Integrative Genomics Reveals Mechanisms of Copy Number Alterations Responsible for Transcriptional Deregulation in Colorectal Cancer

PubMed Central

Camps, Jordi; Nguyen, Quang Tri; Padilla-Nash, Hesed M.; Knutsen, Turid; McNeil, Nicole E.; Wangsa, Danny; Hummon, Amanda B.; Grade, Marian; Ried, Thomas; Difilippantonio, Michael J.

2016-01-01

To evaluate the mechanisms and consequences of chromosomal aberrations in colorectal cancer (CRC), we used a combination of spectral karyotyping, array comparative genomic hybridization (aCGH), and array-based global gene expression profiling on 31 primary carcinomas and 15 established cell lines. Importantly, aCGH showed that the genomic profiles of primary tumors are recapitulated in the cell lines. We revealed a preponderance of chromosome breakpoints at sites of copy number variants (CNVs) in the CRC cell lines, a novel mechanism of DNA breakage in cancer. The integration of gene expression and aCGH led to the identification of 157 genes localized within high-level copy number changes whose transcriptional deregulation was significantly affected across all of the samples, thereby suggesting that these genes play a functional role in CRC. Genomic amplification at 8q24 was the most recurrent event and led to the overexpression of MYC and FAM84B. Copy number dependent gene expression resulted in deregulation of known cancer genes such as APC, FGFR2, and ERBB2. The identification of only 36 genes whose localization near a breakpoint could account for their observed deregulated expression demonstrates that the major mechanism for transcriptional deregulation in CRC is genomic copy number changes resulting from chromosomal aberrations. PMID:19691111
Genomic Heterogeneity as a Barrier to Precision Medicine in Gastroesophageal Adenocarcinoma.

PubMed

Pectasides, Eirini; Stachler, Matthew D; Derks, Sarah; Liu, Yang; Maron, Steven; Islam, Mirazul; Alpert, Lindsay; Kwak, Heewon; Kindler, Hedy; Polite, Blase; Sharma, Manish R; Allen, Kenisha; O'Day, Emily; Lomnicki, Samantha; Maranto, Melissa; Kanteti, Rajani; Fitzpatrick, Carrie; Weber, Christopher; Setia, Namrata; Xiao, Shu-Yuan; Hart, John; Nagy, Rebecca J; Kim, Kyoung-Mee; Choi, Min-Gew; Min, Byung-Hoon; Nason, Katie S; O'Keefe, Lea; Watanabe, Masayuki; Baba, Hideo; Lanman, Rick; Agoston, Agoston T; Oh, David J; Dunford, Andrew; Thorner, Aaron R; Ducar, Matthew D; Wollison, Bruce M; Coleman, Haley A; Ji, Yuan; Posner, Mitchell C; Roggin, Kevin; Turaga, Kiran; Chang, Paul; Hogarth, Kyle; Siddiqui, Uzma; Gelrud, Andres; Ha, Gavin; Freeman, Samuel S; Rhoades, Justin; Reed, Sarah; Gydush, Greg; Rotem, Denisse; Davison, Jon; Imamura, Yu; Adalsteinsson, Viktor; Lee, Jeeyun; Bass, Adam J; Catenacci, Daniel V

2018-01-01

Gastroesophageal adenocarcinoma (GEA) is a lethal disease where targeted therapies, even when guided by genomic biomarkers, have had limited efficacy. A potential reason for the failure of such therapies is that genomic profiling results could commonly differ between the primary and metastatic tumors. To evaluate genomic heterogeneity, we sequenced paired primary GEA and synchronous metastatic lesions across multiple cohorts, finding extensive differences in genomic alterations, including discrepancies in potentially clinically relevant alterations. Multiregion sequencing showed significant discrepancy within the primary tumor (PT) and between the PT and disseminated disease, with oncogene amplification profiles commonly discordant. In addition, a pilot analysis of cell-free DNA (cfDNA) sequencing demonstrated the feasibility of detecting genomic amplifications not detected in PT sampling. Lastly, we profiled paired primary tumors, metastatic tumors, and cfDNA from patients enrolled in the personalized antibodies for GEA (PANGEA) trial of targeted therapies in GEA and found that genomic biomarkers were recurrently discrepant between the PT and untreated metastases. Divergent primary and metastatic tissue profiling led to treatment reassignment in 32% (9/28) of patients. In discordant primary and metastatic lesions, we found 87.5% concordance for targetable alterations in metastatic tissue and cfDNA, suggesting the potential for cfDNA profiling to enhance selection of therapy. Significance: We demonstrate frequent baseline heterogeneity in targetable genomic alterations in GEA, indicating that current tissue sampling practices for biomarker testing do not effectively guide precision medicine in this disease and that routine profiling of metastatic lesions and/or cfDNA should be systematically evaluated. Cancer Discov; 8(1); 37-48. ©2017 AACR. See related commentary by Sundar and Tan, p. 14 See related article by Janjigian et al., p. 49 This article is highlighted in the In This Issue feature, p. 1 . ©2017 American Association for Cancer Research.
Fungal Genomics Program

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigoriev, Igor

The JGI Fungal Genomics Program aims to scale up sequencing and analysis of fungal genomes to explore the diversity of fungi important for energy and the environment, and to promote functional studies on a system level. Combining new sequencing technologies and comparative genomics tools, JGI is now leading the world in fungal genome sequencing and analysis. Over 120 sequenced fungal genomes with analytical tools are available via MycoCosm (www.jgi.doe.gov/fungi), a web-portal for fungal biologists. Our model of interacting with user communities, unique among other sequencing centers, helps organize these communities, improves genome annotation and analysis work, and facilitates new larger-scalemore » genomic projects. This resulted in 20 high-profile papers published in 2011 alone and contributing to the Genomics Encyclopedia of Fungi, which targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts). Our next grand challenges include larger scale exploration of fungal diversity (1000 fungal genomes), developing molecular tools for DOE-relevant model organisms, and analysis of complex systems and metagenomes.« less
Initiative for Molecular Profiling and Advanced Cancer Therapy and challenges in the implementation of precision medicine.

PubMed

Tsimberidou, Apostolia-Maria

In the last decade, breakthroughs in technology have improved our understanding of genomic, transcriptional, proteomic, epigenetic aberrations and immune mechanisms in carcinogenesis. Genomics and model systems have enabled the validation of novel therapeutic strategies. Based on these developments, in 2007, we initiated the IMPACT (Initiative for Molecular Profiling and Advanced Cancer Therapy) study, the first personalized medicine program for patients with advanced cancer at The University of Texas MD Anderson Cancer Center. We demonstrated that in patients referred for Phase I clinical trials, the use of tumor molecular profiling and treatment with matched targeted therapy was associated with encouraging rates of response, progression-free survival and overall survival compared to non-matched therapy. We are currently conducting IMPACT2, a randomized study evaluating molecular profiling and targeted agents in patients with metastatic cancer. Optimization of innovative biomarker-driven clinical trials that include targeted therapy and/or immunotherapeutic approaches for carefully selected patients will accelerate the development of novel drugs and the implementation of precision medicine. Copyright © 2017 Elsevier Inc. All rights reserved.
Mutation-profile-based methods for understanding selection forces in cancer somatic mutations: a comparative analysis.

PubMed

Zhou, Zhan; Zou, Yangyun; Liu, Gangbiao; Zhou, Jingqi; Wu, Jingcheng; Zhao, Shimin; Su, Zhixi; Gu, Xun

2017-08-29

Human genes exhibit different effects on fitness in cancer and normal cells. Here, we present an evolutionary approach to measure the selection pressure on human genes, using the well-known ratio of the nonsynonymous to synonymous substitution rate in both cancer genomes ( C N / C S ) and normal populations ( p N / p S ). A new mutation-profile-based method that adopts sample-specific mutation rate profiles instead of conventional substitution models was developed. We found that cancer-specific selection pressure is quite different from the selection pressure at the species and population levels. Both the relaxation of purifying selection on passenger mutations and the positive selection of driver mutations may contribute to the increased C N / C S values of human genes in cancer genomes compared with the p N / p S values in human populations. The C N / C S values also contribute to the improved classification of cancer genes and a better understanding of the onco-functionalization of cancer genes during oncogenesis. The use of our computational pipeline to identify cancer-specific positively and negatively selected genes may provide useful information for understanding the evolution of cancers and identifying possible targets for therapeutic intervention.
Transcription facilitated genome-wide recruitment of topoisomerase I and DNA gyrase.

PubMed

Ahmed, Wareed; Sala, Claudia; Hegde, Shubhada R; Jha, Rajiv Kumar; Cole, Stewart T; Nagaraja, Valakunja

2017-05-01

Movement of the transcription machinery along a template alters DNA topology resulting in the accumulation of supercoils in DNA. The positive supercoils generated ahead of transcribing RNA polymerase (RNAP) and the negative supercoils accumulating behind impose severe topological constraints impeding transcription process. Previous studies have implied the role of topoisomerases in the removal of torsional stress and the maintenance of template topology but the in vivo interaction of functionally distinct topoisomerases with heterogeneous chromosomal territories is not deciphered. Moreover, how the transcription-induced supercoils influence the genome-wide recruitment of DNA topoisomerases remains to be explored in bacteria. Using ChIP-Seq, we show the genome-wide occupancy profile of both topoisomerase I and DNA gyrase in conjunction with RNAP in Mycobacterium tuberculosis taking advantage of minimal topoisomerase representation in the organism. The study unveils the first in vivo genome-wide interaction of both the topoisomerases with the genomic regions and establishes that transcription-induced supercoils govern their recruitment at genomic sites. Distribution profiles revealed co-localization of RNAP and the two topoisomerases on the active transcriptional units (TUs). At a given locus, topoisomerase I and DNA gyrase were localized behind and ahead of RNAP, respectively, correlating with the twin-supercoiled domains generated. The recruitment of topoisomerases was higher at the genomic loci with higher transcriptional activity and/or at regions under high torsional stress compared to silent genomic loci. Importantly, the occupancy of DNA gyrase, sole type II topoisomerase in Mtb, near the Ter domain of the Mtb chromosome validates its function as a decatenase.
Transcription facilitated genome-wide recruitment of topoisomerase I and DNA gyrase

PubMed Central

Ahmed, Wareed; Sala, Claudia; Hegde, Shubhada R.; Jha, Rajiv Kumar

2017-01-01

Movement of the transcription machinery along a template alters DNA topology resulting in the accumulation of supercoils in DNA. The positive supercoils generated ahead of transcribing RNA polymerase (RNAP) and the negative supercoils accumulating behind impose severe topological constraints impeding transcription process. Previous studies have implied the role of topoisomerases in the removal of torsional stress and the maintenance of template topology but the in vivo interaction of functionally distinct topoisomerases with heterogeneous chromosomal territories is not deciphered. Moreover, how the transcription-induced supercoils influence the genome-wide recruitment of DNA topoisomerases remains to be explored in bacteria. Using ChIP-Seq, we show the genome-wide occupancy profile of both topoisomerase I and DNA gyrase in conjunction with RNAP in Mycobacterium tuberculosis taking advantage of minimal topoisomerase representation in the organism. The study unveils the first in vivo genome-wide interaction of both the topoisomerases with the genomic regions and establishes that transcription-induced supercoils govern their recruitment at genomic sites. Distribution profiles revealed co-localization of RNAP and the two topoisomerases on the active transcriptional units (TUs). At a given locus, topoisomerase I and DNA gyrase were localized behind and ahead of RNAP, respectively, correlating with the twin-supercoiled domains generated. The recruitment of topoisomerases was higher at the genomic loci with higher transcriptional activity and/or at regions under high torsional stress compared to silent genomic loci. Importantly, the occupancy of DNA gyrase, sole type II topoisomerase in Mtb, near the Ter domain of the Mtb chromosome validates its function as a decatenase. PMID:28463980
Array-Based Comparative Genomic Hybridization Analysis Reveals Chromosomal Copy Number Aberrations Associated with Clinical Outcome in Canine Diffuse Large B-Cell Lymphoma

PubMed Central

Bresolin, Silvia; Marconato, Laura; Comazzi, Stefano; Te Kronnie, Geertruy; Aresu, Luca

2014-01-01

Canine Diffuse Large B-cell Lymphoma (cDLBCL) is an aggressive cancer with variable clinical response. Despite recent attempts by gene expression profiling to identify the dog as a potential animal model for human DLBCL, this tumor remains biologically heterogeneous with no prognostic biomarkers to predict prognosis. The aim of this work was to identify copy number aberrations (CNAs) by high-resolution array comparative genomic hybridization (aCGH) in 12 dogs with newly diagnosed DLBCL. In a subset of these dogs, the genetic profiles at the end of therapy and at relapse were also assessed. In primary DLBCLs, 90 different genomic imbalances were counted, consisting of 46 gains and 44 losses. Two gains in chr13 were significantly correlated with clinical stage. In addition, specific regions of gains and losses were significantly associated to duration of remission. In primary DLBCLs, individual variability was found, however 14 recurrent CNAs (>30%) were identified. Losses involving IGK, IGL and IGH were always found, and gains along the length of chr13 and chr31 were often observed (>41%). In these segments, MYC, LDHB, HSF1, KIT and PDGFRα are annotated. At the end of therapy, dogs in remission showed four new CNAs, whereas three new CNAs were observed in dogs at relapse compared with the previous profiles. One ex novo CNA, involving TCR, was present in dogs in remission after therapy, possibly induced by the autologous vaccine. Overall, aCGH identified small CNAs associated with outcome, which, along with future expression studies, may reveal target genes relevant to cDLBCL. PMID:25372838
Translational bioinformatics in the cloud: an affordable alternative

PubMed Central

2010-01-01

With the continued exponential expansion of publicly available genomic data and access to low-cost, high-throughput molecular technologies for profiling patient populations, computational technologies and informatics are becoming vital considerations in genomic medicine. Although cloud computing technology is being heralded as a key enabling technology for the future of genomic research, available case studies are limited to applications in the domain of high-throughput sequence data analysis. The goal of this study was to evaluate the computational and economic characteristics of cloud computing in performing a large-scale data integration and analysis representative of research problems in genomic medicine. We find that the cloud-based analysis compares favorably in both performance and cost in comparison to a local computational cluster, suggesting that cloud computing technologies might be a viable resource for facilitating large-scale translational research in genomic medicine. PMID:20691073

Quantitative analysis of comparative genomic hybridization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Manoir, S. du; Bentz, M.; Joos, S.

1995-01-01

Comparative genomic hybridization (CGH) is a new molecular cytogenetic method for the detection of chromosomal imbalances. Following cohybridization of DNA prepared from a sample to be studied and control DNA to normal metaphase spreads, probes are detected via different fluorochromes. The ratio of the test and control fluorescence intensities along a chromosome reflects the relative copy number of segments of a chromosome in the test genome. Quantitative evaluation of CGH experiments is required for the determination of low copy changes, e.g., monosomy or trisomy, and for the definition of the breakpoints involved in unbalanced rearrangements. In this study, a programmore » for quantitation of CGH preparations is presented. This program is based on the extraction of the fluorescence ratio profile along each chromosome, followed by averaging of individual profiles from several metaphase spreads. Objective parameters critical for quantitative evaluations were tested, and the criteria for selection of suitable CGH preparations are described. The granularity of the chromosome painting and the regional inhomogeneity of fluorescence intensities in metaphase spreads proved to be crucial parameters. The coefficient of variation of the ratio value for chromosomes in balanced state (CVBS) provides a general quality criterion for CGH experiments. Different cutoff levels (thresholds) of average fluorescence ratio values were compared for their specificity and sensitivity with regard to the detection of chromosomal imbalances. 27 refs., 15 figs., 1 tab.« less
Comparative Transcriptomes and EVO-DEVO Studies Depending on Next Generation Sequencing.

PubMed

Liu, Tiancheng; Yu, Lin; Liu, Lei; Li, Hong; Li, Yixue

2015-01-01

High throughput technology has prompted the progressive omics studies, including genomics and transcriptomics. We have reviewed the improvement of comparative omic studies, which are attributed to the high throughput measurement of next generation sequencing technology. Comparative genomics have been successfully applied to evolution analysis while comparative transcriptomics are adopted in comparison of expression profile from two subjects by differential expression or differential coexpression, which enables their application in evolutionary developmental biology (EVO-DEVO) studies. EVO-DEVO studies focus on the evolutionary pressure affecting the morphogenesis of development and previous works have been conducted to illustrate the most conserved stages during embryonic development. Old measurements of these studies are based on the morphological similarity from macro view and new technology enables the micro detection of similarity in molecular mechanism. Evolutionary model of embryo development, which includes the "funnel-like" model and the "hourglass" model, has been evaluated by combination of these new comparative transcriptomic methods with prior comparative genomic information. Although the technology has promoted the EVO-DEVO studies into a new era, technological and material limitation still exist and further investigations require more subtle study design and procedure.
Spiked GBS: A unified, open platform for single marker genotyping and whole-genome profiling

USDA-ARS?s Scientific Manuscript database

In plant breeding, there are two primary applications for DNA markers in selection: 1) selection of known genes using a single marker assay (marker-assisted selection; MAS); and 2) whole-genome profiling and prediction (genomic selection; GS). Typically, marker platforms have addressed only one of t...
YersiniaBase: a genomic resource and analysis platform for comparative analysis of Yersinia.

PubMed

Tan, Shi Yang; Dutta, Avirup; Jakubovics, Nicholas S; Ang, Mia Yang; Siow, Cheuk Chuen; Mutha, Naresh Vr; Heydari, Hamed; Wee, Wei Yee; Wong, Guat Jah; Choo, Siew Woh

2015-01-16

Yersinia is a Gram-negative bacteria that includes serious pathogens such as the Yersinia pestis, which causes plague, Yersinia pseudotuberculosis, Yersinia enterocolitica. The remaining species are generally considered non-pathogenic to humans, although there is evidence that at least some of these species can cause occasional infections using distinct mechanisms from the more pathogenic species. With the advances in sequencing technologies, many genomes of Yersinia have been sequenced. However, there is currently no specialized platform to hold the rapidly-growing Yersinia genomic data and to provide analysis tools particularly for comparative analyses, which are required to provide improved insights into their biology, evolution and pathogenicity. To facilitate the ongoing and future research of Yersinia, especially those generally considered non-pathogenic species, a well-defined repository and analysis platform is needed to hold the Yersinia genomic data and analysis tools for the Yersinia research community. Hence, we have developed the YersiniaBase, a robust and user-friendly Yersinia resource and analysis platform for the analysis of Yersinia genomic data. YersiniaBase has a total of twelve species and 232 genome sequences, of which the majority are Yersinia pestis. In order to smooth the process of searching genomic data in a large database, we implemented an Asynchronous JavaScript and XML (AJAX)-based real-time searching system in YersiniaBase. Besides incorporating existing tools, which include JavaScript-based genome browser (JBrowse) and Basic Local Alignment Search Tool (BLAST), YersiniaBase also has in-house developed tools: (1) Pairwise Genome Comparison tool (PGC) for comparing two user-selected genomes; (2) Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomics analysis of Yersinia genomes; (3) YersiniaTree for constructing phylogenetic tree of Yersinia. We ran analyses based on the tools and genomic data in YersiniaBase and the preliminary results showed differences in virulence genes found in Yersinia pestis and Yersinia pseudotuberculosis compared to other Yersinia species, and differences between Yersinia enterocolitica subsp. enterocolitica and Yersinia enterocolitica subsp. palearctica. YersiniaBase offers free access to wide range of genomic data and analysis tools for the analysis of Yersinia. YersiniaBase can be accessed at http://yersinia.um.edu.my .
Hospital strain colonization by Staphylococcus epidermidis.

PubMed

Blum-Menezes, D; Bratfich, O J; Padoveze, M C; Moretti, M L

2009-03-01

The skin and mucous membranes of healthy subjects are colonized by strains of Staphylococcus epidermidis showing a high diversity of genomic DNA polymorphisms. Prolonged hospitalization and the use of invasive procedures promote changes in the microbiota with subsequent colonization by hospital strains. We report here a patient with prolonged hospitalization due to chronic pancreatitis who was treated with multiple antibiotics, invasive procedures and abdominal surgery. We studied the dynamics of skin colonization by S. epidermidis leading to the development of catheter-related infections and compared the genotypic profile of clinical and microbiota strains by pulsed field gel electrophoresis. During hospitalization, the normal S. epidermidis skin microbiota exhibiting a polymorphic genomic DNA profile was replaced with a hospital-acquired biofilm-producer S. epidermidis strain that subsequently caused repetitive catheter-related infections.
Managing the genomic revolution in cancer diagnostics.

PubMed

Nguyen, Doreen; Gocke, Christopher D

2017-08-01

Molecular tumor profiling is now a routine part of patient care, revealing targetable genomic alterations and molecularly distinct tumor subtypes with therapeutic and prognostic implications. The widespread adoption of next-generation sequencing technologies has greatly facilitated clinical implementation of genomic data and opened the door for high-throughput multigene-targeted sequencing. Herein, we discuss the variability of cancer genetic profiling currently offered by clinical laboratories, the challenges of applying rapidly evolving medical knowledge to individual patients, and the need for more standardized population-based molecular profiling.
Decomposing genomic variance using information from GWA, GWE and eQTL analysis.

PubMed

Ehsani, A; Janss, L; Pomp, D; Sørensen, P

2016-04-01

A commonly used procedure in genome-wide association (GWA), genome-wide expression (GWE) and expression quantitative trait locus (eQTL) analyses is based on a bottom-up experimental approach that attempts to individually associate molecular variants with complex traits. Top-down modeling of the entire set of genomic data and partitioning of the overall variance into subcomponents may provide further insight into the genetic basis of complex traits. To test this approach, we performed a whole-genome variance components analysis and partitioned the genomic variance using information from GWA, GWE and eQTL analyses of growth-related traits in a mouse F2 population. We characterized the mouse trait genetic architecture by ordering single nucleotide polymorphisms (SNPs) based on their P-values and studying the areas under the curve (AUCs). The observed traits were found to have a genomic variance profile that differed significantly from that expected of a trait under an infinitesimal model. This situation was particularly true for both body weight and body fat, for which the AUCs were much higher compared with that of glucose. In addition, SNPs with a high degree of trait-specific regulatory potential (SNPs associated with subset of transcripts that significantly associated with a specific trait) explained a larger proportion of the genomic variance than did SNPs with high overall regulatory potential (SNPs associated with transcripts using traditional eQTL analysis). We introduced AUC measures of genomic variance profiles that can be used to quantify relative importance of SNPs as well as degree of deviation of a trait's inheritance from an infinitesimal model. The shape of the curve aids global understanding of traits: The steeper the left-hand side of the curve, the fewer the number of SNPs controlling most of the phenotypic variance. © 2015 Stichting International Foundation for Animal Genetics.
Comparative Genomic Analysis of Globally Dominant ST131 Clone with Other Epidemiologically Successful Extraintestinal Pathogenic Escherichia coli (ExPEC) Lineages.

PubMed

Shaik, Sabiha; Ranjan, Amit; Tiwari, Sumeet K; Hussain, Arif; Nandanwar, Nishant; Kumar, Narender; Jadhav, Savita; Semmler, Torsten; Baddam, Ramani; Islam, Mohammed Aminul; Alam, Munirul; Wieler, Lothar H; Watanabe, Haruo; Ahmed, Niyaz

2017-10-24

Escherichia coli sequence type 131 (ST131), a pandemic clone responsible for the high incidence of extraintestinal pathogenic E. coli (ExPEC) infections, has been known widely for its contribution to the worldwide dissemination of multidrug resistance. Although other ExPEC-associated and extended-spectrum-β-lactamase (ESBL)-producing E. coli clones, such as ST38, ST405, and ST648 have been studied widely, no comparative genomic data with respect to other genotypes exist for ST131. In this study, comparative genomic analysis was performed for 99 ST131 E. coli strains with 40 genomes from three other STs, including ST38 ( n = 12), ST405 ( n = 10), and ST648 ( n = 18), and functional studies were performed on five in-house strains corresponding to the four STs. Phylogenomic analysis results from this study corroborated with the sequence type-specific clonality. Results from the genome-wide resistance profiling confirmed that all strains were inherently multidrug resistant. ST131 genomes showed unique virulence profiles, and analysis of mobile genetic elements and their associated methyltransferases (MTases) has revealed that several of them were missing from the majority of the non-ST131 strains. Despite the fact that non-ST131 strains lacked few essential genes belonging to the serum resistome, the in-house strains representing all four STs demonstrated similar resistance levels to serum antibactericidal activity. Core genome analysis data revealed that non-ST131 strains usually lacked several ST131-defined genomic coordinates, and a significant number of genes were missing from the core of the ST131 genomes. Data from this study reinforce adaptive diversification of E. coli strains belonging to the ST131 lineage and provide new insights into the molecular mechanisms underlying clonal diversification of the ST131 lineage. IMPORTANCE E. coli , particularly the ST131 extraintestinal pathogenic E. coli (ExPEC) lineage, is an important cause of community- and hospital-acquired infections, such as urinary tract infections, surgical site infections, bloodstream infections, and sepsis. The treatment of infections caused by ExPEC has become very challenging due to the emergence of resistance to the first-line as well as the last-resort antibiotics. This study analyzes E. coli ST131 against three other important and globally distributed ExPEC lineages (ST38, ST405, and ST648) that also produced extended-spectrum β-lactamase (ESBL). This is perhaps the first study that employs the high-throughput whole-genome sequence-based approach to compare and study the genomic features of these four ExPEC lineages in relation to their functional properties. Findings from this study highlight the differences in the genomic coordinates of ST131 with respect to the other STs considered here. Results from this comparative genomics study can help in advancing the understanding of ST131 evolution and also offer a framework towards future developments in pathogen identification and targeted therapeutics to prevent diseases caused by this pandemic E. coli ST131 clone. Copyright © 2017 Shaik et al.
Regulatory variation: an emerging vantage point for cancer biology.

PubMed

Li, Luolan; Lorzadeh, Alireza; Hirst, Martin

2014-01-01

Transcriptional regulation involves complex and interdependent interactions of noncoding and coding regions of the genome with proteins that interact and modify them. Genetic variation/mutation in coding and noncoding regions of the genome can drive aberrant transcription and disease. In spite of accounting for nearly 98% of the genome comparatively little is known about the contribution of noncoding DNA elements to disease. Genome-wide association studies of complex human diseases including cancer have revealed enrichment for variants in the noncoding genome. A striking finding of recent cancer genome re-sequencing efforts has been the previously underappreciated frequency of mutations in epigenetic modifiers across a wide range of cancer types. Taken together these results point to the importance of dysregulation in transcriptional regulatory control in genesis of cancer. Powered by recent technological advancements in functional genomic profiling, exploration of normal and transformed regulatory networks will provide novel insight into the initiation and progression of cancer and open new windows to future prognostic and diagnostic tools. © 2013 Wiley Periodicals, Inc.
i-ADHoRe 2.0: an improved tool to detect degenerated genomic homology using genomic profiles.

PubMed

Simillion, Cedric; Janssens, Koen; Sterck, Lieven; Van de Peer, Yves

2008-01-01

i-ADHoRe is a software tool that combines gene content and gene order information of homologous genomic segments into profiles to detect highly degenerated homology relations within and between genomes. The new version offers, besides a significant increase in performance, several optimizations to the algorithm, most importantly to the profile alignment routine. As a result, the annotations of multiple genomes, or parts thereof, can be fed simultaneously into the program, after which it will report all regions of homology, both within and between genomes. The i-ADHoRe 2.0 package contains the C++ source code for the main program as well as various Perl scripts and a fully documented Perl API to facilitate post-processing. The software runs on any Linux- or -UNIX based platform. The package is freely available for academic users and can be downloaded from http://bioinformatics.psb.ugent.be/
Methylation profiling identifies 2 groups of gliomas according to their tumorigenesis.

PubMed

Laffaire, Julien; Everhard, Sibille; Idbaih, Ahmed; Crinière, Emmanuelle; Marie, Yannick; de Reyniès, Aurelien; Schiappa, Renaud; Mokhtari, Karima; Hoang-Xuan, Khê; Sanson, Marc; Delattre, Jean-Yves; Thillet, Joëlle; Ducray, François

2011-01-01

Extensive genomic and gene expression studies have been performed in gliomas, but the epigenetic alterations that characterize different subtypes of gliomas remain largely unknown. Here, we analyzed the methylation patterns of 807 genes (1536 CpGs) in a series of 33 low-grade gliomas (LGGs), 36 glioblastomas (GBMs), 8 paired initial and recurrent gliomas, and 9 controls. This analysis was performed with Illumina's Golden Gate Bead methylation arrays and was correlated with clinical, histological, genomic, gene expression, and genotyping data, including IDH1 mutations. Unsupervised hierarchical clustering resulted in 2 groups of gliomas: a group corresponding to de novo GBMs and a group consisting of LGGs, recurrent anaplastic gliomas, and secondary GBMs. When compared with de novo GBMs and controls, this latter group was characterized by a very high frequency of IDH1 mutations and by a hypermethylated profile similar to the recently described glioma CpG island methylator phenotype. MGMT methylation was more frequent in this group. Among the LGG cluster, 1p19q codeleted LGG displayed a distinct methylation profile. A study of paired initial and recurrent gliomas demonstrated that methylation profiles were remarkably stable across glioma evolution, even during anaplastic transformation, suggesting that epigenetic alterations occur early during gliomagenesis. Using the Cancer Genome Atlas data set, we demonstrated that GBM samples that had an LGG-like hypermethylated profile had a high rate of IDH1 mutations and a better outcome. Finally, we identified several hypermethylated and downregulated genes that may be associated with LGG and GBM oncogenesis, LGG oncogenesis, 1p19q codeleted LGG oncogenesis, and GBM oncogenesis.
Methylation profiling identifies 2 groups of gliomas according to their tumorigenesis

PubMed Central

Laffaire, Julien; Everhard, Sibille; Idbaih, Ahmed; Crinière, Emmanuelle; Marie, Yannick; de Reyniès, Aurelien; Schiappa, Renaud; Mokhtari, Karima; Hoang-Xuan, Khê; Sanson, Marc; Delattre, Jean-Yves; Thillet, Joëlle; Ducray, François

2011-01-01

Extensive genomic and gene expression studies have been performed in gliomas, but the epigenetic alterations that characterize different subtypes of gliomas remain largely unknown. Here, we analyzed the methylation patterns of 807 genes (1536 CpGs) in a series of 33 low-grade gliomas (LGGs), 36 glioblastomas (GBMs), 8 paired initial and recurrent gliomas, and 9 controls. This analysis was performed with Illumina's Golden Gate Bead methylation arrays and was correlated with clinical, histological, genomic, gene expression, and genotyping data, including IDH1 mutations. Unsupervised hierarchical clustering resulted in 2 groups of gliomas: a group corresponding to de novo GBMs and a group consisting of LGGs, recurrent anaplastic gliomas, and secondary GBMs. When compared with de novo GBMs and controls, this latter group was characterized by a very high frequency of IDH1 mutations and by a hypermethylated profile similar to the recently described glioma CpG island methylator phenotype. MGMT methylation was more frequent in this group. Among the LGG cluster, 1p19q codeleted LGG displayed a distinct methylation profile. A study of paired initial and recurrent gliomas demonstrated that methylation profiles were remarkably stable across glioma evolution, even during anaplastic transformation, suggesting that epigenetic alterations occur early during gliomagenesis. Using the Cancer Genome Atlas data set, we demonstrated that GBM samples that had an LGG-like hypermethylated profile had a high rate of IDH1 mutations and a better outcome. Finally, we identified several hypermethylated and downregulated genes that may be associated with LGG and GBM oncogenesis, LGG oncogenesis, 1p19q codeleted LGG oncogenesis, and GBM oncogenesis. PMID:20926426
Insight into small RNA abundance and expression in high- and low-temperature stress response using deep sequencing in Arabidopsis.

PubMed

Baev, Vesselin; Milev, Ivan; Naydenov, Mladen; Vachev, Tihomir; Apostolova, Elena; Mehterov, Nikolay; Gozmanva, Mariyana; Minkov, Georgi; Sablok, Gaurav; Yahubyan, Galina

2014-11-01

Small RNA profiling and assessing its dependence on changing environmental factors have expanded our understanding of the transcriptional and post-transcriptional regulation of plant stress responses. Insufficient data have been documented earlier to depict the profiling of small RNA classes in temperature-associated stress which has a wide implication for climate change biology. In the present study, we report a comparative assessment of the genome-wide profiling of small RNAs in Arabidopsis thaliana using two conditional responses, induced by high- and low-temperature. Genome-wide profiling of small RNAs revealed an abundance of 21 nt small RNAs at low temperature, while high temperature showed an abundance of 21 nt and 24 nt small RNAs. The two temperature treatments altered the expression of a specific subset of mature miRNAs and displayed differential expression of a number of miRNA isoforms (isomiRs). Comparative analysis demonstrated that a large number of protein-coding genes can give rise to differentially expressed small RNAs following temperature shifts. Low temperature caused accumulation of small RNAs, corresponding to the sense strand of a number of cold-responsive genes. In contrast, high temperature stimulated the production of small RNAs of both polarities from genes encoding functionally diverse proteins. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
Mining the archives: a cross-platform analysis of gene ...

EPA Pesticide Factsheets

Formalin-fixed paraffin-embedded (FFPE) tissue samples represent a potentially invaluable resource for genomic research into the molecular basis of disease. However, use of FFPE samples in gene expression studies has been limited by technical challenges resulting from degradation of nucleic acids. Here we evaluated gene expression profiles derived from fresh-frozen (FRO) and FFPE mouse liver tissues using two DNA microarray protocols and two whole transcriptome sequencing (RNA-seq) library preparation methodologies. The ribo-depletion protocol outperformed the other three methods by having the highest correlations of differentially expressed genes (DEGs) and best overlap of pathways between FRO and FFPE groups. We next tested the effect of sample time in formalin (18 hours or 3 weeks) on gene expression profiles. Hierarchical clustering of the datasets indicated that test article treatment, and not preservation method, was the main driver of gene expression profiles. Meta- and pathway analyses indicated that biological responses were generally consistent for 18-hour and 3-week FFPE samples compared to FRO samples. However, clear erosion of signal intensity with time in formalin was evident, and DEG numbers differed by platform and preservation method. Lastly, we investigated the effect of age in FFPE block on genomic profiles. RNA-seq analysis of 8-, 19-, and 26-year-old control blocks using the ribo-depletion protocol resulted in comparable quality metrics, inc
Aggressive natural killer-cell leukemia mutational landscape and drug profiling highlight JAK-STAT signaling as therapeutic target.

PubMed

Dufva, Olli; Kankainen, Matti; Kelkka, Tiina; Sekiguchi, Nodoka; Awad, Shady Adnan; Eldfors, Samuli; Yadav, Bhagwan; Kuusanmäki, Heikki; Malani, Disha; Andersson, Emma I; Pietarinen, Paavo; Saikko, Leena; Kovanen, Panu E; Ojala, Teija; Lee, Dean A; Loughran, Thomas P; Nakazawa, Hideyuki; Suzumiya, Junji; Suzuki, Ritsuro; Ko, Young Hyeh; Kim, Won Seog; Chuang, Shih-Sung; Aittokallio, Tero; Chan, Wing C; Ohshima, Koichi; Ishida, Fumihiro; Mustjoki, Satu

2018-04-19

Aggressive natural killer-cell (NK-cell) leukemia (ANKL) is an extremely aggressive malignancy with dismal prognosis and lack of targeted therapies. Here, we elucidate the molecular pathogenesis of ANKL using a combination of genomic and drug sensitivity profiling. We study 14 ANKL patients using whole-exome sequencing (WES) and identify mutations in STAT3 (21%) and RAS-MAPK pathway genes (21%) as well as in DDX3X (29%) and epigenetic modifiers (50%). Additional alterations include JAK-STAT copy gains and tyrosine phosphatase mutations, which we show recurrent also in extranodal NK/T-cell lymphoma, nasal type (NKTCL) through integration of public genomic data. Drug sensitivity profiling further demonstrates the role of the JAK-STAT pathway in the pathogenesis of NK-cell malignancies, identifying NK cells to be highly sensitive to JAK and BCL2 inhibition compared to other hematopoietic cell lineages. Our results provide insight into ANKL genetics and a framework for application of targeted therapies in NK-cell malignancies.
Unique DNA methylome profiles in CpG island methylator phenotype colon cancers

PubMed Central

Xu, Yaomin; Hu, Bo; Choi, Ae-Jin; Gopalan, Banu; Lee, Byron H.; Kalady, Matthew F.; Church, James M.; Ting, Angela H.

2012-01-01

A subset of colorectal cancers was postulated to have the CpG island methylator phenotype (CIMP), a higher propensity for CpG island DNA methylation. The validity of CIMP, its molecular basis, and its prognostic value remain highly controversial. Using MBD-isolated genome sequencing, we mapped and compared genome-wide DNA methylation profiles of normal, non-CIMP, and CIMP colon specimens. Multidimensional scaling analysis revealed that each specimen could be clearly classified as normal, non-CIMP, and CIMP, thus signifying that these three groups have distinctly different global methylation patterns. We discovered 3780 sites in various genomic contexts that were hypermethylated in both non-CIMP and CIMP colon cancers when compared with normal colon. An additional 2026 sites were found to be hypermethylated in CIMP tumors only; and importantly, 80% of these sites were located in CpG islands. These data demonstrate on a genome-wide level that the additional hypermethylation seen in CIMP tumors occurs almost exclusively at CpG islands and support definitively that these tumors were appropriately named. When these sites were examined more closely, we found that 25% were adjacent to sites that were also hypermethylated in non-CIMP tumors. Thus, CIMP is also characterized by more extensive methylation of sites that are already prone to be hypermethylated in colon cancer. These observations indicate that CIMP tumors have specific defects in controlling both DNA methylation seeding and spreading and serve as an important first step in delineating molecular mechanisms that control these processes. PMID:21990380
The role of genetic and epigenetic alterations in neuroblastoma disease pathogenesis

PubMed Central

Domingo-Fernandez, Raquel; Watters, Karen; Piskareva, Olga; Bray, Isabella

2013-01-01

Neuroblastoma is a highly heterogeneous tumor accounting for 15 % of all pediatric cancer deaths. Clinical behavior ranges from the spontaneous regression of localized, asymptomatic tumors, as well as metastasized tumors in infants, to rapid progression and resistance to therapy. Genomic amplification of the MYCN oncogene has been used to predict outcome in neuroblastoma for over 30 years, however, recent methodological advances including miR-NA and mRNA profiling, comparative genomic hybridization (array-CGH), and whole-genome sequencing have enabled the detailed analysis of the neuroblastoma genome, leading to the identification of new prognostic markers and better patient stratification. In this review, we will describe the main genetic factors responsible for these diverse clinical phenotypes in neuroblastoma, the chronology of their discovery, and the impact on patient prognosis. PMID:23274701
Bluejay 1.0: genome browsing and comparison with rich customization provision and dynamic resource linking

PubMed Central

Soh, Jung; Gordon, Paul MK; Taschuk, Morgan L; Dong, Anguo; Ah-Seng, Andrew C; Turinsky, Andrei L; Sensen, Christoph W

2008-01-01

Background The Bluejay genome browser has been developed over several years to address the challenges posed by the ever increasing number of data types as well as the increasing volume of data in genome research. Beginning with a browser capable of rendering views of XML-based genomic information and providing scalable vector graphics output, we have now completed version 1.0 of the system with many additional features. Our development efforts were guided by our observation that biologists who use both gene expression profiling and comparative genomics gain functional insights above and beyond those provided by traditional per-gene analyses. Results Bluejay 1.0 is a genome viewer integrating genome annotation with: (i) gene expression information; and (ii) comparative analysis with an unlimited number of other genomes in the same view. This allows the biologist to see a gene not just in the context of its genome, but also its regulation and its evolution. Bluejay now has rich provision for personalization by users: (i) numerous display customization features; (ii) the availability of waypoints for marking multiple points of interest on a genome and subsequently utilizing them; and (iii) the ability to take user relevance feedback of annotated genes or textual items to offer personalized recommendations. Bluejay 1.0 also embeds the Seahawk browser for the Moby protocol, enabling users to seamlessly invoke hundreds of Web Services on genomic data of interest without any hard-coding. Conclusion Bluejay offers a unique set of customizable genome-browsing features, with the goal of allowing biologists to quickly focus on, analyze, compare, and retrieve related information on the parts of the genomic data they are most interested in. We expect these capabilities of Bluejay to benefit the many biologists who want to answer complex questions using the information available from completely sequenced genomes. PMID:18940007
A comparative analysis of whole genome sequencing of esophageal adenocarcinoma pre- and post-chemotherapy

PubMed Central

Noorani, Ayesha; Lynch, Andy G.; Achilleos, Achilleas; Eldridge, Matthew; Bower, Lawrence; Weaver, Jamie M.J.; Crawte, Jason; Ong, Chin-Ann; Shannon, Nicholas; MacRae, Shona; Grehan, Nicola; Nutzinger, Barbara; O'Donovan, Maria; Hardwick, Richard; Tavaré, Simon; Fitzgerald, Rebecca C.

2017-01-01

The scientific community has avoided using tissue samples from patients that have been exposed to systemic chemotherapy to infer the genomic landscape of a given cancer. Esophageal adenocarcinoma is a heterogeneous, chemoresistant tumor for which the availability and size of pretreatment endoscopic samples are limiting. This study compares whole-genome sequencing data obtained from chemo-naive and chemo-treated samples. The quality of whole-genomic sequencing data is comparable across all samples regardless of chemotherapy status. Inclusion of samples collected post-chemotherapy increased the proportion of late-stage tumors. When comparing matched pre- and post-chemotherapy samples from 10 cases, the mutational signatures, copy number, and SNV mutational profiles reflect the expected heterogeneity in this disease. Analysis of SNVs in relation to allele-specific copy-number changes pinpoints the common ancestor to a point prior to chemotherapy. For cases in which pre- and post-chemotherapy samples do show substantial differences, the timing of the divergence is near-synchronous with endoreduplication. Comparison across a large prospective cohort (62 treatment-naive, 58 chemotherapy-treated samples) reveals no significant differences in the overall mutation rate, mutation signatures, specific recurrent point mutations, or copy-number events in respect to chemotherapy status. In conclusion, whole-genome sequencing of samples obtained following neoadjuvant chemotherapy is representative of the genomic landscape of esophageal adenocarcinoma. Excluding these samples reduces the material available for cataloging and introduces a bias toward the earlier stages of cancer. PMID:28465312
Genome-Wide Characterization and Expression Profiling of the AUXIN RESPONSE FACTOR (ARF) Gene Family in Eucalyptus grandis

PubMed Central

Yu, Hong; Soler, Marçal; Mila, Isabelle; San Clemente, Hélène; Savelli, Bruno; Dunand, Christophe; Paiva, Jorge A. P.; Myburg, Alexander A.; Bouzayen, Mondher; Grima-Pettenati, Jacqueline; Cassan-Wang, Hua

2014-01-01

Auxin is a central hormone involved in a wide range of developmental processes including the specification of vascular stem cells. Auxin Response Factors (ARF) are important actors of the auxin signalling pathway, regulating the transcription of auxin-responsive genes through direct binding to their promoters. The recent availability of the Eucalyptus grandis genome sequence allowed us to examine the characteristics and evolutionary history of this gene family in a woody plant of high economic importance. With 17 members, the E. grandis ARF gene family is slightly contracted, as compared to those of most angiosperms studied hitherto, lacking traces of duplication events. In silico analysis of alternative transcripts and gene truncation suggested that these two mechanisms were preeminent in shaping the functional diversity of the ARF family in Eucalyptus. Comparative phylogenetic analyses with genomes of other taxonomic lineages revealed the presence of a new ARF clade found preferentially in woody and/or perennial plants. High-throughput expression profiling among different organs and tissues and in response to environmental cues highlighted genes expressed in vascular cambium and/or developing xylem, responding dynamically to various environmental stimuli. Finally, this study allowed identification of three ARF candidates potentially involved in the auxin-regulated transcriptional program underlying wood formation. PMID:25269088

Evaluating cell lines as tumour models by comparison of genomic profiles

PubMed Central

Domcke, Silvia; Sinha, Rileen; Levine, Douglas A.; Sander, Chris; Schultz, Nikolaus

2013-01-01

Cancer cell lines are frequently used as in vitro tumour models. Recent molecular profiles of hundreds of cell lines from The Cancer Cell Line Encyclopedia and thousands of tumour samples from the Cancer Genome Atlas now allow a systematic genomic comparison of cell lines and tumours. Here we analyse a panel of 47 ovarian cancer cell lines and identify those that have the highest genetic similarity to ovarian tumours. Our comparison of copy-number changes, mutations and mRNA expression profiles reveals pronounced differences in molecular profiles between commonly used ovarian cancer cell lines and high-grade serous ovarian cancer tumour samples. We identify several rarely used cell lines that more closely resemble cognate tumour profiles than commonly used cell lines, and we propose these lines as the most suitable models of ovarian cancer. Our results indicate that the gap between cell lines and tumours can be bridged by genomically informed choices of cell line models for all tumour types. PMID:23839242
Comprehensive Genome-Wide Survey, Genomic Constitution and Expression Profiling of the NAC Transcription Factor Family in Foxtail Millet (Setaria italica L.)

PubMed Central

Puranik, Swati; Sahu, Pranav Pankaj; Mandal, Sambhu Nath; B., Venkata Suresh; Parida, Swarup Kumar; Prasad, Manoj

2013-01-01

The NAC proteins represent a major plant-specific transcription factor family that has established enormously diverse roles in various plant processes. Aided by the availability of complete genomes, several members of this family have been identified in Arabidopsis, rice, soybean and poplar. However, no comprehensive investigation has been presented for the recently sequenced, naturally stress tolerant crop, Setaria italica (foxtail millet) that is famed as a model crop for bioenergy research. In this study, we identified 147 putative NAC domain-encoding genes from foxtail millet by systematic sequence analysis and physically mapped them onto nine chromosomes. Genomic organization suggested that inter-chromosomal duplications may have been responsible for expansion of this gene family in foxtail millet. Phylogenetically, they were arranged into 11 distinct sub-families (I-XI), with duplicated genes fitting into one cluster and possessing conserved motif compositions. Comparative mapping with other grass species revealed some orthologous relationships and chromosomal rearrangements including duplication, inversion and deletion of genes. The evolutionary significance as duplication and divergence of NAC genes based on their amino acid substitution rates was understood. Expression profiling against various stresses and phytohormones provides novel insights into specific and/or overlapping expression patterns of SiNAC genes, which may be responsible for functional divergence among individual members in this crop. Further, we performed structure modeling and molecular simulation of a stress-responsive protein, SiNAC128, proffering an initial framework for understanding its molecular function. Taken together, this genome-wide identification and expression profiling unlocks new avenues for systematic functional analysis of novel NAC gene family candidates which may be applied for improvising stress adaption in plants. PMID:23691254
Comprehensive genome-wide survey, genomic constitution and expression profiling of the NAC transcription factor family in foxtail millet (Setaria italica L.).

PubMed

Puranik, Swati; Sahu, Pranav Pankaj; Mandal, Sambhu Nath; B, Venkata Suresh; Parida, Swarup Kumar; Prasad, Manoj

2013-01-01

The NAC proteins represent a major plant-specific transcription factor family that has established enormously diverse roles in various plant processes. Aided by the availability of complete genomes, several members of this family have been identified in Arabidopsis, rice, soybean and poplar. However, no comprehensive investigation has been presented for the recently sequenced, naturally stress tolerant crop, Setaria italica (foxtail millet) that is famed as a model crop for bioenergy research. In this study, we identified 147 putative NAC domain-encoding genes from foxtail millet by systematic sequence analysis and physically mapped them onto nine chromosomes. Genomic organization suggested that inter-chromosomal duplications may have been responsible for expansion of this gene family in foxtail millet. Phylogenetically, they were arranged into 11 distinct sub-families (I-XI), with duplicated genes fitting into one cluster and possessing conserved motif compositions. Comparative mapping with other grass species revealed some orthologous relationships and chromosomal rearrangements including duplication, inversion and deletion of genes. The evolutionary significance as duplication and divergence of NAC genes based on their amino acid substitution rates was understood. Expression profiling against various stresses and phytohormones provides novel insights into specific and/or overlapping expression patterns of SiNAC genes, which may be responsible for functional divergence among individual members in this crop. Further, we performed structure modeling and molecular simulation of a stress-responsive protein, SiNAC128, proffering an initial framework for understanding its molecular function. Taken together, this genome-wide identification and expression profiling unlocks new avenues for systematic functional analysis of novel NAC gene family candidates which may be applied for improvising stress adaption in plants.
Genomic profiling of CHEK2*1100delC-mutated breast carcinomas.

PubMed

Massink, Maarten P G; Kooi, Irsan E; Martens, John W M; Waisfisz, Quinten; Meijers-Heijboer, Hanne

2015-11-09

CHEK2*1100delC is a moderate-risk breast cancer susceptibility allele with a high prevalence in the Netherlands. We performed copy number and gene expression profiling to investigate whether CHEK2*1100delC breast cancers harbor characteristic genomic aberrations, as seen for BRCA1 mutated breast cancers. We performed high-resolution SNP array and gene expression profiling of 120 familial breast carcinomas selected from a larger cohort of 155 familial breast tumors, including BRCA1, BRCA2, and CHEK2 mutant tumors. Gene expression analyses based on a mRNA immune signature was used to identify samples with relative low amounts of tumor infiltrating lymphocytes (TILs), which were previously found to disturb tumor copy number and LOH (loss of heterozygosity) profiling. We specifically compared the genomic and gene expression profiles of CHEK2*1100delC breast cancers (n = 14) with BRCAX (familial non-BRCA1/BRCA2/CHEK2*1100delC mutated) breast cancers (n = 34) of the luminal intrinsic subtypes for which both SNP-array and gene expression data is available. High amounts of TILs were found in a relatively small number of luminal breast cancers as compared to breast cancers of the basal-like subtype. As expected, these samples mostly have very few copy number aberrations and no detectable regions of LOH. By unsupervised hierarchical clustering of copy number data we observed a great degree of heterogeneity amongst the CHEK2*1100delC breast cancers, comparable to the BRCAX breast cancers. Furthermore, copy number aberrations were mostly seen at low frequencies in both the CHEK2*1100delC and BRCAX group of breast cancers. However, supervised class comparison identified copy number loss of chromosomal arm 1p to be associated with CHEK2*1100delC status. In conclusion, in contrast to basal-like BRCA1 mutated breast cancers, no apparent specific somatic copy number aberration (CNA) profile for CHEK2*1100delC breast cancers was found. With the possible exception of copy number loss of chromosomal arm 1p in a subset of tumors, which might be involved in CHEK2 tumorigenesis. This difference in CNAs profiles might be explained by the need for BRCA1-deficient tumor cells to acquire survival factors, by for example specific copy number aberrations, to expand. Such factors may not be needed for breast tumors with a defect in a non-essential gene such as CHEK2.
Genomic profiling of plasma cell disorders in a clinical setting: integration of microarray and FISH, after CD138 selection of bone marrow

PubMed Central

Berry, Nadine Kaye; Bain, Nicole L; Enjeti, Anoop K; Rowlings, Philip

2014-01-01

Aim To evaluate the role of whole genome comparative genomic hybridisation microarray (array-CGH) in detecting genomic imbalances as compared to conventional karyotype (GTG-analysis) or myeloma specific fluorescence in situ hybridisation (FISH) panel in a diagnostic setting for plasma cell dyscrasia (PCD). Methods A myeloma-specific interphase FISH (i-FISH) panel was carried out on CD138 PC-enriched bone marrow (BM) from 20 patients having BM biopsies for evaluation of PCD. Whole genome array-CGH was performed on reference (control) and neoplastic (test patient) genomic DNA extracted from CD138 PC-enriched BM and analysed. Results Comparison of techniques demonstrated a much higher detection rate of genomic imbalances using array-CGH. Genomic imbalances were detected in 1, 19 and 20 patients using GTG-analysis, i-FISH and array-CGH, respectively. Genomic rearrangements were detected in one patient using GTG-analysis and seven patients using i-FISH, while none were detected using array-CGH. I-FISH was the most sensitive method for detecting gene rearrangements and GTG-analysis was the least sensitive method overall. All copy number aberrations observed in GTG-analysis were detected using array-CGH and i-FISH. Conclusions We show that array-CGH performed on CD138-enriched PCs significantly improves the detection of clinically relevant and possibly novel genomic abnormalities in PCD, and thus could be considered as a standard diagnostic technique in combination with IGH rearrangement i-FISH. PMID:23969274
Genomic profiling of plasma cell disorders in a clinical setting: integration of microarray and FISH, after CD138 selection of bone marrow.

PubMed

Berry, Nadine Kaye; Bain, Nicole L; Enjeti, Anoop K; Rowlings, Philip

2014-01-01

To evaluate the role of whole genome comparative genomic hybridisation microarray (array-CGH) in detecting genomic imbalances as compared to conventional karyotype (GTG-analysis) or myeloma specific fluorescence in situ hybridisation (FISH) panel in a diagnostic setting for plasma cell dyscrasia (PCD). A myeloma-specific interphase FISH (i-FISH) panel was carried out on CD138 PC-enriched bone marrow (BM) from 20 patients having BM biopsies for evaluation of PCD. Whole genome array-CGH was performed on reference (control) and neoplastic (test patient) genomic DNA extracted from CD138 PC-enriched BM and analysed. Comparison of techniques demonstrated a much higher detection rate of genomic imbalances using array-CGH. Genomic imbalances were detected in 1, 19 and 20 patients using GTG-analysis, i-FISH and array-CGH, respectively. Genomic rearrangements were detected in one patient using GTG-analysis and seven patients using i-FISH, while none were detected using array-CGH. I-FISH was the most sensitive method for detecting gene rearrangements and GTG-analysis was the least sensitive method overall. All copy number aberrations observed in GTG-analysis were detected using array-CGH and i-FISH. We show that array-CGH performed on CD138-enriched PCs significantly improves the detection of clinically relevant and possibly novel genomic abnormalities in PCD, and thus could be considered as a standard diagnostic technique in combination with IGH rearrangement i-FISH.
MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands

PubMed Central

Ou, Hong-Yu; He, Xinyi; Harrison, Ewan M.; Kulasekara, Bridget R.; Thani, Ali Bin; Kadioglu, Aras; Lory, Stephen; Hinton, Jay C. D.; Barer, Michael R.; Rajakumar, Kumar

2007-01-01

MobilomeFINDER (http://mml.sjtu.edu.cn/MobilomeFINDER) is an interactive online tool that facilitates bacterial genomic island or ‘mobile genome’ (mobilome) discovery; it integrates the ArrayOme and tRNAcc software packages. ArrayOme utilizes a microarray-derived comparative genomic hybridization input data set to generate ‘inferred contigs’ produced by merging adjacent genes classified as ‘present’. Collectively these ‘fragments’ represent a hypothetical ‘microarray-visualized genome (MVG)’. ArrayOme permits recognition of discordances between physical genome and MVG sizes, thereby enabling identification of strains rich in microarray-elusive novel genes. Individual tRNAcc tools facilitate automated identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites and other integration hotspots in closely related sequenced genomes. Accessory tools facilitate design of hotspot-flanking primers for in silico and/or wet-science-based interrogation of cognate loci in unsequenced strains and analysis of islands for features suggestive of foreign origins; island-specific and genome-contextual features are tabulated and represented in schematic and graphical forms. To date we have used MobilomeFINDER to analyse several Enterobacteriaceae, Pseudomonas aeruginosa and Streptococcus suis genomes. MobilomeFINDER enables high-throughput island identification and characterization through increased exploitation of emerging sequence data and PCR-based profiling of unsequenced test strains; subsequent targeted yeast recombination-based capture permits full-length sequencing and detailed functional studies of novel genomic islands. PMID:17537813
An orthology-based analysis of pathogenic protozoa impacting global health: an improved comparative genomics approach with prokaryotes and model eukaryote orthologs.

PubMed

Cuadrat, Rafael R C; da Serra Cruz, Sérgio Manuel; Tschoeke, Diogo Antônio; Silva, Edno; Tosta, Frederico; Jucá, Henrique; Jardim, Rodrigo; Campos, Maria Luiza M; Mattoso, Marta; Dávila, Alberto M R

2014-08-01

A key focus in 21(st) century integrative biology and drug discovery for neglected tropical and other diseases has been the use of BLAST-based computational methods for identification of orthologous groups in pathogenic organisms to discern orthologs, with a view to evaluate similarities and differences among species, and thus allow the transfer of annotation from known/curated proteins to new/non-annotated ones. We used here a profile-based sensitive methodology to identify distant homologs, coupled to the NCBI's COG (Unicellular orthologs) and KOG (Eukaryote orthologs), permitting us to perform comparative genomics analyses on five protozoan genomes. OrthoSearch was used in five protozoan proteomes showing that 3901 and 7473 orthologs can be identified by comparison with COG and KOG proteomes, respectively. The core protozoa proteome inferred was 418 Protozoa-COG orthologous groups and 704 Protozoa-KOG orthologous groups: (i) 31.58% (132/418) belongs to the category J (translation, ribosomal structure, and biogenesis), and 9.81% (41/418) to the category O (post-translational modification, protein turnover, chaperones) using COG; (ii) 21.45% (151/704) belongs to the categories J, and 13.92% (98/704) to the O using KOG. The phylogenomic analysis showed four well-supported clades for Eukarya, discriminating Multicellular [(i) human, fly, plant and worm] and Unicellular [(ii) yeast, (iii) fungi, and (iv) protozoa] species. These encouraging results attest to the usefulness of the profile-based methodology for comparative genomics to accelerate semi-automatic re-annotation, especially of the protozoan proteomes. This approach may also lend itself for applications in global health, for example, in the case of novel drug target discovery against pathogenic organisms previously considered difficult to research with traditional drug discovery tools.
An Orthology-Based Analysis of Pathogenic Protozoa Impacting Global Health: An Improved Comparative Genomics Approach with Prokaryotes and Model Eukaryote Orthologs

PubMed Central

Cuadrat, Rafael R. C.; da Serra Cruz, Sérgio Manuel; Tschoeke, Diogo Antônio; Silva, Edno; Tosta, Frederico; Jucá, Henrique; Jardim, Rodrigo; Campos, Maria Luiza M.; Mattoso, Marta

2014-01-01

Abstract A key focus in 21st century integrative biology and drug discovery for neglected tropical and other diseases has been the use of BLAST-based computational methods for identification of orthologous groups in pathogenic organisms to discern orthologs, with a view to evaluate similarities and differences among species, and thus allow the transfer of annotation from known/curated proteins to new/non-annotated ones. We used here a profile-based sensitive methodology to identify distant homologs, coupled to the NCBI's COG (Unicellular orthologs) and KOG (Eukaryote orthologs), permitting us to perform comparative genomics analyses on five protozoan genomes. OrthoSearch was used in five protozoan proteomes showing that 3901 and 7473 orthologs can be identified by comparison with COG and KOG proteomes, respectively. The core protozoa proteome inferred was 418 Protozoa-COG orthologous groups and 704 Protozoa-KOG orthologous groups: (i) 31.58% (132/418) belongs to the category J (translation, ribosomal structure, and biogenesis), and 9.81% (41/418) to the category O (post-translational modification, protein turnover, chaperones) using COG; (ii) 21.45% (151/704) belongs to the categories J, and 13.92% (98/704) to the O using KOG. The phylogenomic analysis showed four well-supported clades for Eukarya, discriminating Multicellular [(i) human, fly, plant and worm] and Unicellular [(ii) yeast, (iii) fungi, and (iv) protozoa] species. These encouraging results attest to the usefulness of the profile-based methodology for comparative genomics to accelerate semi-automatic re-annotation, especially of the protozoan proteomes. This approach may also lend itself for applications in global health, for example, in the case of novel drug target discovery against pathogenic organisms previously considered difficult to research with traditional drug discovery tools. PMID:24960463
Comparison of Two Capillary Gel Electrophoresis Systems for Clostridium difficile Ribotyping, Using a Panel of Ribotype 027 Isolates and Whole-Genome Sequences as a Reference Standard

PubMed Central

Xiao, Meng; Kong, Fanrong; Jin, Ping; Wang, Qinning; Xiao, Kelin; Jeoffreys, Neisha; James, Gregory

2012-01-01

PCR ribotyping is the most commonly used Clostridium difficile genotyping method, but its utility is limited by lack of standardization. In this study, we analyzed four published whole genomes and tested an international collection of 21 well-characterized C. difficile ribotype 027 isolates as the basis for comparison of two capillary gel electrophoresis (CGE)-based ribotyping methods. There were unexpected differences between the 16S-23S rRNA intergenic spacer region (ISR) allelic profiles of the four ribotype 027 genomes, but six bands were identified in all four and a seventh in three genomes. All seven bands and another, not identified in any of the whole genomes, were found in all 21 isolates. We compared sequencer-based CGE (SCGE) with three different primer pairs to the Qiagen QIAxcel CGE (QCGE) platform. Deviations from individual reference/consensus band sizes were smaller for SCGE (0 to 0.2 bp) than for QCGE (4.2 to 9.5 bp). Compared with QCGE, SCGE more readily distinguished bands of similar length (more discriminatory), detected bands of larger size and lower intensity (more sensitive), and assigned band sizes more accurately and reproducibly, making it more suitable for standardization. Specifically, QCGE failed to identify the largest ISR amplicon. Based on several criteria, we recommend the primer set 16S-USA/23S-USA for use in a proposed standard SCGE method. Similar differences between SCGE and QCGE were found on testing of 14 isolates of four other C. difficile ribotypes. Based on our results, ISR profiles based on accurate sequencer-based band lengths would be preferable to agarose gel-based banding patterns for the assignment of ribotypes. PMID:22692737
Economic issues involved in integrating genomic testing into clinical care: the case of genomic testing to guide decision-making about chemotherapy for breast cancer patients.

PubMed

Marino, Patricia; Siani, Carole; Bertucci, François; Roche, Henri; Martin, Anne-Laure; Viens, Patrice; Seror, Valérie

2011-09-01

The use of taxanes to treat node-positive (N+) breast cancer patients is associated with heterogeneous benefits as well as with morbidity and financial costs. This study aimed to assess the economic impact of using gene expression profiling to guide decision-making about chemotherapy, and to discuss the coverage/reimbursement issues involved. Retrospective data on 246 patients included in a randomised trial (PACS01) were analyzed. Tumours were genotyped using DNA microarrays (189-gene signature), and patients were classified depending on whether or not they were likely to benefit from chemotherapy regimens without taxanes. Standard anthracyclines plus taxane chemotherapy (strategy AT) was compared with the innovative strategy based on genomic testing (GEN). Statistical analyses involved bootstrap methods and sensitivity analyses. The AT and GEN strategies yielded similar 5-year metastasis-free survival rates. In comparison with AT, GEN was cost-effective when genomic testing costs were less than 2,090€. With genomic testing costs higher than 2,919€, AT was cost-effective. Considering a 30% decrease in the price of docetaxel (the patent rights being about to expire), GEN was cost-effective if the cost of genomic testing was in the 0€-1,139€, range; whereas AT was cost-effective if genomic testing costs were higher than 1,891€. The use of gene expression profiling to guide decision-making about chemotherapy for N+ breast cancer patients is potentially cost-effective. Since genomic testing and the drugs targeted in these tests yield greater well-being than the sum of those resulting from separate use, questions arise about how to deal with extra well-being in decision-making about coverage/reimbursement.
Towards precision medicine-based therapies for glioblastoma: interrogating human disease genomics and mouse phenotypes.

PubMed

Chen, Yang; Gao, Zhen; Wang, Bingcheng; Xu, Rong

2016-08-22

Glioblastoma (GBM) is the most common and aggressive brain tumors. It has poor prognosis even with optimal radio- and chemo-therapies. Since GBM is highly heterogeneous, drugs that target on specific molecular profiles of individual tumors may achieve maximized efficacy. Currently, the Cancer Genome Atlas (TCGA) projects have identified hundreds of GBM-associated genes. We develop a drug repositioning approach combining disease genomics and mouse phenotype data towards predicting targeted therapies for GBM. We first identified disease specific mouse phenotypes using the most recently discovered GBM genes. Then we systematically searched all FDA-approved drugs for candidates that share similar mouse phenotype profiles with GBM. We evaluated the ranks for approved and novel GBM drugs, and compared with an existing approach, which also use the mouse phenotype data but not the disease genomics data. We achieved significantly higher ranks for the approved and novel GBM drugs than the earlier approach. For all positive examples of GBM drugs, we achieved a median rank of 9.2 45.6 of the top predictions have been demonstrated effective in inhibiting the growth of human GBM cells. We developed a computational drug repositioning approach based on both genomic and phenotypic data. Our approach prioritized existing GBM drugs and outperformed a recent approach. Overall, our approach shows potential in discovering new targeted therapies for GBM.
LS-CAP: an algorithm for identifying cytogenetic aberrations in hepatocellular carcinoma using microarray data.

PubMed

He, Xianmin; Wei, Qing; Sun, Meiqian; Fu, Xuping; Fan, Sichang; Li, Yao

2006-05-01

Biological techniques such as Array-Comparative genomic hybridization (CGH), fluorescent in situ hybridization (FISH) and affymetrix single nucleotide pleomorphism (SNP) array have been used to detect cytogenetic aberrations. However, on genomic scale, these techniques are labor intensive and time consuming. Comparative genomic microarray analysis (CGMA) has been used to identify cytogenetic changes in hepatocellular carcinoma (HCC) using gene expression microarray data. However, CGMA algorithm can not give precise localization of aberrations, fails to identify small cytogenetic changes, and exhibits false negatives and positives. Locally un-weighted smoothing cytogenetic aberrations prediction (LS-CAP) based on local smoothing and binomial distribution can be expected to address these problems. LS-CAP algorithm was built and used on HCC microarray profiles. Eighteen cytogenetic abnormalities were identified, among them 5 were reported previously, and 12 were proven by CGH studies. LS-CAP effectively reduced the false negatives and positives, and precisely located small fragments with cytogenetic aberrations.
What the Aspergillus genomes have told us.

PubMed

Nierman, W C; May, G; Kim, H S; Anderson, M J; Chen, D; Denning, D W

2005-05-01

The sequencing and annotation of the genomes of the first strains of Aspergillus nidulans, Aspergillus oryzae, and Aspergillus fumigatus will be seen in retrospect as a transformational event in Aspergillus biology. With this event the entire genetic composition of A. nidulans, the sexual experimental model organism of the genus Aspergillus, A. oryzae, the food biotechnology organism which is the product of centuries of cultivation, and A. fumigatus, the most common causative agent of invasive aspergillosis is now revealed to the extent that we are at present able to understand. Each genome exhibits a large set of genes common to the three as well as a much smaller set of genes unique to each. Moreover, these sequences serve as resources providing the major tool to expanding our understanding of the biology of each. Transcription profiling of A. fumigatus at high temperatures and comparative genomic hybridization between A. fumigatus and a closely related Aspergillus species provides microarray based examples of the beginning of functional analysis of the genomes of these organisms going forward from the genome sequence.
Cell-type-specific profiling of protein-DNA interactions without cell isolation using targeted DamID with next-generation sequencing.

PubMed

Marshall, Owen J; Southall, Tony D; Cheetham, Seth W; Brand, Andrea H

2016-09-01

This protocol is an extension to: Nat. Protoc. 2, 1467-1478 (2007); doi:10.1038/nprot.2007.148; published online 7 June 2007The ability to profile transcription and chromatin binding in a cell-type-specific manner is a powerful aid to understanding cell-fate specification and cellular function in multicellular organisms. We recently developed targeted DamID (TaDa) to enable genome-wide, cell-type-specific profiling of DNA- and chromatin-binding proteins in vivo without cell isolation. As a protocol extension, this article describes substantial modifications to an existing protocol, and it offers additional applications. TaDa builds upon DamID, a technique for detecting genome-wide DNA-binding profiles of proteins, by coupling it with the GAL4 system in Drosophila to enable both temporal and spatial resolution. TaDa ensures that Dam-fusion proteins are expressed at very low levels, thus avoiding toxicity and potential artifacts from overexpression. The modifications to the core DamID technique presented here also increase the speed of sample processing and throughput, and adapt the method to next-generation sequencing technology. TaDa is robust, reproducible and highly sensitive. Compared with other methods for cell-type-specific profiling, the technique requires no cell-sorting, cross-linking or antisera, and binding profiles can be generated from as few as 10,000 total induced cells. By profiling the genome-wide binding of RNA polymerase II (Pol II), TaDa can also identify transcribed genes in a cell-type-specific manner. Here we describe a detailed protocol for carrying out TaDa experiments and preparing the material for next-generation sequencing. Although we developed TaDa in Drosophila, it should be easily adapted to other organisms with an inducible expression system. Once transgenic animals are obtained, the entire experimental procedure-from collecting tissue samples to generating sequencing libraries-can be accomplished within 5 d.
Genome organization of epidemic Acinetobacter baumannii strains.

PubMed

Di Nocera, Pier Paolo; Rocco, Francesco; Giannouli, Maria; Triassi, Maria; Zarrilli, Raffaele

2011-10-10

Acinetobacter baumannii is an opportunistic pathogen responsible for hospital-acquired infections. A. baumannii epidemics described world-wide were caused by few genotypic clusters of strains. The occurrence of epidemics caused by multi-drug resistant strains assigned to novel genotypes have been reported over the last few years. In the present study, we compared whole genome sequences of three A. baumannii strains assigned to genotypes ST2, ST25 and ST78, representative of the most frequent genotypes responsible for epidemics in several Mediterranean hospitals, and four complete genome sequences of A. baumannii strains assigned to genotypes ST1, ST2 and ST77. Comparative genome analysis showed extensive synteny and identified 3068 coding regions which are conserved, at the same chromosomal position, in all A. baumannii genomes. Genome alignments also identified 63 DNA regions, ranging in size from 4 o 126 kb, all defined as genomic islands, which were present in some genomes, but were either missing or replaced by non-homologous DNA sequences in others. Some islands are involved in resistance to drugs and metals, others carry genes encoding surface proteins or enzymes involved in specific metabolic pathways, and others correspond to prophage-like elements. Accessory DNA regions encode 12 to 19% of the potential gene products of the analyzed strains. The analysis of a collection of epidemic A. baumannii strains showed that some islands were restricted to specific genotypes. The definition of the genome components of A. baumannii provides a scaffold to rapidly evaluate the genomic organization of novel clinical A. baumannii isolates. Changes in island profiling will be useful in genomic epidemiology of A. baumannii population.
Characterization of canine osteosarcoma by array comparative genomic hybridization and RT-qPCR: signatures of genomic imbalance in canine osteosarcoma parallel the human counterpart.

PubMed

Angstadt, Andrea Y; Motsinger-Reif, Alison; Thomas, Rachael; Kisseberth, William C; Guillermo Couto, C; Duval, Dawn L; Nielsen, Dahlia M; Modiano, Jaime F; Breen, Matthew

2011-11-01

Osteosarcoma (OS) is the most commonly diagnosed malignant bone tumor in humans and dogs, characterized in both species by extremely complex karyotypes exhibiting high frequencies of genomic imbalance. Evaluation of genomic signatures in human OS using array comparative genomic hybridization (aCGH) has assisted in uncovering genetic mechanisms that result in disease phenotype. Previous low-resolution (10-20 Mb) aCGH analysis of canine OS identified a wide range of recurrent DNA copy number aberrations, indicating extensive genomic instability. In this study, we profiled 123 canine OS tumors by 1 Mb-resolution aCGH to generate a dataset for direct comparison with current data for human OS, concluding that several high frequency aberrations in canine and human OS are orthologous. To ensure complete coverage of gene annotation, we identified the human refseq genes that map to these orthologous aberrant dog regions and found several candidate genes warranting evaluation for OS involvement. Specifically, subsequenct FISH and qRT-PCR analysis of RUNX2, TUSC3, and PTEN indicated that expression levels correlated with genomic copy number status, showcasing RUNX2 as an OS associated gene and TUSC3 as a possible tumor suppressor candidate. Together these data demonstrate the ability of genomic comparative oncology to identify genetic abberations which may be important for OS progression. Large scale screening of genomic imbalance in canine OS further validates the use of the dog as a suitable model for human cancers, supporting the idea that dysregulation discovered in canine cancers will provide an avenue for complementary study in human counterparts. Copyright © 2011 Wiley-Liss, Inc.
Base-By-Base: single nucleotide-level analysis of whole viral genome alignments.

PubMed

Brodie, Ryan; Smith, Alex J; Roper, Rachel L; Tcherepanov, Vasily; Upton, Chris

2004-07-14

With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes) is not feasible without new bioinformatics tools. A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1) rapidly identify and correct alignment errors in large, multiple genome alignments; and 2) generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs) to retrieve detailed annotation information about the aligned genomes or use information from text files. Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.
A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis

PubMed Central

Down, Thomas A.; Rakyan, Vardhman K.; Turner, Daniel J.; Flicek, Paul; Li, Heng; Kulesha, Eugene; Gräf, Stefan; Johnson, Nathan; Herrero, Javier; Tomazou, Eleni M.; Thorne, Natalie P.; Bäckdahl, Liselotte; Herberth, Marlis; Howe, Kevin L.; Jackson, David K.; Miretti, Marcos M.; Marioni, John C.; Birney, Ewan; Hubbard, Tim J. P.; Durbin, Richard; Tavaré, Simon; Beck, Stephan

2009-01-01

DNA methylation is an indispensible epigenetic modification of mammalian genomes. Consequently there is great interest in strategies for genome-wide/whole-genome DNA methylation analysis, and immunoprecipitation-based methods have proven to be a powerful option. Such methods are rapidly shifting the bottleneck from data generation to data analysis, necessitating the development of better analytical tools. Until now, a major analytical difficulty associated with immunoprecipitation-based DNA methylation profiling has been the inability to estimate absolute methylation levels. Here we report the development of a novel cross-platform algorithm – Bayesian Tool for Methylation Analysis (Batman) – for analyzing Methylated DNA Immunoprecipitation (MeDIP) profiles generated using arrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). The latter is an approach we have developed to elucidate the first high-resolution whole-genome DNA methylation profile (DNA methylome) of any mammalian genome. MeDIP-seq/MeDIP-chip combined with Batman represent robust, quantitative, and cost-effective functional genomic strategies for elucidating the function of DNA methylation. PMID:18612301
Integrative Analysis of Complex Cancer Genomics and Clinical Profiles Using the cBioPortal

PubMed Central

Gao, Jianjiong; Aksoy, Bülent Arman; Dogrusoz, Ugur; Dresdner, Gideon; Gross, Benjamin; Sumer, S. Onur; Sun, Yichao; Jacobsen, Anders; Sinha, Rileen; Larsson, Erik; Cerami, Ethan; Sander, Chris; Schultz, Nikolaus

2014-01-01

The cBioPortal for Cancer Genomics (http://cbioportal.org) provides a Web resource for exploring, visualizing, and analyzing multidimensional cancer genomics data. The portal reduces molecular profiling data from cancer tissues and cell lines into readily understandable genetic, epigenetic, gene expression, and proteomic events. The query interface combined with customized data storage enables researchers to interactively explore genetic alterations across samples, genes, and pathways and, when available in the underlying data, to link these to clinical outcomes. The portal provides graphical summaries of gene-level data from multiple platforms, network visualization and analysis, survival analysis, patient-centric queries, and software programmatic access. The intuitive Web interface of the portal makes complex cancer genomics profiles accessible to researchers and clinicians without requiring bioinformatics expertise, thus facilitating biological discoveries. Here, we provide a practical guide to the analysis and visualization features of the cBioPortal for Cancer Genomics. PMID:23550210

NGSmethDB 2017: enhanced methylomes and differential methylation

PubMed Central

Lebrón, Ricardo; Gómez-Martín, Cristina; Carpena, Pedro; Bernaola-Galván, Pedro; Barturen, Guillermo; Hackenberg, Michael; Oliver, José L.

2017-01-01

The 2017 update of NGSmethDB stores whole genome methylomes generated from short-read data sets obtained by bisulfite sequencing (WGBS) technology. To generate high-quality methylomes, stringent quality controls were integrated with third-part software, adding also a two-step mapping process to exploit the advantages of the new genome assembly models. The samples were all profiled under constant parameter settings, thus enabling comparative downstream analyses. Besides a significant increase in the number of samples, NGSmethDB now includes two additional data-types, which are a valuable resource for the discovery of methylation epigenetic biomarkers: (i) differentially methylated single-cytosines; and (ii) methylation segments (i.e. genome regions of homogeneous methylation). The NGSmethDB back-end is now based on MongoDB, a NoSQL hierarchical database using JSON-formatted documents and dynamic schemas, thus accelerating sample comparative analyses. Besides conventional database dumps, track hubs were implemented, which improved database access, visualization in genome browsers and comparative analyses to third-part annotations. In addition, the database can be also accessed through a RESTful API. Lastly, a Python client and a multiplatform virtual machine allow for program-driven access from user desktop. This way, private methylation data can be compared to NGSmethDB without the need to upload them to public servers. Database website: http://bioinfo2.ugr.es/NGSmethDB. PMID:27794041
A comprehensive profile of DNA copy number variations in a Korean population: identification of copy number invariant regions among Koreans.

PubMed

Jeon, Jae Pil; Shim, Sung Mi; Jung, Jong Sun; Nam, Hye Young; Lee, Hye Jin; Oh, Berm Seok; Kim, Kuchan; Kim, Hyung Lae; Han, Bok Ghee

2009-09-30

To examine copy number variations among the Korean population, we compared individual genomes with the Korean reference genome assembly using the publicly available Korean HapMap SNP 50 k chip data from 90 individuals. Korean individuals exhibited 123 copy number variation regions (CNVRs) covering 27.2 mb, equivalent to 1.0% of the genome in the copy number variation (CNV) analysis using the combined criteria of P value (P<0.01) and standard deviation of copy numbers (SD>or= 0.25) among study subjects. In contrast, when compared to the Affymetrix reference genome assembly from multiple ethnic groups, considerably more CNVRs (n=643) were detected in larger proportions (5.0%) of the genome covering 135.1 mb even by more stringent criteria (P<0.001 and SD>or=0.25), reflecting ethnic diversity of structural variations between Korean and other populations. Some CNVRs were validated by the quantitative multiplex PCR of short fluorescent fragment (QMPSF) method, and then copy number invariant regions were detected among the study subjects. These copy number invariant regions would be used as good internal controls for further CNV studies. Lastly, we demonstrated that the CNV information could stratify even a single ethnic population with a proper reference genome assembly from multiple heterogeneous populations.
Comparative Genomics of Burkholderia singularis sp. nov., a Low G+C Content, Free-Living Bacterium That Defies Taxonomic Dissection of the Genus Burkholderia

PubMed Central

Vandamme, Peter; Peeters, Charlotte; De Smet, Birgit; Price, Erin P.; Sarovich, Derek S.; Henry, Deborah A.; Hird, Trevor J.; Zlosnik, James E. A.; Mayo, Mark; Warner, Jeffrey; Baker, Anthony; Currie, Bart J.; Carlier, Aurélien

2017-01-01

Four Burkholderia pseudomallei-like isolates of human clinical origin were examined by a polyphasic taxonomic approach that included comparative whole genome analyses. The results demonstrated that these isolates represent a rare and unusual, novel Burkholderia species for which we propose the name B. singularis. The type strain is LMG 28154T (=CCUG 65685T). Its genome sequence has an average mol% G+C content of 64.34%, which is considerably lower than that of other Burkholderia species. The reduced G+C content of strain LMG 28154T was characterized by a genome wide AT bias that was not due to reduced GC-biased gene conversion or reductive genome evolution, but might have been caused by an altered DNA base excision repair pathway. B. singularis can be differentiated from other Burkholderia species by multilocus sequence analysis, MALDI-TOF mass spectrometry and a distinctive biochemical profile that includes the absence of nitrate reduction, a mucoid appearance on Columbia sheep blood agar, and a slowly positive oxidase reaction. Comparisons with publicly available whole genome sequences demonstrated that strain TSV85, an Australian water isolate, also represents the same species and therefore, to date, B. singularis has been recovered from human or environmental samples on three continents. PMID:28932212
The Essential Genome of Escherichia coli K-12

PubMed Central

2018-01-01

ABSTRACT Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. PMID:29463657
Matching phenotypes to whole genomes: Lessons learned from four iterations of the personal genome project community challenges.

PubMed

Cai, Binghuang; Li, Biao; Kiga, Nikki; Thusberg, Janita; Bergquist, Timothy; Chen, Yun-Ching; Niknafs, Noushin; Carter, Hannah; Tokheim, Collin; Beleva-Guthrie, Violeta; Douville, Christopher; Bhattacharya, Rohit; Yeo, Hui Ting Grace; Fan, Jean; Sengupta, Sohini; Kim, Dewey; Cline, Melissa; Turner, Tychele; Diekhans, Mark; Zaucha, Jan; Pal, Lipika R; Cao, Chen; Yu, Chen-Hsin; Yin, Yizhou; Carraro, Marco; Giollo, Manuel; Ferrari, Carlo; Leonardi, Emanuela; Tosatto, Silvio C E; Bobe, Jason; Ball, Madeleine; Hoskins, Roger A; Repo, Susanna; Church, George; Brenner, Steven E; Moult, John; Gough, Julian; Stanke, Mario; Karchin, Rachel; Mooney, Sean D

2017-09-01

The advent of next-generation sequencing has dramatically decreased the cost for whole-genome sequencing and increased the viability for its application in research and clinical care. The Personal Genome Project (PGP) provides unrestricted access to genomes of individuals and their associated phenotypes. This resource enabled the Critical Assessment of Genome Interpretation (CAGI) to create a community challenge to assess the bioinformatics community's ability to predict traits from whole genomes. In the CAGI PGP challenge, researchers were asked to predict whether an individual had a particular trait or profile based on their whole genome. Several approaches were used to assess submissions, including ROC AUC (area under receiver operating characteristic curve), probability rankings, the number of correct predictions, and statistical significance simulations. Overall, we found that prediction of individual traits is difficult, relying on a strong knowledge of trait frequency within the general population, whereas matching genomes to trait profiles relies heavily upon a small number of common traits including ancestry, blood type, and eye color. When a rare genetic disorder is present, profiles can be matched when one or more pathogenic variants are identified. Prediction accuracy has improved substantially over the last 6 years due to improved methodology and a better understanding of features. © 2017 Wiley Periodicals, Inc.
Genomic profiling reveals extensive heterogeneity in somatic DNA copy number aberrations of canine hemangiosarcoma

PubMed Central

Thomas, Rachael; Borst, Luke; Rotroff, Daniel; Motsinger-Reif, Alison; Lindblad-Toh, Kerstin; Modiano, Jaime F.; Breen, Matthew

2017-01-01

Canine hemangiosarcoma is a highly aggressive vascular neoplasm associated with extensive clinical and anatomical heterogeneity and a grave prognosis. Comprehensive molecular characterization of hemangiosarcoma may identify novel therapeutic targets and advanced clinical management strategies, but there are no published reports of tumor-associated genome instability and disrupted gene dosage in this cancer. We performed genome-wide microarray-based somatic DNA copy number profiling of 75 primary intra-abdominal hemangiosarcomas from five popular dog breeds that are highly predisposed to this disease. The cohort exhibited limited global genomic instability, compared to other canine sarcomas studied to date, and DNA copy number aberrations (CNAs) were predominantly of low amplitude. Recurrent imbalances of several key cancer-associated genes were evident; however the global penetrance of any single CNA was low and no distinct hallmark aberrations were evident. Copy number gains of dog chromosomes 13, 24 and 31, and loss of chromosome 16, were the most recurrent CNAs involving large chromosome regions, but their relative distribution within and between cases suggests they most likely represent passenger aberrations. CNAs involving CDKN2A, VEGFA and the SKI oncogene were identified as potential driver aberrations of hemangiosarcoma development, highlighting potential targets for therapeutic modulation. CNA profiles were broadly conserved between the five breeds, although subregional variation was evident, including a near two-fold lower incidence of VEGFA gain in Golden Retrievers versus other breeds (22% versus 40%). These observations support prior transcriptional studies suggesting that the clinical heterogeneity of this cancer may reflect the existence of multiple, molecularly-distinct subtypes of canine hemangiosarcoma. PMID:24599718
Genomic profiling reveals extensive heterogeneity in somatic DNA copy number aberrations of canine hemangiosarcoma.

PubMed

Thomas, Rachael; Borst, Luke; Rotroff, Daniel; Motsinger-Reif, Alison; Lindblad-Toh, Kerstin; Modiano, Jaime F; Breen, Matthew

2014-09-01

Canine hemangiosarcoma is a highly aggressive vascular neoplasm associated with extensive clinical and anatomical heterogeneity and a grave prognosis. Comprehensive molecular characterization of hemangiosarcoma may identify novel therapeutic targets and advanced clinical management strategies, but there are no published reports of tumor-associated genome instability and disrupted gene dosage in this cancer. We performed genome-wide microarray-based somatic DNA copy number profiling of 75 primary intra-abdominal hemangiosarcomas from five popular dog breeds that are highly predisposed to this disease. The cohort exhibited limited global genomic instability, compared to other canine sarcomas studied to date, and DNA copy number aberrations (CNAs) were predominantly of low amplitude. Recurrent imbalances of several key cancer-associated genes were evident; however, the global penetrance of any single CNA was low and no distinct hallmark aberrations were evident. Copy number gains of dog chromosomes 13, 24, and 31, and loss of chromosome 16, were the most recurrent CNAs involving large chromosome regions, but their relative distribution within and between cases suggests they most likely represent passenger aberrations. CNAs involving CDKN2A, VEGFA, and the SKI oncogene were identified as potential driver aberrations of hemangiosarcoma development, highlighting potential targets for therapeutic modulation. CNA profiles were broadly conserved between the five breeds, although subregional variation was evident, including a near twofold lower incidence of VEGFA gain in Golden Retrievers versus other breeds (22 versus 40 %). These observations support prior transcriptional studies suggesting that the clinical heterogeneity of this cancer may reflect the existence of multiple, molecularly distinct subtypes of canine hemangiosarcoma.
Automated array-based genomic profiling in chronic lymphocytic leukemia: Development of a clinical tool and discovery of recurrent genomic alterations

PubMed Central

Schwaenen, Carsten; Nessling, Michelle; Wessendorf, Swen; Salvi, Tatjana; Wrobel, Gunnar; Radlwimmer, Bernhard; Kestler, Hans A.; Haslinger, Christian; Stilgenbauer, Stephan; Döhner, Hartmut; Bentz, Martin; Lichter, Peter

2004-01-01

B cell chronic lymphocytic leukemia (B-CLL) is characterized by a highly variable clinical course. Recurrent chromosomal imbalances provide significant prognostic markers. Risk-adapted therapy based on genomic alterations has become an option that is currently being tested in clinical trials. To supply a robust tool for such large scale studies, we developed a comprehensive DNA microarray dedicated to the automated analysis of recurrent genomic imbalances in B-CLL by array-based comparative genomic hybridization (matrix–CGH). Validation of this chip in a series of 106 B-CLL cases revealed a high specificity and sensitivity that fulfils the criteria for application in clinical oncology. This chip is immediately applicable within clinical B-CLL treatment trials that evaluate whether B-CLL cases with distinct chromosomal abnormalities should be treated with chemotherapy of different intensities and/or stem cell transplantation. Through the control set of DNA fragments equally distributed over the genome, recurrent genomic imbalances were discovered: trisomy of chromosome 19 and gain of the MYCN oncogene correlating with an elevation of MYCN mRNA expression. PMID:14730057
QuickMap: a public tool for large-scale gene therapy vector insertion site mapping and analysis.

PubMed

Appelt, J-U; Giordano, F A; Ecker, M; Roeder, I; Grund, N; Hotz-Wagenblatt, A; Opelz, G; Zeller, W J; Allgayer, H; Fruehauf, S; Laufs, S

2009-07-01

Several events of insertional mutagenesis in pre-clinical and clinical gene therapy studies have created intense interest in assessing the genomic insertion profiles of gene therapy vectors. For the construction of such profiles, vector-flanking sequences detected by inverse PCR, linear amplification-mediated-PCR or ligation-mediated-PCR need to be mapped to the host cell's genome and compared to a reference set. Although remarkable progress has been achieved in mapping gene therapy vector insertion sites, public reference sets are lacking, as are the possibilities to quickly detect non-random patterns in experimental data. We developed a tool termed QuickMap, which uniformly maps and analyzes human and murine vector-flanking sequences within seconds (available at www.gtsg.org). Besides information about hits in chromosomes and fragile sites, QuickMap automatically determines insertion frequencies in +/- 250 kb adjacency to genes, cancer genes, pseudogenes, transcription factor and (post-transcriptional) miRNA binding sites, CpG islands and repetitive elements (short interspersed nuclear elements (SINE), long interspersed nuclear elements (LINE), Type II elements and LTR elements). Additionally, all experimental frequencies are compared with the data obtained from a reference set, containing 1 000 000 random integrations ('random set'). Thus, for the first time a tool allowing high-throughput profiling of gene therapy vector insertion sites is available. It provides a basis for large-scale insertion site analyses, which is now urgently needed to discover novel gene therapy vectors with 'safe' insertion profiles.
Genome-wide expression profiling of five mouse models identifies similarities and differences with human psoriasis.

PubMed

Swindell, William R; Johnston, Andrew; Carbajal, Steve; Han, Gangwen; Wohn, Christian; Lu, Jun; Xing, Xianying; Nair, Rajan P; Voorhees, John J; Elder, James T; Wang, Xiao-Jing; Sano, Shigetoshi; Prens, Errol P; DiGiovanni, John; Pittelkow, Mark R; Ward, Nicole L; Gudjonsson, Johann E

2011-04-04

Development of a suitable mouse model would facilitate the investigation of pathomechanisms underlying human psoriasis and would also assist in development of therapeutic treatments. However, while many psoriasis mouse models have been proposed, no single model recapitulates all features of the human disease, and standardized validation criteria for psoriasis mouse models have not been widely applied. In this study, whole-genome transcriptional profiling is used to compare gene expression patterns manifested by human psoriatic skin lesions with those that occur in five psoriasis mouse models (K5-Tie2, imiquimod, K14-AREG, K5-Stat3C and K5-TGFbeta1). While the cutaneous gene expression profiles associated with each mouse phenotype exhibited statistically significant similarity to the expression profile of psoriasis in humans, each model displayed distinctive sets of similarities and differences in comparison to human psoriasis. For all five models, correspondence to the human disease was strong with respect to genes involved in epidermal development and keratinization. Immune and inflammation-associated gene expression, in contrast, was more variable between models as compared to the human disease. These findings support the value of all five models as research tools, each with identifiable areas of convergence to and divergence from the human disease. Additionally, the approach used in this paper provides an objective and quantitative method for evaluation of proposed mouse models of psoriasis, which can be strategically applied in future studies to score strengths of mouse phenotypes relative to specific aspects of human psoriasis.
Genomic comparison of multi-drug resistant invasive and colonizing Acinetobacter baumannii isolated from diverse human body sites reveals genomic plasticity.

PubMed

Sahl, Jason W; Johnson, J Kristie; Harris, Anthony D; Phillippy, Adam M; Hsiao, William W; Thom, Kerri A; Rasko, David A

2011-06-04

Acinetobacter baumannii has recently emerged as a significant global pathogen, with a surprisingly rapid acquisition of antibiotic resistance and spread within hospitals and health care institutions. This study examines the genomic content of three A. baumannii strains isolated from distinct body sites. Isolates from blood, peri-anal, and wound sources were examined in an attempt to identify genetic features that could be correlated to each isolation source. Pulsed-field gel electrophoresis, multi-locus sequence typing and antibiotic resistance profiles demonstrated genotypic and phenotypic variation. Each isolate was sequenced to high-quality draft status, which allowed for comparative genomic analyses with existing A. baumannii genomes. A high resolution, whole genome alignment method detailed the phylogenetic relationships of sequenced A. baumannii and found no correlation between phylogeny and body site of isolation. This method identified genomic regions unique to both those isolates found on the surface of the skin or in wounds, termed colonization isolates, and those identified from body fluids, termed invasive isolates; these regions may play a role in the pathogenesis and spread of this important pathogen. A PCR-based screen of 74 A. baumanii isolates demonstrated that these unique genes are not exclusive to either phenotype or isolation source; however, a conserved genomic region exclusive to all sequenced A. baumannii was identified and verified. The results of the comparative genome analysis and PCR assay show that A. baumannii is a diverse and genomically variable pathogen that appears to have the potential to cause a range of human disease regardless of the isolation source.
Genomic profiles of low-grade murine gliomas evolve during progression to glioblastoma. | Office of Cancer Genomics

Cancer.gov

Background: Gliomas are diverse neoplasms with multiple molecular subtypes. How tumor-initiating mutations relate to molecular subtypes as these tumors evolve during malignant progression remains unclear.Methods: We used genetically engineered mouse models, histopathology, genetic lineage tracing, expression profiling, and copy number analyses to examine how genomic tumor diversity evolves during the course of malignant progression from low- to high-grade disease.
CRISPR-cas loci profiling of Cronobacter sakazakii pathovars.

PubMed

Ogrodzki, Pauline; Forsythe, Stephen James

2016-12-01

Cronobacter sakazakii sequence types 1, 4, 8 and 12 are associated with outbreaks of neonatal meningitis and necrotizing enterocolitis infections. However clonality results in strains which are indistinguishable using conventional methods. This study investigated the use of clustered regularly interspaced short palindromic repeats (CRISPR)-cas loci profiling for epidemiological investigations. Seventy whole genomes of C. sakazakii strains from four clonal complexes which were widely distributed temporally, geographically and origin of source were profiled. All strains encoded the same type I-E subtype CRISPR-cas system with a total of 12 different CRISPR spacer arrays. This study demonstrated the greater discriminatory power of CRISPR spacer array profiling compared with multilocus sequence typing, which will be of use in source attribution during Cronobacter outbreak investigations.
Transcriptome profiling and expression analyses of genes critical to wheat adaptation to low temperature

USDA-ARS?s Scientific Manuscript database

Background: To identify the genes involved in the development of low temperature (LT) tolerance in hexaploid wheat, we examined the global changes in expression in response to cold of the 55,052 potentially unique genes represented in the Affymetrix Wheat Genome microarray. We compared the expressi...
Novel mouse model recapitulates genome and transcriptome alterations in human colorectal carcinomas.

PubMed

McNeil, Nicole E; Padilla-Nash, Hesed M; Buishand, Floryne O; Hue, Yue; Ried, Thomas

2017-03-01

Human colorectal carcinomas are defined by a nonrandom distribution of genomic imbalances that are characteristic for this disease. Often, these imbalances affect entire chromosomes. Understanding the role of these aneuploidies for carcinogenesis is of utmost importance. Currently, established transgenic mice do not recapitulate the pathognonomic genome aberration profile of human colorectal carcinomas. We have developed a novel model based on the spontaneous transformation of murine colon epithelial cells. During this process, cells progress through stages of pre-immortalization, immortalization and, finally, transformation, and result in tumors when injected into immunocompromised mice. We analyzed our model for genome and transcriptome alterations using ArrayCGH, spectral karyotyping (SKY), and array based gene expression profiling. ArrayCGH revealed a recurrent pattern of genomic imbalances. These results were confirmed by SKY. Comparing these imbalances with orthologous maps of human chromosomes revealed a remarkable overlap. We observed focal deletions of the tumor suppressor genes Trp53 and Cdkn2a/p16. High-level focal genomic amplification included the locus harboring the oncogene Mdm2, which was confirmed by FISH in the form of double minute chromosomes. Array-based global gene expression revealed distinct differences between the sequential steps of spontaneous transformation. Gene expression changes showed significant similarities with human colorectal carcinomas. Pathways most prominently affected included genes involved in chromosomal instability and in epithelial to mesenchymal transition. Our novel mouse model therefore recapitulates the most prominent genome and transcriptome alterations in human colorectal cancer, and might serve as a valuable tool for understanding the dynamic process of tumorigenesis, and for preclinical drug testing. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Comparative Genomics of a Plant-Parasitic Nematode Endosymbiont Suggest a Role in Nutritional Symbiosis

PubMed Central

Brown, Amanda M.V.; Howe, Dana K.; Wasala, Sulochana K.; Peetz, Amy B.; Zasada, Inga A.; Denver, Dee R.

2015-01-01

Bacterial mutualists can modulate the biochemical capacity of animals. Highly coevolved nutritional mutualists do this by synthesizing nutrients missing from the host’s diet. Genomics tools have advanced the study of these partnerships. Here we examined the endosymbiont Xiphinematobacter (phylum Verrucomicrobia) from the dagger nematode Xiphinema americanum, a migratory ectoparasite of numerous crops that also vectors nepovirus. Previously, this endosymbiont was identified in the gut, ovaries, and eggs, but its role was unknown. We explored the potential role of this symbiont using fluorescence in situ hybridization, genome sequencing, and comparative functional genomics. We report the first genome of an intracellular Verrucomicrobium and the first exclusively intracellular non-Wolbachia nematode symbiont. Results revealed that Xiphinematobacter had a small 0.916-Mb genome with only 817 predicted proteins, resembling genomes of other mutualist endosymbionts. Compared with free-living relatives, conserved proteins were shorter on average, and there was large-scale loss of regulatory pathways. Despite massive gene loss, more genes were retained for biosynthesis of amino acids predicted to be essential to the host. Gene ontology enrichment tests showed enrichment for biosynthesis of arginine, histidine, and aromatic amino acids, as well as thiamine and coenzyme A, diverging from the profiles of relatives Akkermansia muciniphilia (in the human colon), Methylacidiphilum infernorum, and the mutualist Wolbachia from filarial nematodes. Together, these features and the location in the gut suggest that Xiphinematobacter functions as a nutritional mutualist, supplementing essential nutrients that are depleted in the nematode diet. This pattern points to evolutionary convergence with endosymbionts found in sap-feeding insects. PMID:26362082
A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica

PubMed Central

2012-01-01

Background The Azadirachta indica (neem) tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. Results The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. Conclusions This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides. PMID:22958331
A Large-Scale Comparative Metagenomic Study Reveals the Functional Interactions in Six Bloom-Forming Microcystis-Epibiont Communities

PubMed Central

Li, Qi; Lin, Feibi; Yang, Chen; Wang, Juanping; Lin, Yan; Shen, Mengyuan; Park, Min S.; Li, Tao; Zhao, Jindong

2018-01-01

Cyanobacterial blooms are worldwide issues of societal concern and scientific interest. Lake Taihu and Lake Dianchi, two of the largest lakes in China, have been suffering from annual Microcystis-based blooms over the past two decades. These two eutrophic lakes differ in both nutrient load and environmental parameters, where Microcystis microbiota consisting of different Microcystis morphospecies and associated bacteria (epibionts) have dominated. We conducted a comprehensive metagenomic study that analyzed species diversity, community structure, functional components, metabolic pathways and networks to investigate functional interactions among the members of six Microcystis-epibiont communities in these two lakes. Our integrated metagenomic pipeline consisted of efficient assembly, binning, annotation, and quality assurance methods that ensured high-quality genome reconstruction. This study provides a total of 68 reconstructed genomes including six complete Microcystis genomes and 28 high quality bacterial genomes of epibionts belonging to 14 distinct taxa. This metagenomic dataset constitutes the largest reference genome catalog available for genome-centric studies of the Microcystis microbiome. Epibiont community composition appears to be dynamic rather than fixed, and the functional profiles of communities were related to the environment of origin. This study demonstrates mutualistic interactions between Microcystis and epibionts at genetic and metabolic levels. Metabolic pathway reconstruction provided evidence for functional complementation in nitrogen and sulfur cycles, fatty acid catabolism, vitamin synthesis, and aromatic compound degradation among community members. Thus, bacterial social interactions within Microcystis-epibiont communities not only shape species composition, but also stabilize the communities functional profiles. These interactions appear to play an important role in environmental adaptation of Microcystis colonies. PMID:29731741
Integrative analysis and expression profiling of secondary cell wall genes in C4 biofuel model Setaria italica reveals targets for lignocellulose bioengineering

PubMed Central

Muthamilarasan, Mehanathan; Khan, Yusuf; Jaishankar, Jananee; Shweta, Shweta; Lata, Charu; Prasad, Manoj

2015-01-01

Several underutilized grasses have excellent potential for use as bioenergy feedstock due to their lignocellulosic biomass. Genomic tools have enabled identification of lignocellulose biosynthesis genes in several sequenced plants. However, the non-availability of whole genome sequence of bioenergy grasses hinders the study on bioenergy genomics and their genomics-assisted crop improvement. Foxtail millet (Setaria italica L.; Si) is a model crop for studying systems biology of bioenergy grasses. In the present study, a systematic approach has been used for identification of gene families involved in cellulose (CesA/Csl), callose (Gsl) and monolignol biosynthesis (PAL, C4H, 4CL, HCT, C3H, CCoAOMT, F5H, COMT, CCR, CAD) and construction of physical map of foxtail millet. Sequence alignment and phylogenetic analysis of identified proteins showed that monolignol biosynthesis proteins were highly diverse, whereas CesA/Csl and Gsl proteins were homologous to rice and Arabidopsis. Comparative mapping of foxtail millet lignocellulose biosynthesis genes with other C4 panicoid genomes revealed maximum homology with switchgrass, followed by sorghum and maize. Expression profiling of candidate lignocellulose genes in response to different abiotic stresses and hormone treatments showed their differential expression pattern, with significant higher expression of SiGsl12, SiPAL2, SiHCT1, SiF5H2, and SiCAD6 genes. Further, due to the evolutionary conservation of grass genomes, the insights gained from the present study could be extrapolated for identifying genes involved in lignocellulose biosynthesis in other biofuel species for further characterization. PMID:26583030
Integrative analysis and expression profiling of secondary cell wall genes in C4 biofuel model Setaria italica reveals targets for lignocellulose bioengineering.

PubMed

Muthamilarasan, Mehanathan; Khan, Yusuf; Jaishankar, Jananee; Shweta, Shweta; Lata, Charu; Prasad, Manoj

2015-01-01

Several underutilized grasses have excellent potential for use as bioenergy feedstock due to their lignocellulosic biomass. Genomic tools have enabled identification of lignocellulose biosynthesis genes in several sequenced plants. However, the non-availability of whole genome sequence of bioenergy grasses hinders the study on bioenergy genomics and their genomics-assisted crop improvement. Foxtail millet (Setaria italica L.; Si) is a model crop for studying systems biology of bioenergy grasses. In the present study, a systematic approach has been used for identification of gene families involved in cellulose (CesA/Csl), callose (Gsl) and monolignol biosynthesis (PAL, C4H, 4CL, HCT, C3H, CCoAOMT, F5H, COMT, CCR, CAD) and construction of physical map of foxtail millet. Sequence alignment and phylogenetic analysis of identified proteins showed that monolignol biosynthesis proteins were highly diverse, whereas CesA/Csl and Gsl proteins were homologous to rice and Arabidopsis. Comparative mapping of foxtail millet lignocellulose biosynthesis genes with other C4 panicoid genomes revealed maximum homology with switchgrass, followed by sorghum and maize. Expression profiling of candidate lignocellulose genes in response to different abiotic stresses and hormone treatments showed their differential expression pattern, with significant higher expression of SiGsl12, SiPAL2, SiHCT1, SiF5H2, and SiCAD6 genes. Further, due to the evolutionary conservation of grass genomes, the insights gained from the present study could be extrapolated for identifying genes involved in lignocellulose biosynthesis in other biofuel species for further characterization.

Gene Expression Profiling Reveals a Massive, Aneuploidy-Dependent Transcriptional Deregulation and Distinct Differences between Lymph Node–Negative and Lymph Node–Positive Colon Carcinomas

PubMed Central

Grade, Marian; Hörmann, Patrick; Becker, Sandra; Hummon, Amanda B.; Wangsa, Danny; Varma, Sudhir; Simon, Richard; Liersch, Torsten; Becker, Heinz; Difilippantonio, Michael J.; Ghadimi, B. Michael; Ried, Thomas

2016-01-01

To characterize patterns of global transcriptional deregulation in primary colon carcinomas, we did gene expression profiling of 73 tumors [Unio Internationale Contra Cancrum stage II (n = 33) and stage III (n = 40)] using oligonucleotide microarrays. For 30 of the tumors, expression profiles were compared with those from matched normal mucosa samples. We identified a set of 1,950 genes with highly significant deregulation between tumors and mucosa samples (P < 1e–7). A significant proportion of these genes mapped to chromosome 20 (P = 0.01). Seventeen genes had a >5-fold average expression difference between normal colon mucosa and carcinomas, including up-regulation of MYC and of HMGA1, a putative oncogene. Furthermore, we identified 68 genes that were significantly differentially expressed between lymph node–negative and lymph node–positive tumors (P < 0.001), the functional annotation of which revealed a preponderance of genes that play a role in cellular immune response and surveillance. The microarray-derived gene expression levels of 20 deregulated genes were validated using quantitative real-time reverse transcription-PCR in >40 tumor and normal mucosa samples with good concordance between the techniques. Finally, we established a relationship between specific genomic imbalances, which were mapped for 32 of the analyzed colon tumors by comparative genomic hybridization, and alterations of global transcriptional activity. Previously, we had conducted a similar analysis of primary rectal carcinomas. The systematic comparison of colon and rectal carcinomas revealed a significant overlap of genomic imbalances and transcriptional deregulation, including activation of the Wnt/β-catenin signaling cascade, suggesting similar pathogenic pathways. PMID:17210682
Gene expression profiling reveals a massive, aneuploidy-dependent transcriptional deregulation and distinct differences between lymph node-negative and lymph node-positive colon carcinomas.

PubMed

Grade, Marian; Hörmann, Patrick; Becker, Sandra; Hummon, Amanda B; Wangsa, Danny; Varma, Sudhir; Simon, Richard; Liersch, Torsten; Becker, Heinz; Difilippantonio, Michael J; Ghadimi, B Michael; Ried, Thomas

2007-01-01

To characterize patterns of global transcriptional deregulation in primary colon carcinomas, we did gene expression profiling of 73 tumors [Unio Internationale Contra Cancrum stage II (n = 33) and stage III (n = 40)] using oligonucleotide microarrays. For 30 of the tumors, expression profiles were compared with those from matched normal mucosa samples. We identified a set of 1,950 genes with highly significant deregulation between tumors and mucosa samples (P < 1e-7). A significant proportion of these genes mapped to chromosome 20 (P = 0.01). Seventeen genes had a >5-fold average expression difference between normal colon mucosa and carcinomas, including up-regulation of MYC and of HMGA1, a putative oncogene. Furthermore, we identified 68 genes that were significantly differentially expressed between lymph node-negative and lymph node-positive tumors (P < 0.001), the functional annotation of which revealed a preponderance of genes that play a role in cellular immune response and surveillance. The microarray-derived gene expression levels of 20 deregulated genes were validated using quantitative real-time reverse transcription-PCR in >40 tumor and normal mucosa samples with good concordance between the techniques. Finally, we established a relationship between specific genomic imbalances, which were mapped for 32 of the analyzed colon tumors by comparative genomic hybridization, and alterations of global transcriptional activity. Previously, we had conducted a similar analysis of primary rectal carcinomas. The systematic comparison of colon and rectal carcinomas revealed a significant overlap of genomic imbalances and transcriptional deregulation, including activation of the Wnt/beta-catenin signaling cascade, suggesting similar pathogenic pathways.
Quantitative image analysis of cellular heterogeneity in breast tumors complements genomic profiling.

PubMed

Yuan, Yinyin; Failmezger, Henrik; Rueda, Oscar M; Ali, H Raza; Gräf, Stefan; Chin, Suet-Feung; Schwarz, Roland F; Curtis, Christina; Dunning, Mark J; Bardwell, Helen; Johnson, Nicola; Doyle, Sarah; Turashvili, Gulisa; Provenzano, Elena; Aparicio, Sam; Caldas, Carlos; Markowetz, Florian

2012-10-24

Solid tumors are heterogeneous tissues composed of a mixture of cancer and normal cells, which complicates the interpretation of their molecular profiles. Furthermore, tissue architecture is generally not reflected in molecular assays, rendering this rich information underused. To address these challenges, we developed a computational approach based on standard hematoxylin and eosin-stained tissue sections and demonstrated its power in a discovery and validation cohort of 323 and 241 breast tumors, respectively. To deconvolute cellular heterogeneity and detect subtle genomic aberrations, we introduced an algorithm based on tumor cellularity to increase the comparability of copy number profiles between samples. We next devised a predictor for survival in estrogen receptor-negative breast cancer that integrated both image-based and gene expression analyses and significantly outperformed classifiers that use single data types, such as microarray expression signatures. Image processing also allowed us to describe and validate an independent prognostic factor based on quantitative analysis of spatial patterns between stromal cells, which are not detectable by molecular assays. Our quantitative, image-based method could benefit any large-scale cancer study by refining and complementing molecular assays of tumor samples.
Genome-wide chromatin and gene expression profiling during memory formation and maintenance in adult mice.

PubMed

Centeno, Tonatiuh Pena; Shomroni, Orr; Hennion, Magali; Halder, Rashi; Vidal, Ramon; Rahman, Raza-Ur; Bonn, Stefan

2016-10-11

Recent evidence suggests that the formation and maintenance of memory requires epigenetic changes. In an effort to understand the spatio-temporal extent of learning and memory-related epigenetic changes we have charted genome-wide histone and DNA methylation profiles, in two different brain regions, two cell types, and three time-points, before and after learning. In this data descriptor we provide detailed information on data generation, give insights into the rationale of experiments, highlight necessary steps to assess data quality, offer guidelines for future use of the data and supply ready-to-use code to replicate the analysis results. The data provides a blueprint of the gene regulatory network underlying short- and long-term memory formation and maintenance. This 'healthy' gene regulatory network of learning can now be compared to changes in neurological or psychiatric diseases, providing mechanistic insights into brain disorders and highlighting potential therapeutic avenues.
Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat

The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The speciesmore » P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but this needs to be experimentally characterized with ecologically relevant phenotype properties. This study justifies the need to sequence multiple isolates, especially from P. fluorescens group in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants.« less
Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

DOE PAGES

Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; ...

2016-01-01

The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The speciesmore » P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but this needs to be experimentally characterized with ecologically relevant phenotype properties. This study justifies the need to sequence multiple isolates, especially from P. fluorescens group in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants.« less
Impact of genomic profiling on the treatment and outcomes of patients with advanced gastrointestinal malignancies.

PubMed

Dhir, Mashaal; Choudry, Haroon A; Holtzman, Matthew P; Pingpank, James F; Ahrendt, Steven A; Zureikat, Amer H; Hogg, Melissa E; Bartlett, David L; Zeh, Herbert J; Singhi, Aatur D; Bahary, Nathan

2017-01-01

The impact of genomic profiling on the outcomes of patients with advanced gastrointestinal (GI) malignancies remains unknown. The primary objectives of the study were to investigate the clinical benefit of genomic-guided therapy, defined as complete response (CR), partial response (PR), or stable disease (SD) at 3 months, and its impact on progression-free survival (PFS) in patients with advanced GI malignancies. Clinical and genomic data of all consecutive GI tumor samples from April, 2013 to April, 2016 sequenced by FoundationOne were obtained and analyzed. A total of 101 samples from 97 patients were analyzed. Ninety-eight samples from 95 patients could be amplified making this approach feasible in 97% of the samples. After removing duplicates, 95 samples from 95 patients were included in the further analysis. Median time from specimen collection to reporting was 11 days. Genomic alteration-guided treatment recommendations were considered new and clinically relevant in 38% (36/95) of the patients. Rapid decline in functional status was noted in 25% (9/36) of these patients who could therefore not receive genomic-guided therapy. Genomic-guided therapy was utilized in 13 patients (13.7%) and 7 patients (7.4%) experienced clinical benefit (6 PR and 1 SD). Among these seven patients, median PFS was 10 months with some ongoing durable responses. Genomic profiling-guided therapy can lead to clinical benefit in a subset of patients with advanced GI malignancies. Attempting genomic profiling earlier in the course of treatment prior to functional decline may allow more patients to benefit from these therapies. © 2016 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.
Effect of reference genome selection on the performance of computational methods for genome-wide protein-protein interaction prediction.

PubMed

Muley, Vijaykumar Yogesh; Ranjan, Akash

2012-01-01

Recent progress in computational methods for predicting physical and functional protein-protein interactions has provided new insights into the complexity of biological processes. Most of these methods assume that functionally interacting proteins are likely to have a shared evolutionary history. This history can be traced out for the protein pairs of a query genome by correlating different evolutionary aspects of their homologs in multiple genomes known as the reference genomes. These methods include phylogenetic profiling, gene neighborhood and co-occurrence of the orthologous protein coding genes in the same cluster or operon. These are collectively known as genomic context methods. On the other hand a method called mirrortree is based on the similarity of phylogenetic trees between two interacting proteins. Comprehensive performance analyses of these methods have been frequently reported in literature. However, very few studies provide insight into the effect of reference genome selection on detection of meaningful protein interactions. We analyzed the performance of four methods and their variants to understand the effect of reference genome selection on prediction efficacy. We used six sets of reference genomes, sampled in accordance with phylogenetic diversity and relationship between organisms from 565 bacteria. We used Escherichia coli as a model organism and the gold standard datasets of interacting proteins reported in DIP, EcoCyc and KEGG databases to compare the performance of the prediction methods. Higher performance for predicting protein-protein interactions was achievable even with 100-150 bacterial genomes out of 565 genomes. Inclusion of archaeal genomes in the reference genome set improves performance. We find that in order to obtain a good performance, it is better to sample few genomes of related genera of prokaryotes from the large number of available genomes. Moreover, such a sampling allows for selecting 50-100 genomes for comparable accuracy of predictions when computational resources are limited.
Comparative Genomics of Escherichia coli Isolated from Skin and Soft Tissue and Other Extraintestinal Infections.

PubMed

Ranjan, Amit; Shaik, Sabiha; Nandanwar, Nishant; Hussain, Arif; Tiwari, Sumeet K; Semmler, Torsten; Jadhav, Savita; Wieler, Lothar H; Alam, Munirul; Colwell, Rita R; Ahmed, Niyaz

2017-08-15

Escherichia coli , an intestinal Gram-negative bacterium, has been shown to be associated with a variety of diseases in addition to intestinal infections, such as urinary tract infections (UTIs), meningitis in neonates, septicemia, skin and soft tissue infections (SSTIs), and colisepticemia. Thus, for nonintestinal infections, it is categorized as extraintestinal pathogenic E. coli (ExPEC). It is also an opportunistic pathogen, causing cross infections, notably as an agent of zoonotic diseases. However, comparative genomic data providing functional and genetic coordinates for ExPEC strains associated with these different types of infections have not proven conclusive. In the study reported here, ExPEC E. coli isolated from SSTIs was characterized, including virulence and drug resistance profiles, and compared with isolates from patients suffering either pyelonephritis or septicemia. Results revealed that the majority of the isolates belonged to two pathogenic phylogroups, B2 and D. Approximately 67% of the isolates were multidrug resistant (MDR), with 85% producing extended-spectrum beta-lactamase (ESBL) and 6% producing metallo-beta-lactamase (MBL). The bla CTX-M-15 genotype was observed in at least 70% of the E. coli isolates in each category, conferring resistance to an extended range of beta-lactam antibiotics. Whole-genome sequencing and comparative genomics of the ExPEC isolates revealed that two of the four isolates from SSTIs, NA633 and NA643, belong to pandemic sequence type ST131, whereas functional characteristics of three of the ExPEC pathotypes revealed that they had equal capabilities to form biofilm and were resistant to human serum. Overall, the isolates from a variety of ExPEC infections demonstrated similar resistomes and virulomes and did not display any disease-specific functional or genetic coordinates. IMPORTANCE Infections caused by extraintestinal pathogenic E. coli (ExPEC) are of global concern as they result in significant costs to health care facilities management. The recent emergence of a multidrug-resistant pandemic clone, Escherichia coli ST131, is of primary concern as a global threat. In developing countries, such as India, skin and soft tissue infections (SSTIs) associated with E. coli are marginally addressed. In this study, we employed both genomic analysis and phenotypic assays to determine relationships, if any, among the ExPEC pathotypes. Similarity between antibiotic resistance and virulence profiles was observed, ST131 isolates from SSTIs were reported, and genomic similarities among strains isolated from different disease conditions were detected. This study provides functional molecular infection epidemiology insight into SSTI-associated E. coli compared with ExPEC pathotypes. Copyright © 2017 Ranjan et al.
The PiGeOn project: protocol for a longitudinal study examining psychosocial, behavioural and ethical issues and outcomes in cancer tumour genomic profiling.

PubMed

Best, Megan; Newson, Ainsley J; Meiser, Bettina; Juraskova, Ilona; Goldstein, David; Tucker, Kathy; Ballinger, Mandy L; Hess, Dominique; Schlub, Timothy E; Biesecker, Barbara; Vines, Richard; Vines, Kate; Thomas, David; Young, Mary-Anne; Savard, Jacqueline; Jacobs, Chris; Butow, Phyllis

2018-04-05

Genomic sequencing in cancer (both tumour and germline), and development of therapies targeted to tumour genetic status, hold great promise for improvement of patient outcomes. However, the imminent introduction of genomics into clinical practice calls for better understanding of how patients value, experience, and cope with this novel technology and its often complex results. Here we describe a protocol for a novel mixed-methods, prospective study (PiGeOn) that aims to examine patients' psychosocial, cognitive, affective and behavioural responses to tumour genomic profiling and to integrate a parallel critical ethical analysis of returning results. This is a cohort sub-study of a parent tumour genomic profiling programme enrolling patients with advanced cancer. One thousand patients will be recruited for the parent study in Sydney, Australia from 2016 to 2019. They will be asked to complete surveys at baseline, three, and five months. Primary outcomes are: knowledge, preferences, attitudes and values. A purposively sampled subset of patients will be asked to participate in three semi-structured interviews (at each time point) to provide deeper data interpretation. Relevant ethical themes will be critically analysed to iteratively develop or refine normative ethical concepts or frameworks currently used in the return of genetic information. This will be the first Australian study to collect longitudinal data on cancer patients' experience of tumour genomic profiling. Findings will be used to inform ongoing ethical debates on issues such as how to effectively obtain informed consent for genomic profiling return results, distinguish between research and clinical practice and manage patient expectations. The combination of quantitative and qualitative methods will provide comprehensive and critical data on how patients cope with 'actionable' and 'non-actionable' results. This information is needed to ensure that when tumour genomic profiling becomes part of routine clinical care, ethical considerations are embedded, and patients are adequately prepared and supported during and after receiving results. Not required for this sub-study, parent trial registration ACTRN12616000908437 .
The future of genomics in polar and alpine cyanobacteria

PubMed Central

Anesio, Alexandre M; Sánchez-Baracaldo, Patricia

2018-01-01

Abstract In recent years, genomic analyses have arisen as an exciting way of investigating the functional capacity and environmental adaptations of numerous micro-organisms of global relevance, including cyanobacteria. In the extreme cold of Arctic, Antarctic and alpine environments, cyanobacteria are of fundamental ecological importance as primary producers and ecosystem engineers. While their role in biogeochemical cycles is well appreciated, little is known about the genomic makeup of polar and alpine cyanobacteria. In this article, we present ways that genomic techniques might be used to further our understanding of cyanobacteria in cold environments in terms of their evolution and ecology. Existing examples from other environments (e.g. marine/hot springs) are used to discuss how methods developed there might be used to investigate specific questions in the cryosphere. Phylogenomics, comparative genomics and population genomics are identified as methods for understanding the evolution and biogeography of polar and alpine cyanobacteria. Transcriptomics will allow us to investigate gene expression under extreme environmental conditions, and metagenomics can be used to complement tradition amplicon-based methods of community profiling. Finally, new techniques such as single cell genomics and metagenome assembled genomes will also help to expand our understanding of polar and alpine cyanobacteria that cannot readily be cultured. PMID:29506259
The complete and fully assembled genome sequence of Aeromonas salmonicida subsp. pectinolytica and its comparative analysis with other Aeromonas species: investigation of the mobilome in environmental and pathogenic strains.

PubMed

Pfeiffer, Friedhelm; Zamora-Lagos, Maria-Antonia; Blettinger, Martin; Yeroslaviz, Assa; Dahl, Andreas; Gruber, Stephan; Habermann, Bianca H

2018-01-05

Due to the predominant usage of short-read sequencing to date, most bacterial genome sequences reported in the last years remain at the draft level. This precludes certain types of analyses, such as the in-depth analysis of genome plasticity. Here we report the finalized genome sequence of the environmental strain Aeromonas salmonicida subsp. pectinolytica 34mel, for which only a draft genome with 253 contigs is currently available. Successful completion of the transposon-rich genome critically depended on the PacBio long read sequencing technology. Using finalized genome sequences of A. salmonicida subsp. pectinolytica and other Aeromonads, we report the detailed analysis of the transposon composition of these bacterial species. Mobilome evolution is exemplified by a complex transposon, which has shifted from pathogenicity-related to environmental-related gene content in A. salmonicida subsp. pectinolytica 34mel. Obtaining the complete, circular genome of A. salmonicida subsp. pectinolytica allowed us to perform an in-depth analysis of its mobilome. We demonstrate the mobilome-dependent evolution of this strain's genetic profile from pathogenic to environmental.
The Number of Point Mutations in Induced Pluripotent Stem Cells and Nuclear Transfer Embryonic Stem Cells Depends on the Method and Somatic Cell Type Used for Their Generation.

PubMed

Araki, Ryoko; Mizutani, Eiji; Hoki, Yuko; Sunayama, Misato; Wakayama, Sayaka; Nagatomo, Hiroaki; Kasama, Yasuji; Nakamura, Miki; Wakayama, Teruhiko; Abe, Masumi

2017-05-01

Induced pluripotent stem cells hold great promise for regenerative medicine but point mutations have been identified in these cells and have raised serious concerns about their safe use. We generated nuclear transfer embryonic stem cells (ntESCs) from both mouse embryonic fibroblasts (MEFs) and tail-tip fibroblasts (TTFs) and by whole genome sequencing found fewer mutations compared with iPSCs generated by retroviral gene transduction. Furthermore, TTF-derived ntESCs showed only a very small number of point mutations, approximately 80% less than the number observed in iPSCs generated using retrovirus. Base substitution profile analysis confirmed this greatly reduced number of point mutations. The point mutations in iPSCs are therefore not a Yamanaka factor-specific phenomenon but are intrinsic to genome reprogramming. Moreover, the dramatic reduction in point mutations in ntESCs suggests that most are not essential for genome reprogramming. Our results suggest that it is feasible to reduce the point mutation frequency in iPSCs by optimizing various genome reprogramming conditions. We conducted whole genome sequencing of ntES cells derived from MEFs or TTFs. We thereby succeeded in establishing TTF-derived ntES cell lines with far fewer point mutations. Base substitution profile analysis of these clones also indicated a reduced point mutation frequency, moving from a transversion-predominance to a transition-predominance. Stem Cells 2017;35:1189-1196. © 2017 AlphaMed Press.
GWIPS-viz: development of a ribo-seq genome browser

PubMed Central

Michel, Audrey M.; Fox, Gearoid; M. Kiran, Anmol; De Bo, Christof; O’Connor, Patrick B. F.; Heaphy, Stephen M.; Mullan, James P. A.; Donohue, Claire A.; Higgins, Desmond G.; Baranov, Pavel V.

2014-01-01

We describe the development of GWIPS-viz (http://gwips.ucc.ie), an online genome browser for viewing ribosome profiling data. Ribosome profiling (ribo-seq) is a recently developed technique that provides genome-wide information on protein synthesis (GWIPS) in vivo. It is based on the deep sequencing of ribosome-protected messenger RNA (mRNA) fragments, which allows the ribosome density along all mRNA transcripts present in the cell to be quantified. Since its inception, ribo-seq has been carried out in a number of eukaryotic and prokaryotic organisms. Owing to the increasing interest in ribo-seq, there is a pertinent demand for a dedicated ribo-seq genome browser. GWIPS-viz is based on The University of California Santa Cruz (UCSC) Genome Browser. Ribo-seq tracks, coupled with mRNA-seq tracks, are currently available for several genomes: human, mouse, zebrafish, nematode, yeast, bacteria (Escherichia coli K12, Bacillus subtilis), human cytomegalovirus and bacteriophage lambda. Our objective is to continue incorporating published ribo-seq data sets so that the wider community can readily view ribosome profiling information from multiple studies without the need to carry out computational processing. PMID:24185699
Cancer systems biology in the genome sequencing era: part 1, dissecting and modeling of tumor clones and their networks.

PubMed

Wang, Edwin; Zou, Jinfeng; Zaman, Naif; Beitel, Lenore K; Trifiro, Mark; Paliouras, Miltiadis

2013-08-01

Recent tumor genome sequencing confirmed that one tumor often consists of multiple cell subpopulations (clones) which bear different, but related, genetic profiles such as mutation and copy number variation profiles. Thus far, one tumor has been viewed as a whole entity in cancer functional studies. With the advances of genome sequencing and computational analysis, we are able to quantify and computationally dissect clones from tumors, and then conduct clone-based analysis. Emerging technologies such as single-cell genome sequencing and RNA-Seq could profile tumor clones. Thus, we should reconsider how to conduct cancer systems biology studies in the genome sequencing era. We will outline new directions for conducting cancer systems biology by considering that genome sequencing technology can be used for dissecting, quantifying and genetically characterizing clones from tumors. Topics discussed in Part 1 of this review include computationally quantifying of tumor subpopulations; clone-based network modeling, cancer hallmark-based networks and their high-order rewiring principles and the principles of cell survival networks of fast-growing clones. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.
TEGS-CN: A Statistical Method for Pathway Analysis of Genome-wide Copy Number Profile.

PubMed

Huang, Yen-Tsung; Hsu, Thomas; Christiani, David C

2014-01-01

The effects of copy number alterations make up a significant part of the tumor genome profile, but pathway analyses of these alterations are still not well established. We proposed a novel method to analyze multiple copy numbers of genes within a pathway, termed Test for the Effect of a Gene Set with Copy Number data (TEGS-CN). TEGS-CN was adapted from TEGS, a method that we previously developed for gene expression data using a variance component score test. With additional development, we extend the method to analyze DNA copy number data, accounting for different sizes and thus various numbers of copy number probes in genes. The test statistic follows a mixture of X (2) distributions that can be obtained using permutation with scaled X (2) approximation. We conducted simulation studies to evaluate the size and the power of TEGS-CN and to compare its performance with TEGS. We analyzed a genome-wide copy number data from 264 patients of non-small-cell lung cancer. With the Molecular Signatures Database (MSigDB) pathway database, the genome-wide copy number data can be classified into 1814 biological pathways or gene sets. We investigated associations of the copy number profile of the 1814 gene sets with pack-years of cigarette smoking. Our analysis revealed five pathways with significant P values after Bonferroni adjustment (<2.8 × 10(-5)), including the PTEN pathway (7.8 × 10(-7)), the gene set up-regulated under heat shock (3.6 × 10(-6)), the gene sets involved in the immune profile for rejection of kidney transplantation (9.2 × 10(-6)) and for transcriptional control of leukocytes (2.2 × 10(-5)), and the ganglioside biosynthesis pathway (2.7 × 10(-5)). In conclusion, we present a new method for pathway analyses of copy number data, and causal mechanisms of the five pathways require further study.
Reprogramming the Maternal Zebrafish Genome after Fertilization to Match the Paternal Methylation Pattern

PubMed Central

Potok, Magdalena E.; Nix, David A.; Parnell, Timothy J.; Cairns, Bradley R.

2014-01-01

SUMMARY Early vertebrate embryos must achieve totipotency and prepare for zygotic genome activation (ZGA). To understand this process, we determined the DNA methylation (DNAme) profiles of zebrafish gametes, embryos at different stages, and somatic muscle and compared them to gene activity and histone modifications. Sperm chromatin patterns are virtually identical to those at ZGA. Unexpectedly, the DNA of many oocyte genes important for germ-line functions (i.e., piwil1) or early development (i.e., hox genes) is methylated, but the loci are demethylated during zygotic cleavage stages to precisely the state observed in sperm, even in parthenogenetic embryos lacking a replicating paternal genome. Furthermore, this cohort constitutes the genes and loci that acquire DNAme during development (i.e., ZGA to muscle). Finally, DNA methyltransferase inhibition experiments suggest that DNAme silences particular gene and chromatin cohorts at ZGA, preventing their precocious expression. Thus, zebrafish achieve a totipotent chromatin state at ZGA through paternal genome competency and maternal genome DNAme reprogramming. PMID:23663776
Draft Genome Sequences of Two Species of "Difficult-to-Identify" Human-Pathogenic Corynebacteria: Implications for Better Identification Tests.

PubMed

Pacheco, Luis G C; Mattos-Guaraldi, Ana L; Santos, Carolina S; Veras, Adonney A O; Guimarães, Luis C; Abreu, Vinícius; Pereira, Felipe L; Soares, Siomar C; Dorella, Fernanda A; Carvalho, Alex F; Leal, Carlos G; Figueiredo, Henrique C P; Ramos, Juliana N; Vieira, Veronica V; Farfour, Eric; Guiso, Nicole; Hirata, Raphael; Azevedo, Vasco; Silva, Artur; Ramos, Rommel T J

2015-01-01

Non-diphtheriae Corynebacterium species have been increasingly recognized as the causative agents of infections in humans. Differential identification of these bacteria in the clinical microbiology laboratory by the most commonly used biochemical tests is challenging, and normally requires additional molecular methods. Herein, we present the annotated draft genome sequences of two isolates of "difficult-to-identify" human-pathogenic corynebacterial species: C. xerosis and C. minutissimum. The genome sequences of ca. 2.7 Mbp, with a mean number of 2,580 protein encoding genes, were also compared with the publicly available genome sequences of strains of C. amycolatum and C. striatum. These results will aid the exploration of novel biochemical reactions to improve existing identification tests as well as the development of more accurate molecular identification methods through detection of species-specific target genes for isolate's identification or drug susceptibility profiling.
Genomic and Metabolomic Profile Associated to Microalbuminuria

PubMed Central

Marrachelli, Vannina G.; Monleon, Daniel; Rentero, Pilar; Mansego, María L.; Morales, Jose Manuel; Galan, Inma; Segura, Remedios; Martinez, Fernando; Martin-Escudero, Juan Carlos; Briongos, Laisa; Marin, Pablo; Lliso, Gloria; Chaves, Felipe Javier; Redon, Josep

2014-01-01

To identify factors related with the risk to develop microalbuminuria using combined genomic and metabolomic values from a general population study. One thousand five hundred and two subjects, Caucasian, more than 18 years, representative of the general population, were included. Blood pressure measurement and albumin/creatinine ratio were measured in a urine sample. Using SNPlex, 1251 SNPs potentially associated to urinary albumin excretion (UAE) were analyzed. Serum metabolomic profile was assessed by 1H NMR spectra using a Brucker Advance DRX 600 spectrometer. From the total population, 1217 (mean age 54±19, 50.6% men, ACR>30 mg/g in 81 subjects) with high genotyping call rate were analysed. A characteristic metabolomic profile, which included products from mitochondrial and extra mitochondrial metabolism as well as branched amino acids and their derivative signals, were observed in microalbuminuric as compare to normoalbuminuric subjects. The comparison of the metabolomic profile between subjects with different UAE status for each of the genotypes associated to microalbuminuria revealed two SNPs, the rs10492025_TT of RPH3A gene and the rs4359_CC of ACE gene, with minimal or no statistically significant differences. Subjects with and without microalbuminuria, who shared the same genotype and metabolomic profile, differed in age. Microalbuminurics with the CC genotype of the rs4359 polymorphism and with the TT genotype of the rs10492025 polymorphism were seven years older and seventeen years younger, respectively as compared to the whole microalbuminuric subjects. With the same metabolomic environment, characteristic of subjects with microalbuminuria, the TT genotype of the rs10492025 polymorphism seems to increase and the CC genotype of the rs4359 polymorphism seems to reduce risk to develop microalbuminuria. PMID:24918908
Clinical Application of Genomic Profiling With Circulating Tumor DNA for Management of Advanced Non-Small-cell Lung Cancer in Asia.

PubMed

Loong, Herbert H; Raymond, Victoria M; Shiotsu, Yukimasa; Chua, Daniel T T; Teo, Peter M L; Yung, Tony; Skrzypczak, Stan; Lanman, Richard B; Mok, Tony S K

2018-05-07

Genomic profiling of cell-free circulating tumor DNA (ctDNA) is a potential alternative to repeat invasive biopsy in patients with advanced cancer. We report the first real-world cohort of comprehensive genomic assessments of patients with non-small-cell lung cancer (NSCLC) in a Chinese population. We performed a retrospective analysis of patients with advanced or metastatic NSCLC whose physician requested ctDNA-based genomic profiling using the Guardant360 platform from January 2016 to June 2017. Guardant360 includes all 4 major types of genomic alterations (point mutations, insertion-deletion alterations, fusions, and amplifications) in 73 genes. Genomic profiling was performed in 76 patients from Hong Kong during the 18-month study period (median age, 59.5 years; 41 men and 35 women). The histologic types included adenocarcinoma (n = 10), NSCLC, not otherwise specified (n = 58), and squamous cell carcinoma (n = 8). In the adenocarcinoma and NSCLC, not otherwise specified, combined group, 62 of the 68 patients (91%) had variants identified (range, 1-12; median, 3), of whom, 26 (42%) had ≥ 1 of the 7 National Comprehensive Cancer Network-recommended lung adenocarcinoma genomic targets. Concurrent detection of driver and resistance mutations were identified in 6 of 13 patients with EGFR driver mutations and in 3 of 5 patients with EML4-ALK fusions. All 8 patients with squamous cell carcinoma had multiple variants identified (range, 1-20; median, 6), including FGFR1 amplification and ERBB2 (HER2) amplification. PIK3CA amplification occurred in combination with either FGFR1 or ERBB2 (HER2) amplification or alone. Genomic profiling using ctDNA analysis detected alterations in most patients with advanced-stage NSCLC, with targetable aberrations and resistance mechanisms identified. This approach has demonstrated its feasibility in Asia. Copyright © 2018 Elsevier Inc. All rights reserved.

Integrating Genomics Into Clinical Pediatric Oncology Using the Molecular Tumor Board at the Memorial Sloan Kettering Cancer Center.

PubMed

Ortiz, Michael V; Kobos, Rachel; Walsh, Michael; Slotkin, Emily K; Roberts, Stephen; Berger, Michael F; Hameed, Meera; Solit, David; Ladanyi, Marc; Shukla, Neerav; Kentsis, Alex

2016-08-01

Pediatric oncologists have begun to leverage tumor genetic profiling to match patients with targeted therapies. At the Memorial Sloan Kettering Cancer Center (MSKCC), we developed the Pediatric Molecular Tumor Board (PMTB) to track, integrate, and interpret clinical genomic profiling and potential targeted therapeutic recommendations. This retrospective case series includes all patients reviewed by the MSKCC PMTB from July 2014 to June 2015. Cases were submitted by treating oncologists and potential treatment recommendations were based upon the modified guidelines of the Oxford Centre for Evidence-Based Medicine. There were 41 presentations of 39 individual patients during the study period. Gliomas, acute myeloid leukemia, and neuroblastoma were the most commonly reviewed cases. Thirty nine (87%) of the 45 molecular sequencing profiles utilized hybrid-capture targeted genome sequencing. In 30 (73%) of the 41 presentations, the PMTB provided therapeutic recommendations, of which 19 (46%) were implemented. Twenty-one (70%) of the recommendations involved targeted therapies. Three (14%) targeted therapy recommendations had published evidence to support the proposed recommendations (evidence levels 1-2), eight (36%) recommendations had preclinical evidence (level 3), and 11 (50%) recommendations were based upon hypothetical biological rationales (level 4). The MSKCC PMTB enabled a clinically relevant interpretation of genomic profiling. Effective use of clinical genomics is anticipated to require new and improved tools to ascribe pathogenic significance and therapeutic actionability. The development of specific rule-driven clinical protocols will be needed for the incorporation and evaluation of genomic and molecular profiling in interventional prospective clinical trials. © 2016 Wiley Periodicals, Inc.
Analysis of Protein-DNA Interaction by Chromatin Immunoprecipitation and DNA Tiling Microarray (ChIP-on-chip).

PubMed

Gao, Hui; Zhao, Chunyan

2018-01-01

Chromatin immunoprecipitation (ChIP) has become the most effective and widely used tool to study the interactions between specific proteins or modified forms of proteins and a genomic DNA region. Combined with genome-wide profiling technologies, such as microarray hybridization (ChIP-on-chip) or massively parallel sequencing (ChIP-seq), ChIP could provide a genome-wide mapping of in vivo protein-DNA interactions in various organisms. Here, we describe a protocol of ChIP-on-chip that uses tiling microarray to obtain a genome-wide profiling of ChIPed DNA.
Inferring genome-wide interplay landscape between DNA methylation and transcriptional regulation.

PubMed

Tang, Binhua; Wang, Xin

2015-01-01

DNA methylation and transcriptional regulation play important roles in cancer cell development and differentiation processes. Based on the currently available cell line profiling information from the ENCODE Consortium, we propose a Bayesian inference model to infer and construct genome-wide interaction landscape between DNA methylation and transcriptional regulation, which sheds light on the underlying complex functional mechanisms important within the human cancer and disease context. For the first time, we select all the currently available cell lines (>=20) and transcription factors (>=80) profiling information from the ENCODE Consortium portal. Through the integration of those genome-wide profiling sources, our genome-wide analysis detects multiple functional loci of interest, and indicates that DNA methylation is cell- and region-specific, due to the interplay mechanisms with transcription regulatory activities. We validate our analysis results with the corresponding RNA-sequencing technique for those detected genomic loci. Our results provide novel and meaningful insights for the interplay mechanisms of transcriptional regulation and gene expression for the human cancer and disease studies.
Genome-wide mapping and analysis of aryl hydrocarbon receptor (AHR)- and aryl hydrocarbon receptor repressor (AHRR)-binding sites in human breast cancer cells.

PubMed

Yang, Sunny Y; Ahmed, Shaimaa; Satheesh, Somisetty V; Matthews, Jason

2018-01-01

The aryl hydrocarbon receptor (AHR) mediates the toxic actions of environmental contaminants, such as 2,3,7,8-tetrachlorodibenzo-ρ-dioxin (TCDD), and also plays roles in vascular development, the immune response, and cell cycle regulation. The AHR repressor (AHRR) is an AHR-regulated gene and a negative regulator of AHR; however, the mechanisms of AHRR-dependent repression of AHR are unclear. In this study, we compared the genome-wide binding profiles of AHR and AHRR in MCF-7 human breast cancer cells treated for 24 h with TCDD using chromatin immunoprecipitation followed by next-generation sequencing (ChIP-Seq). We identified 3915 AHR- and 2811 AHRR-bound regions, of which 974 (35%) were common to both datasets. When these 24-h datasets were also compared with AHR-bound regions identified after 45 min of TCDD treatment, 67% (1884) of AHRR-bound regions overlapped with those of AHR. This analysis identified 994 unique AHRR-bound regions. AHRR-bound regions mapped closer to promoter regions when compared with AHR-bound regions. The AHRE was identified and overrepresented in AHR:AHRR-co-bound regions, AHR-only regions, and AHRR-only regions. Candidate unique AHR- and AHRR-bound regions were validated by ChIP-qPCR and their ability to regulate gene expression was confirmed by luciferase reporter gene assays. Overall, this study reveals that AHR and AHRR exhibit similar but also distinct genome-wide binding profiles, supporting the notion that AHRR is a context- and gene-specific repressor of AHR activity.
Blastomere biopsy influences epigenetic reprogramming during early embryo development, which impacts neural development and function in resulting mice.

PubMed

Wu, Yibo; Lv, Zhuo; Yang, Yang; Dong, Guoying; Yu, Yang; Cui, Yiqiang; Tong, Man; Wang, Liu; Zhou, Zuomin; Zhu, Hui; Zhou, Qi; Sha, Jiahao

2014-05-01

Blastomere biopsy is used in preimplantation genetic diagnosis; however, the long-term implications on the offspring are poorly characterized. We previously reported a high risk of memory defects in adult biopsied mice. Here, we assessed nervous function of aged biopsied mice and further investigated the mechanism of neural impairment after biopsy. We found that aged biopsied mice had poorer spatial learning ability, increased neuron degeneration, and altered expression of proteins involved in neural degeneration or dysfunction in the brain compared to aged control mice. Furthermore, the MeDIP assay indicated a genome-wide low methylation in the brains of adult biopsied mice when compared to the controls, and most of the genes containing differentially methylated loci in promoter regions were associated with neural disorders. When we further compared the genomic DNA methylation profiles of 7.5-days postconception (dpc) embryos between the biopsy and control group, we found the whole genome low methylation in the biopsied group, suggesting that blastomere biopsy was an obstacle to de novo methylation during early embryo development. Further analysis on mRNA profiles of 4.5-dpc embryos indicated that reduced expression of de novo methylation genes in biopsied embryos may impact de novo methylation. In conclusion, we demonstrate an abnormal neural development and function in mice generated after blastomere biopsy. The impaired epigenetic reprogramming during early embryo development may be the latent mechanism contributing to the impairment of the nervous system in the biopsied mice, which results in a hypomethylation status in their brains.
Genome-wide alterations of the DNA replication program during tumor progression

NASA Astrophysics Data System (ADS)

Arneodo, A.; Goldar, A.; Argoul, F.; Hyrien, O.; Audit, B.

2016-08-01

Oncogenic stress is a major driving force in the early stages of cancer development. Recent experimental findings reveal that, in precancerous lesions and cancers, activated oncogenes may induce stalling and dissociation of DNA replication forks resulting in DNA damage. Replication timing is emerging as an important epigenetic feature that recapitulates several genomic, epigenetic and functional specificities of even closely related cell types. There is increasing evidence that chromosome rearrangements, the hallmark of many cancer genomes, are intimately associated with the DNA replication program and that epigenetic replication timing changes often precede chromosomic rearrangements. The recent development of a novel methodology to map replication fork polarity using deep sequencing of Okazaki fragments has provided new and complementary genome-wide replication profiling data. We review the results of a wavelet-based multi-scale analysis of genomic and epigenetic data including replication profiles along human chromosomes. These results provide new insight into the spatio-temporal replication program and its dynamics during differentiation. Here our goal is to bring to cancer research, the experimental protocols and computational methodologies for replication program profiling, and also the modeling of the spatio-temporal replication program. To illustrate our purpose, we report very preliminary results obtained for the chronic myelogeneous leukemia, the archetype model of cancer. Finally, we discuss promising perspectives on using genome-wide DNA replication profiling as a novel efficient tool for cancer diagnosis, prognosis and personalized treatment.
PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms.

PubMed

Gan, Ruei-Chi; Chen, Ting-Wen; Wu, Timothy H; Huang, Po-Jung; Lee, Chi-Ching; Yeh, Yuan-Ming; Chiu, Cheng-Hsun; Huang, Hsien-Da; Tang, Petrus

2016-12-22

Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. However, there are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organisms lacking proper reference genomes, the common strategy is de novo assembly followed by functional annotation. However, things become even more complicated when multiple transcriptomes are compared. Here, we propose a new analysis strategy and quantification methods for quantifying expression level which not only generate a virtual reference from sequencing data, but also provide comparisons between transcriptomes. First, all reads from the transcriptome datasets are pooled together for de novo assembly. The assembled contigs are searched against NCBI NR databases to find potential homolog sequences. Based on the searched result, a set of virtual transcripts are generated and served as a reference transcriptome. By using the same reference, normalized quantification values including RC (read counts), eRPKM (estimated RPKM) and eTPM (estimated TPM) can be obtained that are comparable across transcriptome datasets. In order to demonstrate the feasibility of our strategy, we implement it in the web service PARRoT. PARRoT stands for Pipeline for Analyzing RNA Reads of Transcriptomes. It analyzes gene expression profiles for two transcriptome sequencing datasets. For better understanding of the biological meaning from the comparison among transcriptomes, PARRoT further provides linkage between these virtual transcripts and their potential function through showing best hits in SwissProt, NR database, assigning GO terms. Our demo datasets showed that PARRoT can analyze two paired-end transcriptomic datasets of approximately 100 million reads within just three hours. In this study, we proposed and implemented a strategy to analyze transcriptomes from non-reference organisms which offers the opportunity to quantify and compare transcriptome profiles through a homolog based virtual transcriptome reference. By using the homolog based reference, our strategy effectively avoids the problems that may cause from inconsistencies among transcriptomes. This strategy will shed lights on the field of comparative genomics for non-model organism. We have implemented PARRoT as a web service which is freely available at http://parrot.cgu.edu.tw .
Genome-wide DNA methylation profiles and their replationship with mRNA and the microRNA transcriptome in bovine muscle tissue (Bos Taurine)

USDA-ARS?s Scientific Manuscript database

DNA methylation is a key epigenetic modification in mammals, having essential and important roles in muscle development. We sample longissimus thoracis tissues from a well-known elite native breed of Chinese Qinchuan cattle living within comparable environments at fetal and adult stages, using methy...
Genomic characterization of recurrent high-grade astroblastoma.

PubMed

Bale, Tejus A; Abedalthagafi, Malak; Bi, Wenya Linda; Kang, Yun Jee; Merrill, Parker; Dunn, Ian F; Dubuc, Adrian; Charbonneau, Sarah K; Brown, Loreal; Ligon, Azra H; Ramkissoon, Shakti H; Ligon, Keith L

2016-01-01

Astroblastomas are rare primary brain tumors, diagnosed based on histologic features. Not currently assigned a WHO grade, they typically display indolent behavior, with occasional variants taking a more aggressive course. We characterized the immunohistochemical characteristics, copy number (high-resolution array comparative genomic hybridization, OncoCopy) and mutational profile (targeted next-generation exome sequencing, OncoPanel) of a cohort of seven biopsies from four patients to identify recurrent genomic events that may help distinguish astroblastomas from other more common high-grade gliomas. We found that tumor histology was variable across patients and between primary and recurrent tumor samples. No common molecular features were identified among the four tumors. Mutations commonly observed in astrocytic tumors (IDH1/2, TP53, ATRX, and PTEN) or ependymoma were not identified. However one case with rapid clinical progression displayed mutations more commonly associated with GBM (NF1(N1054H/K63)*, PIK3CA(R38H) and ERG(A403T)). Conversely, another case, originally classified as glioblastoma with nine-year survival before recurrence, lacked a GBM mutational profile. Other mutations frequently seen in lower grade gliomas (BCOR, BCORL1, ERBB3, MYB, ATM) were also present in several tumors. Copy number changes were variable across tumors. Our findings indicate that astroblastomas have variable growth patterns and morphologic features, posing significant challenges to accurate classification in the absence of diagnostically specific copy number alterations and molecular features. Their histopathologic overlap with glioblastoma will likely confound the observation of long-term GBM "survivors". Further genomic profiling is needed to determine whether these tumors represent a distinct entity and to guide management strategies. Copyright © 2016 Elsevier Inc. All rights reserved.
Genome-wide identification and characterisation of F-box family in maize.

PubMed

Jia, Fengjuan; Wu, Bingjiang; Li, Hui; Huang, Jinguang; Zheng, Chengchao

2013-11-01

F-box-containing proteins, as the key components of the protein degradation machinery, are widely distributed in higher plants and are considered as one of the largest known families of regulatory proteins. The F-box protein family plays a crucial role in plant growth and development and in response to biotic and abiotic stresses. However, systematic analysis of the F-box family in maize (Zea mays) has not been reported yet. In this paper, we identified and characterised the maize F-box genes in a genome-wide scale, including phylogenetic analysis, chromosome distribution, gene structure, promoter analysis and gene expression profiles. A total of 359 F-box genes were identified and divided into 15 subgroups by phylogenetic analysis. The F-box domain was relatively conserved, whereas additional motifs outside the F-box domain may indicate the functional diversification of maize F-box genes. These genes were unevenly distributed in ten maize chromosomes, suggesting that they expanded in the maize genome because of tandem and segmental duplication events. The expression profiles suggested that the maize F-box genes had temporal and spatial expression patterns. Putative cis-acting regulatory DNA elements involved in abiotic stresses were observed in maize F-box gene promoters. The gene expression profiles under abiotic stresses also suggested that some genes participated in stress responsive pathways. Furthermore, ten genes were chosen for quantitative real-time PCR analysis under drought stress and the results were consistent with the microarray data. This study has produced a comparative genomics analysis of the maize ZmFBX gene family that can be used in further studies to uncover their roles in maize growth and development.
Pattern analysis uncovers a chronic ethanol-induced disruption of the switch-like dynamics of C/EBP-β and C/EBP-α genome-wide binding during liver regeneration

PubMed Central

Kuttippurathu, Lakshmi; Patra, Biswanath; Cook, Daniel; Hoek, Jan B.

2017-01-01

Chronic ethanol intake impairs liver regeneration through a system-wide alteration in the regulatory networks driving the response to injury. Our study focused on the initial phase of response to 2/3rd partial hepatectomy (PHx) to investigate how adaptation to chronic ethanol intake affects the genome-wide binding profiles of the transcription factors C/EBP-β and C/EBP-α. These factors participate in complementary and often opposing functions for maintaining cellular differentiation, regulating metabolism, and governing cell growth during liver regeneration. We analyzed ChIP-seq data with a comparative pattern count (COMPACT) analysis, which exhaustively enumerates temporal patterns of discretized binding profiles to identify dominant as well as subtle patterns that may not be apparent from conventional clustering analyses. We found that adaptation to chronic ethanol intake significantly alters the genome-wide binding profile of C/EBP-β and C/EBP-α before and following PHx. A subset of these ethanol-induced changes include C/EBP-β binding to promoters of genes involved in the profibrogenic transforming growth factor-β pathway, and both C/EBP-β and C/EBP-α binding to promoters of genes involved in the cell cycle, apoptosis, homeostasis, and metabolic processes. The shift in C/EBP binding loci, coupled with an ethanol-induced increase in C/EBP-β binding at 6 h post-resection, indicates that ethanol adaptation may change both the amount and nature of C/EBP binding postresection. Taken together, our results suggest that chronic ethanol consumption leads to a spatially and temporally reorganized activity at many genomic loci, resulting in a shift in the dynamic balance and coordination of cellular processes underlying regenerative response. PMID:27815535
NGSmethDB 2017: enhanced methylomes and differential methylation.

PubMed

Lebrón, Ricardo; Gómez-Martín, Cristina; Carpena, Pedro; Bernaola-Galván, Pedro; Barturen, Guillermo; Hackenberg, Michael; Oliver, José L

2017-01-04

The 2017 update of NGSmethDB stores whole genome methylomes generated from short-read data sets obtained by bisulfite sequencing (WGBS) technology. To generate high-quality methylomes, stringent quality controls were integrated with third-part software, adding also a two-step mapping process to exploit the advantages of the new genome assembly models. The samples were all profiled under constant parameter settings, thus enabling comparative downstream analyses. Besides a significant increase in the number of samples, NGSmethDB now includes two additional data-types, which are a valuable resource for the discovery of methylation epigenetic biomarkers: (i) differentially methylated single-cytosines; and (ii) methylation segments (i.e. genome regions of homogeneous methylation). The NGSmethDB back-end is now based on MongoDB, a NoSQL hierarchical database using JSON-formatted documents and dynamic schemas, thus accelerating sample comparative analyses. Besides conventional database dumps, track hubs were implemented, which improved database access, visualization in genome browsers and comparative analyses to third-part annotations. In addition, the database can be also accessed through a RESTful API. Lastly, a Python client and a multiplatform virtual machine allow for program-driven access from user desktop. This way, private methylation data can be compared to NGSmethDB without the need to upload them to public servers. Database website: http://bioinfo2.ugr.es/NGSmethDB. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Comparative genomics of enterohemorrhagic Escherichia coli O145:H28 demonstrates a common evolutionary lineage with Escherichia coli O157:H7

PubMed Central

2014-01-01

Background Although serotype O157:H7 is the predominant enterohemorrhagic Escherichia coli (EHEC), outbreaks of non-O157 EHEC that cause severe foodborne illness, including hemolytic uremic syndrome have increased worldwide. In fact, non-O157 serotypes are now estimated to cause over half of all the Shiga toxin-producing Escherichia coli (STEC) cases, and outbreaks of non-O157 EHEC infections are frequently associated with serotypes O26, O45, O103, O111, O121, and O145. Currently, there are no complete genomes for O145 in public databases. Results We determined the complete genome sequences of two O145 strains (EcO145), one linked to a US lettuce-associated outbreak (RM13514) and one to a Belgium ice-cream-associated outbreak (RM13516). Both strains contain one chromosome and two large plasmids, with genome sizes of 5,737,294 bp for RM13514 and 5,559,008 bp for RM13516. Comparative analysis of the two EcO145 genomes revealed a large core (5,173 genes) and a considerable amount of strain-specific genes. Additionally, the two EcO145 genomes display distinct chromosomal architecture, virulence gene profile, phylogenetic origin of Stx2a prophage, and methylation profile (methylome). Comparative analysis of EcO145 genomes to other completely sequenced STEC and other E. coli and Shigella genomes revealed that, unlike any other known non-O157 EHEC strain, EcO145 ascended from a common lineage with EcO157/EcO55. This evolutionary relationship was further supported by the pangenome analysis of the 10 EHEC str ains. Of the 4,192 EHEC core genes, EcO145 shares more genes with EcO157 than with the any other non-O157 EHEC strains. Conclusions Our data provide evidence that EcO145 and EcO157 evolved from a common lineage, but ultimately each serotype evolves via a lineage-independent nature to EHEC by acquisition of the core set of EHEC virulence factors, including the genes encoding Shiga toxin and the large virulence plasmid. The large variation between the two EcO145 genomes suggests a distinctive evolutionary path between the two outbreak strains. The distinct methylome between the two EcO145 strains is likely due to the presence of a BsuBI/PstI methyltransferase gene cassette in the Stx2a prophage of the strain RM13514, suggesting a role of horizontal gene transfer-mediated epigenetic alteration in the evolution of individual EHEC strains. PMID:24410921
Inter-individual variability and genetic influences on cytokine responses against bacterial and fungal pathogens

PubMed Central

Li, Yang; Oosting, Marije; Deelen, Patrick; Ricaño-Ponce, Isis; Smeekens, Sanne; Jaeger, Martin; Matzaraki, Vasiliki; Swertz, Morris A.; Xavier, Ramnik J.; Franke, Lude; Wijmenga, Cisca; Joosten, Leo A.B.; Kumar, Vinod; Netea, Mihai G.

2016-01-01

Little is known about the inter-individual variation of cytokine responses to different pathogens in healthy individuals. To systematically describe cytokine responses elicited by distinct pathogens, and to determine the impact of genetic variation on cytokine production, we profiled cytokines produced by peripheral blood mononuclear cells from 197 individuals of European origin from the 200 Functional Genomics (200FG) cohort within the Human Functional Genomics Study (www.humanfunctionalgenomics.org), obtained over three different years. By comparing bacteria- and fungi-induced cytokine profiles, we show that most cytokine responses are organized around a physiological response to specific pathogens, rather than around a particular immune pathway or cytokine. We then correlated genome-wide SNP genotypes with cytokine abundance and identified six cytokine QTLs. Among them, a cytokine QTL at NAA35-GOLM1 locus markedly modulates IL-6 production in response to multiple pathogens, and associated with susceptibility to candidemia. Furthermore, the cytokine QTLs we identified are enriched among SNPs previously associated with infectious diseases and heart diseases. These data reveal and begin to explain the variability in cytokine production by human immune cells in response to pathogens. PMID:27376574
Genome-Wide Expression Profiling of Five Mouse Models Identifies Similarities and Differences with Human Psoriasis

PubMed Central

Swindell, William R.; Johnston, Andrew; Carbajal, Steve; Han, Gangwen; Wohn, Christian; Lu, Jun; Xing, Xianying; Nair, Rajan P.; Voorhees, John J.; Elder, James T.; Wang, Xiao-Jing; Sano, Shigetoshi; Prens, Errol P.; DiGiovanni, John; Pittelkow, Mark R.; Ward, Nicole L.; Gudjonsson, Johann E.

2011-01-01

Development of a suitable mouse model would facilitate the investigation of pathomechanisms underlying human psoriasis and would also assist in development of therapeutic treatments. However, while many psoriasis mouse models have been proposed, no single model recapitulates all features of the human disease, and standardized validation criteria for psoriasis mouse models have not been widely applied. In this study, whole-genome transcriptional profiling is used to compare gene expression patterns manifested by human psoriatic skin lesions with those that occur in five psoriasis mouse models (K5-Tie2, imiquimod, K14-AREG, K5-Stat3C and K5-TGFbeta1). While the cutaneous gene expression profiles associated with each mouse phenotype exhibited statistically significant similarity to the expression profile of psoriasis in humans, each model displayed distinctive sets of similarities and differences in comparison to human psoriasis. For all five models, correspondence to the human disease was strong with respect to genes involved in epidermal development and keratinization. Immune and inflammation-associated gene expression, in contrast, was more variable between models as compared to the human disease. These findings support the value of all five models as research tools, each with identifiable areas of convergence to and divergence from the human disease. Additionally, the approach used in this paper provides an objective and quantitative method for evaluation of proposed mouse models of psoriasis, which can be strategically applied in future studies to score strengths of mouse phenotypes relative to specific aspects of human psoriasis. PMID:21483750
Comparative Genomics of a Plant-Parasitic Nematode Endosymbiont Suggest a Role in Nutritional Symbiosis.

PubMed

Brown, Amanda M V; Howe, Dana K; Wasala, Sulochana K; Peetz, Amy B; Zasada, Inga A; Denver, Dee R

2015-09-10

Bacterial mutualists can modulate the biochemical capacity of animals. Highly coevolved nutritional mutualists do this by synthesizing nutrients missing from the host's diet. Genomics tools have advanced the study of these partnerships. Here we examined the endosymbiont Xiphinematobacter (phylum Verrucomicrobia) from the dagger nematode Xiphinema americanum, a migratory ectoparasite of numerous crops that also vectors nepovirus. Previously, this endosymbiont was identified in the gut, ovaries, and eggs, but its role was unknown. We explored the potential role of this symbiont using fluorescence in situ hybridization, genome sequencing, and comparative functional genomics. We report the first genome of an intracellular Verrucomicrobium and the first exclusively intracellular non-Wolbachia nematode symbiont. Results revealed that Xiphinematobacter had a small 0.916-Mb genome with only 817 predicted proteins, resembling genomes of other mutualist endosymbionts. Compared with free-living relatives, conserved proteins were shorter on average, and there was large-scale loss of regulatory pathways. Despite massive gene loss, more genes were retained for biosynthesis of amino acids predicted to be essential to the host. Gene ontology enrichment tests showed enrichment for biosynthesis of arginine, histidine, and aromatic amino acids, as well as thiamine and coenzyme A, diverging from the profiles of relatives Akkermansia muciniphilia (in the human colon), Methylacidiphilum infernorum, and the mutualist Wolbachia from filarial nematodes. Together, these features and the location in the gut suggest that Xiphinematobacter functions as a nutritional mutualist, supplementing essential nutrients that are depleted in the nematode diet. This pattern points to evolutionary convergence with endosymbionts found in sap-feeding insects. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
The chimeric nature of the genomes of marine magnetotactic coccoid-ovoid bacteria defines a novel group of Proteobacteria.

PubMed

Ji, Boyang; Zhang, Sheng-Da; Zhang, Wei-Jia; Rouy, Zoe; Alberto, François; Santini, Claire-Lise; Mangenot, Sophie; Gagnot, Séverine; Philippe, Nadège; Pradel, Nathalie; Zhang, Lichen; Tempel, Sébastien; Li, Ying; Médigue, Claudine; Henrissat, Bernard; Coutinho, Pedro M; Barbe, Valérie; Talla, Emmanuel; Wu, Long-Fei

2017-03-01

Magnetotactic bacteria (MTB) are a group of phylogenetically and physiologically diverse Gram-negative bacteria that synthesize intracellular magnetic crystals named magnetosomes. MTB are affiliated with three classes of Proteobacteria phylum, Nitrospirae phylum, Omnitrophica phylum and probably with the candidate phylum Latescibacteria. The evolutionary origin and physiological diversity of MTB compared with other bacterial taxonomic groups remain to be illustrated. Here, we analysed the genome of the marine magneto-ovoid strain MO-1 and found that it is closely related to Magnetococcus marinus MC-1. Detailed analyses of the ribosomal proteins and whole proteomes of 390 genomes reveal that, among the Proteobacteria analysed, only MO-1 and MC-1 have coding sequences (CDSs) with a similarly high proportion of origins from Alphaproteobacteria, Betaproteobacteria, Deltaproteobacteria and Gammaproteobacteria. Interestingly, a comparative metabolic network analysis with anoxic network enzymes from sequenced MTB and non-MTB successfully allows the eventual prediction of an organism with a metabolic profile compatible for magnetosome production. Altogether, our genomic analysis reveals multiple origins of MO-1 and M. marinus MC-1 genomes and suggests a metabolism-restriction model for explaining whether a bacterium could become an MTB upon acquisition of magnetosome encoding genes. © 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.
The First Genomic and Proteomic Characterization of a Deep-Sea Sulfate Reducer: Insights into the Piezophilic Lifestyle of Desulfovibrio piezophilus

PubMed Central

Pradel, Nathalie; Ji, Boyang; Gimenez, Grégory; Talla, Emmanuel; Lenoble, Patricia; Garel, Marc; Tamburini, Christian; Fourquet, Patrick; Lebrun, Régine; Bertin, Philippe; Denis, Yann; Pophillat, Matthieu; Barbe, Valérie; Ollivier, Bernard; Dolla, Alain

2013-01-01

Desulfovibrio piezophilus strain C1TLV30T is a piezophilic anaerobe that was isolated from wood falls in the Mediterranean deep-sea. D. piezophilus represents a unique model for studying the adaptation of sulfate-reducing bacteria to hydrostatic pressure. Here, we report the 3.6 Mbp genome sequence of this piezophilic bacterium. An analysis of the genome revealed the presence of seven genomic islands as well as gene clusters that are most likely linked to life at a high hydrostatic pressure. Comparative genomics and differential proteomics identified the transport of solutes and amino acids as well as amino acid metabolism as major cellular processes for the adaptation of this bacterium to hydrostatic pressure. In addition, the proteome profiles showed that the abundance of key enzymes that are involved in sulfate reduction was dependent on hydrostatic pressure. A comparative analysis of orthologs from the non-piezophilic marine bacterium D. salexigens and D. piezophilus identified aspartic acid, glutamic acid, lysine, asparagine, serine and tyrosine as the amino acids preferentially replaced by arginine, histidine, alanine and threonine in the piezophilic strain. This work reveals the adaptation strategies developed by a sulfate reducer to a deep-sea lifestyle. PMID:23383081
Small RNA-based prediction of hybrid performance in maize.

PubMed

Seifert, Felix; Thiemann, Alexander; Schrag, Tobias A; Rybka, Dominika; Melchinger, Albrecht E; Frisch, Matthias; Scholten, Stefan

2018-05-21

Small RNA (sRNA) sequences are known to have a broad impact on gene regulation by various mechanisms. Their performance for the prediction of hybrid traits has not yet been analyzed. Our objective was to analyze the relation of parental sRNA expression with the performance of their hybrids, to develop a sRNA-based prediction approach, and to compare it to more common SNP and mRNA transcript based predictions using a factorial mating scheme of a maize hybrid breeding program. Correlation of genomic differences and messenger RNA (mRNA) or sRNA expression differences between parental lines with hybrid performance of their hybrids revealed that sRNAs showed an inverse relationship in contrast to the other two data types. We associated differences for SNPs, mRNA and sRNA expression between parental inbred lines with the performance of their hybrid combinations and developed two prediction approaches using distance measures based on associated markers. Cross-validations revealed parental differences in sRNA expression to be strong predictors for hybrid performance for grain yield in maize, comparable to genomic and mRNA data. The integration of both positively and negatively associated markers in the prediction approaches enhanced the prediction accurary. The associated sRNAs belong predominantly to the canonical size classes of 22- and 24-nt that show specific genomic mapping characteristics. Expression profiles of sRNA are a promising alternative to SNPs or mRNA expression profiles for hybrid prediction, especially for plant species without reference genome or transcriptome information. The characteristics of the sRNAs we identified suggest that association studies based on breeding populations facilitate the identification of sRNAs involved in hybrid performance.
Expansion of banana (Musa acuminata) gene families involved in ethylene biosynthesis and signalling after lineage-specific whole-genome duplications.

PubMed

Jourda, Cyril; Cardi, Céline; Mbéguié-A-Mbéguié, Didier; Bocs, Stéphanie; Garsmeur, Olivier; D'Hont, Angélique; Yahiaoui, Nabila

2014-05-01

Whole-genome duplications (WGDs) are widespread in plants, and three lineage-specific WGDs occurred in the banana (Musa acuminata) genome. Here, we analysed the impact of WGDs on the evolution of banana gene families involved in ethylene biosynthesis and signalling, a key pathway for banana fruit ripening. Banana ethylene pathway genes were identified using comparative genomics approaches and their duplication modes and expression profiles were analysed. Seven out of 10 banana ethylene gene families evolved through WGD and four of them (1-aminocyclopropane-1-carboxylate synthase (ACS), ethylene-insensitive 3-like (EIL), ethylene-insensitive 3-binding F-box (EBF) and ethylene response factor (ERF)) were preferentially retained. Banana orthologues of AtEIN3 and AtEIL1, two major genes for ethylene signalling in Arabidopsis, were particularly expanded. This expansion was paralleled by that of EBF genes which are responsible for control of EIL protein levels. Gene expression profiles in banana fruits suggested functional redundancy for several MaEBF and MaEIL genes derived from WGD and subfunctionalization for some of them. We propose that EIL and EBF genes were co-retained after WGD in banana to maintain balanced control of EIL protein levels and thus avoid detrimental effects of constitutive ethylene signalling. In the course of evolution, subfunctionalization was favoured to promote finer control of ethylene signalling. © 2014 CIRAD New Phytologist © 2014 New Phytologist Trust.

Genome-wide alteration of 5-hydroxymethylcytosine in a mouse model of fragile X-associated tremor/ataxia syndrome.

PubMed

Yao, Bing; Lin, Li; Street, R Craig; Zalewski, Zachary A; Galloway, Jocelyn N; Wu, Hao; Nelson, David L; Jin, Peng

2014-02-15

Fragile X-associated tremor/ataxia syndrome (FXTAS) is a late-onset neurodegenerative disorder in which patients carry premutation alleles of 55-200 CGG repeats in the FMR1 gene. To date, whether alterations in epigenetic regulation modulate FXTAS has gone unexplored. 5-Hydroxymethylcytosine (5hmC) converted from 5-methylcytosine (5mC) by the ten-eleven translocation (TET) family of proteins has been found recently to play key roles in neuronal functions. Here, we undertook genome-wide profiling of cerebellar 5hmC in a FXTAS mouse model (rCGG mice) and found that rCGG mice at 16 weeks showed overall reduced 5hmC levels genome-wide compared with age-matched wild-type littermates. However, we also observed gain-of-5hmC regions in repetitive elements, as well as in cerebellum-specific enhancers, but not in general enhancers. Genomic annotation and motif prediction of wild-type- and rCGG-specific differential 5-hydroxymethylated regions (DhMRs) revealed their high correlation with genes and transcription factors that are important in neuronal developmental and functional pathways. DhMR-associated genes partially overlapped with genes that were differentially associated with ribosomes in CGG mice identified by bacTRAP ribosomal profiling. Taken together, our data strongly indicate a functional role for 5hmC-mediated epigenetic modulation in the etiology of FXTAS, possibly through the regulation of transcription.
Chemical genomic profiling via barcode sequencing to predict compound mode of action

PubMed Central

Piotrowski, Jeff S.; Simpkins, Scott W.; Li, Sheena C.; Deshpande, Raamesh; McIlwain, Sean; Ong, Irene; Myers, Chad L.; Boone, Charlie; Andersen, Raymond J.

2015-01-01

Summary Chemical genomics is an unbiased, whole-cell approach to characterizing novel compounds to determine mode of action and cellular target. Our version of this technique is built upon barcoded deletion mutants of Saccharomyces cerevisiae and has been adapted to a high-throughput methodology using next-generation sequencing. Here we describe the steps to generate a chemical genomic profile from a compound of interest, and how to use this information to predict molecular mechanism and targets of bioactive compounds. PMID:25618354
Phenotypic and genomic comparison of Mycobacterium aurum and surrogate model species to Mycobacterium tuberculosis: implications for drug discovery.

PubMed

Namouchi, Amine; Cimino, Mena; Favre-Rochex, Sandrine; Charles, Patricia; Gicquel, Brigitte

2017-07-13

Tuberculosis (TB) is caused by Mycobacterium tuberculosis and represents one of the major challenges facing drug discovery initiatives worldwide. The considerable rise in bacterial drug resistance in recent years has led to the need of new drugs and drug regimens. Model systems are regularly used to speed-up the drug discovery process and circumvent biosafety issues associated with manipulating M. tuberculosis. These include the use of strains such as Mycobacterium smegmatis and Mycobacterium marinum that can be handled in biosafety level 2 facilities, making high-throughput screening feasible. However, each of these model species have their own limitations. We report and describe the first complete genome sequence of Mycobacterium aurum ATCC23366, an environmental mycobacterium that can also grow in the gut of humans and animals as part of the microbiota. This species shows a comparable resistance profile to that of M. tuberculosis for several anti-TB drugs. The aims of this study were to (i) determine the drug resistance profile of a recently proposed model species, Mycobacterium aurum, strain ATCC23366, for anti-TB drug discovery as well as Mycobacterium smegmatis and Mycobacterium marinum (ii) sequence and annotate the complete genome sequence of this species obtained using Pacific Bioscience technology (iii) perform comparative genomics analyses of the various surrogate strains with M. tuberculosis (iv) discuss how the choice of the surrogate model used for drug screening can affect the drug discovery process. We describe the complete genome sequence of M. aurum, a surrogate model for anti-tuberculosis drug discovery. Most of the genes already reported to be associated with drug resistance are shared between all the surrogate strains and M. tuberculosis. We consider that M. aurum might be used in high-throughput screening for tuberculosis drug discovery. We also highly recommend the use of different model species during the drug discovery screening process.
Pulmonary Sarcomatoid Carcinomas Commonly Harbor Either Potentially Targetable Genomic Alterations or High Tumor Mutational Burden as Observed by Comprehensive Genomic Profiling.

PubMed

Schrock, Alexa B; Li, Shuyu D; Frampton, Garrett M; Suh, James; Braun, Eduardo; Mehra, Ranee; Buck, Steven C; Bufill, Jose A; Peled, Nir; Karim, Nagla Abdel; Hsieh, K Cynthia; Doria, Manuel; Knost, James; Chen, Rong; Ou, Sai-Hong Ignatius; Ross, Jeffrey S; Stephens, Philip J; Fishkin, Paul; Miller, Vincent A; Ali, Siraj M; Halmos, Balazs; Liu, Jane J

2017-06-01

Pulmonary sarcomatoid carcinoma (PSC) is a high-grade NSCLC characterized by poor prognosis and resistance to chemotherapy. Development of targeted therapeutic strategies for PSC has been hampered because of limited and inconsistent molecular characterization. Hybrid capture-based comprehensive genomic profiling was performed on DNA from formalin-fixed paraffin-embedded sections of 15,867 NSCLCs, including 125 PSCs (0.8%). Tumor mutational burden (TMB) was calculated from 1.11 megabases (Mb) of sequenced DNA. The median age of the patients with PSC was 67 years (range 32-87), 58% were male, and 78% had stage IV disease. Tumor protein p53 gene (TP53) genomic alterations (GAs) were identified in 74% of cases, which had genomics distinct from TP53 wild-type cases, and 62% featured a GA in KRAS (34%) or one of seven genes currently recommended for testing in the National Comprehensive Cancer Network NSCLC guidelines, including the following: hepatocyte growth factor receptor gene (MET) (13.6%), EGFR (8.8%), BRAF (7.2%), erb-b2 receptor tyrosine kinase 2 gene (HER2) (1.6%), and ret proto-oncogene (RET) (0.8%). MET exon 14 alterations were enriched in PSC (12%) compared with non-PSC NSCLCs (∼3%) (p < 0.0001) and were more prevalent in PSC cases with an adenocarcinoma component. The fraction of PSC with a high TMB (>20 mutations per Mb) was notably higher than in non-PSC NSCLC (20% versus 14%, p = 0.056). Of nine patients with PSC treated with targeted or immunotherapies, three had partial responses and three had stable disease. Potentially targetable GAs in National Comprehensive Cancer Network NSCLC genes (30%) or intermediate or high TMB (43%, >10 mutations per Mb) were identified in most of the PSC cases. Thus, the use of comprehensive genomic profiling in clinical care may provide important treatment options for a historically poorly characterized and difficult to treat disease. Copyright © 2017 International Association for the Study of Lung Cancer. Published by Elsevier Inc. All rights reserved.
Systems perspectives on erythromycin biosynthesis by comparative genomic and transcriptomic analyses of S. erythraea E3 and NRRL23338 strains

PubMed Central

2013-01-01

Background S. erythraea is a Gram-positive filamentous bacterium used for the industrial-scale production of erythromycin A which is of high clinical importance. In this work, we sequenced the whole genome of a high-producing strain (E3) obtained by random mutagenesis and screening from the wild-type strain NRRL23338, and examined time-series expression profiles of both E3 and NRRL23338. Based on the genomic data and transcriptpmic data of these two strains, we carried out comparative analysis of high-producing strain and wild-type strain at both the genomic level and the transcriptomic level. Results We observed a large number of genetic variants including 60 insertions, 46 deletions and 584 single nucleotide variations (SNV) in E3 in comparison with NRRL23338, and the analysis of time series transcriptomic data indicated that the genes involved in erythromycin biosynthesis and feeder pathways were significantly up-regulated during the 60 hours time-course. According to our data, BldD, a previously identified ery cluster regulator, did not show any positive correlations with the expression of ery cluster, suggesting the existence of alternative regulation mechanisms of erythromycin synthesis in S. erythraea. Several potential regulators were then proposed by integration analysis of genomic and transcriptomic data. Conclusion This is a demonstration of the functional comparative genomics between an industrial S. erythraea strain and the wild-type strain. These findings help to understand the global regulation mechanisms of erythromycin biosynthesis in S. erythraea, providing useful clues for genetic and metabolic engineering in the future. PMID:23902230
Genomic profiling of dedifferentiated liposarcoma compared to matched well-differentiated liposarcoma reveals higher genomic complexity and a common origin

PubMed Central

Beird, Hannah C.; Wu, Chia-Chin; Ingram, Davis R.; Wang, Wei-Lien; Alimohamed, Asrar; Gumbs, Curtis; Little, Latasha; Song, Xingzhi; Feig, Barry W.; Roland, Christina L.; Zhang, Jianhua; Benjamin, Robert S.; Hwu, Patrick; Lazar, Alexander J.; Futreal, P. Andrew; Somaiah, Neeta

2018-01-01

Well-differentiated (WD) liposarcoma is a low-grade mesenchymal tumor with features of mature adipocytes and high propensity for local recurrence. Often, WD patients present with or later progress to a higher-grade nonlipogenic form known as dedifferentiated (DD) liposarcoma. These DD tumors behave more aggressively and can metastasize. Both WD and DD liposarcomas harbor neochromosomes formed from amplifications and rearrangements of Chr 12q that encode oncogenes (MDM2, CDK4, and YEATS2) and adipocytic differentiation factors (HMGA2 and CPM). However, genomic changes associated with progression from WD to DD have not been well-defined. Therefore, we selected patients with matched WD and DD tumors for extensive genomic profiling in order to understand their clonal relationships and to delineate any defining alterations for each entity. Exome and transcriptomic sequencing was performed for 17 patients with both WD and DD diagnoses. Somatic point and copy-number alterations were integrated with transcriptional analyses to determine subtype-associated genomic features and pathways. The results were, on average, that only 8.3% of somatic mutations in WD liposarcoma were shared with their cognate DD component. DD tumors had higher numbers of somatic copy-number losses, amplifications involving Chr 12q, and fusion transcripts than WD tumors. HMGA2 and CPM rearrangements occur more frequently in DD components. The shared somatic mutations indicate a clonal origin for matched WD and DD tumors and show early divergence with ongoing genomic instability due to continual generation and selection of neochromosomes. Stochastic generation and subsequent expression of fusion transcripts from the neochromosome that involve adipogenesis genes such as HMGA2 and CPM may influence the differentiation state of the subsequent tumor. PMID:29610390
Independent evolution of neurotoxin and flagellar genetic loci in proteolytic Clostridium botulinum

PubMed Central

Carter, Andrew T; Paul, Catherine J; Mason, David R; Twine, Susan M; Alston, Mark J; Logan, Susan M; Austin, John W; Peck, Michael W

2009-01-01

Background Proteolytic Clostridium botulinum is the causative agent of botulism, a severe neuroparalytic illness. Given the severity of botulism, surprisingly little is known of the population structure, biology, phylogeny or evolution of C. botulinum. The recent determination of the genome sequence of C. botulinum has allowed comparative genomic indexing using a DNA microarray. Results Whole genome microarray analysis revealed that 63% of the coding sequences (CDSs) present in reference strain ATCC 3502 were common to all 61 widely-representative strains of proteolytic C. botulinum and the closely related C. sporogenes tested. This indicates a relatively stable genome. There was, however, evidence for recombination and genetic exchange, in particular within the neurotoxin gene and cluster (including transfer of neurotoxin genes to C. sporogenes), and the flagellar glycosylation island (FGI). These two loci appear to have evolved independently from each other, and from the remainder of the genetic complement. A number of strains were atypical; for example, while 10 out of 14 strains that formed type A1 toxin gave almost identical profiles in whole genome, neurotoxin cluster and FGI analyses, the other four strains showed divergent properties. Furthermore, a new neurotoxin sub-type (A5) has been discovered in strains from heroin-associated wound botulism cases. For the first time, differences in glycosylation profiles of the flagella could be linked to differences in the gene content of the FGI. Conclusion Proteolytic C. botulinum has a stable genome backbone containing specific regions of genetic heterogeneity. These include the neurotoxin gene cluster and the FGI, each having evolved independently of each other and the remainder of the genetic complement. Analysis of these genetic components provides a high degree of discrimination of strains of proteolytic C. botulinum, and is suitable for clinical and forensic investigations of botulism outbreaks. PMID:19298644
Independent evolution of neurotoxin and flagellar genetic loci in proteolytic Clostridium botulinum.

PubMed

Carter, Andrew T; Paul, Catherine J; Mason, David R; Twine, Susan M; Alston, Mark J; Logan, Susan M; Austin, John W; Peck, Michael W

2009-03-19

Proteolytic Clostridium botulinum is the causative agent of botulism, a severe neuroparalytic illness. Given the severity of botulism, surprisingly little is known of the population structure, biology, phylogeny or evolution of C. botulinum. The recent determination of the genome sequence of C. botulinum has allowed comparative genomic indexing using a DNA microarray. Whole genome microarray analysis revealed that 63% of the coding sequences (CDSs) present in reference strain ATCC 3502 were common to all 61 widely-representative strains of proteolytic C. botulinum and the closely related C. sporogenes tested. This indicates a relatively stable genome. There was, however, evidence for recombination and genetic exchange, in particular within the neurotoxin gene and cluster (including transfer of neurotoxin genes to C. sporogenes), and the flagellar glycosylation island (FGI). These two loci appear to have evolved independently from each other, and from the remainder of the genetic complement. A number of strains were atypical; for example, while 10 out of 14 strains that formed type A1 toxin gave almost identical profiles in whole genome, neurotoxin cluster and FGI analyses, the other four strains showed divergent properties. Furthermore, a new neurotoxin sub-type (A5) has been discovered in strains from heroin-associated wound botulism cases. For the first time, differences in glycosylation profiles of the flagella could be linked to differences in the gene content of the FGI. Proteolytic C. botulinum has a stable genome backbone containing specific regions of genetic heterogeneity. These include the neurotoxin gene cluster and the FGI, each having evolved independently of each other and the remainder of the genetic complement. Analysis of these genetic components provides a high degree of discrimination of strains of proteolytic C. botulinum, and is suitable for clinical and forensic investigations of botulism outbreaks.
CNS germinomas are characterized by global demethylation, chromosomal instability and mutational activation of the Kit-, Ras/Raf/Erk- and Akt-pathways

PubMed Central

Schulte, Simone Laura; Waha, Andreas; Steiger, Barbara; Denkhaus, Dorota; Dörner, Evelyn; Calaminus, Gabriele; Leuschner, Ivo; Pietsch, Torsten

2016-01-01

CNS germinomas represent a unique germ cell tumor entity characterized by undifferentiated tumor cells and a high response rate to current treatment protocols. Limited information is available on their underlying genomic, epigenetic and biological alterations. We performed a genome-wide analysis of genomic copy number alterations in 49 CNS germinomas by molecular inversion profiling. In addition, CpG dinucleotide methylation was studied by immunohistochemistry for methylated cytosine residues. Mutational analysis was performed by resequencing of candidate genes including KIT and RAS family members. Ras/Erk and Akt pathway activation was analyzed by immunostaining with antibodies against phospho-Erk, phosho-Akt, phospho-mTOR and phospho-S6. All germinomas coexpressed Oct4 and Kit but showed an extensive global DNA demethylation compared to other tumors and normal tissues. Molecular inversion profiling showed predominant genomic instability in all tumors with a high frequency of regional gains and losses including high level gene amplifications. Activating mutations of KIT exons 11, 13, and 17 as well as a case with genomic KIT amplification and activating mutations or amplifications of RAS gene family members including KRAS, NRAS and RRAS2 indicated mutational activation of crucial signaling pathways. Co-activation of Ras/Erk and Akt pathways was present in 83% of germinomas. These data suggest that CNS germinoma cells display a demethylated nuclear DNA similar to primordial germ cells in early development. This finding has a striking coincidence with extensive genomic instability. In addition, mutational activation of Kit-, Ras/Raf/Erk- and Akt- pathways indicate the biological importance of these pathways and their components as potential targets for therapy. PMID:27391150
Canine urothelial carcinoma: genomically aberrant and comparatively relevant

PubMed Central

Shapiro, S. G.; Raghunath, S.; Williams, C.; Motsinger-Reif, A. A.; Cullen, J. M.; Liu, T.; Albertson, D.; Ruvolo, M.; Lucas, A. Bergstrom; Jin, J.; Knapp, D. W.; Schiffman, J. D.

2015-01-01

Urothelial carcinoma (UC), also referred to as transitional cell carcinoma (TCC), is the most common bladder malignancy in both human and canine populations. In human UC, numerous studies have demonstrated the prevalence of chromosomal imbalances. Although the histopathology of the disease is similar in both species, studies evaluating the genomic profile of canine UC are lacking, limiting the discovery of key comparative molecular markers associated with driving UC pathogenesis. In the present study, we evaluated 31 primary canine UC biopsies by oligonucleotide array comparative genomic hybridization (oaCGH). Results highlighted the presence of three highly recurrent numerical aberrations: gain of dog chromosome (CFA) 13 and 36 and loss of CFA 19. Regional gains of CFA 13 and 36 were present in 97% and 84% of cases, respectively, and losses on CFA 19 were present in 77% of cases. Fluorescence in situ hybridization (FISH), using targeted bacterial artificial chromosome (BAC) clones and custom Agilent SureFISH probes, was performed to detect and quantify these regions in paraffin-embedded biopsy sections and urine-derived urothelial cells. The data indicate that these three aberrations are potentially diagnostic of UC. Comparison of our canine oaCGH data with that of 285 human cases identified a series of shared copy number aberrations. Using an informatics approach to interrogate the frequency of copy number aberrations across both species, we identified those that had the highest joint probability of association with UC. The most significant joint region contained the gene PABPC1, which should be considered further for its role in UC progression. In addition, cross-species filtering of genome-wide copy number data highlighted several genes as high-profile candidates for further analysis, including CDKN2A, S100A8/9, and LRP1B. We propose that these common aberrations are indicative of an evolutionarily conserved mechanism of pathogenesis and harbor genes key to urothelial neoplasia, warranting investigation for diagnostic, prognostic, and therapeutic applications. PMID:25783786
Different DNA methylation patterns detected by the Amplified Methylation Polymorphism Polymerase Chain Reaction (AMP PCR) technique among various cell types of bulls.

PubMed

Phutikanit, Nawapen; Suwimonteerabutr, Junpen; Harrison, Dion; D'Occhio, Michael; Carroll, Bernie; Techakumphu, Mongkol

2010-03-05

The purpose of this study was to apply an arbitrarily primed methylation sensitive polymerase chain reaction (PCR) assay called Amplified Methylation Polymorphism Polymerase Chain Reaction (AMP PCR) to investigate the methylation profiles of somatic and germ cells obtained from Holstein bulls. Genomic DNA was extracted from sperm, leukocytes and fibroblasts obtained from three bulls and digested with a methylation sensitive endonuclease (HpaII). The native genomic and enzyme treated DNA samples were used as templates in an arbitrarily primed-PCR assay with 30 sets of single short oligonucleotide primer. The PCR products were separated on silver stained denaturing polyacrylamide gels. Three types of PCR markers; digestion resistant-, digestion sensitive-, and digestion dependent markers, were analyzed based on the presence/absence polymorphism of the markers between the two templates. Approximately 1,000 PCR markers per sample were produced from 27 sets of primer and most of them (>90%) were digestion resistant markers. The highest percentage of digestion resistant markers was found in leukocytic DNA (94.8%) and the lowest in fibroblastic DNA (92.3%, P < or = 0.05). Spermatozoa contained a higher number of digestion sensitive markers when compared with the others (3.6% vs. 2.2% and 2.6% in leukocytes and fibroblasts respectively, P < or = 0.05). The powerfulness of the AMP PCR assay was the generation of methylation-associated markers without any prior knowledge of the genomic sequence. The data obtained from different primers provided an overview of genome wide DNA methylation content in different cell types. By using this technique, we found that DNA methylation profile is tissue-specific. Male germ cells were hypomethylated at the HpaII locations when compared with somatic cells, while the chromatin of the well-characterized somatic cells was heavily methylated when compared with that of the versatile somatic cells.
Canine urothelial carcinoma: genomically aberrant and comparatively relevant.

PubMed

Shapiro, S G; Raghunath, S; Williams, C; Motsinger-Reif, A A; Cullen, J M; Liu, T; Albertson, D; Ruvolo, M; Bergstrom Lucas, A; Jin, J; Knapp, D W; Schiffman, J D; Breen, M

2015-06-01

Urothelial carcinoma (UC), also referred to as transitional cell carcinoma (TCC), is the most common bladder malignancy in both human and canine populations. In human UC, numerous studies have demonstrated the prevalence of chromosomal imbalances. Although the histopathology of the disease is similar in both species, studies evaluating the genomic profile of canine UC are lacking, limiting the discovery of key comparative molecular markers associated with driving UC pathogenesis. In the present study, we evaluated 31 primary canine UC biopsies by oligonucleotide array comparative genomic hybridization (oaCGH). Results highlighted the presence of three highly recurrent numerical aberrations: gain of dog chromosome (CFA) 13 and 36 and loss of CFA 19. Regional gains of CFA 13 and 36 were present in 97 % and 84 % of cases, respectively, and losses on CFA 19 were present in 77 % of cases. Fluorescence in situ hybridization (FISH), using targeted bacterial artificial chromosome (BAC) clones and custom Agilent SureFISH probes, was performed to detect and quantify these regions in paraffin-embedded biopsy sections and urine-derived urothelial cells. The data indicate that these three aberrations are potentially diagnostic of UC. Comparison of our canine oaCGH data with that of 285 human cases identified a series of shared copy number aberrations. Using an informatics approach to interrogate the frequency of copy number aberrations across both species, we identified those that had the highest joint probability of association with UC. The most significant joint region contained the gene PABPC1, which should be considered further for its role in UC progression. In addition, cross-species filtering of genome-wide copy number data highlighted several genes as high-profile candidates for further analysis, including CDKN2A, S100A8/9, and LRP1B. We propose that these common aberrations are indicative of an evolutionarily conserved mechanism of pathogenesis and harbor genes key to urothelial neoplasia, warranting investigation for diagnostic, prognostic, and therapeutic applications.
Comparative Transcriptomic Analysis of Two Brassica napus Near-Isogenic Lines Reveals a Network of Genes That Influences Seed Oil Accumulation.

PubMed

Wang, Jingxue; Singh, Sanjay K; Du, Chunfang; Li, Chen; Fan, Jianchun; Pattanaik, Sitakanta; Yuan, Ling

2016-01-01

Rapeseed ( Brassica napus ) is an important oil seed crop, providing more than 13% of the world's supply of edible oils. An in-depth knowledge of the gene network involved in biosynthesis and accumulation of seed oil is critical for the improvement of B. napus . Using available genomic and transcriptomic resources, we identified 1,750 acyl-lipid metabolism (ALM) genes that are distributed over 19 chromosomes in the B . napus genome. B. rapa and B. oleracea , two diploid progenitors of B. napus , contributed almost equally to the ALM genes. Genome collinearity analysis demonstrated that the majority of the ALM genes have arisen due to genome duplication or segmental duplication events. In addition, we profiled the expression patterns of the ALM genes in four different developmental stages. Furthermore, we developed two B. napus near isogenic lines (NILs). The high oil NIL, YC13-559, accumulates significantly higher (∼10%) seed oil compared to the other, YC13-554. Comparative gene expression analysis revealed upregulation of lipid biosynthesis-related regulatory genes in YC13-559, including SHOOTMERISTEMLESS, LEAFY COTYLEDON 1 (LEC1), LEC2, FUSCA3, ABSCISIC ACID INSENSITIVE 3 (ABI3), ABI4, ABI5 , and WRINKLED1 , as well as structural genes, such as ACETYL-CoA CARBOXYLASE, ACYL-CoA DIACYLGLYCEROL ACYLTRANSFERASE , and LONG - CHAIN ACYL-CoA SYNTHETASES . We observed that several genes related to the phytohormones, gibberellins, jasmonate, and indole acetic acid, were differentially expressed in the NILs. Our findings provide a broad account of the numbers, distribution, and expression profiles of acyl-lipid metabolism genes, as well as gene networks that potentially control oil accumulation in B . napus seeds. The upregulation of key regulatory and structural genes related to lipid biosynthesis likely plays a major role for the increased seed oil in YC13-559.
Genome-Wide Characterization of Major Intrinsic Proteins in Four Grass Plants and Their Non-Aqua Transport Selectivity Profiles with Comparative Perspective

PubMed Central

Azad, Abul Kalam; Ahmed, Jahed; Alum, Md. Asraful; Hasan, Md. Mahbub; Ishikawa, Takahiro; Sawa, Yoshihiro; Katsuhara, Maki

2016-01-01

Major intrinsic proteins (MIPs), commonly known as aquaporins, transport not only water in plants but also other substrates of physiological significance and heavy metals. In most of the higher plants, MIPs are divided into five subfamilies (PIPs, TIPs, NIPs, SIPs and XIPs). Herein, we identified 68, 42, 38 and 28 full-length MIPs, respectively in the genomes of four monocot grass plants, specifically Panicum virgatum, Setaria italica, Sorghum bicolor and Brachypodium distachyon. Phylogenetic analysis showed that the grass plants had only four MIP subfamilies including PIPs, TIPs, NIPs and SIPs without XIPs. Based on structural analysis of the homology models and comparing the primary selectivity-related motifs [two NPA regions, aromatic/arginine (ar/R) selectivity filter and Froger's positions (FPs)] of all plant MIPs that have been experimentally proven to transport non-aqua substrates, we predicted the transport profiles of all MIPs in the four grass plants and also in eight other plants. Groups of MIP subfamilies based on ar/R selectivity filter and FPs were linked to the non-aqua transport profiles. We further deciphered the substrate selectivity profiles of the MIPs in the four grass plants and compared them with their counterparts in rice, maize, soybean, poplar, cotton, Arabidopsis thaliana, Physcomitrella patens and Selaginella moellendorffii. In addition to two NPA regions, ar/R filter and FPs, certain residues, especially in loops B and C, contribute to the functional distinctiveness of MIP groups. Expression analysis of transcripts in different organs indicated that non-aqua transport was related to expression of MIPs since most of the unexpressed MIPs were not predicted to facilitate the transport of non-aqua molecules. Among all MIPs in every plant, TIP (BdTIP1;1, SiTIP1;2, SbTIP2;1 and PvTIP1;2) had the overall highest mean expression. Our study generates significant information for understanding the diversity, evolution, non-aqua transport profiles and insight into comparative transport selectivity of plant MIPs, and provides tools for the development of transgenic plants. PMID:27327960
Genome-Wide Characterization of Major Intrinsic Proteins in Four Grass Plants and Their Non-Aqua Transport Selectivity Profiles with Comparative Perspective.

PubMed

Azad, Abul Kalam; Ahmed, Jahed; Alum, Md Asraful; Hasan, Md Mahbub; Ishikawa, Takahiro; Sawa, Yoshihiro; Katsuhara, Maki

2016-01-01

Major intrinsic proteins (MIPs), commonly known as aquaporins, transport not only water in plants but also other substrates of physiological significance and heavy metals. In most of the higher plants, MIPs are divided into five subfamilies (PIPs, TIPs, NIPs, SIPs and XIPs). Herein, we identified 68, 42, 38 and 28 full-length MIPs, respectively in the genomes of four monocot grass plants, specifically Panicum virgatum, Setaria italica, Sorghum bicolor and Brachypodium distachyon. Phylogenetic analysis showed that the grass plants had only four MIP subfamilies including PIPs, TIPs, NIPs and SIPs without XIPs. Based on structural analysis of the homology models and comparing the primary selectivity-related motifs [two NPA regions, aromatic/arginine (ar/R) selectivity filter and Froger's positions (FPs)] of all plant MIPs that have been experimentally proven to transport non-aqua substrates, we predicted the transport profiles of all MIPs in the four grass plants and also in eight other plants. Groups of MIP subfamilies based on ar/R selectivity filter and FPs were linked to the non-aqua transport profiles. We further deciphered the substrate selectivity profiles of the MIPs in the four grass plants and compared them with their counterparts in rice, maize, soybean, poplar, cotton, Arabidopsis thaliana, Physcomitrella patens and Selaginella moellendorffii. In addition to two NPA regions, ar/R filter and FPs, certain residues, especially in loops B and C, contribute to the functional distinctiveness of MIP groups. Expression analysis of transcripts in different organs indicated that non-aqua transport was related to expression of MIPs since most of the unexpressed MIPs were not predicted to facilitate the transport of non-aqua molecules. Among all MIPs in every plant, TIP (BdTIP1;1, SiTIP1;2, SbTIP2;1 and PvTIP1;2) had the overall highest mean expression. Our study generates significant information for understanding the diversity, evolution, non-aqua transport profiles and insight into comparative transport selectivity of plant MIPs, and provides tools for the development of transgenic plants.
Network-assisted target identification for haploinsufficiency and homozygous profiling screens

PubMed Central

Wang, Sheng

2017-01-01

Chemical genomic screens have recently emerged as a systematic approach to drug discovery on a genome-wide scale. Drug target identification and elucidation of the mechanism of action (MoA) of hits from these noisy high-throughput screens remain difficult. Here, we present GIT (Genetic Interaction Network-Assisted Target Identification), a network analysis method for drug target identification in haploinsufficiency profiling (HIP) and homozygous profiling (HOP) screens. With the drug-induced phenotypic fitness defect of the deletion of a gene, GIT also incorporates the fitness defects of the gene’s neighbors in the genetic interaction network. On three genome-scale yeast chemical genomic screens, GIT substantially outperforms previous scoring methods on target identification on HIP and HOP assays, respectively. Finally, we showed that by combining HIP and HOP assays, GIT further boosts target identification and reveals potential drug’s mechanism of action. PMID:28574983
Genomics DNA Profiling in Elite Professional Soccer Players: A Pilot Study

PubMed Central

Kambouris, M; Del Buono, A; Maffulli, N

2014-01-01

Functional variants in exonic regions have been associated with development of cardiovascular disease, diabetes and cancer. Athletic performance can be considered a multi-factorial complex phenotype. Genomic DNA was extracted from buccal swabs of seven soccer players from the Fulham football team. Single nucleotide polymorphism (SNPs) genotyping was undertaken. To achieve optimal athletic performance, predictive genomics DNA profiling for sports performance can be used to aid in sport selection and elaboration of personalized training and nutrition programs. Predictive DNA profiling may be able to detect athletes with potential or frank injuries, or screening and selection of future athletes, and can help them to maximize utilization of their potential and improve performance in sports. The aim of this study is to provide a wide scenario of specific genomic variants that an athlete carries, to implement which measures should be taken to maximize the athlete’s potential. PMID:24809029
Leveraging non-targeted metabolite profiling via statistical genomics

USDA-ARS?s Scientific Manuscript database

One of the challenges of systems biology is to integrate multiple sources of data in order to build a cohesive view of the system of study. Here we describe the mass spectrometry based profiling of maize kernels, a model system for genomic studies and a cornerstone of the agroeconomy. Using a networ...
Identification of a unique library of complex, but ordered, arrays of repetitive elements in the human genome and implication of their potential involvement in pathobiology.

PubMed

Lee, Kang-Hoon; Lee, Young-Kwan; Kwon, Deug-Nam; Chiu, Sophia; Chew, Victoria; Rah, Hyungchul; Kujawski, Gregory; Melhem, Ramzi; Hsu, Karen; Chung, Cecilia; Greenhalgh, David G; Cho, Kiho

2011-06-01

Approximately 2% of the human genome is reported to be occupied by genes. Various forms of repetitive elements (REs), both characterized and uncharacterized, are presumed to make up the vast majority of the rest of the genomes of human and other species. In conjunction with a comprehensive annotation of genes, information regarding components of genome biology, such as gene polymorphisms, non-coding RNAs, and certain REs, is found in human genome databases. However, the genome-wide profile of unique RE arrangements formed by different groups of REs has not been fully characterized yet. In this study, the entire human genome was subjected to an unbiased RE survey to establish a whole-genome profile of REs and their arrangements. Due to the limitation in query size within the bl2seq alignment program (National Center for Biotechnology Information [NCBI]) utilized for the RE survey, the entire NCBI reference human genome was fragmented into 6206 units of 0.5M nucleotides. A number of RE arrangements with varying complexities and patterns were identified throughout the genome. Each chromosome had unique profiles of RE arrangements and density, and high levels of RE density were measured near the centromere regions. Subsequently, 175 complex RE arrangements, which were selected throughout the genome, were subjected to a comparison analysis using five different human genome sequences. Interestingly, three of the five human genome databases shared the exactly same arrangement patterns and sequences for all 175 RE arrangement regions (a total of 12,765,625 nucleotides). The findings from this study demonstrate that a substantial fraction of REs in the human genome are clustered into various forms of ordered structures. Further investigations are needed to examine whether some of these ordered RE arrangements contribute to the human pathobiology as a functional genome unit. Copyright © 2011 Elsevier Inc. All rights reserved.
Genome-Wide Expression Profiling of Complex Regional Pain Syndrome

PubMed Central

Jin, Eun-Heui; Zhang, Enji; Ko, Youngkwon; Sim, Woo Seog; Moon, Dong Eon; Yoon, Keon Jung; Hong, Jang Hee; Lee, Won Hyung

2013-01-01

Complex regional pain syndrome (CRPS) is a chronic, progressive, and devastating pain syndrome characterized by spontaneous pain, hyperalgesia, allodynia, altered skin temperature, and motor dysfunction. Although previous gene expression profiling studies have been conducted in animal pain models, there genome-wide expression profiling in the whole blood of CRPS patients has not been reported yet. Here, we successfully identified certain pain-related genes through genome-wide expression profiling in the blood from CRPS patients. We found that 80 genes were differentially expressed between 4 CRPS patients (2 CRPS I and 2 CRPS II) and 5 controls (cut-off value: 1.5-fold change and p<0.05). Most of those genes were associated with signal transduction, developmental processes, cell structure and motility, and immunity and defense. The expression levels of major histocompatibility complex class I A subtype (HLA-A29.1), matrix metalloproteinase 9 (MMP9), alanine aminopeptidase N (ANPEP), l-histidine decarboxylase (HDC), granulocyte colony-stimulating factor 3 receptor (G-CSF3R), and signal transducer and activator of transcription 3 (STAT3) genes selected from the microarray were confirmed in 24 CRPS patients and 18 controls by quantitative reverse transcription-polymerase chain reaction (qRT-PCR). We focused on the MMP9 gene that, by qRT-PCR, showed a statistically significant difference in expression in CRPS patients compared to controls with the highest relative fold change (4.0±1.23 times and p = 1.4×10−4). The up-regulation of MMP9 gene in the blood may be related to the pain progression in CRPS patients. Our findings, which offer a valuable contribution to the understanding of the differential gene expression in CRPS may help in the understanding of the pathophysiology of CRPS pain progression. PMID:24244504

Xenopus microRNA genes are predominantly located within introns and are differentially expressed in adult frog tissues via post-transcriptional regulation

PubMed Central

Tang, Guo-Qing; Maxwell, E. Stuart

2008-01-01

The amphibian Xenopus provides a model organism for investigating microRNA expression during vertebrate embryogenesis and development. Searching available Xenopus genome databases using known human pre-miRNAs as query sequences, more than 300 genes encoding 142 Xenopus tropicalis miRNAs were identified. Analysis of Xenopus tropicalis miRNA genes revealed a predominate positioning within introns of protein-coding and nonprotein-coding RNA Pol II-transcribed genes. MiRNA genes were also located in pre-mRNA exons and positioned intergenically between known protein-coding genes. Many miRNA species were found in multiple locations and in more than one genomic context. MiRNA genes were also clustered throughout the genome, indicating the potential for the cotranscription and coordinate expression of miRNAs located in a given cluster. Northern blot analysis confirmed the expression of many identified miRNAs in both X. tropicalis and X. laevis. Comparison of X. tropicalis and X. laevis blots revealed comparable expression profiles, although several miRNAs exhibited species-specific expression in different tissues. More detailed analysis revealed that for some miRNAs, the tissue-specific expression profile of the pri-miRNA precursor was distinctly different from that of the mature miRNA profile. Differential miRNA precursor processing in both the nucleus and cytoplasm was implicated in the observed tissue-specific differences. These observations indicated that post-transcriptional processing plays an important role in regulating miRNA expression in the amphibian Xenopus. PMID:18032731
Refinement of light-responsive transcript lists using rice oligonucleotide arrays: evaluation of gene-redundancy.

PubMed

Jung, Ki-Hong; Dardick, Christopher; Bartley, Laura E; Cao, Peijian; Phetsom, Jirapa; Canlas, Patrick; Seo, Young-Su; Shultz, Michael; Ouyang, Shu; Yuan, Qiaoping; Frank, Bryan C; Ly, Eugene; Zheng, Li; Jia, Yi; Hsia, An-Ping; An, Kyungsook; Chou, Hui-Hsien; Rocke, David; Lee, Geun Cheol; Schnable, Patrick S; An, Gynheung; Buell, C Robin; Ronald, Pamela C

2008-10-06

Studies of gene function are often hampered by gene-redundancy, especially in organisms with large genomes such as rice (Oryza sativa). We present an approach for using transcriptomics data to focus functional studies and address redundancy. To this end, we have constructed and validated an inexpensive and publicly available rice oligonucleotide near-whole genome array, called the rice NSF45K array. We generated expression profiles for light- vs. dark-grown rice leaf tissue and validated the biological significance of the data by analyzing sources of variation and confirming expression trends with reverse transcription polymerase chain reaction. We examined trends in the data by evaluating enrichment of gene ontology terms at multiple false discovery rate thresholds. To compare data generated with the NSF45K array with published results, we developed publicly available, web-based tools (www.ricearray.org). The Oligo and EST Anatomy Viewer enables visualization of EST-based expression profiling data for all genes on the array. The Rice Multi-platform Microarray Search Tool facilitates comparison of gene expression profiles across multiple rice microarray platforms. Finally, we incorporated gene expression and biochemical pathway data to reduce the number of candidate gene products putatively participating in the eight steps of the photorespiration pathway from 52 to 10, based on expression levels of putatively functionally redundant genes. We confirmed the efficacy of this method to cope with redundancy by correctly predicting participation in photorespiration of a gene with five paralogs. Applying these methods will accelerate rice functional genomics.
Prognostic Classifier Based on Genome-Wide DNA Methylation Profiling in Well-Differentiated Thyroid Tumors.

PubMed

Bisarro Dos Reis, Mariana; Barros-Filho, Mateus Camargo; Marchi, Fábio Albuquerque; Beltrami, Caroline Moraes; Kuasne, Hellen; Pinto, Clóvis Antônio Lopes; Ambatipudi, Srikant; Herceg, Zdenko; Kowalski, Luiz Paulo; Rogatto, Silvia Regina

2017-11-01

Even though the majority of well-differentiated thyroid carcinoma (WDTC) is indolent, a number of cases display an aggressive behavior. Cumulative evidence suggests that the deregulation of DNA methylation has the potential to point out molecular markers associated with worse prognosis. To identify a prognostic epigenetic signature in thyroid cancer. Genome-wide DNA methylation assays (450k platform, Illumina) were performed in a cohort of 50 nonneoplastic thyroid tissues (NTs), 17 benign thyroid lesions (BTLs), and 74 thyroid carcinomas (60 papillary, 8 follicular, 2 Hürthle cell, 1 poorly differentiated, and 3 anaplastic). A prognostic classifier for WDTC was developed via diagonal linear discriminant analysis. The results were compared with The Cancer Genome Atlas (TCGA) database. A specific epigenetic profile was detected according to each histological subtype. BTLs and follicular carcinomas showed a greater number of methylated CpG in comparison with NTs, whereas hypomethylation was predominant in papillary and undifferentiated carcinomas. A prognostic classifier based on 21 DNA methylation probes was able to predict poor outcome in patients with WDTC (sensitivity 63%, specificity 92% for internal data; sensitivity 64%, specificity 88% for TCGA data). High-risk score based on the classifier was considered an independent factor of poor outcome (Cox regression, P < 0.001). The methylation profile of thyroid lesions exhibited a specific signature according to the histological subtype. A meaningful algorithm composed of 21 probes was capable of predicting the recurrence in WDTC. Copyright © 2017 Endocrine Society
Genetic Progression of High Grade Prostatic Intraepithelial Neoplasia to Prostate Cancer.

PubMed

Jung, Seung-Hyun; Shin, Sun; Kim, Min Sung; Baek, In-Pyo; Lee, Ji Youl; Lee, Sung Hak; Kim, Tae-Min; Lee, Sug Hyung; Chung, Yeun-Jun

2016-05-01

Although high grade prostatic intraepithelial neoplasia (HGPIN) is considered a neoplastic lesion that precedes prostate cancer (PCA), the genomic structures of HGPIN remain unknown. Identification of the genomic landscape of HGPIN and the genomic differences between HGPIN and PCA that may drive the progression to PCA. We analyzed 20 regions of paired HGPIN and PCA from six patients using whole-exome sequencing and array-comparative genomic hybridization. Somatic mutation and copy number alteration (CNA) profiles of paired HGPIN and PCA were measured and compared. The number of total mutations and CNAs of HGPINs were significantly fewer than those of PCAs. Mutations in FOXA1 and CNAs (1q and 8q gains) were detected in both HGPIN and PCA ('common'), suggesting their roles in early PCA development. Mutations in SPOP, KDM6A, and KMT2D were 'PCA-specific', suggesting their roles in HGPIN progression to PCA. The 8p loss was either 'common' or 'PCA-specific'. In-silico estimation of evolutionary ages predicted that HGPIN genomes were much younger than PCA genomes. Our data show that PCAs are direct descendants of HGPINs in most cases that require more genomic alterations to progress to PCA. The nature of heterogeneous HGPIN population that might attenuate genomic signals should further be studied. HGPIN genomes harbor relatively fewer mutations and CNAs than PCA but require additional hits for the progression. In this study, we suggest a systemic diagram from high grade prostatic intraepithelial neoplasia (HGPIN) to prostate cancer (PCA). Our results provide a clue to explain the long latency from HGPIN to PCA and provide useful information for the genetic diagnosis of HGPIN and PCA. Copyright © 2015 European Association of Urology. Published by Elsevier B.V. All rights reserved.
Signatures of cytoplasmic proteins in the exoproteome distinguish community- and hospital-associated methicillin-resistant Staphylococcus aureus USA300 lineages.

PubMed

Mekonnen, Solomon A; Palma Medina, Laura M; Glasner, Corinna; Tsompanidou, Eleni; de Jong, Anne; Grasso, Stefano; Schaffer, Marc; Mäder, Ulrike; Larsen, Anders R; Gumpert, Heidi; Westh, Henrik; Völker, Uwe; Otto, Andreas; Becher, Dörte; van Dijl, Jan Maarten

2017-08-18

Methicillin-resistant Staphylococcus aureus (MRSA) is the common name for a heterogeneous group of highly drug-resistant staphylococci. Two major MRSA classes are distinguished based on epidemiology, namely community-associated (CA) and hospital-associated (HA) MRSA. Notably, the distinction of CA- and HA-MRSA based on molecular traits remains difficult due to the high genomic plasticity of S. aureus. Here we sought to pinpoint global distinguishing features of CA- and HA-MRSA through a comparative genome and proteome analysis of the notorious MRSA lineage USA300. We show for the first time that CA- and HA-MRSA isolates can be distinguished by 2 distinct extracellular protein abundance clusters that are predictive not only for epidemiologic behavior, but also for their growth and survival within epithelial cells. This 'exoproteome profiling' also groups more distantly related HA-MRSA isolates into the HA exoproteome cluster. Comparative genome analysis suggests that these distinctive features of CA- and HA-MRSA isolates relate predominantly to the accessory genome. Intriguingly, the identified exoproteome clusters differ in the relative abundance of typical cytoplasmic proteins, suggesting that signatures of cytoplasmic proteins in the exoproteome represent a new distinguishing feature of CA- and HA-MRSA. Our comparative genome and proteome analysis focuses attention on potentially distinctive roles of 'liberated' cytoplasmic proteins in the epidemiology and intracellular survival of CA- and HA-MRSA isolates. Such extracellular cytoplasmic proteins were recently invoked in staphylococcal virulence, but their implication in the epidemiology of MRSA is unprecedented.
An Inducible Operon Is Involved in Inulin Utilization in Lactobacillus plantarum Strains, as Revealed by Comparative Proteogenomics and Metabolic Profiling.

PubMed

Buntin, Nirunya; Hongpattarakere, Tipparat; Ritari, Jarmo; Douillard, François P; Paulin, Lars; Boeren, Sjef; Shetty, Sudarshan A; de Vos, Willem M

2017-01-15

The draft genomes of Lactobacillus plantarum strains isolated from Asian fermented foods, infant feces, and shrimp intestines were sequenced and compared to those of well-studied strains. Among 28 strains of L. plantarum, variations in the genomic features involved in ecological adaptation were elucidated. The genome sizes ranged from approximately 3.1 to 3.5 Mb, of which about 2,932 to 3,345 protein-coding sequences (CDS) were predicted. The food-derived isolates contained a higher number of carbohydrate metabolism-associated genes than those from infant feces. This observation correlated to their phenotypic carbohydrate metabolic profile, indicating their ability to metabolize the largest range of sugars. Surprisingly, two strains (P14 and P76) isolated from fermented fish utilized inulin. β-Fructosidase, the inulin-degrading enzyme, was detected in the supernatants and cell wall extracts of both strains. No activity was observed in the cytoplasmic fraction, indicating that this key enzyme was either membrane-bound or extracellularly secreted. From genomic mining analysis, a predicted inulin operon of fosRABCDXE, which encodes β-fructosidase and many fructose transporting proteins, was found within the genomes of strains P14 and P76. Moreover, pts1BCA genes, encoding sucrose-specific IIBCA components involved in sucrose transport, were also identified. The proteomic analysis revealed the mechanism and functional characteristic of the fosRABCDXE operon involved in the inulin utilization of L. plantarum The expression levels of the fos operon and pst genes were upregulated at mid-log phase. FosE and the LPXTG-motif cell wall anchored β-fructosidase were induced to a high abundance when inulin was present as a carbon source. Inulin is a long-chain carbohydrate that may act as a prebiotic, which provides many health benefits to the host by selectively stimulating the growth and activity of beneficial bacteria in the colon. While certain lactobacilli can catabolize inulin, this has not yet been described for Lactobacillus plantarum, and an associated putative inulin operon has not been reported in this species. By using comparative and functional genomics, we showed that two L. plantarum strains utilized inulin and identified functional inulin operons in their genomes. The proteogenomic data revealed that inulin degradation and uptake routes, which related to the fosRABCDXE operon and pstBCA genes, were widely expressed among L. plantarum strains. The present work provides a novel understanding of gene regulation and mechanisms of inulin utilization in probiotic L. plantarum generating opportunities for synbiotic product development. Copyright © 2016 American Society for Microbiology.
An Inducible Operon Is Involved in Inulin Utilization in Lactobacillus plantarum Strains, as Revealed by Comparative Proteogenomics and Metabolic Profiling

PubMed Central

Buntin, Nirunya; Hongpattarakere, Tipparat; Ritari, Jarmo; Douillard, François P.; Paulin, Lars; Boeren, Sjef; Shetty, Sudarshan A.

2016-01-01

ABSTRACT The draft genomes of Lactobacillus plantarum strains isolated from Asian fermented foods, infant feces, and shrimp intestines were sequenced and compared to those of well-studied strains. Among 28 strains of L. plantarum, variations in the genomic features involved in ecological adaptation were elucidated. The genome sizes ranged from approximately 3.1 to 3.5 Mb, of which about 2,932 to 3,345 protein-coding sequences (CDS) were predicted. The food-derived isolates contained a higher number of carbohydrate metabolism-associated genes than those from infant feces. This observation correlated to their phenotypic carbohydrate metabolic profile, indicating their ability to metabolize the largest range of sugars. Surprisingly, two strains (P14 and P76) isolated from fermented fish utilized inulin. β-Fructosidase, the inulin-degrading enzyme, was detected in the supernatants and cell wall extracts of both strains. No activity was observed in the cytoplasmic fraction, indicating that this key enzyme was either membrane-bound or extracellularly secreted. From genomic mining analysis, a predicted inulin operon of fosRABCDXE, which encodes β-fructosidase and many fructose transporting proteins, was found within the genomes of strains P14 and P76. Moreover, pts1BCA genes, encoding sucrose-specific IIBCA components involved in sucrose transport, were also identified. The proteomic analysis revealed the mechanism and functional characteristic of the fosRABCDXE operon involved in the inulin utilization of L. plantarum. The expression levels of the fos operon and pst genes were upregulated at mid-log phase. FosE and the LPXTG-motif cell wall anchored β-fructosidase were induced to a high abundance when inulin was present as a carbon source. IMPORTANCE Inulin is a long-chain carbohydrate that may act as a prebiotic, which provides many health benefits to the host by selectively stimulating the growth and activity of beneficial bacteria in the colon. While certain lactobacilli can catabolize inulin, this has not yet been described for Lactobacillus plantarum, and an associated putative inulin operon has not been reported in this species. By using comparative and functional genomics, we showed that two L. plantarum strains utilized inulin and identified functional inulin operons in their genomes. The proteogenomic data revealed that inulin degradation and uptake routes, which related to the fosRABCDXE operon and pstBCA genes, were widely expressed among L. plantarum strains. The present work provides a novel understanding of gene regulation and mechanisms of inulin utilization in probiotic L. plantarum generating opportunities for synbiotic product development. PMID:27815279
Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples.

PubMed

Pettengill, James B; Pightling, Arthur W; Baugher, Joseph D; Rand, Hugh; Strain, Errol

2016-01-01

The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging due to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). When analyzing empirical data (whole-genome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.
Genomic Landscape of Colorectal Mucosa and Adenomas in Familial Adenomatous Polyposis

PubMed Central

Borras, Ester; San Lucas, F. Anthony; Chang, Kyle; Zhou, Ruoji; Masand, Gita; Fowler, Jerry; Mork, Maureen E.; You, Y. Nancy; Taggart, Melissa W.; McAllister, Florencia; Jones, David A.; Davies, Gareth E.; Edelmann, Winfried; Ehli, Erik A.; Lynch, Patrick M.; Hawk, Ernest T.; Capella, Gabriel; Scheet, Paul; Vilar, Eduardo

2016-01-01

Purpose The molecular basis of the adenoma to carcinoma transition has been deduced using comparative analysis of genetic alterations observed through the sequential steps of intestinal carcinogenesis. However, comprehensive genomic analyses of adenomas and at-risk mucosa are still lacking. Therefore, our aim was to characterize the genomic landscape of colonic at-risk mucosa and adenomas. Experimental Design We analyzed the mutation profile and copy number changes of 25 adenomas and adjacent mucosa from 12 familial adenomatous polyposis patients using whole-exome sequencing and validated allelic imbalances in 37 adenomas using SNP arrays. We assessed for evidence of clonality and performed estimations on the proportions of driver and passenger mutations using a systems biology approach. Results Adenomas had lower mutational rates than did colorectal cancers and showed recurrent alterations in known cancer-driver genes (APC, KRAS, FBXW7, TCF7L2) and allelic imbalances in chromosomes 5, 7 and 13. Moreover, 80% of adenomas had somatic alterations in WNT pathway genes. Adenomas displayed evidence of multiclonality similar to stage I carcinomas. Strong correlations between mutational rate and patient age were observed in at-risk mucosa and adenomas. Our data indicate that at least 23% of somatic mutations are present in at-risk mucosa prior to adenoma initiation. Conclusions The genomic profiles of at-risk mucosa and adenomas illustrate the evolution from normal tissue to carcinoma via greater resolution of molecular changes at the inflection point of premalignant lesions. Furthermore, substantial genomic variation exists in at-risk mucosa before adenoma formation, and deregulation of the WNT pathway is required to foster carcinogenesis. PMID:27221540
Colwellia psychrerythraea strains from distant deep sea basins show adaptation to local conditions

DOE PAGES

Techtmann, Stephen M.; Fitzgerald, Kathleen S.; Stelling, Savannah C.; ...

2016-05-09

Many studies have shown that microbes, which share nearly identical 16S rRNA genes, can have highly divergent genomes. Microbes from distinct parts of the ocean also exhibit biogeographic patterning. Here in this study we seek to better understand how certain microbes from the same species have adapted for growth under local conditions. The phenotypic and genomic heterogeneity of three strains of Colwellia psychrerythraea was investigated in order to understand adaptions to local environments. Colwellia are psychrophilic heterotrophic marine bacteria ubiquitous in cold marine ecosystems. We have recently isolated two Colwellia strains: ND2E from the Eastern Mediterranean and GAB14E from themore » Great Australian Bight. The 16S rRNA sequence of these two strains were greater than 98.2% identical to the well-characterized C. psychrerythraea 34H, which was isolated from arctic sediments. Salt tolerance, and carbon source utilization profiles for these strains were determined using Biolog Phenotype MicoArrays. These strains exhibited distinct salt tolerance, which was not associated with the salinity of sites of isolation. The carbon source utilization profiles were distinct with less than half of the tested carbon sources being metabolized by all three strains. Whole genome sequencing revealed that the genomes of these three strains were quite diverse with some genomes having up to 1600 strain-specific genes. Many genes involved in degrading strain-specific carbon sources were identified. Finally, there appears to be a link between carbon source utilization and location of isolation with distinctions observed between the Colwellia isolate recovered from sediment compared to water column isolates.« less
Identification of Candidate B-Lymphoma Genes by Cross-Species Gene Expression Profiling

PubMed Central

Tompkins, Van S.; Han, Seong-Su; Olivier, Alicia; Syrbu, Sergei; Bair, Thomas; Button, Anna; Jacobus, Laura; Wang, Zebin; Lifton, Samuel; Raychaudhuri, Pradip; Morse, Herbert C.; Weiner, George; Link, Brian; Smith, Brian J.; Janz, Siegfried

2013-01-01

Comparative genome-wide expression profiling of malignant tumor counterparts across the human-mouse species barrier has a successful track record as a gene discovery tool in liver, breast, lung, prostate and other cancers, but has been largely neglected in studies on neoplasms of mature B-lymphocytes such as diffuse large B cell lymphoma (DLBCL) and Burkitt lymphoma (BL). We used global gene expression profiles of DLBCL-like tumors that arose spontaneously in Myc-transgenic C57BL/6 mice as a phylogenetically conserved filter for analyzing the human DLBCL transcriptome. The human and mouse lymphomas were found to have 60 concordantly deregulated genes in common, including 8 genes that Cox hazard regression analysis associated with overall survival in a published landmark dataset of DLBCL. Genetic network analysis of the 60 genes followed by biological validation studies indicate FOXM1 as a candidate DLBCL and BL gene, supporting a number of studies contending that FOXM1 is a therapeutic target in mature B cell tumors. Our findings demonstrate the value of the “mouse filter” for genomic studies of human B-lineage neoplasms for which a vast knowledge base already exists. PMID:24130802
CGDM: collaborative genomic data model for molecular profiling data using NoSQL.

PubMed

Wang, Shicai; Mares, Mihaela A; Guo, Yi-Ke

2016-12-01

High-throughput molecular profiling has greatly improved patient stratification and mechanistic understanding of diseases. With the increasing amount of data used in translational medicine studies in recent years, there is a need to improve the performance of data warehouses in terms of data retrieval and statistical processing. Both relational and Key Value models have been used for managing molecular profiling data. Key Value models such as SeqWare have been shown to be particularly advantageous in terms of query processing speed for large datasets. However, more improvement can be achieved, particularly through better indexing techniques of the Key Value models, taking advantage of the types of queries which are specific for the high-throughput molecular profiling data. In this article, we introduce a Collaborative Genomic Data Model (CGDM), aimed at significantly increasing the query processing speed for the main classes of queries on genomic databases. CGDM creates three Collaborative Global Clustering Index Tables (CGCITs) to solve the velocity and variety issues at the cost of limited extra volume. Several benchmarking experiments were carried out, comparing CGDM implemented on HBase to the traditional SQL data model (TDM) implemented on both HBase and MySQL Cluster, using large publicly available molecular profiling datasets taken from NCBI and HapMap. In the microarray case, CGDM on HBase performed up to 246 times faster than TDM on HBase and 7 times faster than TDM on MySQL Cluster. In single nucleotide polymorphism case, CGDM on HBase outperformed TDM on HBase by up to 351 times and TDM on MySQL Cluster by up to 9 times. The CGDM source code is available at https://github.com/evanswang/CGDM. y.guo@imperial.ac.uk. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Comprehensive Genomic Profiling of Esthesioneuroblastoma Reveals Additional Treatment Options.

PubMed

Gay, Laurie M; Kim, Sungeun; Fedorchak, Kyle; Kundranda, Madappa; Odia, Yazmin; Nangia, Chaitali; Battiste, James; Colon-Otero, Gerardo; Powell, Steven; Russell, Jeffery; Elvin, Julia A; Vergilio, Jo-Anne; Suh, James; Ali, Siraj M; Stephens, Philip J; Miller, Vincent A; Ross, Jeffrey S

2017-07-01

Esthesioneuroblastoma (ENB), also known as olfactory neuroblastoma, is a rare malignant neoplasm of the olfactory mucosa. Despite surgical resection combined with radiotherapy and adjuvant chemotherapy, ENB often relapses with rapid progression. Current multimodality, nontargeted therapy for relapsed ENB is of limited clinical benefit. We queried whether comprehensive genomic profiling (CGP) of relapsed or refractory ENB can uncover genomic alterations (GA) that could identify potential targeted therapies for these patients. CGP was performed on formalin-fixed, paraffin-embedded sections from 41 consecutive clinical cases of ENBs using a hybrid-capture, adaptor ligation based next-generation sequencing assay to a mean coverage depth of 593X. The results were analyzed for base substitutions, insertions and deletions, select rearrangements, and copy number changes (amplifications and homozygous deletions). Clinically relevant GA (CRGA) were defined as GA linked to drugs on the market or under evaluation in clinical trials. A total of 28 ENBs harbored GA, with a mean of 1.5 GA per sample. Approximately half of the ENBs (21, 51%) featured at least one CRGA, with an average of 1 CRGA per sample. The most commonly altered gene was TP53 (17%), with GA in PIK3CA , NF1 , CDKN2A , and CDKN2C occurring in 7% of samples. We report comprehensive genomic profiles for 41 ENB tumors. CGP revealed potential new therapeutic targets, including targetable GA in the mTOR, CDK and growth factor signaling pathways, highlighting the clinical value of genomic profiling in ENB. Comprehensive genomic profiling of 41 relapsed or refractory ENBs reveals recurrent alterations or classes of mutation, including amplification of tyrosine kinases encoded on chromosome 5q and mutations affecting genes in the mTOR/PI3K pathway. Approximately half of the ENBs (21, 51%) featured at least one clinically relevant genomic alteration (CRGA), with an average of 1 CRGA per sample. The most commonly altered gene was TP53 (17%), and alterations in PIK3CA , NF1 , CDKN2A , or CDKN2C were identified in 7% of samples. Responses to treatment with the kinase inhibitors sunitinib, everolimus, and pazopanib are presented in conjunction with tumor genomics. © AlphaMed Press 2017.
Accurate detection for a wide range of mutation and editing sites of microRNAs from small RNA high-throughput sequencing profiles

PubMed Central

Zheng, Yun; Ji, Bo; Song, Renhua; Wang, Shengpeng; Li, Ting; Zhang, Xiaotuo; Chen, Kun; Li, Tianqing; Li, Jinyan

2016-01-01

Various types of mutation and editing (M/E) events in microRNAs (miRNAs) can change the stabilities of pre-miRNAs and/or complementarities between miRNAs and their targets. Small RNA (sRNA) high-throughput sequencing (HTS) profiles can contain many mutated and edited miRNAs. Systematic detection of miRNA mutation and editing sites from the huge volume of sRNA HTS profiles is computationally difficult, as high sensitivity and low false positive rate (FPR) are both required. We propose a novel method (named MiRME) for an accurate and fast detection of miRNA M/E sites using a progressive sequence alignment approach which refines sensitivity and improves FPR step-by-step. From 70 sRNA HTS profiles with over 1.3 billion reads, MiRME has detected thousands of statistically significant M/E sites, including 3′-editing sites, 57 A-to-I editing sites (of which 32 are novel), as well as some putative non-canonical editing sites. We demonstrated that a few non-canonical editing sites were not resulted from mutations in genome by integrating the analysis of genome HTS profiles of two human cell lines, suggesting the existence of new editing types to further diversify the functions of miRNAs. Compared with six existing studies or methods, MiRME has shown much superior performance for the identification and visualization of the M/E sites of miRNAs from the ever-increasing sRNA HTS profiles. PMID:27229138
Classification of Phylogenetic Profiles for Protein Function Prediction: An SVM Approach

NASA Astrophysics Data System (ADS)

Kotaru, Appala Raju; Joshi, Ramesh C.

Predicting the function of an uncharacterized protein is a major challenge in post-genomic era due to problems complexity and scale. Having knowledge of protein function is a crucial link in the development of new drugs, better crops, and even the development of biochemicals such as biofuels. Recently numerous high-throughput experimental procedures have been invented to investigate the mechanisms leading to the accomplishment of a protein’s function and Phylogenetic profile is one of them. Phylogenetic profile is a way of representing a protein which encodes evolutionary history of proteins. In this paper we proposed a method for classification of phylogenetic profiles using supervised machine learning method, support vector machine classification along with radial basis function as kernel for identifying functionally linked proteins. We experimentally evaluated the performance of the classifier with the linear kernel, polynomial kernel and compared the results with the existing tree kernel. In our study we have used proteins of the budding yeast saccharomyces cerevisiae genome. We generated the phylogenetic profiles of 2465 yeast genes and for our study we used the functional annotations that are available in the MIPS database. Our experiments show that the performance of the radial basis kernel is similar to polynomial kernel is some functional classes together are better than linear, tree kernel and over all radial basis kernel outperformed the polynomial kernel, linear kernel and tree kernel. In analyzing these results we show that it will be feasible to make use of SVM classifier with radial basis function as kernel to predict the gene functionality using phylogenetic profiles.
Comprehensive Genomic Profiling Identifies a Subset of Crizotinib-Responsive ALK-Rearranged Non-Small Cell Lung Cancer Not Detected by Fluorescence In Situ Hybridization

PubMed Central

Hensing, Thomas; Schrock, Alexa B.; Allen, Justin; Sanford, Eric; Gowen, Kyle; Kulkarni, Atul; He, Jie; Suh, James H.; Lipson, Doron; Elvin, Julia A.; Yelensky, Roman; Chalmers, Zachary; Chmielecki, Juliann; Peled, Nir; Klempner, Samuel J.; Firozvi, Kashif; Frampton, Garrett M.; Molina, Julian R.; Menon, Smitha; Brahmer, Julie R.; MacMahon, Heber; Nowak, Jan; Ou, Sai-Hong Ignatius; Zauderer, Marjorie; Ladanyi, Marc; Zakowski, Maureen; Fischbach, Neil; Ross, Jeffrey S.; Stephens, Phil J.; Miller, Vincent A.; Wakelee, Heather

2016-01-01

Introduction. For patients with non-small cell lung cancer (NSCLC) to benefit from ALK inhibitors, sensitive and specific detection of ALK genomic rearrangements is needed. ALK break-apart fluorescence in situ hybridization (FISH) is the U.S. Food and Drug Administration approved and standard-of-care diagnostic assay, but identification of ALK rearrangements by other methods reported in NSCLC cases that tested negative for ALK rearrangements by FISH suggests a significant false-negative rate. We report here a large series of NSCLC cases assayed by hybrid-capture-based comprehensive genomic profiling (CGP) in the course of clinical care. Materials and Methods. Hybrid-capture-based CGP using next-generation sequencing was performed in the course of clinical care of 1,070 patients with advanced lung cancer. Each tumor sample was evaluated for all classes of genomic alterations, including base-pair substitutions, insertions/deletions, copy number alterations and rearrangements, as well as fusions/rearrangements. Results. A total of 47 patients (4.4%) were found to harbor ALK rearrangements, of whom 41 had an EML4-ALK fusion, and 6 had other fusion partners, including 3 previously unreported rearrangement events: EIF2AK-ALK, PPM1B-ALK, and PRKAR1A-ALK. Of 41 patients harboring ALK rearrangements, 31 had prior FISH testing results available. Of these, 20 were ALK FISH positive, and 11 (35%) were ALK FISH negative. Of the latter 11 patients, 9 received crizotinib based on the CGP results, and 7 achieved a response with median duration of 17 months. Conclusion. Comprehensive genomic profiling detected canonical ALK rearrangements and ALK rearrangements with noncanonical fusion partners in a subset of patients with NSCLC with previously negative ALK FISH results. In this series, such patients had durable responses to ALK inhibitors, comparable to historical response rates for ALK FISH-positive cases. Implications for Practice: Comprehensive genomic profiling (CGP) that includes hybrid capture and specific baiting of intron 19 of ALK is a highly sensitive, alternative method for identification of drug-sensitive ALK fusions in patients with non-small cell lung cancer (NSCLC) who had previously tested negative using standard ALK fluorescence in situ hybridization (FISH) diagnostic assays. Given the proven benefit of treatment with crizotinib and second-generation ALK inhibitors in patients with ALK fusions, CGP should be considered in patients with NSCLC, including those who have tested negative for other alterations, including negative results using ALK FISH testing. PMID:27245569
Comprehensive Genomic Profiling Identifies a Subset of Crizotinib-Responsive ALK-Rearranged Non-Small Cell Lung Cancer Not Detected by Fluorescence In Situ Hybridization.

PubMed

Ali, Siraj M; Hensing, Thomas; Schrock, Alexa B; Allen, Justin; Sanford, Eric; Gowen, Kyle; Kulkarni, Atul; He, Jie; Suh, James H; Lipson, Doron; Elvin, Julia A; Yelensky, Roman; Chalmers, Zachary; Chmielecki, Juliann; Peled, Nir; Klempner, Samuel J; Firozvi, Kashif; Frampton, Garrett M; Molina, Julian R; Menon, Smitha; Brahmer, Julie R; MacMahon, Heber; Nowak, Jan; Ou, Sai-Hong Ignatius; Zauderer, Marjorie; Ladanyi, Marc; Zakowski, Maureen; Fischbach, Neil; Ross, Jeffrey S; Stephens, Phil J; Miller, Vincent A; Wakelee, Heather; Ganesan, Shridar; Salgia, Ravi

2016-06-01

For patients with non-small cell lung cancer (NSCLC) to benefit from ALK inhibitors, sensitive and specific detection of ALK genomic rearrangements is needed. ALK break-apart fluorescence in situ hybridization (FISH) is the U.S. Food and Drug Administration approved and standard-of-care diagnostic assay, but identification of ALK rearrangements by other methods reported in NSCLC cases that tested negative for ALK rearrangements by FISH suggests a significant false-negative rate. We report here a large series of NSCLC cases assayed by hybrid-capture-based comprehensive genomic profiling (CGP) in the course of clinical care. Hybrid-capture-based CGP using next-generation sequencing was performed in the course of clinical care of 1,070 patients with advanced lung cancer. Each tumor sample was evaluated for all classes of genomic alterations, including base-pair substitutions, insertions/deletions, copy number alterations and rearrangements, as well as fusions/rearrangements. A total of 47 patients (4.4%) were found to harbor ALK rearrangements, of whom 41 had an EML4-ALK fusion, and 6 had other fusion partners, including 3 previously unreported rearrangement events: EIF2AK-ALK, PPM1B-ALK, and PRKAR1A-ALK. Of 41 patients harboring ALK rearrangements, 31 had prior FISH testing results available. Of these, 20 were ALK FISH positive, and 11 (35%) were ALK FISH negative. Of the latter 11 patients, 9 received crizotinib based on the CGP results, and 7 achieved a response with median duration of 17 months. Comprehensive genomic profiling detected canonical ALK rearrangements and ALK rearrangements with noncanonical fusion partners in a subset of patients with NSCLC with previously negative ALK FISH results. In this series, such patients had durable responses to ALK inhibitors, comparable to historical response rates for ALK FISH-positive cases. Comprehensive genomic profiling (CGP) that includes hybrid capture and specific baiting of intron 19 of ALK is a highly sensitive, alternative method for identification of drug-sensitive ALK fusions in patients with non-small cell lung cancer (NSCLC) who had previously tested negative using standard ALK fluorescence in situ hybridization (FISH) diagnostic assays. Given the proven benefit of treatment with crizotinib and second-generation ALK inhibitors in patients with ALK fusions, CGP should be considered in patients with NSCLC, including those who have tested negative for other alterations, including negative results using ALK FISH testing. ©AlphaMed Press.
Functional phylogenomics analysis of bacteria and archaea using consistent genome annotation with UniFam

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chai, Juanjuan; Kora, Guruprasad; Ahn, Tae-Hyuk

2014-10-09

To supply some background, phylogenetic studies have provided detailed knowledge on the evolutionary mechanisms of genes and species in Bacteria and Archaea. However, the evolution of cellular functions, represented by metabolic pathways and biological processes, has not been systematically characterized. Many clades in the prokaryotic tree of life have now been covered by sequenced genomes in GenBank. This enables a large-scale functional phylogenomics study of many computationally inferred cellular functions across all sequenced prokaryotes. Our results show a total of 14,727 GenBank prokaryotic genomes were re-annotated using a new protein family database, UniFam, to obtain consistent functional annotations for accuratemore » comparison. The functional profile of a genome was represented by the biological process Gene Ontology (GO) terms in its annotation. The GO term enrichment analysis differentiated the functional profiles between selected archaeal taxa. 706 prokaryotic metabolic pathways were inferred from these genomes using Pathway Tools and MetaCyc. The consistency between the distribution of metabolic pathways in the genomes and the phylogenetic tree of the genomes was measured using parsimony scores and retention indices. The ancestral functional profiles at the internal nodes of the phylogenetic tree were reconstructed to track the gains and losses of metabolic pathways in evolutionary history. In conclusion, our functional phylogenomics analysis shows divergent functional profiles of taxa and clades. Such function-phylogeny correlation stems from a set of clade-specific cellular functions with low parsimony scores. On the other hand, many cellular functions are sparsely dispersed across many clades with high parsimony scores. These different types of cellular functions have distinct evolutionary patterns reconstructed from the prokaryotic tree.« less
Targetable kinase-activating lesions in Ph-like acute lymphoblastic leukemia | Office of Cancer Genomics

Cancer.gov

Publication Abstract: Philadelphia chromosome-like acute lymphoblastic leukemia (Ph-like ALL) is characterized by a gene-expression profile similar to that of BCR-ABL1-positive ALL, alterations of lymphoid transcription factor genes, and a poor outcome. The frequency and spectrum of genetic alterations in Ph-like ALL and its responsiveness to tyrosine kinase inhibition are undefined, especially in adolescents and adults. We performed genomic profiling of 1725 patients with precursor B-cell ALL and detailed genomic analysis of 154 patients with Ph-like ALL.
Using Genome-Wide Expression Profiling to Define Gene Networks Relevant to the Study of Complex Traits: From RNA Integrity to Network Topology

PubMed Central

O'Brien, M.A.; Costin, B.N.; Miles, M.F.

2014-01-01

Postgenomic studies of the function of genes and their role in disease have now become an area of intense study since efforts to define the raw sequence material of the genome have largely been completed. The use of whole-genome approaches such as microarray expression profiling and, more recently, RNA-sequence analysis of transcript abundance has allowed an unprecedented look at the workings of the genome. However, the accurate derivation of such high-throughput data and their analysis in terms of biological function has been critical to truly leveraging the postgenomic revolution. This chapter will describe an approach that focuses on the use of gene networks to both organize and interpret genomic expression data. Such networks, derived from statistical analysis of large genomic datasets and the application of multiple bioinformatics data resources, poten-tially allow the identification of key control elements for networks associated with human disease, and thus may lead to derivation of novel therapeutic approaches. However, as discussed in this chapter, the leveraging of such networks cannot occur without a thorough understanding of the technical and statistical factors influencing the derivation of genomic expression data. Thus, while the catch phrase may be “it's the network … stupid,” the understanding of factors extending from RNA isolation to genomic profiling technique, multivariate statistics, and bioinformatics are all critical to defining fully useful gene networks for study of complex biology. PMID:23195313

New Era of Studying RNA Secondary Structure and Its Influence on Gene Regulation in Plants.

PubMed

Yang, Xiaofei; Yang, Minglei; Deng, Hongjing; Ding, Yiliang

2018-01-01

The dynamic structure of RNA plays a central role in post-transcriptional regulation of gene expression such as RNA maturation, degradation, and translation. With the rise of next-generation sequencing, the study of RNA structure has been transformed from in vitro low-throughput RNA structure probing methods to in vivo high-throughput RNA structure profiling. The development of these methods enables incremental studies on the function of RNA structure to be performed, revealing new insights of novel regulatory mechanisms of RNA structure in plants. Genome-wide scale RNA structure profiling allows us to investigate general RNA structural features over 10s of 1000s of mRNAs and to compare RNA structuromes between plant species. Here, we provide a comprehensive and up-to-date overview of: (i) RNA structure probing methods; (ii) the biological functions of RNA structure; (iii) genome-wide RNA structural features corresponding to their regulatory mechanisms; and (iv) RNA structurome evolution in plants.
SpinachDB: A Well-Characterized Genomic Database for Gene Family Classification and SNP Information of Spinach.

PubMed

Yang, Xue-Dong; Tan, Hua-Wei; Zhu, Wei-Min

2016-01-01

Spinach (Spinacia oleracea L.), which originated in central and western Asia, belongs to the family Amaranthaceae. Spinach is one of most important leafy vegetables with a high nutritional value as well as being a perfect research material for plant sex chromosome models. As the completion of genome assembly and gene prediction of spinach, we developed SpinachDB (http://222.73.98.124/spinachdb) to store, annotate, mine and analyze genomics and genetics datasets efficiently. In this study, all of 21702 spinach genes were annotated. A total of 15741 spinach genes were catalogued into 4351 families, including identification of a substantial number of transcription factors. To construct a high-density genetic map, a total of 131592 SSRs and 1125743 potential SNPs located in 548801 loci of spinach genome were identified in 11 cultivated and wild spinach cultivars. The expression profiles were also performed with RNA-seq data using the FPKM method, which could be used to compare the genes. Paralogs in spinach and the orthologous genes in Arabidopsis, grape, sugar beet and rice were identified for comparative genome analysis. Finally, the SpinachDB website contains seven main sections, including the homepage; the GBrowse map that integrates genome, genes, SSR and SNP marker information; the Blast alignment service; the gene family classification search tool; the orthologous and paralogous gene pairs search tool; and the download and useful contact information. SpinachDB will be continually expanded to include newly generated robust genomics and genetics data sets along with the associated data mining and analysis tools.
Merkel Cell Polyomavirus Exhibits Dominant Control of the Tumor Genome and Transcriptome in Virus-Associated Merkel Cell Carcinoma.

PubMed

Starrett, Gabriel J; Marcelus, Christina; Cantalupo, Paul G; Katz, Joshua P; Cheng, Jingwei; Akagi, Keiko; Thakuria, Manisha; Rabinowits, Guilherme; Wang, Linda C; Symer, David E; Pipas, James M; Harris, Reuben S; DeCaprio, James A

2017-01-03

Merkel cell polyomavirus is the primary etiological agent of the aggressive skin cancer Merkel cell carcinoma (MCC). Recent studies have revealed that UV radiation is the primary mechanism for somatic mutagenesis in nonviral forms of MCC. Here, we analyze the whole transcriptomes and genomes of primary MCC tumors. Our study reveals that virus-associated tumors have minimally altered genomes compared to non-virus-associated tumors, which are dominated by UV-mediated mutations. Although virus-associated tumors contain relatively small mutation burdens, they exhibit a distinct mutation signature with observable transcriptionally biased kataegic events. In addition, viral integration sites overlap focal genome amplifications in virus-associated tumors, suggesting a potential mechanism for these events. Collectively, our studies indicate that Merkel cell polyomavirus is capable of hijacking cellular processes and driving tumorigenesis to the same severity as tens of thousands of somatic genome alterations. A variety of mutagenic processes that shape the evolution of tumors are critical determinants of disease outcome. Here, we sequenced the entire genome of virus-positive and virus-negative primary Merkel cell carcinomas (MCCs), revealing distinct mutation spectra and corresponding expression profiles. Our studies highlight the strong effect that Merkel cell polyomavirus has on the divergent development of viral MCC compared to the somatic alterations that typically drive nonviral tumorigenesis. A more comprehensive understanding of the distinct mutagenic processes operative in viral and nonviral MCCs has implications for the effective treatment of these tumors. Copyright © 2017 Starrett et al.
Genome wide identification, phylogeny, and expression of bone morphogenetic protein genes in tetraploidized common carp (Cyprinus carpio).

PubMed

Chen, Lin; Dong, Chuanju; Kong, Shengnan; Zhang, Jiangfan; Li, Xuejun; Xu, Peng

2017-09-05

Bone morphogenetic proteins (Bmps) are a group of signaling molecules known to play important roles during formation and maintenance of various organs, not only bone, but also muscle, blood and so on. Common carp (Cyprinus carpio) is one of the most intensively studied fish due to its economic and environmental importance. Besides, common carp has encountered an additional round of whole genome duplication (WGD) compared with many closely related diploid teleost, which make it one of the most important models for genome evolutionary studies in teleost. Comprehensive genome resources of common carp have been developed recently, which facilitate the thorough characterization of bmp gene family in the tetraploidized common carp genome. We identified a total of 44 bmps from the common carp genome, which are twice as many as that of zebrafish. Phylogenetic analysis revealed that most of bmps are highly conserved. Comparative analysis was performed across six typical vertebrate genomes. It appeared that all the bmp genes in common carp were duplicated. Obviously, the expansion of the bmp gene family in common carp was due to the latest additional round of whole genome duplication and made it more abundant than other diploid teleosts. Expression signatures were assessed in major tissues, including gill, intestine, liver, spleen, skin, heart, gonad, muscle, kidney, head kidney, brain and blood, which demonstrated the comprehensive expression profiles of bmp genes in the tetraploidized genome. Significant gene expression divergences were observed which revealed substantial functional divergences of those duplicated bmp genes post the latest WGD event. The conserved synteny blocks of bmp5s revealed the genome rearrangement of common carp post the 4R WGD. The whole set of bmp gene family in common carp provides insight into gene fate of tetraploidized common carp genome post recent WGD. Copyright © 2017. Published by Elsevier B.V.
Genomic analysis and clinical management of adolescent cutaneous melanoma.

PubMed

Rabbie, Roy; Rashid, Mamunur; Arance, Ana M; Sánchez, Marcelo; Tell-Marti, Gemma; Potrony, Miriam; Conill, Carles; van Doorn, Remco; Dentro, Stefan; Gruis, Nelleke A; Corrie, Pippa; Iyer, Vivek; Robles-Espinoza, Carla Daniela; Puig-Butille, Joan A; Puig, Susana; Adams, David J

2017-05-01

Melanoma in young children is rare; however, its incidence in adolescents and young adults is rising. We describe the clinical course of a 15-year-old female diagnosed with AJCC stage IB non-ulcerated primary melanoma, who died from metastatic disease 4 years after diagnosis despite three lines of modern systemic therapy. We also present the complete genomic profile of her tumour and compare this to a further series of 13 adolescent melanomas and 275 adult cutaneous melanomas. A somatic BRAF V 600E mutation and a high mutational load equivalent to that found in adult melanoma and composed primarily of C>T mutations were observed. A germline genomic analysis alongside a series of 23 children and adolescents with melanoma revealed no mutations in known germline melanoma-predisposing genes. Adolescent melanomas appear to have genomes that are as complex as those arising in adulthood and their clinical course can, as with adults, be unpredictable. © 2017 The Authors. Pigment Cell & Melanoma Research published by John Wiley & Sons Ltd.
Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea.

PubMed

Makarova, Kira S; Sorokin, Alexander V; Novichkov, Pavel S; Wolf, Yuri I; Koonin, Eugene V

2007-11-27

An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. New Archaeal Clusters of Orthologous Genes (arCOGs) were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon) using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover approximately 88% of the genes in a genome compared to a approximately 76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; approximately 40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome) consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA) is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile that, in addition to the core archaeal functions, encoded more idiosyncratic systems, e.g., the CASS systems of antivirus defense and some toxin-antitoxin systems. The arCOGs provide a convenient, flexible framework for functional annotation of archaeal genomes, comparative genomics and evolutionary reconstructions. Genomic reconstructions suggest that the last common ancestor of archaea might have been (nearly) as advanced as the modern archaeal hyperthermophiles. ArCOGs and related information are available at: ftp://ftp.ncbi.nih.gov/pub/koonin/arCOGs/.
The Genome Sequence of the North-European Cucumber (Cucumis sativus L.) Unravels Evolutionary Adaptation Mechanisms in Plants

PubMed Central

Wóycicki, Rafał; Witkowicz, Justyna; Gawroński, Piotr; Dąbrowska, Joanna; Lomsadze, Alexandre; Pawełkowicz, Magdalena; Siedlecka, Ewa; Yagi, Kohei; Pląder, Wojciech; Seroczyńska, Anna; Śmiech, Mieczysław; Gutman, Wojciech; Niemirowicz-Szczytt, Katarzyna; Bartoszewski, Grzegorz; Tagashira, Norikazu; Hoshi, Yoshikazu; Borodovsky, Mark; Karpiński, Stanisław; Malepszy, Stefan; Przybecki, Zbigniew

2011-01-01

Cucumber (Cucumis sativus L.), a widely cultivated crop, has originated from Eastern Himalayas and secondary domestication regions includes highly divergent climate conditions e.g. temperate and subtropical. We wanted to uncover adaptive genome differences between the cucumber cultivars and what sort of evolutionary molecular mechanisms regulate genetic adaptation of plants to different ecosystems and organism biodiversity. Here we present the draft genome sequence of the Cucumis sativus genome of the North-European Borszczagowski cultivar (line B10) and comparative genomics studies with the known genomes of: C. sativus (Chinese cultivar – Chinese Long (line 9930)), Arabidopsis thaliana, Populus trichocarpa and Oryza sativa. Cucumber genomes show extensive chromosomal rearrangements, distinct differences in quantity of the particular genes (e.g. involved in photosynthesis, respiration, sugar metabolism, chlorophyll degradation, regulation of gene expression, photooxidative stress tolerance, higher non-optimal temperatures tolerance and ammonium ion assimilation) as well as in distributions of abscisic acid-, dehydration- and ethylene-responsive cis-regulatory elements (CREs) in promoters of orthologous group of genes, which lead to the specific adaptation features. Abscisic acid treatment of non-acclimated Arabidopsis and C. sativus seedlings induced moderate freezing tolerance in Arabidopsis but not in C. sativus. This experiment together with analysis of abscisic acid-specific CRE distributions give a clue why C. sativus is much more susceptible to moderate freezing stresses than A. thaliana. Comparative analysis of all the five genomes showed that, each species and/or cultivars has a specific profile of CRE content in promoters of orthologous genes. Our results constitute the substantial and original resource for the basic and applied research on environmental adaptations of plants, which could facilitate creation of new crops with improved growth and yield in divergent conditions. PMID:21829493
The genome sequence of the North-European cucumber (Cucumis sativus L.) unravels evolutionary adaptation mechanisms in plants.

PubMed

Wóycicki, Rafał; Witkowicz, Justyna; Gawroński, Piotr; Dąbrowska, Joanna; Lomsadze, Alexandre; Pawełkowicz, Magdalena; Siedlecka, Ewa; Yagi, Kohei; Pląder, Wojciech; Seroczyńska, Anna; Śmiech, Mieczysław; Gutman, Wojciech; Niemirowicz-Szczytt, Katarzyna; Bartoszewski, Grzegorz; Tagashira, Norikazu; Hoshi, Yoshikazu; Borodovsky, Mark; Karpiński, Stanisław; Malepszy, Stefan; Przybecki, Zbigniew

2011-01-01

Cucumber (Cucumis sativus L.), a widely cultivated crop, has originated from Eastern Himalayas and secondary domestication regions includes highly divergent climate conditions e.g. temperate and subtropical. We wanted to uncover adaptive genome differences between the cucumber cultivars and what sort of evolutionary molecular mechanisms regulate genetic adaptation of plants to different ecosystems and organism biodiversity. Here we present the draft genome sequence of the Cucumis sativus genome of the North-European Borszczagowski cultivar (line B10) and comparative genomics studies with the known genomes of: C. sativus (Chinese cultivar--Chinese Long (line 9930)), Arabidopsis thaliana, Populus trichocarpa and Oryza sativa. Cucumber genomes show extensive chromosomal rearrangements, distinct differences in quantity of the particular genes (e.g. involved in photosynthesis, respiration, sugar metabolism, chlorophyll degradation, regulation of gene expression, photooxidative stress tolerance, higher non-optimal temperatures tolerance and ammonium ion assimilation) as well as in distributions of abscisic acid-, dehydration- and ethylene-responsive cis-regulatory elements (CREs) in promoters of orthologous group of genes, which lead to the specific adaptation features. Abscisic acid treatment of non-acclimated Arabidopsis and C. sativus seedlings induced moderate freezing tolerance in Arabidopsis but not in C. sativus. This experiment together with analysis of abscisic acid-specific CRE distributions give a clue why C. sativus is much more susceptible to moderate freezing stresses than A. thaliana. Comparative analysis of all the five genomes showed that, each species and/or cultivars has a specific profile of CRE content in promoters of orthologous genes. Our results constitute the substantial and original resource for the basic and applied research on environmental adaptations of plants, which could facilitate creation of new crops with improved growth and yield in divergent conditions.
Comparative DNA Methylation Profiling Reveals an Immunoepigenetic Signature of HIV-related Cognitive Impairment

PubMed Central

Corley, Michael J.; Dye, Christian; D’Antoni, Michelle L.; Byron, Mary Margaret; Yo, Kaahukane Leite-Ah; Lum-Jones, Annette; Nakamoto, Beau; Valcour, Victor; SahBandar, Ivo; Shikuma, Cecilia M.; Ndhlovu, Lishomwa C.; Maunakea, Alika K.

2016-01-01

Monocytes/macrophages contribute to the neuropathogenesis of HIV-related cognitive impairment (CI); however, considerable gaps in our understanding of the precise mechanisms driving this relationship remain. Furthermore, whether a distinct biological profile associated with HIV-related CI resides in immune cell populations remains unknown. Here, we profiled DNA methylomes and transcriptomes of monocytes derived from HIV-infected individuals with and without CI using genome-wide DNA methylation and gene expression profiling. We identified 1,032 CI-associated differentially methylated loci in monocytes. These loci related to gene networks linked to the central nervous system (CNS) and interactions with HIV. Most (70.6%) of these loci exhibited higher DNA methylation states in the CI group and were preferentially distributed over gene bodies and intergenic regions of the genome. CI-associated DNA methylation states at 12 CpG sites associated with neuropsychological testing performance scores. CI-associated DNA methylation also associated with gene expression differences including CNS genes CSRNP1 (P = 0.017), DISC1 (P = 0.012), and NR4A2 (P = 0.005); and a gene known to relate to HIV viremia, THBS1 (P = 0.003). This discovery cohort data unveils cell type-specific DNA methylation patterns related to HIV-associated CI and provide an immunoepigenetic DNA methylation “signature” potentially useful for corroborating clinical assessments, informing pathogenic mechanisms, and revealing new therapeutic targets against CI. PMID:27629381
Peripheral blood gene expression signature differentiates children with autism from unaffected siblings

PubMed Central

Kong, SW; Shimizu-Motohashi, Y; Campbell, MG; Lee, IH; Collins, CD; Brewster, SJ; Holm, IA; Rappaport, L

2013-01-01

Autism spectrum disorder (ASD) is one of the most prevalent neurodevelopmental disorders with high heritability, yet a majority of genetic contribution to pathophysiology is not known. Siblings of individuals with ASD are at increased risk for ASD and autistic traits, but the genetic contribution for simplex families is estimated to be less when compared to multiplex families. To explore the genomic (dis-) similarity between proband and unaffected sibling in simplex families, we used genome-wide gene expression profiles of blood from 20 proband-unaffected sibling pairs and 18 unrelated control individuals. The global gene expression profiles of unaffected siblings were more similar to those from probands as they shared genetic and environmental background. One hundred eighty nine genes were significantly differentially expressed between proband-sib pairs (nominal p-value < 0.01) after controlling for age, sex, and family effects. Probands and siblings were distinguished into two groups by cluster analysis with these genes. Overall, unaffected siblings were equally distant from the centroid of probands and from that of unrelated controls with the differentially expressed genes. Interestingly, 5 of 20 siblings had gene expression profiles that were more similar to unrelated controls than to their matched probands. In summary, we found a set of genes that distinguished probands from the unaffected siblings, and a subgroup of unaffected siblings who were more similar to probands. The pathways that characterized probands compared to siblings using peripheral blood gene expression profiles were the up-regulation of ribosomal, spliceosomal, and mitochondrial pathways, and the down-regulation of neuroreceptor-ligand, immune response and calcium signaling pathways. Further integrative study with structural genetic variations such as de novo mutations, rare variants, and copy number variations would clarify whether these transcriptomic changes are structural or environmental in origin. PMID:23625158
Methylation-sensitive amplified polymorphism-based genome-wide analysis of cytosine methylation profiles in Nicotiana tabacum cultivars.

PubMed

Jiao, J; Wu, J; Lv, Z; Sun, C; Gao, L; Yan, X; Cui, L; Tang, Z; Yan, B; Jia, Y

2015-11-26

This study aimed to investigate cytosine methylation profiles in different tobacco (Nicotiana tabacum) cultivars grown in China. Methylation-sensitive amplified polymorphism was used to analyze genome-wide global methylation profiles in four tobacco cultivars (Yunyan 85, NC89, K326, and Yunyan 87). Amplicons with methylated C motifs were cloned by reamplified polymerase chain reaction, sequenced, and analyzed. The results show that geographical location had a greater effect on methylation patterns in the tobacco genome than did sampling time. Analysis of the CG dinucleotide distribution in methylation-sensitive polymorphic restriction fragments suggested that a CpG dinucleotide cluster-enriched area is a possible site of cytosine methylation in the tobacco genome. The sequence alignments of the Nia1 gene (that encodes nitrate reductase) in Yunyan 87 in different regions indicate that a C-T transition might be responsible for the tobacco phenotype. T-C nucleotide replacement might also be responsible for the tobacco phenotype and may be influenced by geographical location.
Comprehensive Genome Profiling of Single Sperm Cells by Multiple Annealing and Looping-Based Amplification Cycles and Next-Generation Sequencing from Carriers of Robertsonian Translocation.

PubMed

Sha, Yanwei; Sha, Yankun; Ji, Zhiyong; Ding, Lu; Zhang, Qing; Ouyang, Honggen; Lin, Shaobin; Wang, Xu; Shao, Lin; Shi, Chong; Li, Ping; Song, Yueqiang

2017-03-01

Robertsonian translocation (RT) is a common cause for male infertility, recurrent pregnancy loss, and birth defects. Studying meiotic recombination in RT-carrier patients helps decipher the mechanism and improve the clinical management of infertility and birth defects caused by RT. Here we present a new method to study spermatogenesis on a single-gamete basis from two RT carriers. By using a combined single-cell whole-genome amplification and sequencing protocol, we comprehensively profiled the chromosomal copy number of 88 single sperms from two RT-carrier patients. With the profiled information, chromosomal aberrations were identified on a whole-genome, per-sperm basis. We found that the previously reported interchromosomal effect might not exist with RT carriers. It is suggested that single-cell genome sequencing enables comprehensive chromosomal aneuploidy screening and provides a powerful tool for studying gamete generation from patients carrying chromosomal diseases. © 2017 John Wiley & Sons Ltd/University College London.
Stratification of co-evolving genomic groups using ranked phylogenetic profiles

PubMed Central

Freilich, Shiri; Goldovsky, Leon; Gottlieb, Assaf; Blanc, Eric; Tsoka, Sophia; Ouzounis, Christos A

2009-01-01

Background Previous methods of detecting the taxonomic origins of arbitrary sequence collections, with a significant impact to genome analysis and in particular metagenomics, have primarily focused on compositional features of genomes. The evolutionary patterns of phylogenetic distribution of genes or proteins, represented by phylogenetic profiles, provide an alternative approach for the detection of taxonomic origins, but typically suffer from low accuracy. Herein, we present rank-BLAST, a novel approach for the assignment of protein sequences into genomic groups of the same taxonomic origin, based on the ranking order of phylogenetic profiles of target genes or proteins across the reference database. Results The rank-BLAST approach is validated by computing the phylogenetic profiles of all sequences for five distinct microbial species of varying degrees of phylogenetic proximity, against a reference database of 243 fully sequenced genomes. The approach - a combination of sequence searches, statistical estimation and clustering - analyses the degree of sequence divergence between sets of protein sequences and allows the classification of protein sequences according to the species of origin with high accuracy, allowing taxonomic classification of 64% of the proteins studied. In most cases, a main cluster is detected, representing the corresponding species. Secondary, functionally distinct and species-specific clusters exhibit different patterns of phylogenetic distribution, thus flagging gene groups of interest. Detailed analyses of such cases are provided as examples. Conclusion Our results indicate that the rank-BLAST approach can capture the taxonomic origins of sequence collections in an accurate and efficient manner. The approach can be useful both for the analysis of genome evolution and the detection of species groups in metagenomics samples. PMID:19860884
Soybean seed extracts preferentially express genomic loci of Bradyrhizobium japonicum in the initial interaction with soybean, Glycine max (L.) Merr.

PubMed

Wei, Min; Yokoyama, Tadashi; Minamisawa, Kiwamu; Mitsui, Hisayuki; Itakura, Manabu; Kaneko, Takakazu; Tabata, Satoshi; Saeki, Kazuhiko; Omori, Hirofumi; Tajima, Shigeyuki; Uchiumi, Toshiki; Abe, Mikiko; Ohwada, Takuji

2008-08-01

Initial interaction between rhizobia and legumes actually starts via encounters of both partners in the rhizosphere. In this study, the global expression profiles of Bradyrhizobium japonicum USDA 110 in response to soybean (Glycine max) seed extracts (SSE) and genistein, a major soybean-released isoflavone for nod genes induction of B. japonicum, were compared. SSE induced many genomic loci as compared with genistein (5.0 microM), nevertheless SSE-supplemented medium contained 4.7 microM genistein. SSE markedly induced four predominant genomic regions within a large symbiosis island (681 kb), which include tts genes (type III secretion system) and various nod genes. In addition, SSE-treated cells expressed many genomic loci containing genes for polygalacturonase (cell-wall degradation), exopolysaccharide synthesis, 1-aminocyclopropane-1-carboxylate deaminase, ribosome proteins family and energy metabolism even outside symbiosis island. On the other hand, genistein-treated cells exclusively showed one expression cluster including common nod gene operon within symbiosis island and six expression loci including multidrug resistance, which were shared with SSE-treated cells. Twelve putatively regulated genes were indeed validated by quantitative RT-PCR. Several SSE-induced genomic loci likely participate in the initial interaction with legumes. Thus, these results can provide a basic knowledge for screening novel genes relevant to the B. japonicum- soybean symbiosis.
Comparative analysis of field-isolate and monkey-adapted Plasmodium vivax genomes.

PubMed

Chan, Ernest R; Barnwell, John W; Zimmerman, Peter A; Serre, David

2015-03-01

Significant insights into the biology of Plasmodium vivax have been gained from the ability to successfully adapt human infections to non-human primates. P. vivax strains grown in monkeys serve as a renewable source of parasites for in vitro and ex vivo experimental studies and functional assays, or for studying in vivo the relapse characteristics, mosquito species compatibilities, drug susceptibility profiles or immune responses towards potential vaccine candidates. Despite the importance of these studies, little is known as to how adaptation to a different host species may influence the genome of P. vivax. In addition, it is unclear whether these monkey-adapted strains consist of a single clonal population of parasites or if they retain the multiclonal complexity commonly observed in field isolates. Here we compare the genome sequences of seven P. vivax strains adapted to New World monkeys with those of six human clinical isolates collected directly in the field. We show that the adaptation of P. vivax parasites to monkey hosts, and their subsequent propagation, did not result in significant modifications of their genome sequence and that these monkey-adapted strains recapitulate the genomic diversity of field isolates. Our analyses also reveal that these strains are not always genetically homogeneous and should be analyzed cautiously. Overall, our study provides a framework to better leverage this important research material and fully utilize this resource for improving our understanding of P. vivax biology.
Comparative Analysis of Field-Isolate and Monkey-Adapted Plasmodium vivax Genomes

PubMed Central

Chan, Ernest R.; Barnwell, John W.; Zimmerman, Peter A.; Serre, David

2015-01-01

Significant insights into the biology of Plasmodium vivax have been gained from the ability to successfully adapt human infections to non-human primates. P. vivax strains grown in monkeys serve as a renewable source of parasites for in vitro and ex vivo experimental studies and functional assays, or for studying in vivo the relapse characteristics, mosquito species compatibilities, drug susceptibility profiles or immune responses towards potential vaccine candidates. Despite the importance of these studies, little is known as to how adaptation to a different host species may influence the genome of P. vivax. In addition, it is unclear whether these monkey-adapted strains consist of a single clonal population of parasites or if they retain the multiclonal complexity commonly observed in field isolates. Here we compare the genome sequences of seven P. vivax strains adapted to New World monkeys with those of six human clinical isolates collected directly in the field. We show that the adaptation of P. vivax parasites to monkey hosts, and their subsequent propagation, did not result in significant modifications of their genome sequence and that these monkey-adapted strains recapitulate the genomic diversity of field isolates. Our analyses also reveal that these strains are not always genetically homogeneous and should be analyzed cautiously. Overall, our study provides a framework to better leverage this important research material and fully utilize this resource for improving our understanding of P. vivax biology. PMID:25768941
Source attribution of human campylobacteriosis at the point of exposure by combining comparative exposure assessment and subtype comparison based on comparative genomic fingerprinting.

PubMed

Ravel, André; Hurst, Matt; Petrica, Nicoleta; David, Julie; Mutschall, Steven K; Pintar, Katarina; Taboada, Eduardo N; Pollari, Frank

2017-01-01

Human campylobacteriosis is a common zoonosis with a significant burden in many countries. Its prevention is difficult because humans can be exposed to Campylobacter through various exposures: foodborne, waterborne or by contact with animals. This study aimed at attributing campylobacteriosis to sources at the point of exposure. It combined comparative exposure assessment and microbial subtype comparison with subtypes defined by comparative genomic fingerprinting (CGF). It used isolates from clinical cases and from eight potential exposure sources (chicken, cattle and pig manure, retail chicken, beef, pork and turkey meat, and surface water) collected within a single sentinel site of an integrated surveillance system for enteric pathogens in Canada. Overall, 1518 non-human isolates and 250 isolates from domestically-acquired human cases were subtyped and their subtype profiles analyzed for source attribution using two attribution models modified to include exposure. Exposure values were obtained from a concurrent comparative exposure assessment study undertaken in the same area. Based on CGF profiles, attribution was possible for 198 (79%) human cases. Both models provide comparable figures: chicken meat was the most important source (65-69% of attributable cases) whereas exposure to cattle (manure) ranked second (14-19% of attributable cases), the other sources being minor (including beef meat). In comparison with other attributions conducted at the point of production, the study highlights the fact that Campylobacter transmission from cattle to humans is rarely meat borne, calling for a closer look at local transmission from cattle to prevent campylobacteriosis, in addition to increasing safety along the chicken supply chain.
ICTV Virus Taxonomy Profile: Virgaviridae

USDA-ARS?s Scientific Manuscript database

The family Virgaviridae is comprised of plant-infecting viruses with rod-shaped particles, single stranded RNA genomes with 3' terminal tRNA-like structures, and replication proteins typical of alphalike viruses. Differences in the number of genome components, genome organization and transmission m...
Comparative sequencing analysis reveals high genomic concordance between matched primary and metastatic colorectal cancer lesions.

PubMed

Brannon, A Rose; Vakiani, Efsevia; Sylvester, Brooke E; Scott, Sasinya N; McDermott, Gregory; Shah, Ronak H; Kania, Krishan; Viale, Agnes; Oschwald, Dayna M; Vacic, Vladimir; Emde, Anne-Katrin; Cercek, Andrea; Yaeger, Rona; Kemeny, Nancy E; Saltz, Leonard B; Shia, Jinru; D'Angelica, Michael I; Weiser, Martin R; Solit, David B; Berger, Michael F

2014-08-28

Colorectal cancer is the second leading cause of cancer death in the United States, with over 50,000 deaths estimated in 2014. Molecular profiling for somatic mutations that predict absence of response to anti-EGFR therapy has become standard practice in the treatment of metastatic colorectal cancer; however, the quantity and type of tissue available for testing is frequently limited. Further, the degree to which the primary tumor is a faithful representation of metastatic disease has been questioned. As next-generation sequencing technology becomes more widely available for clinical use and additional molecularly targeted agents are considered as treatment options in colorectal cancer, it is important to characterize the extent of tumor heterogeneity between primary and metastatic tumors. We performed deep coverage, targeted next-generation sequencing of 230 key cancer-associated genes for 69 matched primary and metastatic tumors and normal tissue. Mutation profiles were 100% concordant for KRAS, NRAS, and BRAF, and were highly concordant for recurrent alterations in colorectal cancer. Additionally, whole genome sequencing of four patient trios did not reveal any additional site-specific targetable alterations. Colorectal cancer primary tumors and metastases exhibit high genomic concordance. As current clinical practices in colorectal cancer revolve around KRAS, NRAS, and BRAF mutation status, diagnostic sequencing of either primary or metastatic tissue as available is acceptable for most patients. Additionally, consistency between targeted sequencing and whole genome sequencing results suggests that targeted sequencing may be a suitable strategy for clinical diagnostic applications.
The rubber tree genome shows expansion of gene family associated with rubber biosynthesis.

PubMed

Lau, Nyok-Sean; Makita, Yuko; Kawashima, Mika; Taylor, Todd D; Kondo, Shinji; Othman, Ahmad Sofiman; Shu-Chien, Alexander Chong; Matsui, Minami

2016-06-24

Hevea brasiliensis Muell. Arg, a member of the family Euphorbiaceae, is the sole natural resource exploited for commercial production of high-quality natural rubber. The properties of natural rubber latex are almost irreplaceable by synthetic counterparts for many industrial applications. A paucity of knowledge on the molecular mechanisms of rubber biosynthesis in high yield traits still persists. Here we report the comprehensive genome-wide analysis of the widely planted H. brasiliensis clone, RRIM 600. The genome was assembled based on ~155-fold combined coverage with Illumina and PacBio sequence data and has a total length of 1.55 Gb with 72.5% comprising repetitive DNA sequences. A total of 84,440 high-confidence protein-coding genes were predicted. Comparative genomic analysis revealed strong synteny between H. brasiliensis and other Euphorbiaceae genomes. Our data suggest that H. brasiliensis's capacity to produce high levels of latex can be attributed to the expansion of rubber biosynthesis-related genes in its genome and the high expression of these genes in latex. Using cap analysis gene expression data, we illustrate the tissue-specific transcription profiles of rubber biosynthesis-related genes, revealing alternative means of transcriptional regulation. Our study adds to the understanding of H. brasiliensis biology and provides valuable genomic resources for future agronomic-related improvement of the rubber tree.

Genomic predictive model for recurrence and metastasis development in head and neck squamous cell carcinoma patients.

PubMed

Ribeiro, Ilda Patrícia; Caramelo, Francisco; Esteves, Luísa; Menoita, Joana; Marques, Francisco; Barroso, Leonor; Miguéis, Jorge; Melo, Joana Barbosa; Carreira, Isabel Marques

2017-10-24

The head and neck squamous cell carcinoma (HNSCC) population consists mainly of high-risk for recurrence and locally advanced stage patients. Increased knowledge of the HNSCC genomic profile can improve early diagnosis and treatment outcomes. The development of models to identify consistent genomic patterns that distinguish HNSCC patients that will recur and/or develop metastasis after treatment is of utmost importance to decrease mortality and improve survival rates. In this study, we used array comparative genomic hybridization data from HNSCC patients to implement a robust model to predict HNSCC recurrence/metastasis. This predictive model showed a good accuracy (>80%) and was validated in an independent population from TCGA data portal. This predictive genomic model comprises chromosomal regions from 5p, 6p, 8p, 9p, 11q, 12q, 15q and 17p, where several upstream and downstream members of signaling pathways that lead to an increase in cell proliferation and invasion are mapped. The introduction of genomic predictive models in clinical practice might contribute to a more individualized clinical management of the HNSCC patients, reducing recurrences and improving patients' quality of life. The power of this genomic model to predict the recurrence and metastases development should be evaluated in other HNSCC populations.
Impact of ambient and supplemental ultraviolet-B stress on kidney bean plants: an insight into oxidative stress management.

PubMed

Singh, Suruchi; Sarkar, Abhijit; Agrawal, S B; Agrawal, Madhoolika

2014-11-01

In the present study, the response of kidney bean (Phaseolus vulgaris L. cv. Pusa Komal) plants was evaluated under three different levels of ultraviolet-B (UV-B), i.e., excluded UV-B (eUV-B), ambient UV-B (aUV-B; 5.8 kJ m(-2) day(-1)), and supplemental UV-B (sUV-B; 280-315 nm; ambient + 7.2 kJ m(-2) day(-1)), under near-natural conditions. eUV-B treatment clearly demonstrated that both aUV-B and sUV-B are capable of causing significant changes in the plant's growth, metabolism, economic yield, genome template stability, total protein, and antioxidative enzyme profiles. The experimental findings showed maximum plant height at eUV-B, but biomass accumulation was minimum. Significant reductions in quantum yield (Fv/Fm) were observed under both aUV-B and sUV-B, as compared to eUV-B. UV-B-absorbing flavonoids increased under higher UV-B exposures with consequent increments in phenylalanine ammonia lyase (PAL) activities. The final yield was significantly higher in plants grown under eUV-B, compared to those under aUV-B and sUV-B. Total protein profile through sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and analysis of isoenzymes, like superoxide dismutase (SOD), peroxidase (POX), catalase (CAT), ascorbate peroxidase (APX), guaiacol peroxidase (GPX), and glutathione reductase (GR), through native PAGE revealed major changes in the leaf proteome under aUV-B and sUV-B, depicting induction of some major stress-related proteins. The random amplified polymorphic DNA (RAPD) profile of genomic DNA also indicated a significant reduction of genome template stability under UV-B exposure. Thus, it can be inferred that more energy is diverted for inducing protection mechanisms rather than utilizing it for growth under high UV-B level.
DNA methylation profiles of donor nuclei cells and tissues of cloned bovine fetuses.

PubMed

Kremenskoy, Maksym; Kremenska, Yuliya; Suzuki, Masako; Imai, Kei; Takahashi, Seiya; Hashizume, Kazuyoshi; Yagi, Shintaro; Shiota, Kunio

2006-04-01

Methylation of DNA in CpG islands plays an important role during fetal development and differentiation because CpG islands are preferentially located in upstream regions of mammalian genomic DNA, including the transcription start site of housekeeping genes and are also associated with tissue-specific genes. Somatic nuclear transfer (NT) technology has been used to generate live clones in numerous mammalian species, but only a low percentage of nuclear transferred animals develop to term. Abnormal epigenetic changes in the CpG islands of donor nuclei after nuclear transfer could contribute to a high rate of abortion during early gestation and increase perinatal death. These changes have yet to be explored. Thus, we investigated the genome-wide DNA methylation profiles of CpG islands in nuclei donor cells and NT animals. Using Restriction Landmark Genomic Scanning (RLGS), we showed, for the first time, the epigenetic profile formation of tissues from NT bovine fetuses produced from cumulus cells. From approximately 2600 unmethylated NotI sites visualized on the RLGS profile, at least 35 NotI sites showed different methylation statuses. Moreover, we proved that fetal and placental tissues from artificially inseminated and cloned cattle have tissue-specific differences in the genome-wide methylation profiles of the CpG islands. We also found that possible abnormalities occurred in the fetal brain and placental tissues of cloned animals.
Precision medicine for advanced prostate cancer

PubMed Central

Mullane, Stephanie A.; Van Allen, Eliezer M.

2016-01-01

Purpose of review Precision cancer medicine, the use of genomic profiling of patient tumors at the point-of-care to inform treatment decisions, is rapidly changing treatment strategies across cancer types. Precision medicine for advanced prostate cancer may identify new treatment strategies and change clinical practice. In this review, we discuss the potential and challenges of precision medicine in advanced prostate cancer. Recent findings Although primary prostate cancers do not harbor highly recurrent targetable genomic alterations, recent reports on the genomics of metastatic castration-resistant prostate cancer has shown multiple targetable alterations in castration-resistant prostate cancer metastatic biopsies. Therapeutic implications include targeting prevalent DNA repair pathway alterations with PARP-1 inhibition in genomically defined subsets of patients, among other genomically stratified targets. In addition, multiple recent efforts have demonstrated the promise of liquid tumor profiling (e.g., profiling circulating tumor cells or cell-free tumor DNA) and highlighted the necessary steps to scale these approaches in prostate cancer. Summary Although still in the initial phase of precision medicine for prostate cancer, there is extraordinary potential for clinical impact. Efforts to overcome current scientific and clinical barriers will enable widespread use of precision medicine approaches for advanced prostate cancer patients. PMID:26909474
Precision medicine for advanced prostate cancer.

PubMed

Mullane, Stephanie A; Van Allen, Eliezer M

2016-05-01

Precision cancer medicine, the use of genomic profiling of patient tumors at the point-of-care to inform treatment decisions, is rapidly changing treatment strategies across cancer types. Precision medicine for advanced prostate cancer may identify new treatment strategies and change clinical practice. In this review, we discuss the potential and challenges of precision medicine in advanced prostate cancer. Although primary prostate cancers do not harbor highly recurrent targetable genomic alterations, recent reports on the genomics of metastatic castration-resistant prostate cancer has shown multiple targetable alterations in castration-resistant prostate cancer metastatic biopsies. Therapeutic implications include targeting prevalent DNA repair pathway alterations with PARP-1 inhibition in genomically defined subsets of patients, among other genomically stratified targets. In addition, multiple recent efforts have demonstrated the promise of liquid tumor profiling (e.g., profiling circulating tumor cells or cell-free tumor DNA) and highlighted the necessary steps to scale these approaches in prostate cancer. Although still in the initial phase of precision medicine for prostate cancer, there is extraordinary potential for clinical impact. Efforts to overcome current scientific and clinical barriers will enable widespread use of precision medicine approaches for advanced prostate cancer patients.
Reconstruction of Tissue-Specific Metabolic Networks Using CORDA

PubMed Central

Schultz, André; Qutub, Amina A.

2016-01-01

Human metabolism involves thousands of reactions and metabolites. To interpret this complexity, computational modeling becomes an essential experimental tool. One of the most popular techniques to study human metabolism as a whole is genome scale modeling. A key challenge to applying genome scale modeling is identifying critical metabolic reactions across diverse human tissues. Here we introduce a novel algorithm called Cost Optimization Reaction Dependency Assessment (CORDA) to build genome scale models in a tissue-specific manner. CORDA performs more efficiently computationally, shows better agreement to experimental data, and displays better model functionality and capacity when compared to previous algorithms. CORDA also returns reaction associations that can greatly assist in any manual curation to be performed following the automated reconstruction process. Using CORDA, we developed a library of 76 healthy and 20 cancer tissue-specific reconstructions. These reconstructions identified which metabolic pathways are shared across diverse human tissues. Moreover, we identified changes in reactions and pathways that are differentially included and present different capacity profiles in cancer compared to healthy tissues, including up-regulation of folate metabolism, the down-regulation of thiamine metabolism, and tight regulation of oxidative phosphorylation. PMID:26942765
Transcriptome profiling analysis of Vibrio vulnificus during human infection.

PubMed

Bisharat, Naiel; Bronstein, Michal; Korner, Mira; Schnitzer, Temima; Koton, Yael

2013-09-01

Vibrio vulnificus is a waterborne pathogen that was responsible for an outbreak of severe soft-tissue infections among fish farmers and fish consumers in Israel. Several factors have been shown to be associated with virulence. However, the transcriptome profile of the pathogen during human infection has not been determined yet. We compared the transcriptome profile, using RNA sequencing, of a human-pathogenic strain harvested directly from tissue of a patient suffering from severe soft-tissue infection with necrotizing fasciitis, with the same strain and three other environmental strains grown in vitro. The five sequenced libraries were aligned to the reference genomes of V. vulnificus strains CMCP6 and YJ016. Approximately 47.8 to 62.3 million paired-end raw reads were generated from the five runs. Nearly 84 % of the genome was covered by reads from at least one of the five runs, suggesting that nearly 16 % of the genome is not transcribed or is transcribed at low levels. We identified 123 genes that were differentially expressed during the acute phase of infection. Sixty-three genes were mapped to the large chromosome, 47 genes mapped to the small chromosome and 13 genes mapped to the YJ016 plasmid. The 123 genes fell into a variety of functional categories including transcription, signal transduction, cell motility, carbohydrate metabolism, intracellular trafficking and cell envelope biogenesis. Among the genes differentially expressed during human infection we identified genes encoding bacterial toxin (RtxA1) and genes involved in flagellar components, Flp-coding region, GGDEF family protein, iron acquisition system and sialic acid metabolism.
Company Profile: AKESOgen, Inc.

PubMed

Bouzyk, Mark; Boisjoli, Robert

2012-07-01

Rapid advancement of genomics, genetic and bioinformatic technologies have paved the way for an explosion of opportunities in pharmacogenomics, which is reflected by the growing number of biomarkers in the 'personalized medicine cabinet'. AKESOgen, Inc. (GA, USA) has been established to meet and champion these needs. AKESOgen, Inc. is a biomarker, genomics and pharmacogenomics contract research organization that services the academic, pharmaceutical, biotechnology and agricultural sectors. AKESOgen, Inc. performs biomarker profiling and genomics services utilizing different types of markers (e.g., DNA, RNA and methylation) for the research and development market. AKESOgen, Inc. establishes and validates biomarkers in the clinical trials arena and provides expertise in biobanking.
Copy number gain at 8q12.1-q22.1 is associated with a malignant tumor phenotype in salivary gland myoepitheliomas.

PubMed

Vékony, Hedy; Röser, Kerstin; Löning, Thomas; Ylstra, Bauke; Meijer, Gerrit A; van Wieringen, Wessel N; van de Wiel, Mark A; Carvalho, Beatriz; Kok, Klaas; Leemans, C René; van der Waal, Isaäc; Bloemena, Elisabeth

2009-02-01

Salivary gland myoepithelial tumors are relatively uncommon tumors with an unpredictable clinical course. More knowledge about their genetic profiles is necessary to identify novel predictors of disease. In this study, we subjected 27 primary tumors (15 myoepitheliomas and 12 myoepithelial carcinomas) to genome-wide microarray-based comparative genomic hybridization (array CGH). We set out to delineate known chromosomal aberrations in more detail and to unravel chromosomal differences between benign myoepitheliomas and myoepithelial carcinomas. Patterns of DNA copy number aberrations were analyzed by unsupervised hierarchical cluster analysis. Both benign and malignant tumors revealed a limited amount of chromosomal alterations (median of 5 and 7.5, respectively). In both tumor groups, high frequency gains (> or =20%) were found mainly at loci of growth factors and growth factor receptors (e.g., PDGF, FGF(R)s, and EGFR). In myoepitheliomas, high frequency losses (> or =20%) were detected at regions of proto-cadherins. Cluster analysis of the array CGH data identified three clusters. Differential copy numbers on chromosome arm 8q and chromosome 17 set the clusters apart. Cluster 1 contained a mixture of the two phenotypes (n = 10), cluster 2 included mostly benign tumors (n = 10), and cluster 3 only contained carcinomas (n = 7). Supervised analysis between malignant and benign tumors revealed a 36 Mbp-region at 8q being more frequently gained in malignant tumors (P = 0.007, FDR = 0.05). This is the first study investigating genomic differences between benign and malignant myoepithelial tumors of the salivary glands at a genomic level. Both unsupervised and supervised analysis of the genomic profiles revealed chromosome arm 8q to be involved in the malignant phenotype of salivary gland myoepitheliomas.
Chromatin Landscapes of Retroviral and Transposon Integration Profiles

PubMed Central

Badhai, Jitendra; Rust, Alistair G.; Rad, Roland; Hilkens, John; Berns, Anton; van Lohuizen, Maarten; Wessels, Lodewyk F. A.; de Ridder, Jeroen

2014-01-01

The ability of retroviruses and transposons to insert their genetic material into host DNA makes them widely used tools in molecular biology, cancer research and gene therapy. However, these systems have biases that may strongly affect research outcomes. To address this issue, we generated very large datasets consisting of to unselected integrations in the mouse genome for the Sleeping Beauty (SB) and piggyBac (PB) transposons, and the Mouse Mammary Tumor Virus (MMTV). We analyzed (epi)genomic features to generate bias maps at both local and genome-wide scales. MMTV showed a remarkably uniform distribution of integrations across the genome. More distinct preferences were observed for the two transposons, with PB showing remarkable resemblance to bias profiles of the Murine Leukemia Virus. Furthermore, we present a model where target site selection is directed at multiple scales. At a large scale, target site selection is similar across systems, and defined by domain-oriented features, namely expression of proximal genes, proximity to CpG islands and to genic features, chromatin compaction and replication timing. Notable differences between the systems are mainly observed at smaller scales, and are directed by a diverse range of features. To study the effect of these biases on integration sites occupied under selective pressure, we turned to insertional mutagenesis (IM) screens. In IM screens, putative cancer genes are identified by finding frequently targeted genomic regions, or Common Integration Sites (CISs). Within three recently completed IM screens, we identified 7%–33% putative false positive CISs, which are likely not the result of the oncogenic selection process. Moreover, results indicate that PB, compared to SB, is more suited to tag oncogenes. PMID:24721906
Interplay between DNA methylation, histone modification and chromatin remodeling in stem cells and during development.

PubMed

Ikegami, Kohta; Ohgane, Jun; Tanaka, Satoshi; Yagi, Shintaro; Shiota, Kunio

2009-01-01

Genes constitute only a small proportion of the mammalian genome, the majority of which is composed of non-genic repetitive elements including interspersed repeats and satellites. A unique feature of the mammalian genome is that there are numerous tissue-dependent, differentially methylated regions (T-DMRs) in the non-repetitive sequences, which include genes and their regulatory elements. The epigenetic status of T-DMRs varies from that of repetitive elements and constitutes the DNA methylation profile genome-wide. Since the DNA methylation profile is specific to each cell and tissue type, much like a fingerprint, it can be used as a means of identification. The formation of DNA methylation profiles is the basis for cell differentiation and development in mammals. The epigenetic status of each T-DMR is regulated by the interplay between DNA methyltransferases, histone modification enzymes, histone subtypes, non-histone nuclear proteins and non-coding RNAs. In this review, we will discuss how these epigenetic factors cooperate to establish cell- and tissue-specific DNA methylation profiles.
Differential DNA methylation profiles of coding and non-coding genes define hippocampal sclerosis in human temporal lobe epilepsy

PubMed Central

Miller-Delaney, Suzanne F.C.; Bryan, Kenneth; Das, Sudipto; McKiernan, Ross C.; Bray, Isabella M.; Reynolds, James P.; Gwinn, Ryder; Stallings, Raymond L.

2015-01-01

Temporal lobe epilepsy is associated with large-scale, wide-ranging changes in gene expression in the hippocampus. Epigenetic changes to DNA are attractive mechanisms to explain the sustained hyperexcitability of chronic epilepsy. Here, through methylation analysis of all annotated C-phosphate-G islands and promoter regions in the human genome, we report a pilot study of the methylation profiles of temporal lobe epilepsy with or without hippocampal sclerosis. Furthermore, by comparative analysis of expression and promoter methylation, we identify methylation sensitive non-coding RNA in human temporal lobe epilepsy. A total of 146 protein-coding genes exhibited altered DNA methylation in temporal lobe epilepsy hippocampus (n = 9) when compared to control (n = 5), with 81.5% of the promoters of these genes displaying hypermethylation. Unique methylation profiles were evident in temporal lobe epilepsy with or without hippocampal sclerosis, in addition to a common methylation profile regardless of pathology grade. Gene ontology terms associated with development, neuron remodelling and neuron maturation were over-represented in the methylation profile of Watson Grade 1 samples (mild hippocampal sclerosis). In addition to genes associated with neuronal, neurotransmitter/synaptic transmission and cell death functions, differential hypermethylation of genes associated with transcriptional regulation was evident in temporal lobe epilepsy, but overall few genes previously associated with epilepsy were among the differentially methylated. Finally, a panel of 13, methylation-sensitive microRNA were identified in temporal lobe epilepsy including MIR27A, miR-193a-5p (MIR193A) and miR-876-3p (MIR876), and the differential methylation of long non-coding RNA documented for the first time. The present study therefore reports select, genome-wide DNA methylation changes in human temporal lobe epilepsy that may contribute to the molecular architecture of the epileptic brain. PMID:25552301
Watson for Genomics: Moving Personalized Medicine Forward.

PubMed

Rhrissorrakrai, Kahn; Koyama, Takahiko; Parida, Laxmi

2016-08-01

The confluence of genomic technologies and cognitive computing has brought us to the doorstep of widespread usage of personalized medicine. Cognitive systems, such as Watson for Genomics (WG), integrate massive amounts of new omic data with the current body of knowledge to assist physicians in analyzing and acting on patient's genomic profiles. Copyright © 2016 Elsevier Inc. All rights reserved.
Modular assembly of transposable element arrays by microsatellite targeting in the guayule and rice genomes.

PubMed

Valdes Franco, José A; Wang, Yi; Huo, Naxin; Ponciano, Grisel; Colvin, Howard A; McMahan, Colleen M; Gu, Yong Q; Belknap, William R

2018-04-19

Guayule (Parthenium argentatum A. Gray) is a rubber-producing desert shrub native to Mexico and the United States. Guayule represents an alternative to Hevea brasiliensis as a source for commercial natural rubber. The efficient application of modern molecular/genetic tools to guayule improvement requires characterization of its genome. The 1.6 Gb guayule genome was sequenced, assembled and annotated. The final 1.5 Gb assembly, while fragmented (N 50 = 22 kb), maps > 95% of the shotgun reads and is essentially complete. Approximately 40,000 transcribed, protein encoding genes were annotated on the assembly. Further characterization of this genome revealed 15 families of small, microsatellite-associated, transposable elements (TEs) with unexpected chromosomal distribution profiles. These SaTar (Satellite Targeted) elements, which are non-autonomous Mu-like elements (MULEs), were frequently observed in multimeric linear arrays of unrelated individual elements within which no individual element is interrupted by another. This uniformly non-nested TE multimer architecture has not been previously described in either eukaryotic or prokaryotic genomes. Five families of similarly distributed non-autonomous MULEs (microsatellite associated, modularly assembled) were characterized in the rice genome. Families of TEs with similar structures and distribution profiles were identified in sorghum and citrus. The sequencing and assembly of the guayule genome provides a foundation for application of current crop improvement technologies to this plant. In addition, characterization of this genome revealed SaTar elements with distribution profiles unique among TEs. Satar targeting appears based on an alternative MULE recombination mechanism with the potential to impact gene evolution.
Integrated molecular portrait of non-small cell lung cancers

PubMed Central

2013-01-01

Background Non-small cell lung cancer (NSCLC), a leading cause of cancer deaths, represents a heterogeneous group of neoplasms, mostly comprising squamous cell carcinoma (SCC), adenocarcinoma (AC) and large-cell carcinoma (LCC). The objectives of this study were to utilize integrated genomic data including copy-number alteration, mRNA, microRNA expression and candidate-gene full sequencing data to characterize the molecular distinctions between AC and SCC. Methods Comparative genomic hybridization followed by mutational analysis, gene expression and miRNA microarray profiling were performed on 123 paired tumor and non-tumor tissue samples from patients with NSCLC. Results At DNA, mRNA and miRNA levels we could identify molecular markers that discriminated significantly between the various histopathological entities of NSCLC. We identified 34 genomic clusters using aCGH data; several genes exhibited a different profile of aberrations between AC and SCC, including PIK3CA, SOX2, THPO, TP63, PDGFB genes. Gene expression profiling analysis identified SPP1, CTHRC1and GREM1 as potential biomarkers for early diagnosis of the cancer, and SPINK1 and BMP7 to distinguish between AC and SCC in small biopsies or in blood samples. Using integrated genomics approach we found in recurrently altered regions a list of three potential driver genes, MRPS22, NDRG1 and RNF7, which were consistently over-expressed in amplified regions, had wide-spread correlation with an average of ~800 genes throughout the genome and highly associated with histological types. Using a network enrichment analysis, the targets of these potential drivers were seen to be involved in DNA replication, cell cycle, mismatch repair, p53 signalling pathway and other lung cancer related signalling pathways, and many immunological pathways. Furthermore, we also identified one potential driver miRNA hsa-miR-944. Conclusions Integrated molecular characterization of AC and SCC helped identify clinically relevant markers and potential drivers, which are recurrent and stable changes at DNA level that have functional implications at RNA level and have strong association with histological subtypes. PMID:24299561
Transcriptome profiling of the demosponge Amphimedon queenslandica reveals genome-wide events that accompany major life cycle transitions

PubMed Central

2012-01-01

Background The biphasic life cycle with pelagic larva and benthic adult stages is widely observed in the animal kingdom, including the Porifera (sponges), which are the earliest branching metazoans. The demosponge, Amphimedon queenslandica, undergoes metamorphosis from a free-swimming larva into a sessile adult that bears no morphological resemblance to other animals. While the genome of A. queenslandica contains an extensive repertoire of genes very similar to that of complex bilaterians, it is as yet unclear how this is drawn upon to coordinate changing morphological features and ecological demands throughout the sponge life cycle. Results To identify genome-wide events that accompany the pelagobenthic transition in A. queenslandica, we compared global gene expression profiles at four key developmental stages by sequencing the poly(A) transcriptome using SOLiD technology. Large-scale changes in transcription were observed as sponge larvae settled on the benthos and began metamorphosis. Although previous systematics suggest that the only clear homology between Porifera and other animals is in the embryonic and larval stages, we observed extensive use of genes involved in metazoan-associated cellular processes throughout the sponge life cycle. Sponge-specific transcripts are not over-represented in the morphologically distinct adult; rather, many genes that encode typical metazoan features, such as cell adhesion and immunity, are upregulated. Our analysis further revealed gene families with candidate roles in competence, settlement, and metamorphosis in the sponge, including transcription factors, G-protein coupled receptors and other signaling molecules. Conclusions This first genome-wide study of the developmental transcriptome in an early branching metazoan highlights major transcriptional events that accompany the pelagobenthic transition and point to a network of regulatory mechanisms that coordinate changes in morphology with shifting environmental demands. Metazoan developmental and structural gene orthologs are well-integrated into the expression profiles at every stage of sponge development, including the adult. The utilization of genes involved in metazoan-associated processes throughout sponge development emphasizes the potential of the genome of the last common ancestor of animals to generate phenotypic complexity. PMID:22646746
Genome-enabled prediction models for yield related traits in chickpea

USDA-ARS?s Scientific Manuscript database

Genomic selection (GS) unlike marker-assisted backcrossing (MABC) predicts breeding values of lines using genome-wide marker profiling and allows selection of lines prior to field-phenotyping, thereby shortening the breeding cycle. A collection of 320 elite breeding lines was selected and phenotyped...
From genes to genomes: a new paradigm for studying fungal pathogenesis in Magnaporthe oryzae.

PubMed

Xu, Jin-Rong; Zhao, Xinhua; Dean, Ralph A

2007-01-01

Magnaporthe oryzae is the most destructive fungal pathogen of rice worldwide and because of its amenability to classical and molecular genetic manipulation, availability of a genome sequence, and other resources it has emerged as a leading model system to study host-pathogen interactions. This chapter reviews recent progress toward elucidation of the molecular basis of infection-related morphogenesis, host penetration, invasive growth, and host-pathogen interactions. Related information on genome analysis and genomic studies of plant infection processes is summarized under specific topics where appropriate. Particular emphasis is placed on the role of MAP kinase and cAMP signal transduction pathways and unique features in the genome such as repetitive sequences and expanded gene families. Emerging developments in functional genome analysis through large-scale insertional mutagenesis and gene expression profiling are detailed. The chapter concludes with new prospects in the area of systems biology, such as protein expression profiling, and highlighting remaining crucial information needed to fully appreciate host-pathogen interactions.
Clonal selection in xenografted human T cell acute lymphoblastic leukemia recapitulates gain of malignancy at relapse.

PubMed

Clappier, Emmanuelle; Gerby, Bastien; Sigaux, François; Delord, Marc; Touzri, Farah; Hernandez, Lucie; Ballerini, Paola; Baruchel, André; Pflumio, Françoise; Soulier, Jean

2011-04-11

Genomic studies in human acute lymphoblastic leukemia (ALL) have revealed clonal heterogeneity at diagnosis and clonal evolution at relapse. In this study, we used genome-wide profiling to compare human T cell ALL samples at the time of diagnosis and after engraftment (xenograft) into immunodeficient recipient mice. Compared with paired diagnosis samples, the xenograft leukemia often contained additional genomic lesions in established human oncogenes and/or tumor suppressor genes. Mimicking such genomic lesions by short hairpin RNA-mediated knockdown in diagnosis samples conferred a selective advantage in competitive engraftment experiments, demonstrating that additional lesions can be drivers of increased leukemia-initiating activity. In addition, the xenograft leukemias appeared to arise from minor subclones existing in the patient at diagnosis. Comparison of paired diagnosis and relapse samples showed that, with regard to genetic lesions, xenograft leukemias more frequently more closely resembled relapse samples than bulk diagnosis samples. Moreover, a cell cycle- and mitosis-associated gene expression signature was present in xenograft and relapse samples, and xenograft leukemia exhibited diminished sensitivity to drugs. Thus, the establishment of human leukemia in immunodeficient mice selects and expands a more aggressive malignancy, recapitulating the process of relapse in patients. These findings may contribute to the design of novel strategies to prevent or treat relapse.
FDA Escherichia coli Identification (FDA-ECID) Microarray: a Pangenome Molecular Toolbox for Serotyping, Virulence Profiling, Molecular Epidemiology, and Phylogeny

PubMed Central

Patel, Isha R.; Gangiredla, Jayanthi; Lacher, David W.; Mammel, Mark K.; Jackson, Scott A.; Lampel, Keith A.

2016-01-01

ABSTRACT Most Escherichia coli strains are nonpathogenic. However, for clinical diagnosis and food safety analysis, current identification methods for pathogenic E. coli either are time-consuming and/or provide limited information. Here, we utilized a custom DNA microarray with informative genetic features extracted from 368 sequence sets for rapid and high-throughput pathogen identification. The FDA Escherichia coli Identification (FDA-ECID) platform contains three sets of molecularly informative features that together stratify strain identification and relatedness. First, 53 known flagellin alleles, 103 alleles of wzx and wzy, and 5 alleles of wzm provide molecular serotyping utility. Second, 41,932 probe sets representing the pan-genome of E. coli provide strain-level gene content information. Third, approximately 125,000 single nucleotide polymorphisms (SNPs) of available whole-genome sequences (WGS) were distilled to 9,984 SNPs capable of recapitulating the E. coli phylogeny. We analyzed 103 diverse E. coli strains with available WGS data, including those associated with past foodborne illnesses, to determine robustness and accuracy. The array was able to accurately identify the molecular O and H serotypes, potentially correcting serological failures and providing better resolution for H-nontypeable/nonmotile phenotypes. In addition, molecular risk assessment was possible with key virulence marker identifications. Epidemiologically, each strain had a unique comparative genomic fingerprint that was extended to an additional 507 food and clinical isolates. Finally, a 99.7% phylogenetic concordance was established between microarray analysis and WGS using SNP-level data for advanced genome typing. Our study demonstrates FDA-ECID as a powerful tool for epidemiology and molecular risk assessment with the capacity to profile the global landscape and diversity of E. coli. IMPORTANCE This study describes a robust, state-of-the-art platform developed from available whole-genome sequences of E. coli and Shigella spp. by distilling useful signatures for epidemiology and molecular risk assessment into one assay. The FDA-ECID microarray contains features that enable comprehensive molecular serotyping and virulence profiling along with genome-scale genotyping and SNP analysis. Hence, it is a molecular toolbox that stratifies strain identification and pathogenic potential in the contexts of epidemiology and phylogeny. We applied this tool to strains from food, environmental, and clinical sources, resulting in significantly greater phylogenetic and strain-specific resolution than previously reported for available typing methods. PMID:27037122

Seed-effect modeling improves the consistency of genome-wide loss-of-function screens and identifies synthetic lethal vulnerabilities in cancer cells.

PubMed

Jaiswal, Alok; Peddinti, Gopal; Akimov, Yevhen; Wennerberg, Krister; Kuznetsov, Sergey; Tang, Jing; Aittokallio, Tero

2017-06-01

Genome-wide loss-of-function profiling is widely used for systematic identification of genetic dependencies in cancer cells; however, the poor reproducibility of RNA interference (RNAi) screens has been a major concern due to frequent off-target effects. Currently, a detailed understanding of the key factors contributing to the sub-optimal consistency is still a lacking, especially on how to improve the reliability of future RNAi screens by controlling for factors that determine their off-target propensity. We performed a systematic, quantitative analysis of the consistency between two genome-wide shRNA screens conducted on a compendium of cancer cell lines, and also compared several gene summarization methods for inferring gene essentiality from shRNA level data. We then devised novel concepts of seed essentiality and shRNA family, based on seed region sequences of shRNAs, to study in-depth the contribution of seed-mediated off-target effects to the consistency of the two screens. We further investigated two seed-sequence properties, seed pairing stability, and target abundance in terms of their capability to minimize the off-target effects in post-screening data analysis. Finally, we applied this novel methodology to identify genetic interactions and synthetic lethal partners of cancer drivers, and confirmed differential essentiality phenotypes by detailed CRISPR/Cas9 experiments. Using the novel concepts of seed essentiality and shRNA family, we demonstrate how genome-wide loss-of-function profiling of a common set of cancer cell lines can be actually made fairly reproducible when considering seed-mediated off-target effects. Importantly, by excluding shRNAs having higher propensity for off-target effects, based on their seed-sequence properties, one can remove noise from the genome-wide shRNA datasets. As a translational application case, we demonstrate enhanced reproducibility of genetic interaction partners of common cancer drivers, as well as identify novel synthetic lethal partners of a major oncogenic driver, PIK3CA, supported by a complementary CRISPR/Cas9 experiment. We provide practical guidelines for improved design and analysis of genome-wide loss-of-function profiling and demonstrate how this novel strategy can be applied towards improved mapping of genetic dependencies of cancer cells to aid development of targeted anticancer treatments.
Genome-wide comparative analysis of NBS-encoding genes between Brassica species and Arabidopsis thaliana.

PubMed

Yu, Jingyin; Tehrim, Sadia; Zhang, Fengqi; Tong, Chaobo; Huang, Junyan; Cheng, Xiaohui; Dong, Caihua; Zhou, Yanqiu; Qin, Rui; Hua, Wei; Liu, Shengyi

2014-01-03

Plant disease resistance (R) genes with the nucleotide binding site (NBS) play an important role in offering resistance to pathogens. The availability of complete genome sequences of Brassica oleracea and Brassica rapa provides an important opportunity for researchers to identify and characterize NBS-encoding R genes in Brassica species and to compare with analogues in Arabidopsis thaliana based on a comparative genomics approach. However, little is known about the evolutionary fate of NBS-encoding genes in the Brassica lineage after split from A. thaliana. Here we present genome-wide analysis of NBS-encoding genes in B. oleracea, B. rapa and A. thaliana. Through the employment of HMM search and manual curation, we identified 157, 206 and 167 NBS-encoding genes in B. oleracea, B. rapa and A. thaliana genomes, respectively. Phylogenetic analysis among 3 species classified NBS-encoding genes into 6 subgroups. Tandem duplication and whole genome triplication (WGT) analyses revealed that after WGT of the Brassica ancestor, NBS-encoding homologous gene pairs on triplicated regions in Brassica ancestor were deleted or lost quickly, but NBS-encoding genes in Brassica species experienced species-specific gene amplification by tandem duplication after divergence of B. rapa and B. oleracea. Expression profiling of NBS-encoding orthologous gene pairs indicated the differential expression pattern of retained orthologous gene copies in B. oleracea and B. rapa. Furthermore, evolutionary analysis of CNL type NBS-encoding orthologous gene pairs among 3 species suggested that orthologous genes in B. rapa species have undergone stronger negative selection than those in B .oleracea species. But for TNL type, there are no significant differences in the orthologous gene pairs between the two species. This study is first identification and characterization of NBS-encoding genes in B. rapa and B. oleracea based on whole genome sequences. Through tandem duplication and whole genome triplication analysis in B. oleracea, B. rapa and A. thaliana genomes, our study provides insight into the evolutionary history of NBS-encoding genes after divergence of A. thaliana and the Brassica lineage. These results together with expression pattern analysis of NBS-encoding orthologous genes provide useful resource for functional characterization of these genes and genetic improvement of relevant crops.
Genomic signatures predict migration and spawning failure in wild Canadian salmon.

PubMed

Miller, Kristina M; Li, Shaorong; Kaukinen, Karia H; Ginther, Norma; Hammill, Edd; Curtis, Janelle M R; Patterson, David A; Sierocinski, Thomas; Donnison, Louise; Pavlidis, Paul; Hinch, Scott G; Hruska, Kimberly A; Cooke, Steven J; English, Karl K; Farrell, Anthony P

2011-01-14

Long-term population viability of Fraser River sockeye salmon (Oncorhynchus nerka) is threatened by unusually high levels of mortality as they swim to their spawning areas before they spawn. Functional genomic studies on biopsied gill tissue from tagged wild adults that were tracked through ocean and river environments revealed physiological profiles predictive of successful migration and spawning. We identified a common genomic profile that was correlated with survival in each study. In ocean-tagged fish, a mortality-related genomic signature was associated with a 13.5-fold greater chance of dying en route. In river-tagged fish, the same genomic signature was associated with a 50% increase in mortality before reaching the spawning grounds in one of three stocks tested. At the spawning grounds, the same signature was associated with 3.7-fold greater odds of dying without spawning. Functional analysis raises the possibility that the mortality-related signature reflects a viral infection.
Integrative genomic profiling reveals conserved genetic mechanisms for tumorigenesis in common entities of non-Hodgkin's lymphoma.

PubMed

Green, Michael R; Aya-Bonilla, Carlos; Gandhi, Maher K; Lea, Rod A; Wellwood, Jeremy; Wood, Peter; Marlton, Paula; Griffiths, Lyn R

2011-05-01

Recent developments in genomic technologies have resulted in increased understanding of pathogenic mechanisms and emphasized the importance of central survival pathways. Here, we use a novel bioinformatic based integrative genomic profiling approach to elucidate conserved mechanisms of lymphomagenesis in the three commonest non-Hodgkin's lymphoma (NHL) entities: diffuse large B-cell lymphoma, follicular lymphoma, and B-cell chronic lymphocytic leukemia. By integrating genome-wide DNA copy number analysis and transcriptome profiling of tumor cohorts, we identified genetic lesions present in each entity and highlighted their likely target genes. This revealed a significant enrichment of components of both the apoptosis pathway and the mitogen activated protein kinase pathway, including amplification of the MAP3K12 locus in all three entities, within the set of genes targeted by genetic alterations in these diseases. Furthermore, amplification of 12p13.33 was identified in all three entities and found to target the FOXM1 oncogene. Amplification of FOXM1 was subsequently found to be associated with an increased MYC oncogenic signaling signature, and siRNA-mediated knock-down of FOXM1 resulted in decreased MYC expression and induced G2 arrest. Together, these findings underscore genetic alteration of the MAPK and apoptosis pathways, and genetic amplification of FOXM1 as conserved mechanisms of lymphomagenesis in common NHL entities. Integrative genomic profiling identifies common central survival mechanisms and highlights them as attractive targets for directed therapy. 2011 Wiley-Liss, Inc.
Comparative genomic analysis of Mycobacterium tuberculosis clinical isolates.

PubMed

Liu, Fei; Hu, Yongfei; Wang, Qi; Li, Hong Min; Gao, George F; Liu, Cui Hua; Zhu, Baoli

2014-06-13

Due to excessive antibiotic use, drug-resistant Mycobacterium tuberculosis has become a serious public health threat and a major obstacle to disease control in many countries. To better understand the evolution of drug-resistant M. tuberculosis strains, we performed whole genome sequencing for 7 M. tuberculosis clinical isolates with different antibiotic resistance profiles and conducted comparative genomic analysis of gene variations among them. We observed that all 7 M. tuberculosis clinical isolates with different levels of drug resistance harbored similar numbers of SNPs, ranging from 1409-1464. The numbers of insertion/deletions (Indels) identified in the 7 isolates were also similar, ranging from 56 to 101. A total of 39 types of mutations were identified in drug resistance-associated loci, including 14 previously reported ones and 25 newly identified ones. Sixteen of the identified large Indels spanned PE-PPE-PGRS genes, which represents a major source of antigenic variability. Aside from SNPs and Indels, a CRISPR locus with varied spacers was observed in all 7 clinical isolates, suggesting that they might play an important role in plasticity of the M. tuberculosis genome. The nucleotide diversity (Л value) and selection intensity (dN/dS value) of the whole genome sequences of the 7 isolates were similar. The dN/dS values were less than 1 for all 7 isolates (range from 0.608885 to 0.637365), supporting the notion that M. tuberculosis genomes undergo purifying selection. The Л values and dN/dS values were comparable between drug-susceptible and drug-resistant strains. In this study, we show that clinical M. tuberculosis isolates exhibit distinct variations in terms of the distribution of SNP, Indels, CRISPR-cas locus, as well as the nucleotide diversity and selection intensity, but there are no generalizable differences between drug-susceptible and drug-resistant isolates on the genomic scale. Our study provides evidence strengthening the notion that the evolution of drug resistance among clinical M. tuberculosis isolates is clearly a complex and diversified process.
Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences.

PubMed

Tan, Yen Hock; Huang, He; Kihara, Daisuke

2006-08-15

Aligning distantly related protein sequences is a long-standing problem in bioinformatics, and a key for successful protein structure prediction. Its importance is increasing recently in the context of structural genomics projects because more and more experimentally solved structures are available as templates for protein structure modeling. Toward this end, recent structure prediction methods employ profile-profile alignments, and various ways of aligning two profiles have been developed. More fundamentally, a better amino acid similarity matrix can improve a profile itself; thereby resulting in more accurate profile-profile alignments. Here we have developed novel amino acid similarity matrices from knowledge-based amino acid contact potentials. Contact potentials are used because the contact propensity to the other amino acids would be one of the most conserved features of each position of a protein structure. The derived amino acid similarity matrices are tested on benchmark alignments at three different levels, namely, the family, the superfamily, and the fold level. Compared to BLOSUM45 and the other existing matrices, the contact potential-based matrices perform comparably in the family level alignments, but clearly outperform in the fold level alignments. The contact potential-based matrices perform even better when suboptimal alignments are considered. Comparing the matrices themselves with each other revealed that the contact potential-based matrices are very different from BLOSUM45 and the other matrices, indicating that they are located in a different basin in the amino acid similarity matrix space.
Chemical genomic guided engineering of gamma-valerolactone tolerant yeast.

PubMed

Bottoms, Scott; Dickinson, Quinn; McGee, Mick; Hinchman, Li; Higbee, Alan; Hebert, Alex; Serate, Jose; Xie, Dan; Zhang, Yaoping; Coon, Joshua J; Myers, Chad L; Landick, Robert; Piotrowski, Jeff S

2018-01-12

Gamma valerolactone (GVL) treatment of lignocellulosic bomass is a promising technology for degradation of biomass for biofuel production; however, GVL is toxic to fermentative microbes. Using a combination of chemical genomics with the yeast (Saccharomyces cerevisiae) deletion collection to identify sensitive and resistant mutants, and chemical proteomics to monitor protein abundance in the presence of GVL, we sought to understand the mechanism toxicity and resistance to GVL with the goal of engineering a GVL-tolerant, xylose-fermenting yeast. Chemical genomic profiling of GVL predicted that this chemical affects membranes and membrane-bound processes. We show that GVL causes rapid, dose-dependent cell permeability, and is synergistic with ethanol. Chemical genomic profiling of GVL revealed that deletion of the functionally related enzymes Pad1p and Fdc1p, which act together to decarboxylate cinnamic acid and its derivatives to vinyl forms, increases yeast tolerance to GVL. Further, overexpression of Pad1p sensitizes cells to GVL toxicity. To improve GVL tolerance, we deleted PAD1 and FDC1 in a xylose-fermenting yeast strain. The modified strain exhibited increased anaerobic growth, sugar utilization, and ethanol production in synthetic hydrolysate with 1.5% GVL, and under other conditions. Chemical proteomic profiling of the engineered strain revealed that enzymes involved in ergosterol biosynthesis were more abundant in the presence of GVL compared to the background strain. The engineered GVL strain contained greater amounts of ergosterol than the background strain. We found that GVL exerts toxicity to yeast by compromising cellular membranes, and that this toxicity is synergistic with ethanol. Deletion of PAD1 and FDC1 conferred GVL resistance to a xylose-fermenting yeast strain by increasing ergosterol accumulation in aerobically grown cells. The GVL-tolerant strain fermented sugars in the presence of GVL levels that were inhibitory to the unmodified strain. This strain represents a xylose fermenting yeast specifically tailored to GVL produced hydrolysates.
Comprehensive genomic profiling reveals inactivating SMARCA4 mutations and low tumor mutational burden in small cell carcinoma of the ovary, hypercalcemic-type.

PubMed

Lin, Douglas I; Chudnovsky, Yakov; Duggan, Bridget; Zajchowski, Deborah; Greenbowe, Joel; Ross, Jeffrey S; Gay, Laurie M; Ali, Siraj M; Elvin, Julia A

2017-12-01

Small cell carcinoma of the ovary, hypercalcemic-type (SCCOHT) is a rare, extremely aggressive neoplasm that usually occurs in young women and is characterized by deleterious germline or somatic SMARCA4 mutations. We performed comprehensive genomic profiling (CGP) to potentially identify additional clinically and pathophysiologically relevant genomic alterations in SCCOHT. CGP assessment of all classes of coding alterations in up to 406 genes commonly altered in cancer and intronic regions for up to 31 genes commonly rearranged in cancer was performed on 18 SCCOHT cases (16 exhibiting classic morphology and 2 cases exhibiting exclusive a large cell variant morphology). In addition, a retrospective database search for clinically advanced ovarian tumors with genomic profiles similar to SCCOHT yielded 3 additional cases originally diagnosed as non-SCCOHT. CGP revealed inactivating SMARCA4 alterations and low tumor mutational burden (TMB) (<6mutations/Mb) in 94% (15/16) of SCCOHT with classic morphology. In contrast, both (2/2) cases exhibiting only large cell variant morphology were hypermutated (TMB scores of 90 and 360mut/Mb) and were wildtype for SMARCA4. In our retrospective search, an index ovarian cancer patient harboring inactivating SMARCA4 alterations, initially diagnosed as endometrioid carcinoma, was re-classified as SCCOHT and responded to an SCCOHT chemotherapy regimen. The vast majority of SCCOHT demonstrate genomic SMARCA4 loss with only rare co-occurring alterations. Our data support a role for CGP in the diagnosis and management of SCCOHT and of other lesions with overlapping histological and clinical features, since identifying the former by genomic profile suggests benefit from an appropriate regimen and treatment decisions, as illustrated by an index patient. Copyright © 2017 Elsevier Inc. All rights reserved.
Genomic profiling of pelvic genital type leiomyosarcoma in a woman with a germline CHEK2:c.1100delC mutation and a concomitant diagnosis of metastatic invasive ductal breast carcinoma

PubMed Central

Reisle, Caralyn; Martin, Lee Ann; Alwelaie, Yazeed; Mungall, Karen L.; Ch'ng, Carolyn; Thomas, Ruth; Ng, Tony; Yip, Stephen; J. Lim, Howard; Sun, Sophie; Young, Sean S.; Karsan, Aly; Zhao, Yongjun; Mungall, Andrew J.; Moore, Richard A.; J. Renouf, Daniel; Gelmon, Karen; Ma, Yussanne P.; Hayes, Malcolm; Laskin, Janessa; Marra, Marco A.; Schrader, Kasmintan A.; Jones, Steven J. M.

2017-01-01

We describe a woman with the known pathogenic germline variant CHEK2:c.1100delC and synchronous diagnoses of both pelvic genital type leiomyosarcoma (LMS) and metastatic invasive ductal breast carcinoma. CHEK2 (checkpoint kinase 2) is a tumor-suppressor gene encoding a serine/threonine-protein kinase (CHEK2) involved in double-strand DNA break repair and cell cycle arrest. The CHEK2:c.1100delC variant is a moderate penetrance allele resulting in an approximately twofold increase in breast cancer risk. Whole-genome and whole-transcriptome sequencing were performed on the leiomyosarcoma and matched blood-derived DNA. Despite the presence of several genomic hits within the double-strand DNA damage pathway (CHEK2 germline variant and multiple RAD51B somatic structural variants), tumor profiling did not show an obvious DNA repair deficiency signature. However, even though the LMS displayed clear malignant features, its genomic profiling revealed several characteristics classically associated with leiomyomas including a translocation, t(12;14), with one breakpoint disrupting RAD51B and the other breakpoint upstream of HMGA2 with very high expression of HMGA2 and PLAG1. This is the first report of LMS genomic profiling in a patient with the germline CHEK2:c.1100delC variant and an additional diagnosis of metastatic invasive ductal breast carcinoma. We also describe a possible mechanistic relationship between leiomyoma and LMS based on genomic and transcriptome data. Our findings suggest that RAD51B translocation and HMGA2 overexpression may play an important role in LMS oncogenesis. PMID:28514723
Genomic profiling of pelvic genital type leiomyosarcoma in a woman with a germline CHEK2:c.1100delC mutation and a concomitant diagnosis of metastatic invasive ductal breast carcinoma.

PubMed

Thibodeau, My Linh; Reisle, Caralyn; Zhao, Eric; Martin, Lee Ann; Alwelaie, Yazeed; Mungall, Karen L; Ch'ng, Carolyn; Thomas, Ruth; Ng, Tony; Yip, Stephen; J Lim, Howard; Sun, Sophie; Young, Sean S; Karsan, Aly; Zhao, Yongjun; Mungall, Andrew J; Moore, Richard A; J Renouf, Daniel; Gelmon, Karen; Ma, Yussanne P; Hayes, Malcolm; Laskin, Janessa; Marra, Marco A; Schrader, Kasmintan A; Jones, Steven J M

2017-09-01

We describe a woman with the known pathogenic germline variant CHEK2 :c.1100delC and synchronous diagnoses of both pelvic genital type leiomyosarcoma (LMS) and metastatic invasive ductal breast carcinoma. CHEK2 (checkpoint kinase 2) is a tumor-suppressor gene encoding a serine/threonine-protein kinase (CHEK2) involved in double-strand DNA break repair and cell cycle arrest. The CHEK2 :c.1100delC variant is a moderate penetrance allele resulting in an approximately twofold increase in breast cancer risk. Whole-genome and whole-transcriptome sequencing were performed on the leiomyosarcoma and matched blood-derived DNA. Despite the presence of several genomic hits within the double-strand DNA damage pathway ( CHEK2 germline variant and multiple RAD51B somatic structural variants), tumor profiling did not show an obvious DNA repair deficiency signature. However, even though the LMS displayed clear malignant features, its genomic profiling revealed several characteristics classically associated with leiomyomas including a translocation, t(12;14), with one breakpoint disrupting RAD51B and the other breakpoint upstream of HMGA2 with very high expression of HMGA2 and PLAG1 This is the first report of LMS genomic profiling in a patient with the germline CHEK2 :c.1100delC variant and an additional diagnosis of metastatic invasive ductal breast carcinoma. We also describe a possible mechanistic relationship between leiomyoma and LMS based on genomic and transcriptome data. Our findings suggest that RAD51B translocation and HMGA2 overexpression may play an important role in LMS oncogenesis. © 2017 Thibodeau et al.; Published by Cold Spring Harbor Laboratory Press.
Molecular Targeted Therapies of Childhood Choroid Plexus Carcinoma

DTIC Science & Technology

2013-10-01

Microarray intensities were analyzed in PGS, using the benign human choroid plexus papilloma (CPP) samples as an expression baseline reference. This...additional human and mouse CPC genomic profiles (timeframe: months 1-5). The goal of these studies is to expand our number of genomic profiles (DNA and...mRNA arrays) of both human and mouse CPCs to provide a comprehensive dataset with which to identify key candidate oncogenes, tumor suppressor genes
Molecular Targeted Therapies of Childhood Choroid Plexus Carcinoma

DTIC Science & Technology

2012-10-01

Microarray intensities were analyzed in PGS, using the benign human choroid plexus papilloma (CPP) samples as an expression baseline reference...identify candidate drug targets of CPC. Task 1: Generation of additional human and mouse CPC genomic profiles (timeframe: months 1-5). The goal...of these studies is to expand our number of genomic profiles (DNA and mRNA arrays) of both human and mouse CPCs to provide a comprehensive dataset
Molecular Targeted Therapies of Childhood Choroid Plexus Carcinoma

DTIC Science & Technology

2011-10-01

were analyzed in PGS, using the benign human choroid plexus papilloma (CPP) samples as an expression baseline reference. This analysis highlights...Task 1: Generation of additional human and mouse CPC genomic profiles (timeframe: months 1-5). The goal of these studies is to expand our...number of genomic profiles (DNA and mRNA arrays) of both human and mouse CPCs to provide a comprehensive dataset with which to identify key candidate
Genetic Profile of Adenoid Cystic Carcinomas (ACC) with High-Grade Transformation versus Solid Type

PubMed Central

Costa, Ana Flávia; Altemani, Albina; Vékony, Hedy; Bloemena, Elisabeth; Fresno, Florentino; Suárez, Carlos; Llorente, José Luis; Hermsen, Mario

2010-01-01

Background: ACC can occasionally undergo dedifferentiation also referred to as high-grade transformation (ACC-HGT). However, ACC-HGT can also undergo transformation to adenocarcinomas which are not poorly differentiated. ACC-HGT is generally considered to be an aggressive variant of ACC, even more than solid ACC. This study was aimed to describe the genetic changes of ACC-HGT in relation to clinico-pathological features and to compare results to solid ACC. Methods: Genome-wide DNA copy number changes were analyzed by microarray CGH in ACC-HGT, 4 with transformation into moderately differentiated adenocarcinoma (MDA) and two into poorly differentiated carcinoma (PDC), 5 solid ACC. In addition, Ki-67 index and p53 immunopositivity was assessed. Results: ACC-HGT carried fewer copy number changes compared to solid ACC. Two ACC-HGT cases harboured a breakpoint at 6q23, near the cMYB oncogene. The complexity of the genomic profile concurred with the clinical course of the patient. Among the ACC-HGT, p53 positivity significantly increased from the conventional to the transformed (both MDA and PDC) component. Conclusion: ACC-HGT may not necessarily reflect a more advanced stage of tumor progression, but rather a transformation to another histological form in which the poorly differentiated forms (PDC) presents a genetic complexity similar to the solid ACC. PMID:20978318
Genetic profile of adenoid cystic carcinomas (ACC) with high-grade transformation versus solid type.

PubMed

Costa, Ana Flávia; Altemani, Albina; Vékony, Hedy; Bloemena, Elisabeth; Fresno, Florentino; Suárez, Carlos; Llorente, José Luis; Hermsen, Mario

2010-01-01

ACC can occasionally undergo dedifferentiation also referred to as high-grade transformation (ACC-HGT). However, ACC-HGT can also undergo transformation to adenocarcinomas which are not poorly differentiated. ACC-HGT is generally considered to be an aggressive variant of ACC, even more than solid ACC. This study was aimed to describe the genetic changes of ACC-HGT in relation to clinico-pathological features and to compare results to solid ACC. genome-wide DNA copy number changes were analyzed by microarray CGH in ACC-HGT, 4 with transformation into moderately differentiated adenocarcinoma (MDA) and two into poorly differentiated carcinoma (PDC), 5 solid ACC. In addition, Ki-67 index and p53 immunopositivity was assessed. ACC-HGT carried fewer copy number changes compared to solid ACC. Two ACC-HGT cases harboured a breakpoint at 6q23, near the cMYB oncogene. The complexity of the genomic profile concurred with the clinical course of the patient. Among the ACC-HGT, p53 positivity significantly increased from the conventional to the transformed (both MDA and PDC) component. ACC-HGT may not necessarily reflect a more advanced stage of tumor progression, but rather a transformation to another histological form in which the poorly differentiated forms (PDC) presents a genetic complexity similar to the solid ACC.
Genetic profile of adenoid cystic carcinomas (ACC) with high-grade transformation versus solid type.

PubMed

Costa, Ana Flávia; Altemani, Albina; Vékony, Hedy; Bloemena, Elisabeth; Fresno, Florentino; Suárez, Carlos; Llorente, José Luis; Hermsen, Mario

2011-08-01

ACC can occasionally undergo dedifferentiation also referred to as high-grade transformation (ACC-HGT). However, ACC-HGT can also undergo transformation to adenocarcinomas which are not poorly differentiated. ACC-HGT is generally considered to be an aggressive variant of ACC, even more than solid ACC. This study was aimed to describe the genetic changes of ACC-HGT in relation to clinico-pathological features, and to compare results to solid ACC. Genome wide DNA copy number changes were analyzed by microarray CGH in ACC-HGT, four with transformation into moderately differentiated adenocarcinoma (MDA) and two into poorly differentiated carcinoma (PDC), and five solid ACC. In addition, Ki67 index and p53 immunopositivity was assessed. ACC-HGT carried fewer copy number changes compared to solid ACC. Two ACC-HGT cases harboured a breakpoint at 6q23, near the cMYB oncogene. The complexity of the genomic profile concurred with the clinical course of the patient. Among the ACC-HGT, p53 positivity significantly increased from the conventional to the transformed (both MDA and PDC) component. ACC-HGT may not necessarily reflect a more advanced stage of tumor progression, but rather a transformation to another histological form in which the poorly differentiated forms (PDC) presents a genetic complexity similar to the solid ACC.
Bifidobacterium animalis subsp. lactis ATCC 27673 Is a Genomically Unique Strain within Its Conserved Subspecies

PubMed Central

Loquasto, Joseph R.; Barrangou, Rodolphe; Dudley, Edward G.; Stahl, Buffy; Chen, Chun

2013-01-01

Many strains of Bifidobacterium animalis subsp. lactis are considered health-promoting probiotic microorganisms and are commonly formulated into fermented dairy foods. Analyses of previously sequenced genomes of B. animalis subsp. lactis have revealed little genetic diversity, suggesting that it is a monomorphic subspecies. However, during a multilocus sequence typing survey of Bifidobacterium, it was revealed that B. animalis subsp. lactis ATCC 27673 gave a profile distinct from that of the other strains of the subspecies. As part of an ongoing study designed to understand the genetic diversity of this subspecies, the genome of this strain was sequenced and compared to other sequenced genomes of B. animalis subsp. lactis and B. animalis subsp. animalis. The complete genome of ATCC 27673 was 1,963,012 bp, contained 1,616 genes and 4 rRNA operons, and had a G+C content of 61.55%. Comparative analyses revealed that the genome of ATCC 27673 contained six distinct genomic islands encoding 83 open reading frames not found in other strains of the same subspecies. In four islands, either phage or mobile genetic elements were identified. In island 6, a novel clustered regularly interspaced short palindromic repeat (CRISPR) locus which contained 81 unique spacers was identified. This type I-E CRISPR-cas system differs from the type I-C systems previously identified in this subspecies, representing the first identification of a different system in B. animalis subsp. lactis. This study revealed that ATCC 27673 is a strain of B. animalis subsp. lactis with novel genetic content and suggests that the lack of genetic variability observed is likely due to the repeated sequencing of a limited number of widely distributed commercial strains. PMID:23995933
Comprehensive molecular, genomic and phenotypic analysis of a major clone of Enterococcus faecalis MLST ST40.

PubMed

Zischka, Melanie; Künne, Carsten T; Blom, Jochen; Wobser, Dominique; Sakιnç, Türkân; Schmidt-Hohagen, Kerstin; Dabrowski, P Wojtek; Nitsche, Andreas; Hübner, Johannes; Hain, Torsten; Chakraborty, Trinad; Linke, Burkhard; Goesmann, Alexander; Voget, Sonja; Daniel, Rolf; Schomburg, Dietmar; Hauck, Rüdiger; Hafez, Hafez M; Tielen, Petra; Jahn, Dieter; Solheim, Margrete; Sadowy, Ewa; Larsen, Jesper; Jensen, Lars B; Ruiz-Garbajosa, Patricia; Quiñones Pérez, Dianelys; Mikalsen, Theresa; Bender, Jennifer; Steglich, Matthias; Nübel, Ulrich; Witte, Wolfgang; Werner, Guido

2015-03-12

Enterococcus faecalis is a multifaceted microorganism known to act as a beneficial intestinal commensal bacterium. It is also a dreaded nosocomial pathogen causing life-threatening infections in hospitalised patients. Isolates of a distinct MLST type ST40 represent the most frequent strain type of this species, distributed worldwide and originating from various sources (animal, human, environmental) and different conditions (colonisation/infection). Since enterococci are known to be highly recombinogenic we determined to analyse the microevolution and niche adaptation of this highly distributed clonal type. We compared a set of 42 ST40 isolates by assessing key molecular determinants, performing whole genome sequencing (WGS) and a number of phenotypic assays including resistance profiling, formation of biofilm and utilisation of carbon sources. We generated the first circular closed reference genome of an E. faecalis isolate D32 of animal origin and compared it with the genomes of other reference strains. D32 was used as a template for detailed WGS comparisons of high-quality draft genomes of 14 ST40 isolates. Genomic and phylogenetic analyses suggest a high level of similarity regarding the core genome, also demonstrated by similar carbon utilisation patterns. Distribution of known and putative virulence-associated genes did not differentiate between ST40 strains from a commensal and clinical background or an animal or human source. Further analyses of mobile genetic elements (MGE) revealed genomic diversity owed to: (1) a modularly structured pathogenicity island; (2) a site-specifically integrated and previously unknown genomic island of 138 kb in two strains putatively involved in exopolysaccharide synthesis; and (3) isolate-specific plasmid and phage patterns. Moreover, we used different cell-biological and animal experiments to compare the isolate D32 with a closely related ST40 endocarditis isolate whose draft genome sequence was also generated. D32 generally showed a greater capacity of adherence to human cell lines and an increased pathogenic potential in various animal models in combination with an even faster growth in vivo (not in vitro). Molecular, genomic and phenotypic analysis of representative isolates of a major clone of E. faecalis MLST ST40 revealed new insights into the microbiology of a commensal bacterium which can turn into a conditional pathogen.
JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework.

PubMed

Khan, Aziz; Fornes, Oriol; Stigliani, Arnaud; Gheorghe, Marius; Castro-Mondragon, Jaime A; van der Lee, Robin; Bessy, Adrien; Chèneby, Jeanne; Kulkarni, Shubhada R; Tan, Ge; Baranasic, Damir; Arenillas, David J; Sandelin, Albin; Vandepoele, Klaas; Lenhard, Boris; Ballester, Benoît; Wasserman, Wyeth W; Parcy, François; Mathelier, Anthony

2018-01-04

JASPAR (http://jaspar.genereg.net) is an open-access database of curated, non-redundant transcription factor (TF)-binding profiles stored as position frequency matrices (PFMs) and TF flexible models (TFFMs) for TFs across multiple species in six taxonomic groups. In the 2018 release of JASPAR, the CORE collection has been expanded with 322 new PFMs (60 for vertebrates and 262 for plants) and 33 PFMs were updated (24 for vertebrates, 8 for plants and 1 for insects). These new profiles represent a 30% expansion compared to the 2016 release. In addition, we have introduced 316 TFFMs (95 for vertebrates, 218 for plants and 3 for insects). This release incorporates clusters of similar PFMs in each taxon and each TF class per taxon. The JASPAR 2018 CORE vertebrate collection of PFMs was used to predict TF-binding sites in the human genome. The predictions are made available to the scientific community through a UCSC Genome Browser track data hub. Finally, this update comes with a new web framework with an interactive and responsive user-interface, along with new features. All the underlying data can be retrieved programmatically using a RESTful API and through the JASPAR 2018 R/Bioconductor package. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Embryonic stem cell-like features of testicular carcinoma in situ revealed by genome-wide gene expression profiling.

PubMed

Almstrup, Kristian; Hoei-Hansen, Christina E; Wirkner, Ute; Blake, Jonathon; Schwager, Christian; Ansorge, Wilhelm; Nielsen, John E; Skakkebaek, Niels E; Rajpert-De Meyts, Ewa; Leffers, Henrik

2004-07-15

Carcinoma in situ (CIS) is the common precursor of histologically heterogeneous testicular germ cell tumors (TGCTs), which in recent decades have markedly increased and now are the most common malignancy of young men. Using genome-wide gene expression profiling, we identified >200 genes highly expressed in testicular CIS, including many never reported in testicular neoplasms. Expression was further verified by semiquantitative reverse transcription-PCR and in situ hybridization. Among the highest expressed genes were NANOG and POU5F1, and reverse transcription-PCR revealed possible changes in their stoichiometry on progression into embryonic carcinoma. We compared the CIS expression profile with patterns reported in embryonic stem cells (ESCs), which revealed a substantial overlap that may be as high as 50%. We also demonstrated an over-representation of expressed genes in regions of 17q and 12, reported as unstable in cultured ESCs. The close similarity between CIS and ESCs explains the pluripotency of CIS. Moreover, the findings are consistent with an early prenatal origin of TGCTs and thus suggest that etiologic factors operating in utero are of primary importance for the incidence trends of TGCTs. Finally, some of the highly expressed genes identified in this study are promising candidates for new diagnostic markers for CIS and/or TGCTs.

Minimal evidence for consistent changes in maize DNA methylation patterns following environmental stress.

PubMed

Eichten, Steven R; Springer, Nathan M

2015-01-01

DNA methylation is a chromatin modification that is sometimes associated with epigenetic regulation of gene expression. As DNA methylation can be reversible at some loci, it is possible that methylation patterns may change within an organism that is subjected to environmental stress. In order to assess the effects of abiotic stress on DNA methylation patterns in maize (Zea mays), seeding plants were subjected to heat, cold, and UV stress treatments. Tissue was later collected from individual adult plants that had been subjected to stress or control treatments and used to perform DNA methylation profiling to determine whether there were consistent changes in DNA methylation triggered by specific stress treatments. DNA methylation profiling was performed by immunoprecipitation of methylated DNA followed by microarray hybridization to allow for quantitative estimates of DNA methylation abundance throughout the low-copy portion of the maize genome. By comparing the DNA methylation profiles of each individual plant to the average of the control plants it was possible to identify regions of the genome with variable DNA methylation. However, we did not find evidence of consistent DNA methylation changes resulting from the stress treatments used in this study. Instead, the data suggest that there is a low-rate of stochastic variation that is present in both control and stressed plants.
Adherent Human Alveolar Macrophages Exhibit a Transient Pro-Inflammatory Profile That Confounds Responses to Innate Immune Stimulation

PubMed Central

Tomlinson, Gillian S.; Booth, Helen; Petit, Sarah J.; Potton, Elspeth; Towers, Greg J.; Miller, Robert F.; Chain, Benjamin M.; Noursadeghi, Mahdad

2012-01-01

Alveolar macrophages (AM) are thought to have a key role in the immunopathogenesis of respiratory diseases. We sought to test the hypothesis that human AM exhibit an anti-inflammatory bias by making genome-wide comparisons with monocyte derived macrophages (MDM). Adherent AM obtained by bronchoalveolar lavage of patients under investigation for haemoptysis, but found to have no respiratory pathology, were compared to MDM from healthy volunteers by whole genome transcriptional profiling before and after innate immune stimulation. We found that freshly isolated AM exhibited a marked pro-inflammatory transcriptional signature. High levels of basal pro-inflammatory gene expression gave the impression of attenuated responses to lipopolysaccharide (LPS) and the RNA analogue, poly IC, but in rested cells pro-inflammatory gene expression declined and transcriptional responsiveness to these stimuli was restored. In comparison to MDM, both freshly isolated and rested AM showed upregulation of MHC class II molecules. In most experimental paradigms ex vivo adherent AM are used immediately after isolation. Therefore, the confounding effects of their pro-inflammatory profile at baseline need careful consideration. Moreover, despite the prevailing view that AM have an anti-inflammatory bias, our data clearly show that they can adopt a striking pro-inflammatory phenotype, and may have greater capacity for presentation of exogenous antigens than MDM. PMID:22768282
JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework

PubMed Central

Fornes, Oriol; Stigliani, Arnaud; Gheorghe, Marius; Castro-Mondragon, Jaime A; Bessy, Adrien; Chèneby, Jeanne; Kulkarni, Shubhada R; Tan, Ge; Baranasic, Damir; Arenillas, David J; Vandepoele, Klaas; Parcy, François

2018-01-01

Abstract JASPAR (http://jaspar.genereg.net) is an open-access database of curated, non-redundant transcription factor (TF)-binding profiles stored as position frequency matrices (PFMs) and TF flexible models (TFFMs) for TFs across multiple species in six taxonomic groups. In the 2018 release of JASPAR, the CORE collection has been expanded with 322 new PFMs (60 for vertebrates and 262 for plants) and 33 PFMs were updated (24 for vertebrates, 8 for plants and 1 for insects). These new profiles represent a 30% expansion compared to the 2016 release. In addition, we have introduced 316 TFFMs (95 for vertebrates, 218 for plants and 3 for insects). This release incorporates clusters of similar PFMs in each taxon and each TF class per taxon. The JASPAR 2018 CORE vertebrate collection of PFMs was used to predict TF-binding sites in the human genome. The predictions are made available to the scientific community through a UCSC Genome Browser track data hub. Finally, this update comes with a new web framework with an interactive and responsive user-interface, along with new features. All the underlying data can be retrieved programmatically using a RESTful API and through the JASPAR 2018 R/Bioconductor package. PMID:29140473
Systems genetics: a paradigm to improve discovery of candidate genes and mechanisms underlying complex traits.

PubMed

Feltus, F Alex

2014-06-01

Understanding the control of any trait optimally requires the detection of causal genes, gene interaction, and mechanism of action to discover and model the biochemical pathways underlying the expressed phenotype. Functional genomics techniques, including RNA expression profiling via microarray and high-throughput DNA sequencing, allow for the precise genome localization of biological information. Powerful genetic approaches, including quantitative trait locus (QTL) and genome-wide association study mapping, link phenotype with genome positions, yet genetics is less precise in localizing the relevant mechanistic information encoded in DNA. The coupling of salient functional genomic signals with genetically mapped positions is an appealing approach to discover meaningful gene-phenotype relationships. Techniques used to define this genetic-genomic convergence comprise the field of systems genetics. This short review will address an application of systems genetics where RNA profiles are associated with genetically mapped genome positions of individual genes (eQTL mapping) or as gene sets (co-expression network modules). Both approaches can be applied for knowledge independent selection of candidate genes (and possible control mechanisms) underlying complex traits where multiple, likely unlinked, genomic regions might control specific complex traits. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Deppdb--DNA electrostatic potential properties database: electrostatic properties of genome DNA.

PubMed

Osypov, Alexander A; Krutinin, Gleb G; Kamzolova, Svetlana G

2010-06-01

The electrostatic properties of genome DNA influence its interactions with different proteins, in particular, the regulation of transcription by RNA-polymerases. DEPPDB--DNA Electrostatic Potential Properties Database--was developed to hold and provide all available information on the electrostatic properties of genome DNA combined with its sequence and annotation of biological and structural properties of genome elements and whole genomes. Genomes in DEPPDB are organized on a taxonomical basis. Currently, the database contains all the completely sequenced bacterial and viral genomes according to NCBI RefSeq. General properties of the genome DNA electrostatic potential profile and principles of its formation are revealed. This potential correlates with the GC content but does not correspond to it exactly and strongly depends on both the sequence arrangement and its context (flanking regions). Analysis of the promoter regions for bacterial and viral RNA polymerases revealed a correspondence between the scale of these proteins' physical properties and electrostatic profile patterns. We also discovered a direct correlation between the potential value and the binding frequency of RNA polymerase to DNA, supporting the idea of the role of electrostatics in these interactions. This matches a pronounced tendency of the promoter regions to possess higher values of the electrostatic potential.
An imbalanced parental genome ratio affects the development of rice zygotes.

PubMed

Toda, Erika; Ohnishi, Yukinosuke; Okamoto, Takashi

2018-04-27

Upon double fertilization, one sperm cell fuses with the egg cell to form a zygote with a 1:1 maternal-to-paternal genome ratio (1m:1p), and another sperm cell fuses with the central cell to form a triploid primary endosperm cell with a 2m:1p ratio, resulting in formation of the embryo and the endosperm, respectively. The endosperm is known to be considerably sensitive to the ratio of the parental genomes. However, the effect of an imbalance of the parental genomes on zygotic development and embryogenesis has not been well studied, because it is difficult to reproduce the parental genome-imbalanced situation in zygotes and to monitor the developmental profile of zygotes without external effects from the endosperm. In this study, we produced polyploid zygotes with an imbalanced parental genome ratio by electro-fusion of isolated rice gametes and observed their developmental profiles. Polyploid zygotes with an excess maternal gamete/genome developed normally, whereas approximately half to three-quarters of polyploid zygotes with a paternal excess showed developmental arrests. These results indicate that paternal and maternal genomes synergistically serve zygote development with distinct functions, and that genes with monoallelic expression play important roles during zygotic development and embryogenesis.
The Basic/Helix-Loop-Helix Protein Family in Gossypium: Reference Genes and Their Evolution during Tetraploidization

PubMed Central

Yan, Qian; Liu, Hou-Sheng; Yao, Dan; Li, Xin; Chen, Han; Dou, Yang; Wang, Yi; Pei, Yan; Xiao, Yue-Hua

2015-01-01

Basic/helix-loop-helix (bHLH) proteins comprise one of the largest transcription factor families and play important roles in diverse cellular and molecular processes. Comprehensive analyses of the composition and evolution of the bHLH family in cotton are essential to elucidate their functions and the molecular basis of cotton development. By searching bHLH homologous genes in sequenced diploid cotton genomes (Gossypium raimondii and G. arboreum), a set of cotton bHLH reference genes containing 289 paralogs were identified and named as GobHLH001-289. Based on their phylogenetic relationships, these cotton bHLH proteins were clustered into 27 subfamilies. Compared to those in Arabidopsis and cacao, cotton bHLH proteins generally increased in number, but unevenly in different subfamilies. To further uncover evolutionary changes of bHLH genes during tetraploidization of cotton, all genes of S5a and S5b subfamilies in upland cotton and its diploid progenitors were cloned and compared, and their transcript profiles were determined in upland cotton. A total of 10 genes of S5a and S5b subfamilies (doubled from A- and D-genome progenitors) maintained in tetraploid cottons. The major sequence changes in upland cotton included a 15-bp in-frame deletion in GhbHLH130D and a long terminal repeat retrotransposon inserted in GhbHLH062A, which eliminated GhbHLH062A expression in various tissues. The S5a and S5b bHLH genes of A and D genomes (except GobHLH062) showed similar transcription patterns in various tissues including roots, stems, leaves, petals, ovules, and fibers, while the A- and D-genome genes of GobHLH110 and GobHLH130 displayed clearly different transcript profiles during fiber development. In total, this study represented a genome-wide analysis of cotton bHLH family, and revealed significant changes in sequence and expression of these genes in tetraploid cottons, which paved the way for further functional analyses of bHLH genes in the cotton genus. PMID:25992947
The Basic/Helix-Loop-Helix Protein Family in Gossypium: Reference Genes and Their Evolution during Tetraploidization.

PubMed

Yan, Qian; Liu, Hou-Sheng; Yao, Dan; Li, Xin; Chen, Han; Dou, Yang; Wang, Yi; Pei, Yan; Xiao, Yue-Hua

2015-01-01

Basic/helix-loop-helix (bHLH) proteins comprise one of the largest transcription factor families and play important roles in diverse cellular and molecular processes. Comprehensive analyses of the composition and evolution of the bHLH family in cotton are essential to elucidate their functions and the molecular basis of cotton development. By searching bHLH homologous genes in sequenced diploid cotton genomes (Gossypium raimondii and G. arboreum), a set of cotton bHLH reference genes containing 289 paralogs were identified and named as GobHLH001-289. Based on their phylogenetic relationships, these cotton bHLH proteins were clustered into 27 subfamilies. Compared to those in Arabidopsis and cacao, cotton bHLH proteins generally increased in number, but unevenly in different subfamilies. To further uncover evolutionary changes of bHLH genes during tetraploidization of cotton, all genes of S5a and S5b subfamilies in upland cotton and its diploid progenitors were cloned and compared, and their transcript profiles were determined in upland cotton. A total of 10 genes of S5a and S5b subfamilies (doubled from A- and D-genome progenitors) maintained in tetraploid cottons. The major sequence changes in upland cotton included a 15-bp in-frame deletion in GhbHLH130D and a long terminal repeat retrotransposon inserted in GhbHLH062A, which eliminated GhbHLH062A expression in various tissues. The S5a and S5b bHLH genes of A and D genomes (except GobHLH062) showed similar transcription patterns in various tissues including roots, stems, leaves, petals, ovules, and fibers, while the A- and D-genome genes of GobHLH110 and GobHLH130 displayed clearly different transcript profiles during fiber development. In total, this study represented a genome-wide analysis of cotton bHLH family, and revealed significant changes in sequence and expression of these genes in tetraploid cottons, which paved the way for further functional analyses of bHLH genes in the cotton genus.
Cytoplasmic genome substitution in wheat affects the nuclear-cytoplasmic cross-talk leading to transcript and metabolite alterations

PubMed Central

2013-01-01

Background Alloplasmic lines provide a unique tool to study nuclear-cytoplasmic interactions. Three alloplasmic lines, with nuclear genomes from Triticum aestivum and harboring cytoplasm from Aegilops uniaristata, Aegilops tauschii and Hordeum chilense, were investigated by transcript and metabolite profiling to identify the effects of cytoplasmic substitution on nuclear-cytoplasmic signaling mechanisms. Results In combining the wheat nuclear genome with a cytoplasm of H. chilense, 540 genes were significantly altered, whereas 11 and 28 genes were significantly changed in the alloplasmic lines carrying the cytoplasm of Ae. uniaristata or Ae. tauschii, respectively. We identified the RNA maturation-related process as one of the most sensitive to a perturbation of the nuclear-cytoplasmic interaction. Several key components of the ROS chloroplast retrograde signaling, together with the up-regulation of the ROS scavenging system, showed that changes in the chloroplast genome have a direct impact on nuclear-cytoplasmic cross-talk. Remarkably, the H. chilense alloplasmic line down-regulated some genes involved in the determination of cytoplasmic male sterility without expressing the male sterility phenotype. Metabolic profiling showed a comparable response of the central metabolism of the alloplasmic and euplasmic lines to light, while exposing larger metabolite alterations in the H. chilense alloplasmic line as compared with the Aegilops lines, in agreement with the transcriptomic data. Several stress-related metabolites, remarkably raffinose, were altered in content in the H. chilense alloplasmic line when exposed to high light, while amino acids, as well as organic acids were significantly decreased. Alterations in the levels of transcript, related to raffinose, and the photorespiration-related metabolisms were associated with changes in the level of related metabolites. Conclusion The replacement of a wheat cytoplasm with the cytoplasm of a related species affects the nuclear-cytoplasmic cross-talk leading to transcript and metabolite alterations. The extent of these modifications was limited in the alloplasmic lines with Aegilops cytoplasm, and more evident in the alloplasmic line with H. chilense cytoplasm. We consider that, this finding might be linked to the phylogenetic distance of the genomes. PMID:24320731
A Gene Gravity Model for the Evolution of Cancer Genomes: A Study of 3,000 Cancer Genomes across 9 Cancer Types.

PubMed

Cheng, Feixiong; Liu, Chuang; Lin, Chen-Ching; Zhao, Junfei; Jia, Peilin; Li, Wen-Hsiung; Zhao, Zhongming

2015-09-01

Cancer development and progression result from somatic evolution by an accumulation of genomic alterations. The effects of those alterations on the fitness of somatic cells lead to evolutionary adaptations such as increased cell proliferation, angiogenesis, and altered anticancer drug responses. However, there are few general mathematical models to quantitatively examine how perturbations of a single gene shape subsequent evolution of the cancer genome. In this study, we proposed the gene gravity model to study the evolution of cancer genomes by incorporating the genome-wide transcription and somatic mutation profiles of ~3,000 tumors across 9 cancer types from The Cancer Genome Atlas into a broad gene network. We found that somatic mutations of a cancer driver gene may drive cancer genome evolution by inducing mutations in other genes. This functional consequence is often generated by the combined effect of genetic and epigenetic (e.g., chromatin regulation) alterations. By quantifying cancer genome evolution using the gene gravity model, we identified six putative cancer genes (AHNAK, COL11A1, DDX3X, FAT4, STAG2, and SYNE1). The tumor genomes harboring the nonsynonymous somatic mutations in these genes had a higher mutation density at the genome level compared to the wild-type groups. Furthermore, we provided statistical evidence that hypermutation of cancer driver genes on inactive X chromosomes is a general feature in female cancer genomes. In summary, this study sheds light on the functional consequences and evolutionary characteristics of somatic mutations during tumorigenesis by propelling adaptive cancer genome evolution, which would provide new perspectives for cancer research and therapeutics.
A Gene Gravity Model for the Evolution of Cancer Genomes: A Study of 3,000 Cancer Genomes across 9 Cancer Types

PubMed Central

Lin, Chen-Ching; Zhao, Junfei; Jia, Peilin; Li, Wen-Hsiung; Zhao, Zhongming

2015-01-01

Cancer development and progression result from somatic evolution by an accumulation of genomic alterations. The effects of those alterations on the fitness of somatic cells lead to evolutionary adaptations such as increased cell proliferation, angiogenesis, and altered anticancer drug responses. However, there are few general mathematical models to quantitatively examine how perturbations of a single gene shape subsequent evolution of the cancer genome. In this study, we proposed the gene gravity model to study the evolution of cancer genomes by incorporating the genome-wide transcription and somatic mutation profiles of ~3,000 tumors across 9 cancer types from The Cancer Genome Atlas into a broad gene network. We found that somatic mutations of a cancer driver gene may drive cancer genome evolution by inducing mutations in other genes. This functional consequence is often generated by the combined effect of genetic and epigenetic (e.g., chromatin regulation) alterations. By quantifying cancer genome evolution using the gene gravity model, we identified six putative cancer genes (AHNAK, COL11A1, DDX3X, FAT4, STAG2, and SYNE1). The tumor genomes harboring the nonsynonymous somatic mutations in these genes had a higher mutation density at the genome level compared to the wild-type groups. Furthermore, we provided statistical evidence that hypermutation of cancer driver genes on inactive X chromosomes is a general feature in female cancer genomes. In summary, this study sheds light on the functional consequences and evolutionary characteristics of somatic mutations during tumorigenesis by propelling adaptive cancer genome evolution, which would provide new perspectives for cancer research and therapeutics. PMID:26352260
G-cimp status prediction of glioblastoma samples using mRNA expression data.

PubMed

Baysan, Mehmet; Bozdag, Serdar; Cam, Margaret C; Kotliarova, Svetlana; Ahn, Susie; Walling, Jennifer; Killian, Jonathan K; Stevenson, Holly; Meltzer, Paul; Fine, Howard A

2012-01-01

Glioblastoma Multiforme (GBM) is a tumor with high mortality and no known cure. The dramatic molecular and clinical heterogeneity seen in this tumor has led to attempts to define genetically similar subgroups of GBM with the hope of developing tumor specific therapies targeted to the unique biology within each of these subgroups. Recently, a subset of relatively favorable prognosis GBMs has been identified. These glioma CpG island methylator phenotype, or G-CIMP tumors, have distinct genomic copy number aberrations, DNA methylation patterns, and (mRNA) expression profiles compared to other GBMs. While the standard method for identifying G-CIMP tumors is based on genome-wide DNA methylation data, such data is often not available compared to the more widely available gene expression data. In this study, we have developed and evaluated a method to predict the G-CIMP status of GBM samples based solely on gene expression data.
G-Cimp Status Prediction Of Glioblastoma Samples Using mRNA Expression Data

PubMed Central

Baysan, Mehmet; Bozdag, Serdar; Cam, Margaret C.; Kotliarova, Svetlana; Ahn, Susie; Walling, Jennifer; Killian, Jonathan K.; Stevenson, Holly; Meltzer, Paul; Fine, Howard A.

2012-01-01

Glioblastoma Multiforme (GBM) is a tumor with high mortality and no known cure. The dramatic molecular and clinical heterogeneity seen in this tumor has led to attempts to define genetically similar subgroups of GBM with the hope of developing tumor specific therapies targeted to the unique biology within each of these subgroups. Recently, a subset of relatively favorable prognosis GBMs has been identified. These glioma CpG island methylator phenotype, or G-CIMP tumors, have distinct genomic copy number aberrations, DNA methylation patterns, and (mRNA) expression profiles compared to other GBMs. While the standard method for identifying G-CIMP tumors is based on genome-wide DNA methylation data, such data is often not available compared to the more widely available gene expression data. In this study, we have developed and evaluated a method to predict the G-CIMP status of GBM samples based solely on gene expression data. PMID:23139755
Oncologist use and perception of large panel next-generation tumor sequencing.

PubMed

Schram, A M; Reales, D; Galle, J; Cambria, R; Durany, R; Feldman, D; Sherman, E; Rosenberg, J; D'Andrea, G; Baxi, S; Janjigian, Y; Tap, W; Dickler, M; Baselga, J; Taylor, B S; Chakravarty, D; Gao, J; Schultz, N; Solit, D B; Berger, M F; Hyman, D M

2017-09-01

Genomic profiling is increasingly incorporated into oncology research and the clinical care of cancer patients. We sought to determine physician perception and use of enterprise-scale clinical sequencing at our center, including whether testing changed management and the reasoning behind this decision-making. All physicians who consented patients to MSK-IMPACT, a next-generation hybridization capture assay, in tumor types where molecular profiling is not routinely performed were asked to complete a questionnaire for each patient. Physician determination of genomic 'actionability' was compared to an expertly curated knowledgebase of somatic variants. Reported management decisions were compared to chart review. Responses were received from 146 physicians pertaining to 1932 patients diagnosed with 1 of 49 cancer types. Physicians indicated that sequencing altered management in 21% (331/1593) of patients in need of a treatment change. Among those in whom treatment was not altered, physicians indicated the presence of an actionable alteration in 55% (805/1474), however, only 45% (362/805) of these cases had a genomic variant annotated as actionable by expert curators. Further evaluation of these patients revealed that 66% (291/443) had a variant in a gene associated with biologic but not clinical evidence of actionability or a variant of unknown significance in a gene with at least one known actionable alteration. Of the cases annotated as actionable by experts, physicians identified an actionable alteration in 81% (362/445). In total, 13% (245/1932) of patients were enrolled to a genomically matched trial. Although physician and expert assessment differed, clinicians demonstrate substantial awareness of the genes associated with potential actionability and report using this knowledge to inform management in one in five patients. NCT01775072. © The Author 2017. Published by Oxford University Press on behalf of the European Society for Medical Oncology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Initiative for Molecular Profiling and Advanced Cancer Therapy (IMPACT): An MD Anderson Precision Medicine Study

PubMed Central

Tsimberidou, Apostolia-Maria; Hong, David S.; Ye, Yang; Cartwright, Carrie; Wheler, Jennifer J.; Falchook, Gerald S.; Naing, Aung; Fu, Siqing; Piha-Paul, Sarina; Janku, Filip; Meric-Bernstam, Funda; Hwu, Patrick; Kee, Bryan; Kies, Merrill S.; Broaddus, Russell; Mendelsohn, John; Hess, Kenneth R.; Kurzrock, Razelle

2017-01-01

Purpose Genomic profiling is increasingly used in the management of cancer. We have previously reported preliminary results of our precision medicine program. Here, we present response and survival outcomes for 637 additional patients who were referred for phase I trials and were treated with matched targeted therapy (MTT) when available. Patients and Methods Patients with advanced cancer who underwent tumor genomic analyses were treated with MTT when available. Results Overall, 1,179 (82.1%) of 1,436 patients had one or more alterations (median age, 59.7 years; men, 41.2%); 637 had one or more actionable aberrations and were treated with MTT (n = 390) or non-MTT (n = 247). Patients who were treated with MTT had higher rates of complete and partial response (11% v 5%; P = .0099), longer failure-free survival (FFS; 3.4 v 2.9 months; P = .0015), and longer overall survival (OS; 8.4 v 7.3 months; P = .041) than did unmatched patients. Two-month landmark analyses showed that, for MTT patients, FFS for responders versus nonresponders was 7.6 versus 4.3 months (P < .001) and OS was 23.4 versus 8.5 months (P < .001), whereas for non-MTT patients (responders v nonresponders), FFS was 6.6 versus 4.1 months (P = .001) and OS was 15.2 versus 7.5 months (P = .43). Patients with phosphatidylinositol 3-kinase (PI3K) and mitogen-activated protein kinase pathway alterations matched to PI3K/Akt/mammalian target of rapamycin axis inhibitors alone demonstrated outcomes comparable to unmatched patients. Conclusion Our results support the use of genomic matching. Subset analyses indicate that matching patients who harbor a PI3K and mitogen-activated protein kinase pathway alteration to only a PI3K pathway inhibitor does not improve outcome. We have initiated IMPACT2, a randomized trial to compare treatment with and without genomic selection. PMID:29082359
Comparative Genomics Unravels the Functional Roles of Co-occurring Acidophilic Bacteria in Bioleaching Heaps

PubMed Central

Zhang, Xian; Liu, Xueduan; Liang, Yili; Xiao, Yunhua; Ma, Liyuan; Guo, Xue; Miao, Bo; Liu, Hongwei; Peng, Deliang; Huang, Wenkun; Yin, Huaqun

2017-01-01

The spatial-temporal distribution of populations in various econiches is thought to be potentially related to individual differences in the utilization of nutrients or other resources, but their functional roles in the microbial communities remain elusive. We compared differentiation in gene repertoire and metabolic profiles, with a focus on the potential functional traits of three commonly recognized members (Acidithiobacillus caldus, Leptospirillum ferriphilum, and Sulfobacillus thermosulfidooxidans) in bioleaching heaps. Comparative genomics revealed that intra-species divergence might be driven by horizontal gene transfer. These co-occurring bacteria shared a few homologous genes, which significantly suggested the genomic differences between these organisms. Notably, relatively more genes assigned to the Clusters of Orthologous Groups category [G] (carbohydrate transport and metabolism) were identified in Sulfobacillus thermosulfidooxidans compared to the two other species, which probably indicated their mixotrophic capabilities that assimilate both organic and inorganic forms of carbon. Further inspection revealed distinctive metabolic capabilities involving carbon assimilation, nitrogen uptake, and iron-sulfur cycling, providing robust evidence for functional differences with respect to nutrient utilization. Therefore, we proposed that the mutual compensation of functionalities among these co-occurring organisms might provide a selective advantage for efficiently utilizing the limited resources in their habitats. Furthermore, it might be favorable to chemoautotrophs' lifestyles to form mutualistic interactions with these heterotrophic and/or mixotrophic acidophiles, whereby the latter could degrade organic compounds to effectively detoxify the environments. Collectively, the findings shed light on the genetic traits and potential metabolic activities of these organisms, and enable us to make some inferences about genomic and functional differences that might allow them to co-exist. PMID:28529505
Magnesium supplementation, metabolic and inflammatory markers, and global genomic and proteomic profiling: a randomized, double-blind, controlled, crossover trial in overweight individuals.

PubMed

Chacko, Sara A; Sul, James; Song, Yiqing; Li, Xinmin; LeBlanc, James; You, Yuko; Butch, Anthony; Liu, Simin

2011-02-01

Dietary magnesium intake has been favorably associated with reduced risk of metabolic outcomes in observational studies; however, few randomized trials have introduced a systems-biology approach to explore molecular mechanisms of pleiotropic metabolic actions of magnesium supplementation. We examined the effects of oral magnesium supplementation on metabolic biomarkers and global genomic and proteomic profiling in overweight individuals. We undertook this randomized, crossover, pilot trial in 14 healthy, overweight volunteers [body mass index (in kg/m(2)) ≥25] who were randomly assigned to receive magnesium citrate (500 mg elemental Mg/d) or a placebo for 4 wk with a 1-mo washout period. Fasting blood and urine specimens were collected according to standardized protocols. Biochemical assays were conducted on blood specimens. RNA was extracted and subsequently hybridized with the Human Gene ST 1.0 array (Affymetrix, Santa Clara, CA). Urine proteomic profiling was analyzed with the CM10 ProteinChip array (Bio-Rad Laboratories, Hercules, CA). We observed that magnesium treatment significantly decreased fasting C-peptide concentrations (change: -0.4 ng/mL after magnesium treatment compared with +0.05 ng/mL after placebo treatment; P = 0.004) and appeared to decrease fasting insulin concentrations (change: -2.2 μU/mL after magnesium treatment compared with 0.0 μU/mL after placebo treatment; P = 0.25). No consistent patterns were observed across inflammatory biomarkers. Gene expression profiling revealed up-regulation of 24 genes and down-regulation of 36 genes including genes related to metabolic and inflammatory pathways such as C1q and tumor necrosis factor-related protein 9 (C1QTNF9) and pro-platelet basic protein (PPBP). Urine proteomic profiling showed significant differences in the expression amounts of several peptides and proteins after treatment. Magnesium supplementation for 4 wk in overweight individuals led to distinct changes in gene expression and proteomic profiling consistent with favorable effects on several metabolic pathways. This trial was registered at clinicaltrials.gov as NCT00737815.
Do online prognostication tools represent a valid alternative to genomic profiling in the context of adjuvant treatment of early breast cancer? A systematic review of the literature.

PubMed

El Hage Chehade, Hiba; Wazir, Umar; Mokbel, Kinan; Kasem, Abdul; Mokbel, Kefah

2018-01-01

Decision-making regarding adjuvant chemotherapy has been based on clinical and pathological features. However, such decisions are seldom consistent. Web-based predictive models have been developed using data from cancer registries to help determine the need for adjuvant therapy. More recently, with the recognition of the heterogenous nature of breast cancer, genomic assays have been developed to aid in the therapeutic decision-making. We have carried out a comprehensive literature review regarding online prognostication tools and genomic assays to assess whether online tools could be used as valid alternatives to genomic profiling in decision-making regarding adjuvant therapy in early breast cancer. Breast cancer has been recently recognized as a heterogenous disease based on variations in molecular characteristics. Online tools are valuable in guiding adjuvant treatment, especially in resource constrained countries. However, in the era of personalized therapy, molecular profiling appears to be superior in predicting clinical outcome and guiding therapy. Copyright © 2017 Elsevier Inc. All rights reserved.
The SUPERFAMILY database in 2004: additions and improvements.

PubMed

Madera, Martin; Vogel, Christine; Kummerfeld, Sarah K; Chothia, Cyrus; Gough, Julian

2004-01-01

The SUPERFAMILY database provides structural assignments to protein sequences and a framework for analysis of the results. At the core of the database is a library of profile Hidden Markov Models that represent all proteins of known structure. The library is based on the SCOP classification of proteins: each model corresponds to a SCOP domain and aims to represent an entire superfamily. We have applied the library to predicted proteins from all completely sequenced genomes (currently 154), the Swiss-Prot and TrEMBL databases and other sequence collections. Close to 60% of all proteins have at least one match, and one half of all residues are covered by assignments. All models and full results are available for download and online browsing at http://supfam.org. Users can study the distribution of their superfamily of interest across all completely sequenced genomes, investigate with which other superfamilies it combines and retrieve proteins in which it occurs. Alternatively, concentrating on a particular genome as a whole, it is possible first, to find out its superfamily composition, and secondly, to compare it with that of other genomes to detect superfamilies that are over- or under-represented. In addition, the webserver provides the following standard services: sequence search; keyword search for genomes, superfamilies and sequence identifiers; and multiple alignment of genomic, PDB and custom sequences.
Transitioning from genotypes to epigenotypes: why the time has come for medulloblastoma epigenomics.

PubMed

Batora, N V; Sturm, D; Jones, D T W; Kool, M; Pfister, S M; Northcott, P A

2014-04-04

Recent advances in genomic technologies have allowed for tremendous progress in our understanding of the biology underlying medulloblastoma, a malignant childhood brain tumor. Consensus molecular subgroups have been put forth by the pediatric neuro-oncology community and next-generation genomic studies have led to an improved description of driver genes and pathways somatically altered in these subgroups. In contrast to the impressive pace at which advances have been made at the level of the medulloblastoma genome, comparable studies of the epigenome have lagged behind. Complementary data yielded from genomic sequencing and copy number profiling have verified frequent targeting of chromatin modifiers in medulloblastoma, highly suggestive of prominent epigenetic deregulation in the disease. Past studies of DNA methylation-dependent gene silencing and microRNA expression analyses further support the concept of medulloblastoma as an epigenetic disease. In this Review, we aim to summarize the key findings of past reports pertaining to medulloblastoma epigenetics as well as recent and ongoing genomic efforts linking somatic alterations of the genome with inferred deregulation of the epigenome. In addition, we predict what is on the horizon for medulloblastoma epigenetics and how aberrant changes in the medulloblastoma epigenome might serve as an attractive target for future therapies. Copyright © 2013 IBRO. Published by Elsevier Ltd. All rights reserved.

FULL-GENOME ANALYSIS OF ALTERNATIVE SPLICING IN MOUSE LIVER AFTER HEPATOTOXICANT EXPOSURE

EPA Science Inventory

Alternative splicing plays a role in determining gene function and protein diversity. We have employed whole genome exon profiling using Affymetrix Mouse Exon 1.0 ST arrays to understand the significance of alternative splicing on a genome-wide scale in response to multiple toxic...
Simultaneous isolation of high-quality DNA, RNA, miRNA and proteins from tissues for genomic applications

PubMed Central

Peña-Llopis, Samuel; Brugarolas, James

2014-01-01

Genomic technologies have revolutionized our understanding of complex Mendelian diseases and cancer. Solid tumors present several challenges for genomic analyses, such as tumor heterogeneity and tumor contamination with surrounding stroma and infiltrating lymphocytes. We developed a protocol to (i) select tissues of high cellular purity on the basis of histological analyses of immediately flanking sections and (ii) simultaneously extract genomic DNA (gDNA), messenger RNA (mRNA), noncoding RNA (ncRNA; enriched in microRNA (miRNA)) and protein from the same tissues. After tissue selection, about 12–16 extractions of DNA/RNA/protein can be obtained per day. Compared with other similar approaches, this fast and reliable methodology allowed us to identify mutations in tumors with remarkable sensitivity and to perform integrative analyses of whole-genome and exome data sets, DNA copy numbers (by single-nucleotide polymorphism (SNP) arrays), gene expression data (by transcriptome profiling and quantitative PCR (qPCR)) and protein levels (by western blotting and immunohistochemical analysis) from the same samples. Although we focused on renal cell carcinoma, this protocol may be adapted with minor changes to any human or animal tissue to obtain high-quality and high-yield nucleic acids and proteins. PMID:24136348
Genome-wide analysis of copper, iron and zinc transporters in the arbuscular mycorrhizal fungus Rhizophagus irregularis.

PubMed

Tamayo, Elisabeth; Gómez-Gallego, Tamara; Azcón-Aguilar, Concepción; Ferrol, Nuria

2014-01-01

Arbuscular mycorrhizal fungi (AMF), belonging to the Glomeromycota, are soil microorganisms that establish mutualistic symbioses with the majority of higher plants. The efficient uptake of low mobility mineral nutrients by the fungal symbiont and their further transfer to the plant is a major feature of this symbiosis. Besides improving plant mineral nutrition, AMF can alleviate heavy metal toxicity to their host plants and are able to tolerate high metal concentrations in the soil. Nevertheless, we are far from understanding the key molecular determinants of metal homeostasis in these organisms. To get some insights into these mechanisms, a genome-wide analysis of Cu, Fe and Zn transporters was undertaken, making use of the recently published whole genome of the AMF Rhizophagus irregularis. This in silico analysis allowed identification of 30 open reading frames in the R. irregularis genome, which potentially encode metal transporters. Phylogenetic comparisons with the genomes of a set of reference fungi showed an expansion of some metal transporter families. Analysis of the published transcriptomic profiles of R. irregularis revealed that a set of genes were up-regulated in mycorrhizal roots compared to germinated spores and extraradical mycelium, which suggests that metals are important for plant colonization.
[Research progress in neuropsychopharmacology updated for the post-genomic era].

PubMed

Nakanishi, Toru

2009-11-01

Neuropsychopharmacological research in the post genomic (genomic sequence) era has been developing rapidly through the use of novel techniques including DNA chips. We have applied these techniques to investigate the anti-tumor effect of NSAIDs, isolate novel genes specifically expressed in rheumatoid arthritis, and analyze gene expression profiles in mesenchymal stem cells. Recently, we have developed a novel system of quantitative PCR for detection of BDNF mRNA isoforms. By using this system, we identified the exon-specific mode of expression in acute and chronic pain. In addition, we have made gene expression profiles of KO mice of beta2 subunits in acetylcholine receptors.
The genome analysis of Oleiphilus messinensis ME102 (DSM 13489T) reveals backgrounds of its obligate alkane-devouring marine lifestyle.

PubMed

Toshchakov, Stepan V; Korzhenkov, Alexei A; Chernikova, Tatyana N; Ferrer, Manuel; Golyshina, Olga V; Yakimov, Michail M; Golyshin, Peter N

2017-12-01

Marine bacterium Oleiphilus messinensis ME102 (DSM 13489 T ) isolated from the sediments of the harbor of Messina (Italy) is a member of the order Oceanospirillales, class Gammaproteobacteria, representing the physiological group of marine obligate hydrocarbonoclastic bacteria (OHCB) alongside the members of the genera Alcanivorax, Oleispira, Thalassolituus, Cycloclasticus and Neptunomonas. These organisms play a crucial role in the natural environmental cleanup in marine systems. Despite having the largest genome (6.379.281bp) among OHCB, O. messinensis exhibits a very narrow substrate profile. The alkane metabolism is pre-determined by three loci encoding for two P450 family monooxygenases, one of which formed a cassette with ferredoxin and alcohol dehydrogenase encoding genes and alkane monoxygenase (AlkB) gene clustered with two genes for rubredoxins and NAD + -dependent rubredoxin reductase. Its genome contains the largest numbers of genomic islands (15) and mobile genetic elements (140), as compared with more streamlined genomes of its OHCB counterparts. Among hydrocarbon-degrading Oceanospirillales, O. messinensis encodes the largest array of proteins involved in the signal transduction for sensing and responding to the environmental stimuli (345 vs 170 in Oleispira antarctica, the bacterium with the second highest number). This must be an important trait to adapt to the conditions in marine sediments with a high physico-chemical patchiness and heterogeneity as compared to those in the water column. Copyright © 2017. Published by Elsevier B.V.
Molecular and Genomic Alterations in Glioblastoma Multiforme.

PubMed

Crespo, Ines; Vital, Ana Louisa; Gonzalez-Tablas, María; Patino, María del Carmen; Otero, Alvaro; Lopes, María Celeste; de Oliveira, Catarina; Domingues, Patricia; Orfao, Alberto; Tabernero, Maria Dolores

2015-07-01

In recent years, important advances have been achieved in the understanding of the molecular biology of glioblastoma multiforme (GBM); thus, complex genetic alterations and genomic profiles, which recurrently involve multiple signaling pathways, have been defined, leading to the first molecular/genetic classification of the disease. In this regard, different genetic alterations and genetic pathways appear to distinguish primary (eg, EGFR amplification) versus secondary (eg, IDH1/2 or TP53 mutation) GBM. Such genetic alterations target distinct combinations of the growth factor receptor-ras signaling pathways, as well as the phosphatidylinositol 3-kinase/phosphatase and tensin homolog/AKT, retinoblastoma/cyclin-dependent kinase (CDK) N2A-p16(INK4A), and TP53/mouse double minute (MDM) 2/MDM4/CDKN2A-p14(ARF) pathways, in cells that present features associated with key stages of normal neurogenesis and (normal) central nervous system cell types. This translates into well-defined genomic profiles that have been recently classified by The Cancer Genome Atlas Consortium into four subtypes: classic, mesenchymal, proneural, and neural GBM. Herein, we review the most relevant genetic alterations of primary versus secondary GBM, the specific signaling pathways involved, and the overall genomic profile of this genetically heterogeneous group of malignant tumors. Copyright © 2015 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.
Transcriptome and methylome profiling reveals relics of genome dominance in the mesopolyploid Brassica oleracea

PubMed Central

2014-01-01

Background Brassica oleracea is a valuable vegetable species that has contributed to human health and nutrition for hundreds of years and comprises multiple distinct cultivar groups with diverse morphological and phytochemical attributes. In addition to this phenotypic wealth, B. oleracea offers unique insights into polyploid evolution, as it results from multiple ancestral polyploidy events and a final Brassiceae-specific triplication event. Further, B. oleracea represents one of the diploid genomes that formed the economically important allopolyploid oilseed, Brassica napus. A deeper understanding of B. oleracea genome architecture provides a foundation for crop improvement strategies throughout the Brassica genus. Results We generate an assembly representing 75% of the predicted B. oleracea genome using a hybrid Illumina/Roche 454 approach. Two dense genetic maps are generated to anchor almost 92% of the assembled scaffolds to nine pseudo-chromosomes. Over 50,000 genes are annotated and 40% of the genome predicted to be repetitive, thus contributing to the increased genome size of B. oleracea compared to its close relative B. rapa. A snapshot of both the leaf transcriptome and methylome allows comparisons to be made across the triplicated sub-genomes, which resulted from the most recent Brassiceae-specific polyploidy event. Conclusions Differential expression of the triplicated syntelogs and cytosine methylation levels across the sub-genomes suggest residual marks of the genome dominance that led to the current genome architecture. Although cytosine methylation does not correlate with individual gene dominance, the independent methylation patterns of triplicated copies suggest epigenetic mechanisms play a role in the functional diversification of duplicate genes. PMID:24916971
GeneBreak: detection of recurrent DNA copy number aberration-associated chromosomal breakpoints within genes.

PubMed

van den Broek, Evert; van Lieshout, Stef; Rausch, Christian; Ylstra, Bauke; van de Wiel, Mark A; Meijer, Gerrit A; Fijneman, Remond J A; Abeln, Sanne

2016-01-01

Development of cancer is driven by somatic alterations, including numerical and structural chromosomal aberrations. Currently, several computational methods are available and are widely applied to detect numerical copy number aberrations (CNAs) of chromosomal segments in tumor genomes. However, there is lack of computational methods that systematically detect structural chromosomal aberrations by virtue of the genomic location of CNA-associated chromosomal breaks and identify genes that appear non-randomly affected by chromosomal breakpoints across (large) series of tumor samples. 'GeneBreak' is developed to systematically identify genes recurrently affected by the genomic location of chromosomal CNA-associated breaks by a genome-wide approach, which can be applied to DNA copy number data obtained by array-Comparative Genomic Hybridization (CGH) or by (low-pass) whole genome sequencing (WGS). First, 'GeneBreak' collects the genomic locations of chromosomal CNA-associated breaks that were previously pinpointed by the segmentation algorithm that was applied to obtain CNA profiles. Next, a tailored annotation approach for breakpoint-to-gene mapping is implemented. Finally, dedicated cohort-based statistics is incorporated with correction for covariates that influence the probability to be a breakpoint gene. In addition, multiple testing correction is integrated to reveal recurrent breakpoint events. This easy-to-use algorithm, 'GeneBreak', is implemented in R ( www.cran.r-project.org ) and is available from Bioconductor ( www.bioconductor.org/packages/release/bioc/html/GeneBreak.html ).
The rubber tree genome shows expansion of gene family associated with rubber biosynthesis

PubMed Central

Lau, Nyok-Sean; Makita, Yuko; Kawashima, Mika; Taylor, Todd D.; Kondo, Shinji; Othman, Ahmad Sofiman; Shu-Chien, Alexander Chong; Matsui, Minami

2016-01-01

Hevea brasiliensis Muell. Arg, a member of the family Euphorbiaceae, is the sole natural resource exploited for commercial production of high-quality natural rubber. The properties of natural rubber latex are almost irreplaceable by synthetic counterparts for many industrial applications. A paucity of knowledge on the molecular mechanisms of rubber biosynthesis in high yield traits still persists. Here we report the comprehensive genome-wide analysis of the widely planted H. brasiliensis clone, RRIM 600. The genome was assembled based on ~155-fold combined coverage with Illumina and PacBio sequence data and has a total length of 1.55 Gb with 72.5% comprising repetitive DNA sequences. A total of 84,440 high-confidence protein-coding genes were predicted. Comparative genomic analysis revealed strong synteny between H. brasiliensis and other Euphorbiaceae genomes. Our data suggest that H. brasiliensis’s capacity to produce high levels of latex can be attributed to the expansion of rubber biosynthesis-related genes in its genome and the high expression of these genes in latex. Using cap analysis gene expression data, we illustrate the tissue-specific transcription profiles of rubber biosynthesis-related genes, revealing alternative means of transcriptional regulation. Our study adds to the understanding of H. brasiliensis biology and provides valuable genomic resources for future agronomic-related improvement of the rubber tree. PMID:27339202
Genomic and transcriptomic analysis of the AP2/ERF superfamily in Vitis vinifera

PubMed Central

2010-01-01

Background The AP2/ERF protein family contains transcription factors that play a crucial role in plant growth and development and in response to biotic and abiotic stress conditions in plants. Grapevine (Vitis vinifera) is the only woody crop whose genome has been fully sequenced. So far, no detailed expression profile of AP2/ERF-like genes is available for grapevine. Results An exhaustive search for AP2/ERF genes was carried out on the Vitis vinifera genome and their expression profile was analyzed by Real-Time quantitative PCR (qRT-PCR) in different vegetative and reproductive tissues and under two different ripening stages. One hundred and forty nine sequences, containing at least one ERF domain, were identified. Specific clusters within the AP2 and ERF families showed conserved expression patterns reminiscent of other species and grapevine specific trends related to berry ripening. Moreover, putative targets of group IX ERFs were identified by co-expression and protein similarity comparisons. Conclusions The grapevine genome contains an amount of AP2/ERF genes comparable to that of other dicot species analyzed so far. We observed an increase in the size of specific groups within the ERF family, probably due to recent duplication events. Expression analyses in different aerial tissues display common features previously described in other plant systems and introduce possible new roles for members of some ERF groups during fruit ripening. The presented analysis of AP2/ERF genes in grapevine provides the bases for studying the molecular regulation of berry development and the ripening process. PMID:21171999
Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples

DOE PAGES

Pettengill, James B.; Pightling, Arthur W.; Baugher, Joseph D.; ...

2016-11-10

The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging duemore » to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). Finally, when analyzing empirical data (wholegenome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.« less
Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pettengill, James B.; Pightling, Arthur W.; Baugher, Joseph D.

The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging duemore » to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). Finally, when analyzing empirical data (wholegenome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.« less
Genomic Epidemiology of Hypervirulent Serogroup W, ST-11 Neisseria meningitidis

PubMed Central

Mustapha, Mustapha M.; Marsh, Jane W.; Krauland, Mary G.; Fernandez, Jorge O.; de Lemos, Ana Paula S.; Dunning Hotopp, Julie C.; Wang, Xin; Mayer, Leonard W.; Lawrence, Jeffrey G.; Hiller, N. Luisa; Harrison, Lee H.

2015-01-01

Neisseria meningitidis is a leading bacterial cause of sepsis and meningitis globally with dynamic strain distribution over time. Beginning with an epidemic among Hajj pilgrims in 2000, serogroup W (W) sequence type (ST) 11 emerged as a leading cause of epidemic meningitis in the African ‘meningitis belt’ and endemic cases in South America, Europe, Middle East and China. Previous genotyping studies were unable to reliably discriminate sporadic W ST-11 strains in circulation since 1970 from the Hajj outbreak strain (Hajj clone). It is also unclear what proportion of more recent W ST-11 disease clusters are caused by direct descendants of the Hajj clone. Whole genome sequences of 270 meningococcal strains isolated from patients with invasive meningococcal disease globally from 1970 to 2013 were compared using whole genome phylogenetic and major antigen-encoding gene sequence analyses. We found that all W ST-11 strains were descendants of an ancestral strain that had undergone unique capsular switching events. The Hajj clone and its descendants were distinct from other W ST-11 strains in that they shared a common antigen gene profile and had undergone recombination involving virulence genes encoding factor H binding protein, nitric oxide reductase, and nitrite reductase. These data demonstrate that recent acquisition of a distinct antigen-encoding gene profile and variations in meningococcal virulence genes was associated with the emergence of the Hajj clone. Importantly, W ST-11 strains unrelated to the Hajj outbreak contribute a significant proportion of W ST-11 cases globally. This study helps illuminate genomic factors associated with meningococcal strain emergence and evolution. PMID:26629539
Comparative transcriptome profiling of upland (VS16) and lowland (AP13) ecotypes of switchgrass.

PubMed

Ayyappan, Vasudevan; Saha, Malay C; Thimmapuram, Jyothi; Sripathi, Venkateswara R; Bhide, Ketaki P; Fiedler, Elizabeth; Hayford, Rita K; Kalavacharla, Venu Kal

2017-01-01

Transcriptomes of two switchgrass genotypes representing the upland and lowland ecotypes will be key tools in switchgrass genome annotation and biotic and abiotic stress functional genomics. Switchgrass (Panicum virgatum L.) is an important bioenergy feedstock for cellulosic ethanol production. We report genome-wide transcriptome profiling of two contrasting tetraploid switchgrass genotypes, VS16 and AP13, representing the upland and lowland ecotypes, respectively. A total of 268 million Illumina short reads (50 nt) were generated, of which, 133 million were obtained in AP13 and the rest 135 million in VS16. More than 90% of these reads were mapped to the switchgrass reference genome (V1.1). We identified 6619 and 5369 differentially expressed genes in VS16 and AP13, respectively. Gene ontology and KEGG pathway analysis identified key genes that regulate important pathways including C4 photosynthesis, photorespiration and phenylpropanoid metabolism. A series of genes (33) involved in photosynthetic pathway were up-regulated in AP13 but only two genes showed higher expression in VS16. We identified three dicarboxylate transporter homologs that were highly expressed in AP13. Additionally, genes that mediate drought, heat, and salinity tolerance were also identified. Vesicular transport proteins, syntaxin and signal recognition particles were seen to be up-regulated in VS16. Analyses of selected genes involved in biosynthesis of secondary metabolites, plant-pathogen interaction, membrane transporters, heat, drought and salinity stress responses confirmed significant variation in the relative expression reflected in RNA-Seq data between VS16 and AP13 genotypes. The phenylpropanoid pathway genes identified here are potential targets for biofuel conversion.
An analysis of sponge genomes.

PubMed

Costantini, Maria

2004-11-24

The genome of sponges has only been investigated so far by Bartmann-Lindholm et al. [Progr. Colloid. Polym. Sci. 107 (1997) 122-126] who reported a multimodal CsCl profile which could be resolved into five peaks for Geodia cydonium. This problem was reinvestigated here on both G. cydonium and Suberites domuncula. It was shown that DNAs from both sponges are characterized by unimodal CsCl profiles, additional peaks being due to contaminating prokaryotic and eukaryotic microorganisms.
Genome-wide Gene Expression Profiling of Acute Metal Exposures in Male Zebrafish

DTIC Science & Technology

2014-10-23

Data in Brief Genome-wide gene expression profiling of acute metal exposures in male zebrafish Christine E. Baer a,⁎, Danielle L. Ippolito b, Naissan... Zebrafish Whole organism Nickel Chromium Cobalt Toxicogenomics To capture global responses to metal poisoning and mechanistic insights into metal...toxicity, gene expression changes were evaluated in whole adult male zebrafish following acute 24 h high dose exposure to three metals with known human
Entropic Profiler – detection of conservation in genomes using information theory

PubMed Central

Fernandes, Francisco; Freitas, Ana T; Almeida, Jonas S; Vinga, Susana

2009-01-01

Background In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning. Findings The Entropic Profiler application here presented is a new tool designed to detect and extract under and over-represented DNA segments in genomes by using EP. It allows its computation in a very efficient way by recurring to improved algorithms and data structures, which include modified suffix trees. Available through a web interface and as downloadable source code, it allows to study positions and to search for motifs inside the whole sequence or within a specified range. DNA sequences can be entered from different sources, including FASTA files, pre-loaded examples or resuming a previously saved work. Besides the EP value plots, p-values and z-scores for each motif are also computed, along with the Chaos Game Representation of the sequence. Conclusion EP are directly related with the statistical significance of motifs and can be considered as a new method to extract and classify significant regions in genomes and estimate local scales in DNA. The present implementation establishes an efficient and useful tool for whole genome analysis. PMID:19416538
An ensemble model of competitive multi-factor binding of the genome

PubMed Central

Wasson, Todd; Hartemink, Alexander J.

2009-01-01

Hundreds of different factors adorn the eukaryotic genome, binding to it in large number. These DNA binding factors (DBFs) include nucleosomes, transcription factors (TFs), and other proteins and protein complexes, such as the origin recognition complex (ORC). DBFs compete with one another for binding along the genome, yet many current models of genome binding do not consider different types of DBFs together simultaneously. Additionally, binding is a stochastic process that results in a continuum of binding probabilities at any position along the genome, but many current models tend to consider positions as being either binding sites or not. Here, we present a model that allows a multitude of DBFs, each at different concentrations, to compete with one another for binding sites along the genome. The result is an “occupancy profile,” a probabilistic description of the DNA occupancy of each factor at each position. We implement our model efficiently as the software package COMPETE. We demonstrate genome-wide and at specific loci how modeling nucleosome binding alters TF binding, and vice versa, and illustrate how factor concentration influences binding occupancy. Binding cooperativity between nearby TFs arises implicitly via mutual competition with nucleosomes. Our method applies not only to TFs, but also recapitulates known occupancy profiles of a well-studied replication origin with and without ORC binding. Importantly, the sequence preferences our model takes as input are derived from in vitro experiments. This ensures that the calculated occupancy profiles are the result of the forces of competition represented explicitly in our model and the inherent sequence affinities of the constituent DBFs. PMID:19720867
Evolutionary Analysis and Expression Profiling of Zebra Finch Immune Genes

PubMed Central

Ekblom, Robert; French, Lisa; Slate, Jon; Burke, Terry

2010-01-01

Genes of the immune system are generally considered to evolve rapidly due to host–parasite coevolution. They are therefore of great interest in evolutionary biology and molecular ecology. In this study, we manually annotated 144 avian immune genes from the zebra finch (Taeniopygia guttata) genome and conducted evolutionary analyses of these by comparing them with their orthologs in the chicken (Gallus gallus). Genes classified as immune receptors showed elevated dN/dS ratios compared with other classes of immune genes. Immune genes in general also appear to be evolving more rapidly than other genes, as inferred from a higher dN/dS ratio compared with the rest of the genome. Furthermore, ten genes (of 27) for which sequence data were available from at least three bird species showed evidence of positive selection acting on specific codons. From transcriptome data of eight different tissues, we found evidence for expression of 106 of the studied immune genes, with primary expression of most of these in bursa, blood, and spleen. These immune-related genes showed a more tissue-specific expression pattern than other genes in the zebra finch genome. Several of the avian immune genes investigated here provide strong candidates for in-depth studies of molecular adaptation in birds. PMID:20884724
Genome-wide comparative analysis of papain-like cysteine protease family genes in castor bean and physic nut.

PubMed

Zou, Zhi; Huang, Qixing; Xie, Guishui; Yang, Lifu

2018-01-10

Papain-like cysteine proteases (PLCPs) are a class of proteolytic enzymes involved in many plant processes. Compared with the extensive research in Arabidopsis thaliana, little is known in castor bean (Ricinus communis) and physic nut (Jatropha curcas), two Euphorbiaceous plants without any recent whole-genome duplication. In this study, a total of 26 or 23 PLCP genes were identified from the genomes of castor bean and physic nut respectively, which can be divided into nine subfamilies based on the phylogenetic analysis: RD21, CEP, XCP, XBCP3, THI, SAG12, RD19, ALP and CTB. Although most of them harbor orthologs in Arabidopsis, several members in subfamilies RD21, CEP, XBCP3 and SAG12 form new groups or subgroups as observed in other species, suggesting specific gene loss occurred in Arabidopsis. Recent gene duplicates were also identified in these two species, but they are limited to the SAG12 subfamily and were all derived from local duplication. Expression profiling revealed diverse patterns of different family members over various tissues. Furthermore, the evolution characteristics of PLCP genes were also compared and discussed. Our findings provide a useful reference to characterize PLCP genes and investigate the family evolution in Euphorbiaceae and species beyond.

Cytogenomic profiling of breast cancer brain metastases reveals potential for repurposing targeted therapeutics

PubMed Central

Bollig-Fischer, Aliccia; Michelhaugh, Sharon K.; Wijesinghe, Priyanga; Dyson, Greg; Kruger, Adele; Palanisamy, Nallasivam; Choi, Lydia; Alosh, Baraa; Ali-Fehmi, Rouba; Mittal, Sandeep

2015-01-01

Breast cancer brain metastases remain a significant clinical problem. Chemotherapy is ineffective and a lack of treatment options result in poor patient outcomes. Targeted therapeutics have proven to be highly effective in primary breast cancer, but lack of molecular genomic characterization of metastatic brain tumors is hindering the development of new treatment regimens. Here we contribute to fill this void by reporting on gene copy number variation (CNV) in 10 breast cancer metastatic brain tumors, assayed by array comparative genomic hybridization (aCGH). Results were compared to a list of cancer genes verified by others to influence cancer. Cancer gene aberrations were identified in all specimens and pathway-level analysis was applied to aggregate data, which identified stem cell pluripotency pathway enrichment and highlighted recurring, significant amplification of SOX2, PIK3CA, NTRK1, GNAS, CTNNB1, and FGFR1. For a subset of the metastatic brain tumor samples (n=4) we compared patient-matched primary breast cancer specimens. The results of our CGH analysis and validation by alternative methods indicate that oncogenic signals driving growth of metastatic tumors exist in the original cancer. This report contributes support for more rapid development of new treatments of metastatic brain tumors, the use of genomic-based diagnostic tools and repurposed drug treatments. PMID:25970776
Cytogenomic profiling of breast cancer brain metastases reveals potential for repurposing targeted therapeutics.

PubMed

Bollig-Fischer, Aliccia; Michelhaugh, Sharon K; Wijesinghe, Priyanga; Dyson, Greg; Kruger, Adele; Palanisamy, Nallasivam; Choi, Lydia; Alosh, Baraa; Ali-Fehmi, Rouba; Mittal, Sandeep

2015-06-10

Breast cancer brain metastases remain a significant clinical problem. Chemotherapy is ineffective and a lack of treatment options result in poor patient outcomes. Targeted therapeutics have proven to be highly effective in primary breast cancer, but lack of molecular genomic characterization of metastatic brain tumors is hindering the development of new treatment regimens. Here we contribute to fill this void by reporting on gene copy number variation (CNV) in 10 breast cancer metastatic brain tumors, assayed by array comparative genomic hybridization (aCGH). Results were compared to a list of cancer genes verified by others to influence cancer. Cancer gene aberrations were identified in all specimens and pathway-level analysis was applied to aggregate data, which identified stem cell pluripotency pathway enrichment and highlighted recurring, significant amplification of SOX2, PIK3CA, NTRK1, GNAS, CTNNB1, and FGFR1. For a subset of the metastatic brain tumor samples (n = 4) we compared patient-matched primary breast cancer specimens. The results of our CGH analysis and validation by alternative methods indicate that oncogenic signals driving growth of metastatic tumors exist in the original cancer. This report contributes support for more rapid development of new treatments of metastatic brain tumors, the use of genomic-based diagnostic tools and repurposed drug treatments.
Genome size diversity in orchids: consequences and evolution

PubMed Central

Leitch, I. J.; Kahandawala, I.; Suda, J.; Hanson, L.; Ingrouille, M. J.; Chase, M. W.; Fay, M. F.

2009-01-01

Background The amount of DNA comprising the genome of an organism (its genome size) varies a remarkable 40 000-fold across eukaryotes, yet most groups are characterized by much narrower ranges (e.g. 14-fold in gymnosperms, 3- to 4-fold in mammals). Angiosperms stand out as one of the most variable groups with genome sizes varying nearly 2000-fold. Nevertheless within angiosperms the majority of families are characterized by genomes which are small and vary little. Species with large genomes are mostly restricted to a few monocots families including Orchidaceae. Scope A survey of the literature revealed that genome size data for Orchidaceae are comparatively rare representing just 327 species. Nevertheless they reveal that Orchidaceae are currently the most variable angiosperm family with genome sizes ranging 168-fold (1C = 0·33–55·4 pg). Analysing the data provided insights into the distribution, evolution and possible consequences to the plant of this genome size diversity. Conclusions Superimposing the data onto the increasingly robust phylogenetic tree of Orchidaceae revealed how different subfamilies were characterized by distinct genome size profiles. Epidendroideae possessed the greatest range of genome sizes, although the majority of species had small genomes. In contrast, the largest genomes were found in subfamilies Cypripedioideae and Vanilloideae. Genome size evolution within this subfamily was analysed as this is the only one with reasonable representation of data. This approach highlighted striking differences in genome size and karyotype evolution between the closely related Cypripedium, Paphiopedilum and Phragmipedium. As to the consequences of genome size diversity, various studies revealed that this has both practical (e.g. application of genetic fingerprinting techniques) and biological consequences (e.g. affecting where and when an orchid may grow) and emphasizes the importance of obtaining further genome size data given the considerable phylogenetic gaps which have been highlighted by the current study. PMID:19168860
Genes on B chromosomes: old questions revisited with new tools.

PubMed

Banaei-Moghaddam, Ali M; Martis, Mihaela M; Macas, Jiří; Gundlach, Heidrun; Himmelbach, Axel; Altschmied, Lothar; Mayer, Klaus F X; Houben, Andreas

2015-01-01

B chromosomes are supernumerary dispensable parts of the karyotype which appear in some individuals of some populations in some species. Often, they have been considered as 'junk DNA' or genomic parasites without functional genes. Due to recent advances in sequencing technologies, it became possible to investigate their DNA composition, transcriptional activity and effects on the host transcriptome profile in detail. Here, we review the most recent findings regarding the gene content of B chromosomes and their transcriptional activities and discuss these findings in the context of comparable biological phenomena, like sex chromosomes, aneuploidy and pseudogenes. Recent data suggest that B chromosomes carry transcriptionally active genic sequences which could affect the transcriptome profile of their host genome. These findings are gradually changing our view that B chromosomes are solely genetically inert selfish elements without any functional genes. This at one side could partly explain the deleterious effects which are associated with their presence. On the other hand it makes B chromosome a nice model for studying regulatory mechanisms of duplicated genes and their evolutionary consequences. Copyright © 2014 Elsevier B.V. All rights reserved.
Accurate read-based metagenome characterization using a hierarchical suite of unique signatures

PubMed Central

Freitas, Tracey Allen K.; Li, Po-E; Scholz, Matthew B.; Chain, Patrick S. G.

2015-01-01

A major challenge in the field of shotgun metagenomics is the accurate identification of organisms present within a microbial community, based on classification of short sequence reads. Though existing microbial community profiling methods have attempted to rapidly classify the millions of reads output from modern sequencers, the combination of incomplete databases, similarity among otherwise divergent genomes, errors and biases in sequencing technologies, and the large volumes of sequencing data required for metagenome sequencing has led to unacceptably high false discovery rates (FDR). Here, we present the application of a novel, gene-independent and signature-based metagenomic taxonomic profiling method with significantly and consistently smaller FDR than any other available method. Our algorithm circumvents false positives using a series of non-redundant signature databases and examines Genomic Origins Through Taxonomic CHAllenge (GOTTCHA). GOTTCHA was tested and validated on 20 synthetic and mock datasets ranging in community composition and complexity, was applied successfully to data generated from spiked environmental and clinical samples, and robustly demonstrates superior performance compared with other available tools. PMID:25765641
EvoCor: a platform for predicting functionally related genes using phylogenetic and expression profiles.

PubMed

Dittmar, W James; McIver, Lauren; Michalak, Pawel; Garner, Harold R; Valdez, Gregorio

2014-07-01

The wealth of publicly available gene expression and genomic data provides unique opportunities for computational inference to discover groups of genes that function to control specific cellular processes. Such genes are likely to have co-evolved and be expressed in the same tissues and cells. Unfortunately, the expertise and computational resources required to compare tens of genomes and gene expression data sets make this type of analysis difficult for the average end-user. Here, we describe the implementation of a web server that predicts genes involved in affecting specific cellular processes together with a gene of interest. We termed the server 'EvoCor', to denote that it detects functional relationships among genes through evolutionary analysis and gene expression correlation. This web server integrates profiles of sequence divergence derived by a Hidden Markov Model (HMM) and tissue-wide gene expression patterns to determine putative functional linkages between pairs of genes. This server is easy to use and freely available at http://pilot-hmm.vbi.vt.edu/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genome-wide RNA profiling of long-lasting stem cell-like memory CD8 T cells induced by Yellow Fever vaccination in humans.

PubMed

Fuertes Marraco, Silvia A; Soneson, Charlotte; Delorenzi, Mauro; Speiser, Daniel E

2015-09-01

The live-attenuated Yellow Fever (YF) vaccine YF-17D induces a broad and polyfunctional CD8 T cell response in humans. Recently, we identified a population of stem cell-like memory CD8 T cells induced by YF-17D that persists at stable frequency for at least 25 years after vaccination. The YF-17D is thus a model system of human CD8 T cell biology that furthermore allows to track and study long-lasting and antigen-specific human memory CD8 T cells. Here, we describe in detail the sample characteristics and preparation of a microarray dataset acquired for genome-wide gene expression profiling of long-lasting YF-specific stem cell-like memory CD8 T cells, compared to the reference CD8 T cell differentiation subsets from total CD8 T cells. We also describe the quality controls, annotations and exploratory analyses of the dataset. The microarray data is available from the Gene Expression Omnibus (GEO) public repository with accession number GSE65804.
Interplay of heritage and habitat in the distribution of bacterial signal transduction systems.

PubMed

Galperin, Michael Y; Higdon, Roger; Kolker, Eugene

2010-04-01

Comparative analysis of the complete genome sequences from a variety of poorly studied organisms aims at predicting ecological and behavioral properties of these organisms and helping in characterizing their habitats. This task requires finding appropriate descriptors that could be correlated with the core traits of each system and would allow meaningful comparisons. Using the relatively simple bacterial models, first attempts have been made to introduce suitable metrics to describe the complexity of organism's signaling machinery, which included introducing the "bacterial IQ" score. Here, we use an updated census of prokaryotic signal transduction systems to improve this parameter and evaluate its consistency within selected bacterial phyla. We also introduce a more elaborate descriptor, a set of profiles of relative abundance of members of each family of signal transduction proteins encoded in each genome. We show that these family profiles are well conserved within each genus and are often consistent within families of bacteria. Thus, they reflect evolutionary relationships between organisms as well as individual adaptations of each organism to its specific ecological niche.
Survival in extreme environment by "preserve-expand-specialize" strategy: lessons from comparative genomics of an anhydrobiotic midge.

NASA Astrophysics Data System (ADS)

Gusev, Oleg; Sugimoto, Manabu; Novikova, Nataliya; Sychev, Vladimir; Okuda, Takashi; Kikawada, Takahiro

2012-07-01

Anhydrobiotic chironomid larvae of Polypedilum vanderplanki (Diptera) can withstand prolonged complete desiccation as well as other external stresses including ionizing radiation. Recent experiments showed that this insect is able to survive long-tern exposure to real outer space. At the same time, we found that dehydration causes alterations in chromatin structure and a severe fragmentation of nuclear DNA in the cells of the larvae despite successful anhydrobiosis. Analysis of several remote populations of the chironomid in Africa that desiccation-related DNA damage might be a driving genetic force for rapid radiation within the species. First results of ongoing genome project suggest that origin and evolution of anhydrobiosis in this single insect species related to rapid duplication of the genes, coding late embryogenesis abundant proteins (LEA) and other molecular agents directly involved in desiccation resistance in the cells. Analysis of genome-wide mRNA expression profiles in the larvae subjected to desiccation shows that joint-activity of large multiple-genes coding regions in the genome involved in control of anhydrobiosis-related molecular adaptations in the chironomid.
Unclassified renal cell carcinoma: a clinicopathological, comparative genomic hybridization, and whole-genome exon sequencing study.

PubMed

Hu, Zhen-Yan; Pang, Li-Juan; Qi, Yan; Kang, Xue-Ling; Hu, Jian-Ming; Wang, Lianghai; Liu, Kun-Peng; Ren, Yuan; Cui, Mei; Song, Li-Li; Li, Hong-An; Zou, Hong; Li, Feng

2014-01-01

Unclassified renal cell carcinoma (URCC) is a rare variant of RCC, accounting for only 3-5% of all cases. Studies on the molecular genetics of URCC are limited, and hence, we report on 2 cases of URCC analyzed using comparative genome hybridization (CGH) and the genome-wide human exon GeneChip technique to identify the genomic alterations of URCC. Both URCC patients (mean age, 72 years) presented at an advanced stage and died within 30 months post-surgery. Histologically, the URCCs were composed of undifferentiated, multinucleated, giant cells with eosinophilic cytoplasm. Immunostaining revealed that both URCC cases had strong p53 protein expression and partial expression of cluster of differentiation-10 and cytokeratin. The CGH profiles showed chromosomal imbalances in both URCC cases: gains were observed in chromosomes 1p11-12, 1q12-13, 2q20-23, 3q22-23, 8p12, and 16q11-15, whereas losses were detected on chromosomes 1q22-23, 3p12-22, 5p30-ter, 6p, 11q, 16q18-22, 17p12-14, and 20p. Compared with 18 normal renal tissues, 40 mutated genes were detected in the URCC tissues, including 32 missense and 8 silent mutations. Functional enrichment analysis revealed that the missense mutation genes were involved in 11 different biological processes and pathways, including cell cycle regulation, lipid localization and transport, neuropeptide signaling, organic ether metabolism, and ATP-binding cassette transporter signaling. Our findings indicate that URCC may be a highly aggressive cancer, and the genetic alterations identified herein may provide clues regarding the tumorigenesis of URCC and serve as a basis for the development of targeted therapies against URCC in the future.
Unclassified renal cell carcinoma: a clinicopathological, comparative genomic hybridization, and whole-genome exon sequencing study

PubMed Central

Hu, Zhen-Yan; Pang, Li-Juan; Qi, Yan; Kang, Xue-Ling; Hu, Jian-Ming; Wang, Lianghai; Liu, Kun-Peng; Ren, Yuan; Cui, Mei; Song, Li-Li; Li, Hong-An; Zou, Hong; Li, Feng

2014-01-01

Unclassified renal cell carcinoma (URCC) is a rare variant of RCC, accounting for only 3-5% of all cases. Studies on the molecular genetics of URCC are limited, and hence, we report on 2 cases of URCC analyzed using comparative genome hybridization (CGH) and the genome-wide human exon GeneChip technique to identify the genomic alterations of URCC. Both URCC patients (mean age, 72 years) presented at an advanced stage and died within 30 months post-surgery. Histologically, the URCCs were composed of undifferentiated, multinucleated, giant cells with eosinophilic cytoplasm. Immunostaining revealed that both URCC cases had strong p53 protein expression and partial expression of cluster of differentiation-10 and cytokeratin. The CGH profiles showed chromosomal imbalances in both URCC cases: gains were observed in chromosomes 1p11-12, 1q12-13, 2q20-23, 3q22-23, 8p12, and 16q11-15, whereas losses were detected on chromosomes 1q22-23, 3p12-22, 5p30-ter, 6p, 11q, 16q18-22, 17p12-14, and 20p. Compared with 18 normal renal tissues, 40 mutated genes were detected in the URCC tissues, including 32 missense and 8 silent mutations. Functional enrichment analysis revealed that the missense mutation genes were involved in 11 different biological processes and pathways, including cell cycle regulation, lipid localization and transport, neuropeptide signaling, organic ether metabolism, and ATP-binding cassette transporter signaling. Our findings indicate that URCC may be a highly aggressive cancer, and the genetic alterations identified herein may provide clues regarding the tumorigenesis of URCC and serve as a basis for the development of targeted therapies against URCC in the future. PMID:25120763
Identification and profiling of novel microRNAs in the Brassica rapa genome based on small RNA deep sequencing

PubMed Central

2012-01-01

Background MicroRNAs (miRNAs) are one of the functional non-coding small RNAs involved in the epigenetic control of the plant genome. Although plants contain both evolutionary conserved miRNAs and species-specific miRNAs within their genomes, computational methods often only identify evolutionary conserved miRNAs. The recent sequencing of the Brassica rapa genome enables us to identify miRNAs and their putative target genes. In this study, we sought to provide a more comprehensive prediction of B. rapa miRNAs based on high throughput small RNA deep sequencing. Results We sequenced small RNAs from five types of tissue: seedlings, roots, petioles, leaves, and flowers. By analyzing 2.75 million unique reads that mapped to the B. rapa genome, we identified 216 novel and 196 conserved miRNAs that were predicted to target approximately 20% of the genome’s protein coding genes. Quantitative analysis of miRNAs from the five types of tissue revealed that novel miRNAs were expressed in diverse tissues but their expression levels were lower than those of the conserved miRNAs. Comparative analysis of the miRNAs between the B. rapa and Arabidopsis thaliana genomes demonstrated that redundant copies of conserved miRNAs in the B. rapa genome may have been deleted after whole genome triplication. Novel miRNA members seemed to have spontaneously arisen from the B. rapa and A. thaliana genomes, suggesting the species-specific expansion of miRNAs. We have made this data publicly available in a miRNA database of B. rapa called BraMRs. The database allows the user to retrieve miRNA sequences, their expression profiles, and a description of their target genes from the five tissue types investigated here. Conclusions This is the first report to identify novel miRNAs from Brassica crops using genome-wide high throughput techniques. The combination of computational methods and small RNA deep sequencing provides robust predictions of miRNAs in the genome. The finding of numerous novel miRNAs, many with few target genes and low expression levels, suggests the rapid evolution of miRNA genes. The development of a miRNA database, BraMRs, enables us to integrate miRNA identification, target prediction, and functional annotation of target genes. BraMRs will represent a valuable public resource with which to study the epigenetic control of B. rapa and other closely related Brassica species. The database is available at the following link: http://bramrs.rna.kr [1]. PMID:23163954
The Malus domestica sugar transporter gene family: identifications based on genome and expression profiling related to the accumulation of fruit sugars

PubMed Central

Wei, Xiaoyu; Liu, Fengli; Chen, Cheng; Ma, Fengwang; Li, Mingjun

2014-01-01

In plants, sugar transporters are involved not only in long-distance transport, but also in sugar accumulations in sink cells. To identify members of sugar transporter gene families and to analyze their function in fruit sugar accumulation, we conducted a phylogenetic analysis of the Malus domestica genome. Expression profiling was performed with shoot tips, mature leaves, and developed fruit of “Gala” apple. Genes for sugar alcohol [including 17 sorbitol transporters (SOTs)], sucrose, and monosaccharide transporters, plus SWEET genes, were selected as candidates in 31, 9, 50, and 27 loci, respectively, of the genome. The monosaccharide transporter family appears to include five subfamilies (30 MdHTs, 8 MdEDR6s, 5 MdTMTs, 3 MdvGTs, and 4 MdpGLTs). Phylogenetic analysis of the protein sequences indicated that orthologs exist among Malus, Vitis, and Arabidopsis. Investigations of transcripts revealed that 68 candidate transporters are expressed in apple, albeit to different extents. Here, we discuss their possible roles based on the relationship between their levels of expression and sugar concentrations. The high accumulation of fructose in apple fruit is possibly linked to the coordination and cooperation between MdTMT1/2 and MdEDR6. By contrast, these fruits show low MdSWEET4.1 expression and a high flux of fructose produced from sorbitol. Our study provides an exhaustive survey of sugar transporter genes and demonstrates that sugar transporter gene families in M. domestica are comparable to those in other species. Expression profiling of these transporters will likely contribute to improving our understanding of their physiological functions in fruit formation and the development of sweetness properties. PMID:25414708
The Malus domestica sugar transporter gene family: identifications based on genome and expression profiling related to the accumulation of fruit sugars.

PubMed

Wei, Xiaoyu; Liu, Fengli; Chen, Cheng; Ma, Fengwang; Li, Mingjun

2014-01-01

In plants, sugar transporters are involved not only in long-distance transport, but also in sugar accumulations in sink cells. To identify members of sugar transporter gene families and to analyze their function in fruit sugar accumulation, we conducted a phylogenetic analysis of the Malus domestica genome. Expression profiling was performed with shoot tips, mature leaves, and developed fruit of "Gala" apple. Genes for sugar alcohol [including 17 sorbitol transporters (SOTs)], sucrose, and monosaccharide transporters, plus SWEET genes, were selected as candidates in 31, 9, 50, and 27 loci, respectively, of the genome. The monosaccharide transporter family appears to include five subfamilies (30 MdHTs, 8 MdEDR6s, 5 MdTMTs, 3 MdvGTs, and 4 MdpGLTs). Phylogenetic analysis of the protein sequences indicated that orthologs exist among Malus, Vitis, and Arabidopsis. Investigations of transcripts revealed that 68 candidate transporters are expressed in apple, albeit to different extents. Here, we discuss their possible roles based on the relationship between their levels of expression and sugar concentrations. The high accumulation of fructose in apple fruit is possibly linked to the coordination and cooperation between MdTMT1/2 and MdEDR6. By contrast, these fruits show low MdSWEET4.1 expression and a high flux of fructose produced from sorbitol. Our study provides an exhaustive survey of sugar transporter genes and demonstrates that sugar transporter gene families in M. domestica are comparable to those in other species. Expression profiling of these transporters will likely contribute to improving our understanding of their physiological functions in fruit formation and the development of sweetness properties.
Evaluation of somatic copy number estimation tools for whole-exome sequencing data.

PubMed

Nam, Jae-Yong; Kim, Nayoung K D; Kim, Sang Cheol; Joung, Je-Gun; Xi, Ruibin; Lee, Semin; Park, Peter J; Park, Woong-Yang

2016-03-01

Whole-exome sequencing (WES) has become a standard method for detecting genetic variants in human diseases. Although the primary use of WES data has been the identification of single nucleotide variations and indels, these data also offer a possibility of detecting copy number variations (CNVs) at high resolution. However, WES data have uneven read coverage along the genome owing to the target capture step, and the development of a robust WES-based CNV tool is challenging. Here, we evaluate six WES somatic CNV detection tools: ADTEx, CONTRA, Control-FREEC, EXCAVATOR, ExomeCNV and Varscan2. Using WES data from 50 kidney chromophobe, 50 bladder urothelial carcinoma, and 50 stomach adenocarcinoma patients from The Cancer Genome Atlas, we compared the CNV calls from the six tools with a reference CNV set that was identified by both single nucleotide polymorphism array 6.0 and whole-genome sequencing data. We found that these algorithms gave highly variable results: visual inspection reveals significant differences between the WES-based segmentation profiles and the reference profile, as well as among the WES-based profiles. Using a 50% overlap criterion, 13-77% of WES CNV calls were covered by CNVs from the reference set, up to 21% of the copy gains were called as losses or vice versa, and dramatic differences in CNV sizes and CNV numbers were observed. Overall, ADTEx and EXCAVATOR had the best performance with relatively high precision and sensitivity. We suggest that the current algorithms for somatic CNV detection from WES data are limited in their performance and that more robust algorithms are needed. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Accurate Prediction of Inducible Transcription Factor Binding Intensities In Vivo

PubMed Central

Siepel, Adam; Lis, John T.

2012-01-01

DNA sequence and local chromatin landscape act jointly to determine transcription factor (TF) binding intensity profiles. To disentangle these influences, we developed an experimental approach, called protein/DNA binding followed by high-throughput sequencing (PB–seq), that allows the binding energy landscape to be characterized genome-wide in the absence of chromatin. We applied our methods to the Drosophila Heat Shock Factor (HSF), which inducibly binds a target DNA sequence element (HSE) following heat shock stress. PB–seq involves incubating sheared naked genomic DNA with recombinant HSF, partitioning the HSF–bound and HSF–free DNA, and then detecting HSF–bound DNA by high-throughput sequencing. We compared PB–seq binding profiles with ones observed in vivo by ChIP–seq and developed statistical models to predict the observed departures from idealized binding patterns based on covariates describing the local chromatin environment. We found that DNase I hypersensitivity and tetra-acetylation of H4 were the most influential covariates in predicting changes in HSF binding affinity. We also investigated the extent to which DNA accessibility, as measured by digital DNase I footprinting data, could be predicted from MNase–seq data and the ChIP–chip profiles for many histone modifications and TFs, and found GAGA element associated factor (GAF), tetra-acetylation of H4, and H4K16 acetylation to be the most predictive covariates. Lastly, we generated an unbiased model of HSF binding sequences, which revealed distinct biophysical properties of the HSF/HSE interaction and a previously unrecognized substructure within the HSE. These findings provide new insights into the interplay between the genomic sequence and the chromatin landscape in determining transcription factor binding intensity. PMID:22479205
Whole-genome sequence, SNP chips and pedigree structure: building demographic profiles in domestic dog breeds to optimize genetic-trait mapping.

PubMed

Dreger, Dayna L; Rimbault, Maud; Davis, Brian W; Bhatnagar, Adrienne; Parker, Heidi G; Ostrander, Elaine A

2016-12-01

In the decade following publication of the draft genome sequence of the domestic dog, extraordinary advances with application to several fields have been credited to the canine genetic system. Taking advantage of closed breeding populations and the subsequent selection for aesthetic and behavioral characteristics, researchers have leveraged the dog as an effective natural model for the study of complex traits, such as disease susceptibility, behavior and morphology, generating unique contributions to human health and biology. When designing genetic studies using purebred dogs, it is essential to consider the unique demography of each population, including estimation of effective population size and timing of population bottlenecks. The analytical design approach for genome-wide association studies (GWAS) and analysis of whole-genome sequence (WGS) experiments are inextricable from demographic data. We have performed a comprehensive study of genomic homozygosity, using high-depth WGS data for 90 individuals, and Illumina HD SNP data from 800 individuals representing 80 breeds. These data were coupled with extensive pedigree data analyses for 11 breeds that, together, allowed us to compute breed structure, demography, and molecular measures of genome diversity. Our comparative analyses characterize the extent, formation and implication of breed-specific diversity as it relates to population structure. These data demonstrate the relationship between breed-specific genome dynamics and population architecture, and provide important considerations influencing the technological and cohort design of association and other genomic studies. © 2016. Published by The Company of Biologists Ltd.
Whole-genome sequence, SNP chips and pedigree structure: building demographic profiles in domestic dog breeds to optimize genetic-trait mapping

PubMed Central

Dreger, Dayna L.; Rimbault, Maud; Davis, Brian W.; Bhatnagar, Adrienne; Parker, Heidi G.

2016-01-01

ABSTRACT In the decade following publication of the draft genome sequence of the domestic dog, extraordinary advances with application to several fields have been credited to the canine genetic system. Taking advantage of closed breeding populations and the subsequent selection for aesthetic and behavioral characteristics, researchers have leveraged the dog as an effective natural model for the study of complex traits, such as disease susceptibility, behavior and morphology, generating unique contributions to human health and biology. When designing genetic studies using purebred dogs, it is essential to consider the unique demography of each population, including estimation of effective population size and timing of population bottlenecks. The analytical design approach for genome-wide association studies (GWAS) and analysis of whole-genome sequence (WGS) experiments are inextricable from demographic data. We have performed a comprehensive study of genomic homozygosity, using high-depth WGS data for 90 individuals, and Illumina HD SNP data from 800 individuals representing 80 breeds. These data were coupled with extensive pedigree data analyses for 11 breeds that, together, allowed us to compute breed structure, demography, and molecular measures of genome diversity. Our comparative analyses characterize the extent, formation and implication of breed-specific diversity as it relates to population structure. These data demonstrate the relationship between breed-specific genome dynamics and population architecture, and provide important considerations influencing the technological and cohort design of association and other genomic studies. PMID:27874836
Comparative transcriptome profiling analyses during the lag phase uncover YAP1, PDR1, PDR3, RPN4, and HSF1 as key regulatory genes in genomic adaptation to the lignocellulose derived inhibitor HMF for Saccharomyces cerevisiae

USDA-ARS?s Scientific Manuscript database

The yeast Saccharomyces cerevisiae is able to adapt and in situ detoxify lignocellulose derived inhibitors such as furfural and hydroxymethylfurfural (HMF). The length of lag phase for cell growth in response to the inhibitor challenge has been used to measure tolerance of strain performance. Mechan...
Complete Genome Sequence and Comparative Metabolic Profiling of the Prototypical Enteroaggregative Escherichia coli Strain 042

PubMed Central

Chaudhuri, Roy R.; Sebaihia, Mohammed; Hobman, Jon L.; Webber, Mark A.; Leyton, Denisse L.; Goldberg, Martin D.; Cunningham, Adam F.; Scott-Tucker, Anthony; Ferguson, Paul R.; Thomas, Christopher M.; Frankel, Gad; Tang, Christoph M.; Dudley, Edward G.; Roberts, Ian S.; Rasko, David A.; Pallen, Mark J.; Parkhill, Julian; Nataro, James P.; Thomson, Nicholas R.; Henderson, Ian R.

2010-01-01

Background Escherichia coli can experience a multifaceted life, in some cases acting as a commensal while in other cases causing intestinal and/or extraintestinal disease. Several studies suggest enteroaggregative E. coli are the predominant cause of E. coli-mediated diarrhea in the developed world and are second only to Campylobacter sp. as a cause of bacterial-mediated diarrhea. Furthermore, enteroaggregative E. coli are a predominant cause of persistent diarrhea in the developing world where infection has been associated with malnourishment and growth retardation. Methods In this study we determined the complete genomic sequence of E. coli 042, the prototypical member of the enteroaggregative E. coli, which has been shown to cause disease in volunteer studies. We performed genomic and phylogenetic comparisons with other E. coli strains revealing previously uncharacterised virulence factors including a variety of secreted proteins and a capsular polysaccharide biosynthetic locus. In addition, by using Biolog™ Phenotype Microarrays we have provided a full metabolic profiling of E. coli 042 and the non-pathogenic lab strain E. coli K-12. We have highlighted the genetic basis for many of the metabolic differences between E. coli 042 and E. coli K-12. Conclusion This study provides a genetic context for the vast amount of experimental and epidemiological data published thus far and provides a template for future diagnostic and intervention strategies. PMID:20098708

OrthoDB v8: update of the hierarchical catalog of orthologs and the underlying free software.

PubMed

Kriventseva, Evgenia V; Tegenfeldt, Fredrik; Petty, Tom J; Waterhouse, Robert M; Simão, Felipe A; Pozdnyakov, Igor A; Ioannidis, Panagiotis; Zdobnov, Evgeny M

2015-01-01

Orthology, refining the concept of homology, is the cornerstone of evolutionary comparative studies. With the ever-increasing availability of genomic data, inference of orthology has become instrumental for generating hypotheses about gene functions crucial to many studies. This update of the OrthoDB hierarchical catalog of orthologs (http://www.orthodb.org) covers 3027 complete genomes, including the most comprehensive set of 87 arthropods, 61 vertebrates, 227 fungi and 2627 bacteria (sampling the most complete and representative genomes from over 11,000 available). In addition to the most extensive integration of functional annotations from UniProt, InterPro, GO, OMIM, model organism phenotypes and COG functional categories, OrthoDB uniquely provides evolutionary annotations including rates of ortholog sequence divergence, copy-number profiles, sibling groups and gene architectures. We re-designed the entirety of the OrthoDB website from the underlying technology to the user interface, enabling the user to specify species of interest and to select the relevant orthology level by the NCBI taxonomy. The text searches allow use of complex logic with various identifiers of genes, proteins, domains, ontologies or annotation keywords and phrases. Gene copy-number profiles can also be queried. This release comes with the freely available underlying ortholog clustering pipeline (http://www.orthodb.org/software). © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Systematic evaluation of bias in microbial community profiles induced by whole genome amplification.

PubMed

Direito, Susana O L; Zaura, Egija; Little, Miranda; Ehrenfreund, Pascale; Röling, Wilfred F M

2014-03-01

Whole genome amplification methods facilitate the detection and characterization of microbial communities in low biomass environments. We examined the extent to which the actual community structure is reliably revealed and factors contributing to bias. One widely used [multiple displacement amplification (MDA)] and one new primer-free method [primase-based whole genome amplification (pWGA)] were compared using a polymerase chain reaction (PCR)-based method as control. Pyrosequencing of an environmental sample and principal component analysis revealed that MDA impacted community profiles more strongly than pWGA and indicated that this related to species GC content, although an influence of DNA integrity could not be excluded. Subsequently, biases by species GC content, DNA integrity and fragment size were separately analysed using defined mixtures of DNA from various species. We found significantly less amplification of species with the highest GC content for MDA-based templates and, to a lesser extent, for pWGA. DNA fragmentation also interfered severely: species with more fragmented DNA were less amplified with MDA and pWGA. pWGA was unable to amplify low molecular weight DNA (< 1.5 kb), whereas MDA was inefficient. We conclude that pWGA is the most promising method for characterization of microbial communities in low-biomass environments and for currently planned astrobiological missions to Mars. © 2013 Society for Applied Microbiology and John Wiley & Sons Ltd.
Tissue-Specific Transcriptomic Profiling of Sorghum propinquum using a Rice Genome Array

PubMed Central

Zhang, Ting; Zhao, Xiuqin; Huang, Liyu; Liu, Xiaoyue; Zong, Ying; Zhu, Linghua; Yang, Daichang; Fu, Binying

2013-01-01

Sorghum (Sorghum bicolor) is one of the world's most important cereal crops. S. propinquum is a perennial wild relative of S. bicolor with well-developed rhizomes. Functional genomics analysis of S. propinquum, especially with respect to molecular mechanisms related to rhizome growth and development, can contribute to the development of more sustainable grain, forage, and bioenergy cropping systems. In this study, we used a whole rice genome oligonucleotide microarray to obtain tissue-specific gene expression profiles of S. propinquum with special emphasis on rhizome development. A total of 548 tissue-enriched genes were detected, including 31 and 114 unique genes that were expressed predominantly in the rhizome tips (RT) and internodes (RI), respectively. Further GO analysis indicated that the functions of these tissue-enriched genes corresponded to their characteristic biological processes. A few distinct cis-elements, including ABA-responsive RY repeat CATGCA, sugar-repressive TTATCC, and GA-responsive TAACAA, were found to be prevalent in RT-enriched genes, implying an important role in rhizome growth and development. Comprehensive comparative analysis of these rhizome-enriched genes and rhizome-specific genes previously identified in Oryza longistaminata and S. propinquum indicated that phytohormones, including ABA, GA, and SA, are key regulators of gene expression during rhizome development. Co-localization of rhizome-enriched genes with rhizome-related QTLs in rice and sorghum generated functional candidates for future cloning of genes associated with rhizome growth and development. PMID:23536906
SOX2 and p63 colocalize at genetic loci in squamous cell carcinomas

PubMed Central

Watanabe, Hideo; Ma, Qiuping; Peng, Shouyong; Adelmant, Guillaume; Swain, Danielle; Song, Wenyu; Fox, Cameron; Francis, Joshua M.; Pedamallu, Chandra Sekhar; DeLuca, David S.; Brooks, Angela N.; Wang, Su; Que, Jianwen; Rustgi, Anil K.; Wong, Kwok-kin; Ligon, Keith L.; Liu, X. Shirley; Marto, Jarrod A.; Meyerson, Matthew; Bass, Adam J.

2014-01-01

The transcription factor SOX2 is an essential regulator of pluripotent stem cells and promotes development and maintenance of squamous epithelia. We previously reported that SOX2 is an oncogene and subject to highly recurrent genomic amplification in squamous cell carcinomas (SCCs). Here, we have further characterized the function of SOX2 in SCC. Using ChIP-seq analysis, we compared SOX2-regulated gene profiles in multiple SCC cell lines to ES cell profiles and determined that SOX2 binds to distinct genomic loci in SCCs. In SCCs, SOX2 preferentially interacts with the transcription factor p63, as opposed to the transcription factor OCT4, which is the preferred SOX2 binding partner in ES cells. SOX2 and p63 exhibited overlapping genomic occupancy at a large number of loci in SCCs; however, coordinate binding of SOX2 and p63 was absent in ES cells. We further demonstrated that SOX2 and p63 jointly regulate gene expression, including the oncogene ETV4, which was essential for SOX2-amplified SCC cell survival. Together, these findings demonstrate that the action of SOX2 in SCC differs substantially from its role in pluripotency. The identification of the SCC-associated interaction between SOX2 and p63 will enable deeper characterization the downstream targets of this interaction in SCC and normal squamous epithelial physiology. PMID:24590290
SOX2 and p63 colocalize at genetic loci in squamous cell carcinomas.

PubMed

Watanabe, Hideo; Ma, Qiuping; Peng, Shouyong; Adelmant, Guillaume; Swain, Danielle; Song, Wenyu; Fox, Cameron; Francis, Joshua M; Pedamallu, Chandra Sekhar; DeLuca, David S; Brooks, Angela N; Wang, Su; Que, Jianwen; Rustgi, Anil K; Wong, Kwok-kin; Ligon, Keith L; Liu, X Shirley; Marto, Jarrod A; Meyerson, Matthew; Bass, Adam J

2014-04-01

The transcription factor SOX2 is an essential regulator of pluripotent stem cells and promotes development and maintenance of squamous epithelia. We previously reported that SOX2 is an oncogene and subject to highly recurrent genomic amplification in squamous cell carcinomas (SCCs). Here, we have further characterized the function of SOX2 in SCC. Using ChIP-seq analysis, we compared SOX2-regulated gene profiles in multiple SCC cell lines to ES cell profiles and determined that SOX2 binds to distinct genomic loci in SCCs. In SCCs, SOX2 preferentially interacts with the transcription factor p63, as opposed to the transcription factor OCT4, which is the preferred SOX2 binding partner in ES cells. SOX2 and p63 exhibited overlapping genomic occupancy at a large number of loci in SCCs; however, coordinate binding of SOX2 and p63 was absent in ES cells. We further demonstrated that SOX2 and p63 jointly regulate gene expression, including the oncogene ETV4, which was essential for SOX2-amplified SCC cell survival. Together, these findings demonstrate that the action of SOX2 in SCC differs substantially from its role in pluripotency. The identification of the SCC-associated interaction between SOX2 and p63 will enable deeper characterization the downstream targets of this interaction in SCC and normal squamous epithelial physiology.
RNASeq-based genome annotation and identification of long-noncoding RNAs in the grapevine cultivar 'Riesling'

USDA-ARS?s Scientific Manuscript database

The technological advances of RNA-seq and de novo transcriptome assembly have enabled genome annotation and transcriptome profiling in heterozygous species. This is a promising approach to improving the annotation of the reference genome sequence of grapevine (Vitis vinifera L.), a species of high-l...
Genomic stability and physiological assessments of live offspring sired by a bull clone, Starbuck II.

PubMed

Ortegon, H; Betts, D H; Lin, L; Coppola, G; Perrault, S D; Blondin, P; King, W A

2007-01-01

It appears that overt phenotypic abnormalities observed in some domestic animal clones are not transmitted to their progeny. The current study monitored Holstein heifers sired by a bull clone, Starbuck II, from weaning to puberty. Genomic stability was assessed by telomere length status and chromosomal analysis. Growth parameters, blood profiles, physical exams and reproductive parameters were assessed for 12 months (and compared to age-matched control heifers). Progeny sired by the clone bull did not differ (P>0.05) in weight, length and height compared to controls. However, progeny had lower heart rates (HR) (P=0.009), respiratory rates (RR) (P=0.007) and body temperature (P=0.03). Hematological profiles were within normal ranges and did not differ (P>0.05) between both groups. External and internal genitalia were normal and both groups reached puberty at expected ages. Progeny had two or three ovarian follicular waves per estrous cycle and serum progesterone concentrations were similar (P=0.99) to controls. Telomere lengths of sperm and blood cells from Starbuck II were not different (P>0.05) than those of non-cloned cattle; telomere lengths of progeny were not different (P>0.05) from age-matched controls. In addition, progeny had normal karyotypes in peripheral blood leukocytes compared to controls (89.1% versus 86.3% diploid, respectively). In summary, heifers sired by a bull clone had normal chromosomal stability, growth, physical, hematological and reproductive parameters, compared to normal heifers. Furthermore, they had moderate stress responses to routine handling and restraint.
Predicting Protein Function by Genomic Context: Quantitative Evaluation and Qualitative Inferences

PubMed Central

Huynen, Martijn; Snel, Berend; Lathe, Warren; Bork, Peer

2000-01-01

Various new methods have been proposed to predict functional interactions between proteins based on the genomic context of their genes. The types of genomic context that they use are Type I: the fusion of genes; Type II: the conservation of gene-order or co-occurrence of genes in potential operons; and Type III: the co-occurrence of genes across genomes (phylogenetic profiles). Here we compare these types for their coverage, their correlations with various types of functional interaction, and their overlap with homology-based function assignment. We apply the methods to Mycoplasma genitalium, the standard benchmarking genome in computational and experimental genomics. Quantitatively, conservation of gene order is the technique with the highest coverage, applying to 37% of the genes. By combining gene order conservation with gene fusion (6%), the co-occurrence of genes in operons in absence of gene order conservation (8%), and the co-occurrence of genes across genomes (11%), significant context information can be obtained for 50% of the genes (the categories overlap). Qualitatively, we observe that the functional interactions between genes are stronger as the requirements for physical neighborhood on the genome are more stringent, while the fraction of potential false positives decreases. Moreover, only in cases in which gene order is conserved in a substantial fraction of the genomes, in this case six out of twenty-five, does a single type of functional interaction (physical interaction) clearly dominate (>80%). In other cases, complementary function information from homology searches, which is available for most of the genes with significant genomic context, is essential to predict the type of interaction. Using a combination of genomic context and homology searches, new functional features can be predicted for 10% of M. genitalium genes. PMID:10958638
Whole genome sequencing as a tool to investigate a cluster of seven cases of listeriosis in Austria and Germany, 2011-2013.

PubMed

Schmid, D; Allerberger, F; Huhulescu, S; Pietzka, A; Amar, C; Kleta, S; Prager, R; Preußel, K; Aichinger, E; Mellmann, A

2014-05-01

A cluster of seven human cases of listeriosis occurred in Austria and in Germany between April 2011 and July 2013. The Listeria monocytogenes serovar (SV) 1/2b isolates shared pulsed-field gel electrophoresis (PFGE) and fluorescent amplified fragment length polymorphism (fAFLP) patterns indistinguishable from those from five food producers. The seven human isolates, a control strain with a different PFGE/fAFLP profile and ten food isolates were subjected to whole genome sequencing (WGS) in a blinded fashion. A gene-by-gene comparison (multilocus sequence typing (MLST)+) was performed, and the resulting whole genome allelic profiles were compared using SeqSphere(+) software version 1.0. On analysis of 2298 genes, the four human outbreak isolates from 2012 to 2013 had different alleles at ≤6 genes, i.e. differed by ≤6 genes from each other; the dendrogram placed these isolates in between five Austrian unaged soft cheese isolates from producer A (≤19-gene difference from the human cluster) and two Austrian ready-to-eat meat isolates from producer B (≤8-gene difference from the human cluster). Both food products appeared on grocery bills prospectively collected by these outbreak cases after hospital discharge. Epidemiological results on food consumption and MLST+ clearly separated the three cases in 2011 from the four 2012-2013 outbreak cases (≥48 different genes). We showed that WGS is capable of discriminating L. monocytogenes SV1/2b clones not distinguishable by PFGE and fAFLP. The listeriosis outbreak described clearly underlines the potential of sequence-based typing methods to offer enhanced resolution and comparability of typing systems for public health applications. © 2014 The Authors Clinical Microbiology and Infection © 2014 European Society of Clinical Microbiology and Infectious Diseases.
Genome-Wide Transcriptional Profiling of the Purple Sulfur Bacterium Allochromatium vinosum DSM 180T during Growth on Different Reduced Sulfur Compounds

PubMed Central

Weissgerber, Thomas; Dobler, Nadine; Polen, Tino; Latus, Jeanette; Stockdreher, Yvonne

2013-01-01

The purple sulfur bacterium Allochromatium vinosum DSM 180T is one of the best-studied sulfur-oxidizing anoxygenic phototrophic bacteria, and it has been developed into a model organism for laboratory-based studies of oxidative sulfur metabolism. Here, we took advantage of the organism's high metabolic versatility and performed whole-genome transcriptional profiling to investigate the response of A. vinosum cells upon exposure to sulfide, thiosulfate, elemental sulfur, or sulfite compared to photoorganoheterotrophic growth on malate. Differential expression of 1,178 genes was observed, corresponding to 30% of the A. vinosum genome. Relative transcription of 551 genes increased significantly during growth on one of the different sulfur sources, while the relative transcript abundance of 627 genes decreased. A significant number of genes that revealed strongly enhanced relative transcription levels have documented sulfur metabolism-related functions. Among these are the dsr genes, including dsrAB for dissimilatory sulfite reductase, and the sgp genes for the proteins of the sulfur globule envelope, thus confirming former results. In addition, we identified new genes encoding proteins with appropriate subcellular localization and properties to participate in oxidative dissimilatory sulfur metabolism. Those four genes for hypothetical proteins that exhibited the strongest increases of mRNA levels on sulfide and elemental sulfur, respectively, were chosen for inactivation and phenotypic analyses of the respective mutant strains. This approach verified the importance of the encoded proteins for sulfur globule formation during the oxidation of sulfide and thiosulfate and thereby also documented the suitability of comparative transcriptomics for the identification of new sulfur-related genes in anoxygenic phototrophic sulfur bacteria. PMID:23873913
Clinical implementation of integrated whole-genome copy number and mutation profiling for glioblastoma

PubMed Central

Ramkissoon, Shakti H.; Bi, Wenya Linda; Schumacher, Steven E.; Ramkissoon, Lori A.; Haidar, Sam; Knoff, David; Dubuc, Adrian; Brown, Loreal; Burns, Margot; Cryan, Jane B.; Abedalthagafi, Malak; Kang, Yun Jee; Schultz, Nikolaus; Reardon, David A.; Lee, Eudocia Q.; Rinne, Mikael L.; Norden, Andrew D.; Nayak, Lakshmi; Ruland, Sandra; Doherty, Lisa M.; LaFrankie, Debra C.; Horvath, Margaret; Aizer, Ayal A.; Russo, Andrea; Arvold, Nils D.; Claus, Elizabeth B.; Al-Mefty, Ossama; Johnson, Mark D.; Golby, Alexandra J.; Dunn, Ian F.; Chiocca, E. Antonio; Trippa, Lorenzo; Santagata, Sandro; Folkerth, Rebecca D.; Kantoff, Philip; Rollins, Barrett J.; Lindeman, Neal I.; Wen, Patrick Y.; Ligon, Azra H.; Beroukhim, Rameen; Alexander, Brian M.; Ligon, Keith L.

2015-01-01

Background Multidimensional genotyping of formalin-fixed paraffin-embedded (FFPE) samples has the potential to improve diagnostics and clinical trials for brain tumors, but prospective use in the clinical setting is not yet routine. We report our experience with implementing a multiplexed copy number and mutation-testing program in a diagnostic laboratory certified by the Clinical Laboratory Improvement Amendments. Methods We collected and analyzed clinical testing results from whole-genome array comparative genomic hybridization (OncoCopy) of 420 brain tumors, including 148 glioblastomas. Mass spectrometry–based mutation genotyping (OncoMap, 471 mutations) was performed on 86 glioblastomas. Results OncoCopy was successful in 99% of samples for which sufficient DNA was obtained (n = 415). All clinically relevant loci for glioblastomas were detected, including amplifications (EGFR, PDGFRA, MET) and deletions (EGFRvIII, PTEN, 1p/19q). Glioblastoma patients ≤40 years old had distinct profiles compared with patients >40 years. OncoMap testing reliably identified mutations in IDH1, TP53, and PTEN. Seventy-seven glioblastoma patients enrolled on trials, of whom 51% participated in targeted therapeutic trials where multiplex data informed eligibility or outcomes. Data integration identified patients with complete tumor suppressor inactivation, albeit rarely (5% of patients) due to lack of whole-gene coverage in OncoMap. Conclusions Combined use of multiplexed copy number and mutation detection from FFPE samples in the clinical setting can efficiently replace singleton tests for clinical diagnosis and prognosis in most settings. Our results support incorporation of these assays into clinical trials as integral biomarkers and their potential to impact interpretation of results. Limited tumor suppressor variant capture by targeted genotyping highlights the need for whole-gene sequencing in glioblastoma. PMID:25754088
Transposon identification using profile HMMs

PubMed Central

2010-01-01

Background Transposons are "jumping genes" that account for large quantities of repetitive content in genomes. They are known to affect transcriptional regulation in several different ways, and are implicated in many human diseases. Transposons are related to microRNAs and viruses, and many genes, pseudogenes, and gene promoters are derived from transposons or have origins in transposon-induced duplication. Modeling transposon-derived genomic content is difficult because they are poorly conserved. Profile hidden Markov models (profile HMMs), widely used for protein sequence family modeling, are rarely used for modeling DNA sequence families. The algorithm commonly used to estimate the parameters of profile HMMs, Baum-Welch, is prone to prematurely converge to local optima. The DNA domain is especially problematic for the Baum-Welch algorithm, since it has only four letters as opposed to the twenty residues of the amino acid alphabet. Results We demonstrate with a simulation study and with an application to modeling the MIR family of transposons that two recently introduced methods, Conditional Baum-Welch and Dynamic Model Surgery, achieve better estimates of the parameters of profile HMMs across a range of conditions. Conclusions We argue that these new algorithms expand the range of potential applications of profile HMMs to many important DNA sequence family modeling problems, including that of searching for and modeling the virus-like transposons that are found in all known genomes. PMID:20158867
Integrated analysis of chromosome copy number variation and gene expression in cervical carcinoma

PubMed Central

Yan, Deng; Yi, Song; Chiu, Wang Chi; Qin, Liu Gui; Kin, Wong Hoi; Kwok Hung, Chung Tony; Linxiao, Han; Wai, Choy Kwong; Yi, Sui; Tao, Yang; Tao, Tang

2017-01-01

Objective This study was conducted to explore chromosomal copy number variations (CNV) and transcript expression and to examine pathways in cervical pathogenesis using genome-wide high resolution microarrays. Methods Genome-wide chromosomal CNVs were investigated in 6 cervical cancer cell lines by Human Genome CGH Microarray Kit (4x44K). Gene expression profiles in cervical cancer cell lines, primary cervical carcinoma and normal cervical epithelium tissues were also studied using the Whole Human Genome Microarray Kit (4x44K). Results Fifty common chromosomal CNVs were identified in the cervical cancer cell lines. Correlation analysis revealed that gene up-regulation or down-regulation is significantly correlated with genomic amplification (P=0.009) or deletion (P=0.006) events. Expression profiles were identified through cluster analysis. Gene annotation analysis pinpointed cell cycle pathways was significantly (P=1.15E-08) affected in cervical cancer. Common CNVs were associated with cervical cancer. Conclusion Chromosomal CNVs may contribute to their transcript expression in cervical cancer. PMID:29312578
BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks.

PubMed

Yan, Winston X; Mirzazadeh, Reza; Garnerone, Silvano; Scott, David; Schneider, Martin W; Kallas, Tomasz; Custodio, Joaquin; Wernersson, Erik; Li, Yinqing; Gao, Linyi; Federova, Yana; Zetsche, Bernd; Zhang, Feng; Bienko, Magda; Crosetto, Nicola

2017-05-12

Precisely measuring the location and frequency of DNA double-strand breaks (DSBs) along the genome is instrumental to understanding genomic fragility, but current methods are limited in versatility, sensitivity or practicality. Here we present Breaks Labeling In Situ and Sequencing (BLISS), featuring the following: (1) direct labelling of DSBs in fixed cells or tissue sections on a solid surface; (2) low-input requirement by linear amplification of tagged DSBs by in vitro transcription; (3) quantification of DSBs through unique molecular identifiers; and (4) easy scalability and multiplexing. We apply BLISS to profile endogenous and exogenous DSBs in low-input samples of cancer cells, embryonic stem cells and liver tissue. We demonstrate the sensitivity of BLISS by assessing the genome-wide off-target activity of two CRISPR-associated RNA-guided endonucleases, Cas9 and Cpf1, observing that Cpf1 has higher specificity than Cas9. Our results establish BLISS as a versatile, sensitive and efficient method for genome-wide DSB mapping in many applications.
Integrated analysis of chromosome copy number variation and gene expression in cervical carcinoma.

PubMed

Yan, Deng; Yi, Song; Chiu, Wang Chi; Qin, Liu Gui; Kin, Wong Hoi; Kwok Hung, Chung Tony; Linxiao, Han; Wai, Choy Kwong; Yi, Sui; Tao, Yang; Tao, Tang

2017-12-12

This study was conducted to explore chromosomal copy number variations (CNV) and transcript expression and to examine pathways in cervical pathogenesis using genome-wide high resolution microarrays. Genome-wide chromosomal CNVs were investigated in 6 cervical cancer cell lines by Human Genome CGH Microarray Kit (4x44K). Gene expression profiles in cervical cancer cell lines, primary cervical carcinoma and normal cervical epithelium tissues were also studied using the Whole Human Genome Microarray Kit (4x44K). Fifty common chromosomal CNVs were identified in the cervical cancer cell lines. Correlation analysis revealed that gene up-regulation or down-regulation is significantly correlated with genomic amplification ( P =0.009) or deletion ( P =0.006) events. Expression profiles were identified through cluster analysis. Gene annotation analysis pinpointed cell cycle pathways was significantly ( P =1.15E-08) affected in cervical cancer. Common CNVs were associated with cervical cancer. Chromosomal CNVs may contribute to their transcript expression in cervical cancer.
Transforming the practice of medicine using genomics

PubMed Central

Ginsburg, Geoffrey S.; Ginsburg, Geoffrey S.; J. McCarthy, Jeanette

2009-01-01

Recent studies have demonstrated the use of genomic data, particularly gene expression signatures, as clinical prognostic factors in complex diseases. Such studies herald the future for genomic medicine and the opportunity for personalized prognosis in a variety of clinical contexts that utilize genomescale molecular information. Several key areas represent logical and critical next steps in the use of complex genomic profiling data towards the goal of personalized medicine. First, analyses should be geared toward the development of molecular profiles that predict future events – such as major clinical events or the response, resistance, or adverse reaction to therapy. Secondly, these must move into actual clinical practice by forming the basis for the next generation of clinical trials that will employ these methodologies to stratify patients. Lastly, there remain formidable challenges is in the translation of genomic technologies into clinical medicine that will need to be addressed: professional and public education, health outcomes research, reimbursement, regulatory oversight and privacy protection. PMID:22461094
In Silico Ionomics Segregates Parasitic from Free-Living Eukaryotes

PubMed Central

Greganova, Eva; Steinmann, Michael; Mäser, Pascal; Fankhauser, Niklaus

2013-01-01

Ion transporters are fundamental to life. Due to their ancient origin and conservation in sequence, ion transporters are also particularly well suited for comparative genomics of distantly related species. Here, we perform genome-wide ion transporter profiling as a basis for comparative genomics of eukaryotes. From a given predicted proteome, we identify all bona fide ion channels, ion porters, and ion pumps. Concentrating on unicellular eukaryotes (n = 37), we demonstrate that clustering of species according to their repertoire of ion transporters segregates obligate endoparasites (n = 23) on the one hand, from free-living species and facultative parasites (n = 14) on the other hand. This surprising finding indicates strong convergent evolution of the parasites regarding the acquisition and homeostasis of inorganic ions. Random forest classification identifies transporters of ammonia, plus transporters of iron and other transition metals, as the most informative for distinguishing the obligate parasites. Thus, in silico ionomics further underscores the importance of iron in infection biology and suggests access to host sources of nitrogen and transition metals to be selective forces in the evolution of parasitism. This finding is in agreement with the phenomenon of iron withholding as a primordial antimicrobial strategy of infected mammals. PMID:24048281
Building the genomic nation: ‘Homo Brasilis’ and the ‘Genoma Mexicano’ in comparative cultural perspective

PubMed Central

Kent, Michael; García-Deister, Vivette; López-Beltrán, Carlos; Santos, Ricardo Ventura; Schwartz-Marín, Ernesto; Wade, Peter

2015-01-01

This article explores the relationship between genetic research, nationalism and the construction of collective social identities in Latin America. It makes a comparative analysis of two research projects – the ‘Genoma Mexicano’ and the ‘Homo Brasilis’ – both of which sought to establish national and genetic profiles. Both have reproduced and strengthened the idea of their respective nations of focus, incorporating biological elements into debates on social identities. Also, both have placed the unifying figure of the mestizo/mestiço at the heart of national identity constructions, and in so doing have displaced alternative identity categories, such as those based on race. However, having been developed in different national contexts, these projects have had distinct scientific and social trajectories: in Mexico, the genomic mestizo is mobilized mainly in relation to health, while in Brazil the key arena is that of race. We show the importance of the nation as a frame for mobilizing genetic data in public policy debates, and demonstrate how race comes in and out of focus in different Latin American national contexts of genomic research, while never completely disappearing. PMID:27479999
In silico ionomics segregates parasitic from free-living eukaryotes.

PubMed

Greganova, Eva; Steinmann, Michael; Mäser, Pascal; Fankhauser, Niklaus

2013-01-01

Ion transporters are fundamental to life. Due to their ancient origin and conservation in sequence, ion transporters are also particularly well suited for comparative genomics of distantly related species. Here, we perform genome-wide ion transporter profiling as a basis for comparative genomics of eukaryotes. From a given predicted proteome, we identify all bona fide ion channels, ion porters, and ion pumps. Concentrating on unicellular eukaryotes (n = 37), we demonstrate that clustering of species according to their repertoire of ion transporters segregates obligate endoparasites (n = 23) on the one hand, from free-living species and facultative parasites (n = 14) on the other hand. This surprising finding indicates strong convergent evolution of the parasites regarding the acquisition and homeostasis of inorganic ions. Random forest classification identifies transporters of ammonia, plus transporters of iron and other transition metals, as the most informative for distinguishing the obligate parasites. Thus, in silico ionomics further underscores the importance of iron in infection biology and suggests access to host sources of nitrogen and transition metals to be selective forces in the evolution of parasitism. This finding is in agreement with the phenomenon of iron withholding as a primordial antimicrobial strategy of infected mammals.
Determination of the genotoxic effects of Convolvulus arvensis extracts on corn (Zea mays L.) seeds.

PubMed

Sunar, Serap; Yildirim, Nalan; Aksakal, Ozkan; Agar, Guleray

2013-06-01

In this research, the methanolic extracts of Convolvulus arvensis were tested for genotoxic and inhibitor activity on the total soluble protein content and the genomic template stability against corn Zea mays L. seed. The methanol extracts of leaf, stem and root of C. arvensis were diluted to 50, 75 and 100 μl concentrations and applied to corn seed. The total soluble protein and genomic template stability results were compared with the control. The results showed that especially 100 μl extracts of diluted leaf, stem and root had a strong inhibitory activity on the genomic template stability. The changes occurred in random amplification of polymorphic DNA (RAPD) profiles of C. arvensis extract treatment included variation in band intensity, loss of bands and appearance of new bands compared with control. Also, the results obtained from this study revealed that the increase in the concentrations of C. arvensis extract increased the total soluble protein content in maize. The results suggested that RAPD analysis and total protein analysis could be applied as a suitable biomarker assay for the detection of genotoxic effects of plant allelochemicals.

Reciprocal changes in gene expression profiles of cocultured breast epithelial cells and primary fibroblasts.

PubMed

Rozenchan, Patricia Bortman; Carraro, Dirce Maria; Brentani, Helena; de Carvalho Mota, Louise Danielle; Bastos, Elen Pereira; e Ferreira, Elisa Napolitano; Torres, Cesar H; Katayama, Maria Lúcia Hirata; Roela, Rosimeire Aparecida; Lyra, Eduardo C; Soares, Fernando Augusto; Folgueira, Maria Aparecida Azevedo Koike; Góes, João Carlos Guedes Sampaio; Brentani, Maria Mitzi

2009-12-15

The importance of epithelial-stroma interaction in normal breast development and tumor progression has been recognized. To identify genes that were regulated by these reciprocal interactions, we cocultured a nonmalignant (MCF10A) and a breast cancer derived (MDA-MB231) basal cell lines, with fibroblasts isolated from breast benign-disease adjacent tissues (NAF) or with carcinoma-associated fibroblasts (CAF), in a transwell system. Gene expression profiles of each coculture pair were compared with the correspondent monocultures, using a customized microarray. Contrariwise to large alterations in epithelial cells genomic profiles, fibroblasts were less affected. In MDA-MB231 highly represented genes downregulated by CAF derived factors coded for proteins important for the specificity of vectorial transport between ER and golgi, possibly affecting cell polarity whereas the response of MCF10A comprised an induction of genes coding for stress responsive proteins, representing a prosurvival effect. While NAF downregulated genes encoding proteins associated to glycolipid and fatty acid biosynthesis in MDA-MB231, potentially affecting membrane biogenesis, in MCF10A, genes critical for growth control and adhesion were altered. NAFs responded to coculture with MDA-MB231 by a decrease in the expression of genes induced by TGFbeta1 and associated to motility. However, there was little change in NAFs gene expression profile influenced by MCF10A. CAFs responded to the presence of both epithelial cells inducing genes implicated in cell proliferation. Our data indicate that interactions between breast fibroblasts and basal epithelial cells resulted in alterations in the genomic profiles of both cell types which may help to clarify some aspects of this heterotypic signaling. Copyright (c) 2009 UICC.
Genome and transcriptome adaptation accompanying emergence of the definitive type 2 host-restricted Salmonella enterica serovar Typhimurium pathovar.

PubMed

Kingsley, Robert A; Kay, Sally; Connor, Thomas; Barquist, Lars; Sait, Leanne; Holt, Kathryn E; Sivaraman, Karthi; Wileman, Thomas; Goulding, David; Clare, Simon; Hale, Christine; Seshasayee, Aswin; Harris, Simon; Thomson, Nicholas R; Gardner, Paul; Rabsch, Wolfgang; Wigley, Paul; Humphrey, Tom; Parkhill, Julian; Dougan, Gordon

2013-08-27

Salmonella enterica serovar Typhimurium definitive type 2 (DT2) is host restricted to Columba livia (rock or feral pigeon) but is also closely related to S. Typhimurium isolates that circulate in livestock and cause a zoonosis characterized by gastroenteritis in humans. DT2 isolates formed a distinct phylogenetic cluster within S. Typhimurium based on whole-genome-sequence polymorphisms. Comparative genome analysis of DT2 94-213 and S. Typhimurium SL1344, DT104, and D23580 identified few differences in gene content with the exception of variations within prophages. However, DT2 94-213 harbored 22 pseudogenes that were intact in other closely related S. Typhimurium strains. We report a novel in silico approach to identify single amino acid substitutions in proteins that have a high probability of a functional impact. One polymorphism identified using this method, a single-residue deletion in the Tar protein, abrogated chemotaxis to aspartate in vitro. DT2 94-213 also exhibited an altered transcriptional profile in response to culture at 42°C compared to that of SL1344. Such differentially regulated genes included a number involved in flagellum biosynthesis and motility. IMPORTANCE Whereas Salmonella enterica serovar Typhimurium can infect a wide range of animal species, some variants within this serovar exhibit a more limited host range and altered disease potential. Phylogenetic analysis based on whole-genome sequences can identify lineages associated with specific virulence traits, including host adaptation. This study represents one of the first to link pathogen-specific genetic signatures, including coding capacity, genome degradation, and transcriptional responses to host adaptation within a Salmonella serovar. We performed comparative genome analysis of reference and pigeon-adapted definitive type 2 (DT2) S. Typhimurium isolates alongside phenotypic and transcriptome analyses, to identify genetic signatures linked to host adaptation within the DT2 lineage.
Effective normalization for copy number variation detection from whole genome sequencing.

PubMed

Janevski, Angel; Varadan, Vinay; Kamalakaran, Sitharthan; Banerjee, Nilanjana; Dimitrova, Nevenka

2012-01-01

Whole genome sequencing enables a high resolution view of the human genome and provides unique insights into genome structure at an unprecedented scale. There have been a number of tools to infer copy number variation in the genome. These tools, while validated, also include a number of parameters that are configurable to genome data being analyzed. These algorithms allow for normalization to account for individual and population-specific effects on individual genome CNV estimates but the impact of these changes on the estimated CNVs is not well characterized. We evaluate in detail the effect of normalization methodologies in two CNV algorithms FREEC and CNV-seq using whole genome sequencing data from 8 individuals spanning four populations. We apply FREEC and CNV-seq to a sequencing data set consisting of 8 genomes. We use multiple configurations corresponding to different read-count normalization methodologies in FREEC, and statistically characterize the concordance of the CNV calls between FREEC configurations and the analogous output from CNV-seq. The normalization methodologies evaluated in FREEC are: GC content, mappability and control genome. We further stratify the concordance analysis within genic, non-genic, and a collection of validated variant regions. The GC content normalization methodology generates the highest number of altered copy number regions. Both mappability and control genome normalization reduce the total number and length of copy number regions. Mappability normalization yields Jaccard indices in the 0.07 - 0.3 range, whereas using a control genome normalization yields Jaccard index values around 0.4 with normalization based on GC content. The most critical impact of using mappability as a normalization factor is substantial reduction of deletion CNV calls. The output of another method based on control genome normalization, CNV-seq, resulted in comparable CNV call profiles, and substantial agreement in variable gene and CNV region calls. Choice of read-count normalization methodology has a substantial effect on CNV calls and the use of genomic mappability or an appropriately chosen control genome can optimize the output of CNV analysis.
Characterization of the glutathione S-transferase gene family through ESTs and expression analyses within common and pigmented cultivars of Citrus sinensis (L.) Osbeck.

PubMed

Licciardello, Concetta; D'Agostino, Nunzio; Traini, Alessandra; Recupero, Giuseppe Reforgiato; Frusciante, Luigi; Chiusano, Maria Luisa

2014-02-03

Glutathione S-transferases (GSTs) represent a ubiquitous gene family encoding detoxification enzymes able to recognize reactive electrophilic xenobiotic molecules as well as compounds of endogenous origin. Anthocyanin pigments require GSTs for their transport into the vacuole since their cytoplasmic retention is toxic to the cell. Anthocyanin accumulation in Citrus sinensis (L.) Osbeck fruit flesh determines different phenotypes affecting the typical pigmentation of Sicilian blood oranges. In this paper we describe: i) the characterization of the GST gene family in C. sinensis through a systematic EST analysis; ii) the validation of the EST assembly by exploiting the genome sequences of C. sinensis and C. clementina and their genome annotations; iii) GST gene expression profiling in six tissues/organs and in two different sweet orange cultivars, Cadenera (common) and Moro (pigmented). We identified 61 GST transcripts, described the full- or partial-length nature of the sequences and assigned to each sequence the GST class membership exploiting a comparative approach and the classification scheme proposed for plant species. A total of 23 full-length sequences were defined. Fifty-four of the 61 transcripts were successfully aligned to the C. sinensis and C. clementina genomes. Tissue specific expression profiling demonstrated that the expression of some GST transcripts was 'tissue-affected' and cultivar specific. A comparative analysis of C. sinensis GSTs with those from other plant species was also considered. Data from the current analysis are accessible at http://biosrv.cab.unina.it/citrusGST/, with the aim to provide a reference resource for C. sinensis GSTs. This study aimed at the characterization of the GST gene family in C. sinensis. Based on expression patterns from two different cultivars and on sequence-comparative analyses, we also highlighted that two sequences, a Phi class GST and a Mapeg class GST, could be involved in the conjugation of anthocyanin pigments and in their transport into the vacuole, specifically in fruit flesh of the pigmented cultivar.
The Genomic and Transcriptomic Landscape of a HeLa Cell Line

PubMed Central

Landry, Jonathan J. M.; Pyl, Paul Theodor; Rausch, Tobias; Zichner, Thomas; Tekkedil, Manu M.; Stütz, Adrian M.; Jauch, Anna; Aiyar, Raeka S.; Pau, Gregoire; Delhomme, Nicolas; Gagneur, Julien; Korbel, Jan O.; Huber, Wolfgang; Steinmetz, Lars M.

2013-01-01

HeLa is the most widely used model cell line for studying human cellular and molecular biology. To date, no genomic reference for this cell line has been released, and experiments have relied on the human reference genome. Effective design and interpretation of molecular genetic studies performed using HeLa cells require accurate genomic information. Here we present a detailed genomic and transcriptomic characterization of a HeLa cell line. We performed DNA and RNA sequencing of a HeLa Kyoto cell line and analyzed its mutational portfolio and gene expression profile. Segmentation of the genome according to copy number revealed a remarkably high level of aneuploidy and numerous large structural variants at unprecedented resolution. Some of the extensive genomic rearrangements are indicative of catastrophic chromosome shattering, known as chromothripsis. Our analysis of the HeLa gene expression profile revealed that several pathways, including cell cycle and DNA repair, exhibit significantly different expression patterns from those in normal human tissues. Our results provide the first detailed account of genomic variants in the HeLa genome, yielding insight into their impact on gene expression and cellular function as well as their origins. This study underscores the importance of accounting for the strikingly aberrant characteristics of HeLa cells when designing and interpreting experiments, and has implications for the use of HeLa as a model of human biology. PMID:23550136
Ancient Duplications and Expression Divergence in the Globin Gene Superfamily of Vertebrates: Insights from the Elephant Shark Genome and Transcriptome

PubMed Central

Opazo, Juan C.; Toloza-Villalobos, Jessica; Burmester, Thorsten; Venkatesh, Byrappa; Storz, Jay F.

2015-01-01

Comparative analyses of vertebrate genomes continue to uncover a surprising diversity of genes in the globin gene superfamily, some of which have very restricted phyletic distributions despite their antiquity. Genomic analysis of the globin gene repertoire of cartilaginous fish (Chondrichthyes) should be especially informative about the duplicative origins and ancestral functions of vertebrate globins, as divergence between Chondrichthyes and bony vertebrates represents the most basal split within the jawed vertebrates. Here, we report a comparative genomic analysis of the vertebrate globin gene family that includes the complete globin gene repertoire of the elephant shark (Callorhinchus milii). Using genomic sequence data from representatives of all major vertebrate classes, integrated analyses of conserved synteny and phylogenetic relationships revealed that the last common ancestor of vertebrates possessed a repertoire of at least seven globin genes: single copies of androglobin and neuroglobin, four paralogous copies of globin X, and the single-copy progenitor of the entire set of vertebrate-specific globins. Combined with expression data, the genomic inventory of elephant shark globins yielded four especially surprising findings: 1) there is no trace of the neuroglobin gene (a highly conserved gene that is present in all other jawed vertebrates that have been examined to date), 2) myoglobin is highly expressed in heart, but not in skeletal muscle (reflecting a possible ancestral condition in vertebrates with single-circuit circulatory systems), 3) elephant shark possesses two highly divergent globin X paralogs, one of which is preferentially expressed in gonads, and 4) elephant shark possesses two structurally distinct α-globin paralogs, one of which is preferentially expressed in the brain. Expression profiles of elephant shark globin genes reveal distinct specializations of function relative to orthologs in bony vertebrates and suggest hypotheses about ancestral functions of vertebrate globins. PMID:25743544
Mining a database of single amplified genomes from Red Sea brine pool extremophiles—improving reliability of gene function prediction using a profile and pattern matching algorithm (PPMA)

PubMed Central

Grötzinger, Stefan W.; Alam, Intikhab; Ba Alawi, Wail; Bajic, Vladimir B.; Stingl, Ulrich; Eppinger, Jörg

2014-01-01

Reliable functional annotation of genomic data is the key-step in the discovery of novel enzymes. Intrinsic sequencing data quality problems of single amplified genomes (SAGs) and poor homology of novel extremophile's genomes pose significant challenges for the attribution of functions to the coding sequences identified. The anoxic deep-sea brine pools of the Red Sea are a promising source of novel enzymes with unique evolutionary adaptation. Sequencing data from Red Sea brine pool cultures and SAGs are annotated and stored in the Integrated Data Warehouse of Microbial Genomes (INDIGO) data warehouse. Low sequence homology of annotated genes (no similarity for 35% of these genes) may translate into false positives when searching for specific functions. The Profile and Pattern Matching (PPM) strategy described here was developed to eliminate false positive annotations of enzyme function before progressing to labor-intensive hyper-saline gene expression and characterization. It utilizes InterPro-derived Gene Ontology (GO)-terms (which represent enzyme function profiles) and annotated relevant PROSITE IDs (which are linked to an amino acid consensus pattern). The PPM algorithm was tested on 15 protein families, which were selected based on scientific and commercial potential. An initial list of 2577 enzyme commission (E.C.) numbers was translated into 171 GO-terms and 49 consensus patterns. A subset of INDIGO-sequences consisting of 58 SAGs from six different taxons of bacteria and archaea were selected from six different brine pool environments. Those SAGs code for 74,516 genes, which were independently scanned for the GO-terms (profile filter) and PROSITE IDs (pattern filter). Following stringent reliability filtering, the non-redundant hits (106 profile hits and 147 pattern hits) are classified as reliable, if at least two relevant descriptors (GO-terms and/or consensus patterns) are present. Scripts for annotation, as well as for the PPM algorithm, are available through the INDIGO website. PMID:24778629
Mass Spectrometry Based Ultrasensitive DNA Methylation Profiling Using Target Fragmentation Assay.

PubMed

Lin, Xiang-Cheng; Zhang, Ting; Liu, Lan; Tang, Hao; Yu, Ru-Qin; Jiang, Jian-Hui

2016-01-19

Efficient tools for profiling DNA methylation in specific genes are essential for epigenetics and clinical diagnostics. Current DNA methylation profiling techniques have been limited by inconvenient implementation, requirements of specific reagents, and inferior accuracy in quantifying methylation degree. We develop a novel mass spectrometry method, target fragmentation assay (TFA), which enable to profile methylation in specific sequences. This method combines selective capture of DNA target from restricted cleavage of genomic DNA using magnetic separation with MS detection of the nonenzymatic hydrolysates of target DNA. This method is shown to be highly sensitive with a detection limit as low as 0.056 amol, allowing direct profiling of methylation using genome DNA without preamplification. Moreover, this method offers a unique advantage in accurately determining DNA methylation level. The clinical applicability was demonstrated by DNA methylation analysis using prostate tissue samples, implying the potential of this method as a useful tool for DNA methylation profiling in early detection of related diseases.
Clinically Applicable Inhibitors Impacting Genome Stability.

PubMed

Prakash, Anu; Garcia-Moreno, Juan F; Brown, James A L; Bourke, Emer

2018-05-13

Advances in technology have facilitated the molecular profiling (genomic and transcriptomic) of tumours, and has led to improved stratification of patients and the individualisation of treatment regimes. To fully realize the potential of truly personalised treatment options, we need targeted therapies that precisely disrupt the compensatory pathways identified by profiling which allow tumours to survive or gain resistance to treatments. Here, we discuss recent advances in novel therapies that impact the genome (chromosomes and chromatin), pathways targeted and the stage of the pathways targeted. The current state of research will be discussed, with a focus on compounds that have advanced into trials (clinical and pre-clinical). We will discuss inhibitors of specific DNA damage responses and other genome stability pathways, including those in development, which are likely to synergistically combine with current therapeutic options. Tumour profiling data, combined with the knowledge of new treatments that affect the regulation of essential tumour signalling pathways, is revealing fundamental insights into cancer progression and resistance mechanisms. This is the forefront of the next evolution of advanced oncology medicine that will ultimately lead to improved survival and may, one day, result in many cancers becoming chronic conditions, rather than fatal diseases.
Versatile Gene-Specific Sequence Tags for Arabidopsis Functional Genomics: Transcript Profiling and Reverse Genetics Applications

PubMed Central

Hilson, Pierre; Allemeersch, Joke; Altmann, Thomas; Aubourg, Sébastien; Avon, Alexandra; Beynon, Jim; Bhalerao, Rishikesh P.; Bitton, Frédérique; Caboche, Michel; Cannoot, Bernard; Chardakov, Vasil; Cognet-Holliger, Cécile; Colot, Vincent; Crowe, Mark; Darimont, Caroline; Durinck, Steffen; Eickhoff, Holger; de Longevialle, Andéol Falcon; Farmer, Edward E.; Grant, Murray; Kuiper, Martin T.R.; Lehrach, Hans; Léon, Céline; Leyva, Antonio; Lundeberg, Joakim; Lurin, Claire; Moreau, Yves; Nietfeld, Wilfried; Paz-Ares, Javier; Reymond, Philippe; Rouzé, Pierre; Sandberg, Goran; Segura, Maria Dolores; Serizet, Carine; Tabrett, Alexandra; Taconnat, Ludivine; Thareau, Vincent; Van Hummelen, Paul; Vercruysse, Steven; Vuylsteke, Marnik; Weingartner, Magdalena; Weisbeek, Peter J.; Wirta, Valtteri; Wittink, Floyd R.A.; Zabeau, Marc; Small, Ian

2004-01-01

Microarray transcript profiling and RNA interference are two new technologies crucial for large-scale gene function studies in multicellular eukaryotes. Both rely on sequence-specific hybridization between complementary nucleic acid strands, inciting us to create a collection of gene-specific sequence tags (GSTs) representing at least 21,500 Arabidopsis genes and which are compatible with both approaches. The GSTs were carefully selected to ensure that each of them shared no significant similarity with any other region in the Arabidopsis genome. They were synthesized by PCR amplification from genomic DNA. Spotted microarrays fabricated from the GSTs show good dynamic range, specificity, and sensitivity in transcript profiling experiments. The GSTs have also been transferred to bacterial plasmid vectors via recombinational cloning protocols. These cloned GSTs constitute the ideal starting point for a variety of functional approaches, including reverse genetics. We have subcloned GSTs on a large scale into vectors designed for gene silencing in plant cells. We show that in planta expression of GST hairpin RNA results in the expected phenotypes in silenced Arabidopsis lines. These versatile GST resources provide novel and powerful tools for functional genomics. PMID:15489341
Pancreatic Cancer Genomics 2.0: Profiling Metastases.

PubMed

Collisson, Eric A; Maitra, Anirban

2017-03-13

Pancreatic ductal adenocarcinoma, even when diagnosed early, nearly always metastasizes. Recurrent mutations and genomic instability are early events in the disease. Two recent papers advance our understanding of how the cancer genome evolves as the primary tumor migrates from its origin in the pancreas to colonize distant metastatic sites. Copyright © 2017 Elsevier Inc. All rights reserved.
The dynamics of genome replication using deep sequencing

PubMed Central

Müller, Carolin A.; Hawkins, Michelle; Retkute, Renata; Malla, Sunir; Wilson, Ray; Blythe, Martin J.; Nakato, Ryuichiro; Komata, Makiko; Shirahige, Katsuhiko; de Moura, Alessandro P.S.; Nieduszynski, Conrad A.

2014-01-01

Eukaryotic genomes are replicated from multiple DNA replication origins. We present complementary deep sequencing approaches to measure origin location and activity in Saccharomyces cerevisiae. Measuring the increase in DNA copy number during a synchronous S-phase allowed the precise determination of genome replication. To map origin locations, replication forks were stalled close to their initiation sites; therefore, copy number enrichment was limited to origins. Replication timing profiles were generated from asynchronous cultures using fluorescence-activated cell sorting. Applying this technique we show that the replication profiles of haploid and diploid cells are indistinguishable, indicating that both cell types use the same cohort of origins with the same activities. Finally, increasing sequencing depth allowed the direct measure of replication dynamics from an exponentially growing culture. This is the first time this approach, called marker frequency analysis, has been successfully applied to a eukaryote. These data provide a high-resolution resource and methodological framework for studying genome biology. PMID:24089142
Value-based genomics.

PubMed

Gong, Jun; Pan, Kathy; Fakih, Marwan; Pal, Sumanta; Salgia, Ravi

2018-03-20

Advancements in next-generation sequencing have greatly enhanced the development of biomarker-driven cancer therapies. The affordability and availability of next-generation sequencers have allowed for the commercialization of next-generation sequencing platforms that have found widespread use for clinical-decision making and research purposes. Despite the greater availability of tumor molecular profiling by next-generation sequencing at our doorsteps, the achievement of value-based care, or improving patient outcomes while reducing overall costs or risks, in the era of precision oncology remains a looming challenge. In this review, we highlight available data through a pre-established and conceptualized framework for evaluating value-based medicine to assess the cost (efficiency), clinical benefit (effectiveness), and toxicity (safety) of genomic profiling in cancer care. We also provide perspectives on future directions of next-generation sequencing from targeted panels to whole-exome or whole-genome sequencing and describe potential strategies needed to attain value-based genomics.
Value-based genomics

PubMed Central

Gong, Jun; Pan, Kathy; Fakih, Marwan; Pal, Sumanta; Salgia, Ravi

2018-01-01

Advancements in next-generation sequencing have greatly enhanced the development of biomarker-driven cancer therapies. The affordability and availability of next-generation sequencers have allowed for the commercialization of next-generation sequencing platforms that have found widespread use for clinical-decision making and research purposes. Despite the greater availability of tumor molecular profiling by next-generation sequencing at our doorsteps, the achievement of value-based care, or improving patient outcomes while reducing overall costs or risks, in the era of precision oncology remains a looming challenge. In this review, we highlight available data through a pre-established and conceptualized framework for evaluating value-based medicine to assess the cost (efficiency), clinical benefit (effectiveness), and toxicity (safety) of genomic profiling in cancer care. We also provide perspectives on future directions of next-generation sequencing from targeted panels to whole-exome or whole-genome sequencing and describe potential strategies needed to attain value-based genomics. PMID:29644010
Bisulfite-independent analysis of CpG island methylation enables genome-scale stratification of single cells

PubMed Central

Han, Lin; Wu, Hua-Jun; Zhu, Haiying; Kim, Kun-Yong; Marjani, Sadie L.; Riester, Markus; Euskirchen, Ghia; Zi, Xiaoyuan; Yang, Jennifer; Han, Jasper; Snyder, Michael; Park, In-Hyun; Irizarry, Rafael; Weissman, Sherman M.

2017-01-01

Abstract Conventional DNA bisulfite sequencing has been extended to single cell level, but the coverage consistency is insufficient for parallel comparison. Here we report a novel method for genome-wide CpG island (CGI) methylation sequencing for single cells (scCGI-seq), combining methylation-sensitive restriction enzyme digestion and multiple displacement amplification for selective detection of methylated CGIs. We applied this method to analyzing single cells from two types of hematopoietic cells, K562 and GM12878 and small populations of fibroblasts and induced pluripotent stem cells. The method detected 21 798 CGIs (76% of all CGIs) per cell, and the number of CGIs consistently detected from all 16 profiled single cells was 20 864 (72.7%), with 12 961 promoters covered. This coverage represents a substantial improvement over results obtained using single cell reduced representation bisulfite sequencing, with a 66-fold increase in the fraction of consistently profiled CGIs across individual cells. Single cells of the same type were more similar to each other than to other types, but also displayed epigenetic heterogeneity. The method was further validated by comparing the CpG methylation pattern, methylation profile of CGIs/promoters and repeat regions and 41 classes of known regulatory markers to the ENCODE data. Although not every minor methylation differences between cells are detectable, scCGI-seq provides a solid tool for unsupervised stratification of a heterogeneous cell population. PMID:28126923
NeisseriaBase: a specialised Neisseria genomic resource and analysis platform.

PubMed

Zheng, Wenning; Mutha, Naresh V R; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah; Choo, Siew Woh

2016-01-01

Background. The gram-negative Neisseria is associated with two of the most potent human epidemic diseases: meningococcal meningitis and gonorrhoea. In both cases, disease is caused by bacteria colonizing human mucosal membrane surfaces. Overall, the genus shows great diversity and genetic variation mainly due to its ability to acquire and incorporate genetic material from a diverse range of sources through horizontal gene transfer. Although a number of databases exist for the Neisseria genomes, they are mostly focused on the pathogenic species. In this present study we present the freely available NeisseriaBase, a database dedicated to the genus Neisseria encompassing the complete and draft genomes of 15 pathogenic and commensal Neisseria species. Methods. The genomic data were retrieved from National Center for Biotechnology Information (NCBI) and annotated using the RAST server which were then stored into the MySQL database. The protein-coding genes were further analyzed to obtain information such as calculation of GC content (%), predicted hydrophobicity and molecular weight (Da) using in-house Perl scripts. The web application was developed following the secure four-tier web application architecture: (1) client workstation, (2) web server, (3) application server, and (4) database server. The web interface was constructed using PHP, JavaScript, jQuery, AJAX and CSS, utilizing the model-view-controller (MVC) framework. The in-house developed bioinformatics tools implemented in NeisseraBase were developed using Python, Perl, BioPerl and R languages. Results. Currently, NeisseriaBase houses 603,500 Coding Sequences (CDSs), 16,071 RNAs and 13,119 tRNA genes from 227 Neisseria genomes. The database is equipped with interactive web interfaces. Incorporation of the JBrowse genome browser in the database enables fast and smooth browsing of Neisseria genomes. NeisseriaBase includes the standard BLAST program to facilitate homology searching, and for Virulence Factor Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my.
NeisseriaBase: a specialised Neisseria genomic resource and analysis platform

PubMed Central

Zheng, Wenning; Mutha, Naresh V.R.; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S.; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah

2016-01-01

Background. The gram-negative Neisseria is associated with two of the most potent human epidemic diseases: meningococcal meningitis and gonorrhoea. In both cases, disease is caused by bacteria colonizing human mucosal membrane surfaces. Overall, the genus shows great diversity and genetic variation mainly due to its ability to acquire and incorporate genetic material from a diverse range of sources through horizontal gene transfer. Although a number of databases exist for the Neisseria genomes, they are mostly focused on the pathogenic species. In this present study we present the freely available NeisseriaBase, a database dedicated to the genus Neisseria encompassing the complete and draft genomes of 15 pathogenic and commensal Neisseria species. Methods. The genomic data were retrieved from National Center for Biotechnology Information (NCBI) and annotated using the RAST server which were then stored into the MySQL database. The protein-coding genes were further analyzed to obtain information such as calculation of GC content (%), predicted hydrophobicity and molecular weight (Da) using in-house Perl scripts. The web application was developed following the secure four-tier web application architecture: (1) client workstation, (2) web server, (3) application server, and (4) database server. The web interface was constructed using PHP, JavaScript, jQuery, AJAX and CSS, utilizing the model-view-controller (MVC) framework. The in-house developed bioinformatics tools implemented in NeisseraBase were developed using Python, Perl, BioPerl and R languages. Results. Currently, NeisseriaBase houses 603,500 Coding Sequences (CDSs), 16,071 RNAs and 13,119 tRNA genes from 227 Neisseria genomes. The database is equipped with interactive web interfaces. Incorporation of the JBrowse genome browser in the database enables fast and smooth browsing of Neisseria genomes. NeisseriaBase includes the standard BLAST program to facilitate homology searching, and for Virulence Factor Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my. PMID:27017950
The PEPR GeneChip data warehouse, and implementation of a dynamic time series query tool (SGQT) with graphical interface.

PubMed

Chen, Josephine; Zhao, Po; Massaro, Donald; Clerch, Linda B; Almon, Richard R; DuBois, Debra C; Jusko, William J; Hoffman, Eric P

2004-01-01

Publicly accessible DNA databases (genome browsers) are rapidly accelerating post-genomic research (see http://www.genome.ucsc.edu/), with integrated genomic DNA, gene structure, EST/ splicing and cross-species ortholog data. DNA databases have relatively low dimensionality; the genome is a linear code that anchors all associated data. In contrast, RNA expression and protein databases need to be able to handle very high dimensional data, with time, tissue, cell type and genes, as interrelated variables. The high dimensionality of microarray expression profile data, and the lack of a standard experimental platform have complicated the development of web-accessible databases and analytical tools. We have designed and implemented a public resource of expression profile data containing 1024 human, mouse and rat Affymetrix GeneChip expression profiles, generated in the same laboratory, and subject to the same quality and procedural controls (Public Expression Profiling Resource; PEPR). Our Oracle-based PEPR data warehouse includes a novel time series query analysis tool (SGQT), enabling dynamic generation of graphs and spreadsheets showing the action of any transcript of interest over time. In this report, we demonstrate the utility of this tool using a 27 time point, in vivo muscle regeneration series. This data warehouse and associated analysis tools provides access to multidimensional microarray data through web-based interfaces, both for download of all types of raw data for independent analysis, and also for straightforward gene-based queries. Planned implementations of PEPR will include web-based remote entry of projects adhering to quality control and standard operating procedure (QC/SOP) criteria, and automated output of alternative probe set algorithms for each project (see http://microarray.cnmcresearch.org/pgadatatable.asp).
The PEPR GeneChip data warehouse, and implementation of a dynamic time series query tool (SGQT) with graphical interface

PubMed Central

Chen, Josephine; Zhao, Po; Massaro, Donald; Clerch, Linda B.; Almon, Richard R.; DuBois, Debra C.; Jusko, William J.; Hoffman, Eric P.

2004-01-01

Publicly accessible DNA databases (genome browsers) are rapidly accelerating post-genomic research (see http://www.genome.ucsc.edu/), with integrated genomic DNA, gene structure, EST/ splicing and cross-species ortholog data. DNA databases have relatively low dimensionality; the genome is a linear code that anchors all associated data. In contrast, RNA expression and protein databases need to be able to handle very high dimensional data, with time, tissue, cell type and genes, as interrelated variables. The high dimensionality of microarray expression profile data, and the lack of a standard experimental platform have complicated the development of web-accessible databases and analytical tools. We have designed and implemented a public resource of expression profile data containing 1024 human, mouse and rat Affymetrix GeneChip expression profiles, generated in the same laboratory, and subject to the same quality and procedural controls (Public Expression Profiling Resource; PEPR). Our Oracle-based PEPR data warehouse includes a novel time series query analysis tool (SGQT), enabling dynamic generation of graphs and spreadsheets showing the action of any transcript of interest over time. In this report, we demonstrate the utility of this tool using a 27 time point, in vivo muscle regeneration series. This data warehouse and associated analysis tools provides access to multidimensional microarray data through web-based interfaces, both for download of all types of raw data for independent analysis, and also for straightforward gene-based queries. Planned implementations of PEPR will include web-based remote entry of projects adhering to quality control and standard operating procedure (QC/SOP) criteria, and automated output of alternative probe set algorithms for each project (see http://microarray.cnmcresearch.org/pgadatatable.asp). PMID:14681485
Differential transcriptome analysis reveals genes related to cold tolerance in seabuckthorn carpenter moth, Eogystia hippophaecolus

PubMed Central

Hu, Ping; Wang, Tao; Tao, Jing; Zong, Shixiang

2017-01-01

Seabuckthorn carpenter moth, Eogystia hippophaecolus (Lepidoptera: Cossidae), is an important pest of sea buckthorn (Hippophae rhamnoides), which is a shrub that has significant ecological and economic value in China. E. hippophaecolus is highly cold tolerant, but limited studies have been conducted to elucidate the molecular mechanisms underlying its cold resistance. Here we sequenced the E. hippophaecolus transcriptome using RNA-Seq technology and performed de novo assembly from the short paired-end reads. We investigated the larval response to cold stress by comparing gene expression profiles between treatments. We obtained 118,034 unigenes, of which 22,161 were annotated with gene descriptions, conserved domains, gene ontology terms, and metabolic pathways. These resulted in 57 GO terms and 193 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. By comparing transcriptome profiles for differential gene expression, we identified many differentially expressed proteins and genes, including heat shock proteins and cuticular proteins which have previously been reported to be involved in cold resistance of insects. This study provides a global transcriptome analysis and an assessment of differential gene expression in E. hippophaecolus under cold stress. We found seven differential expressed genes in common between developmental stages, which were verified with qPCR. Our findings facilitate future genomic studies aimed at improving our understanding of the molecular mechanisms underlying the response of insects to low temperatures. PMID:29131867

Differential transcriptome analysis reveals genes related to cold tolerance in seabuckthorn carpenter moth, Eogystia hippophaecolus.

PubMed

Cui, Mingming; Hu, Ping; Wang, Tao; Tao, Jing; Zong, Shixiang

2017-01-01

Seabuckthorn carpenter moth, Eogystia hippophaecolus (Lepidoptera: Cossidae), is an important pest of sea buckthorn (Hippophae rhamnoides), which is a shrub that has significant ecological and economic value in China. E. hippophaecolus is highly cold tolerant, but limited studies have been conducted to elucidate the molecular mechanisms underlying its cold resistance. Here we sequenced the E. hippophaecolus transcriptome using RNA-Seq technology and performed de novo assembly from the short paired-end reads. We investigated the larval response to cold stress by comparing gene expression profiles between treatments. We obtained 118,034 unigenes, of which 22,161 were annotated with gene descriptions, conserved domains, gene ontology terms, and metabolic pathways. These resulted in 57 GO terms and 193 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. By comparing transcriptome profiles for differential gene expression, we identified many differentially expressed proteins and genes, including heat shock proteins and cuticular proteins which have previously been reported to be involved in cold resistance of insects. This study provides a global transcriptome analysis and an assessment of differential gene expression in E. hippophaecolus under cold stress. We found seven differential expressed genes in common between developmental stages, which were verified with qPCR. Our findings facilitate future genomic studies aimed at improving our understanding of the molecular mechanisms underlying the response of insects to low temperatures.
High Glutathione and Glutathione Peroxidase-2 Levels Mediate Cell-Type-Specific DNA Damage Protection in Human Induced Pluripotent Stem Cells

PubMed Central

Dannenmann, Benjamin; Lehle, Simon; Hildebrand, Dominic G.; Kübler, Ayline; Grondona, Paula; Schmid, Vera; Holzer, Katharina; Fröschl, Mirjam; Essmann, Frank; Rothfuss, Oliver; Schulze-Osthoff, Klaus

2015-01-01

Summary Pluripotent stem cells must strictly maintain genomic integrity to prevent transmission of mutations. In human induced pluripotent stem cells (iPSCs), we found that genome surveillance is achieved via two ways, namely, a hypersensitivity to apoptosis and a very low accumulation of DNA lesions. The low apoptosis threshold was mediated by constitutive p53 expression and a marked upregulation of proapoptotic p53 target genes of the BCL-2 family, ensuring the efficient iPSC removal upon genotoxic insults. Intriguingly, despite the elevated apoptosis sensitivity, both mitochondrial and nuclear DNA lesions induced by genotoxins were less frequent in iPSCs compared to fibroblasts. Gene profiling identified that mRNA expression of several antioxidant proteins was considerably upregulated in iPSCs. Knockdown of glutathione peroxidase-2 and depletion of glutathione impaired protection against DNA lesions. Thus, iPSCs ensure genomic integrity through enhanced apoptosis induction and increased antioxidant defense, contributing to protection against DNA damage. PMID:25937369
Bridging the gap between genome analysis and precision breeding in potato.

PubMed

Gebhardt, Christiane

2013-04-01

Efficiency and precision in plant breeding can be enhanced by using diagnostic DNA-based markers for the selection of superior cultivars. This technique has been applied to many crops, including potatoes. The first generation of diagnostic DNA-based markers useful in potato breeding were enabled by several developments: genetic linkage maps based on DNA polymorphisms, linkage mapping of qualitative and quantitative agronomic traits, cloning and functional analysis of genes for pathogen resistance and genes controlling plant metabolism, and association genetics in collections of tetraploid varieties and advanced breeding clones. Although these have led to significant improvements in potato genetics, the prediction of most, if not all, natural variation in agronomic traits by diagnostic markers ultimately requires the identification of the causal genes and their allelic variants. This objective will be facilitated by new genomic tools, such as genomic resequencing and comparative profiling of the proteome, transcriptome, and metabolome in combination with phenotyping genetic materials relevant for variety development. Copyright © 2012 Elsevier Ltd. All rights reserved.
Ectomycorrhizal ecology is imprinted in the genome of the dominant symbiotic fungus Cenococcum geophilum

PubMed Central

Peter, Martina; Kohler, Annegret; Ohm, Robin A.; Kuo, Alan; Krützmann, Jennifer; Morin, Emmanuelle; Arend, Matthias; Barry, Kerrie W.; Binder, Manfred; Choi, Cindy; Clum, Alicia; Copeland, Alex; Grisel, Nadine; Haridas, Sajeet; Kipfer, Tabea; LaButti, Kurt; Lindquist, Erika; Lipzen, Anna; Maire, Renaud; Meier, Barbara; Mihaltcheva, Sirma; Molinier, Virginie; Murat, Claude; Pöggeler, Stefanie; Quandt, C. Alisha; Sperisen, Christoph; Tritt, Andrew; Tisserant, Emilie; Crous, Pedro W.; Henrissat, Bernard; Nehls, Uwe; Egli, Simon; Spatafora, Joseph W.; Grigoriev, Igor V.; Martin, Francis M.

2016-01-01

The most frequently encountered symbiont on tree roots is the ascomycete Cenococcum geophilum, the only mycorrhizal species within the largest fungal class Dothideomycetes, a class known for devastating plant pathogens. Here we show that the symbiotic genomic idiosyncrasies of ectomycorrhizal basidiomycetes are also present in C. geophilum with symbiosis-induced, taxon-specific genes of unknown function and reduced numbers of plant cell wall-degrading enzymes. C. geophilum still holds a significant set of genes in categories known to be involved in pathogenesis and shows an increased genome size due to transposable elements proliferation. Transcript profiling revealed a striking upregulation of membrane transporters, including aquaporin water channels and sugar transporters, and mycorrhiza-induced small secreted proteins (MiSSPs) in ectomycorrhiza compared with free-living mycelium. The frequency with which this symbiont is found on tree roots and its possible role in water and nutrient transport in symbiosis calls for further studies on mechanisms of host and environmental adaptation. PMID:27601008
The NCI Genomic Data Commons as an engine for precision medicine.

PubMed

Jensen, Mark A; Ferretti, Vincent; Grossman, Robert L; Staudt, Louis M

2017-07-27

The National Cancer Institute Genomic Data Commons (GDC) is an information system for storing, analyzing, and sharing genomic and clinical data from patients with cancer. The recent high-throughput sequencing of cancer genomes and transcriptomes has produced a big data problem that precludes many cancer biologists and oncologists from gleaning knowledge from these data regarding the nature of malignant processes and the relationship between tumor genomic profiles and treatment response. The GDC aims to democratize access to cancer genomic data and to foster the sharing of these data to promote precision medicine approaches to the diagnosis and treatment of cancer.
Genome-wide comparative transcriptome analysis of CMS-D2 and its maintainer and restorer lines in upland cotton.

PubMed

Wu, Jianyong; Zhang, Meng; Zhang, Bingbing; Zhang, Xuexian; Guo, Liping; Qi, Tingxiang; Wang, Hailin; Zhang, Jinfa; Xing, Chaozhu

2017-06-08

Cytoplasmic male sterility (CMS) conferred by the cytoplasm from Gossypium harknessii (D2) is an important system for hybrid seed production in Upland cotton (G. hirsutum). The male sterility of CMS-D2 (i.e., A line) can be restored to fertility by a restorer (i.e., R line) carrying the restorer gene Rf1 transferred from the D2 nuclear genome. However, the molecular mechanisms of CMS-D2 and its restoration are poorly understood. In this study, a genome-wide comparative transcriptome analysis was performed to identify differentially expressed genes (DEGs) in flower buds among the isogenic fertile R line and sterile A line derived from a backcross population (BC 8 F 1 ) and the recurrent parent, i.e., the maintainer (B line). A total of 1464 DEGs were identified among the three isogenic lines, and the Rf1-carrying Chr_D05 and its homeologous Chr_A05 had more DEGs than other chromosomes. The results of GO and KEGG enrichment analysis showed differences in circadian rhythm between the fertile and sterile lines. Eleven DEGs were selected for validation using qRT-PCR, confirming the accuracy of the RNA-seq results. Through genome-wide comparative transcriptome analysis, the differential expression profiles of CMS-D2 and its maintainer and restorer lines in Upland cotton were identified. Our results provide an important foundation for further studies into the molecular mechanisms of the interactions between the restorer gene Rf1 and the CMS-D2 cytoplasm.
Annotation and Classification of CRISPR-Cas Systems

PubMed Central

Makarova, Kira S.; Koonin, Eugene V.

2018-01-01

The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas (CRISPR-associated proteins) is a prokaryotic adaptive immune system that is represented in most archaea and many bacteria. Among the currently known prokaryotic defense systems, the CRISPR-Cas genomic loci show unprecedented complexity and diversity. Classification of CRISPR-Cas variants that would capture their evolutionary relationships to the maximum possible extent is essential for comparative genomic and functional characterization of this theoretically and practically important system of adaptive immunity. To this end, a multipronged approach has been developed that combines phylogenetic analysis of the conserved Cas proteins with comparison of gene repertoires and arrangements in CRISPR-Cas loci. This approach led to the current classification of CRISPR-Cas systems into three distinct types and ten subtypes for each of which signature genes have been identified. Comparative genomic analysis of the CRISPR-Cas systems in new archaeal and bacterial genomes performed over the 3 years elapsed since the development of this classification makes it clear that new types and subtypes of CRISPR-Cas need to be introduced. Moreover, this classification system captures only part of the complexity of CRISPR-Cas organization and evolution, due to the intrinsic modularity and evolutionary mobility of these immunity systems, resulting in numerous recombinant variants. Moreover, most of the cas genes evolve rapidly, complicating the family assignment for many Cas proteins and the use of family profiles for the recognition of CRISPR-Cas subtype signatures. Further progress in the comparative analysis of CRISPR-Cas systems requires integration of the most sensitive sequence comparison tools, protein structure comparison, and refined approaches for comparison of gene neighborhoods. PMID:25981466
Annotation and Classification of CRISPR-Cas Systems.

PubMed

Makarova, Kira S; Koonin, Eugene V

2015-01-01

The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas (CRISPR-associated proteins) is a prokaryotic adaptive immune system that is represented in most archaea and many bacteria. Among the currently known prokaryotic defense systems, the CRISPR-Cas genomic loci show unprecedented complexity and diversity. Classification of CRISPR-Cas variants that would capture their evolutionary relationships to the maximum possible extent is essential for comparative genomic and functional characterization of this theoretically and practically important system of adaptive immunity. To this end, a multipronged approach has been developed that combines phylogenetic analysis of the conserved Cas proteins with comparison of gene repertoires and arrangements in CRISPR-Cas loci. This approach led to the current classification of CRISPR-Cas systems into three distinct types and ten subtypes for each of which signature genes have been identified. Comparative genomic analysis of the CRISPR-Cas systems in new archaeal and bacterial genomes performed over the 3 years elapsed since the development of this classification makes it clear that new types and subtypes of CRISPR-Cas need to be introduced. Moreover, this classification system captures only part of the complexity of CRISPR-Cas organization and evolution, due to the intrinsic modularity and evolutionary mobility of these immunity systems, resulting in numerous recombinant variants. Moreover, most of the cas genes evolve rapidly, complicating the family assignment for many Cas proteins and the use of family profiles for the recognition of CRISPR-Cas subtype signatures. Further progress in the comparative analysis of CRISPR-Cas systems requires integration of the most sensitive sequence comparison tools, protein structure comparison, and refined approaches for comparison of gene neighborhoods.
Comparative analysis among the small RNA populations of source, sink and conductive tissues in two different plant-virus pathosystems.

PubMed

Herranz, Mari Carmen; Navarro, Jose Antonio; Sommen, Evelien; Pallas, Vicente

2015-02-22

In plants, RNA silencing plays a fundamental role as defence mechanism against viruses. During last years deep-sequencing technology has allowed to analyze the sRNA profile of a large variety of virus-infected tissues. Nevertheless, the majority of these studies have been restricted to a unique tissue and no comparative analysis between phloem and source/sink tissues has been conducted. In the present work, we compared the sRNA populations of source, sink and conductive (phloem) tissues in two different plant virus pathosystems. We chose two cucurbit species infected with two viruses very different in genome organization and replication strategy; Melon necrotic spot virus (MNSV) and Prunus necrotic ringspot virus (PNRSV). Our findings showed, in both systems, an increase of the 21-nt total sRNAs together with a decrease of those with a size of 24-nt in all the infected tissues, except for the phloem where the ratio of 21/24-nt sRNA species remained constant. Comparing the vsRNAs, both PNRSV- and MNSV-infected plants share the same vsRNA size distribution in all the analyzed tissues. Similar accumulation levels of sense and antisense vsRNAs were observed in both systems except for roots that showed a prevalence of (+) vsRNAs in both pathosystems. Additionally, the presence of overrepresented discrete sites along the viral genome, hot spots, were identified and validated by stem-loop RT-PCR. Despite that in PNRSV-infected plants the presence of vsRNAs was scarce both viruses modulated the host sRNA profile. We compare for the first time the sRNA profile of four different tissues, including source, sink and conductive (phloem) tissues, in two plant-virus pathosystems. Our results indicate that antiviral silencing machinery in melon and cucumber acts mainly through DCL4. Upon infection, the total sRNA pattern in phloem remains unchanged in contrast to the rest of the analyzed tissues indicating a certain tissue-tropism to this polulation. Independently of the accumulation level of the vsRNAs both viruses were able to modulate the host sRNA pattern.
Genome-wide gene expression perturbation induced by loss of C2 chromosome in allotetraploid Brassica napus L.

PubMed Central

Zhu, Bin; Shao, Yujiao; Pan, Qi; Ge, Xianhong; Li, Zaiyun

2015-01-01

Aneuploidy with loss of entire chromosomes from normal complement disrupts the balanced genome and is tolerable only by polyploidy plants. In this study, the monosomic and nullisomic plants losing one or two copies of C2 chromosome from allotetraploid Brassica napus L. (2n = 38, AACC) were produced and compared for their phenotype and transcriptome. The monosomics gave a plant phenotype very similar to the original donor, but the nullisomics had much smaller stature and also shorter growth period. By the comparative analyses on the global transcript profiles with the euploid donor, genome-wide alterations in gene expression were revealed in two aneuploids, and their majority of differentially expressed genes (DEGs) resulted from the trans-acting effects of the zero and one copy of C2 chromosome. The higher number of up-regulated genes than down-regulated genes on other chromosomes suggested that the genome responded to the C2 loss via enhancing the expression of certain genes. Particularly, more DEGs were detected in the monosomics than nullisomics, contrasting with their phenotypes. The gene expression of the other chromosomes was differently affected, and several dysregulated domains in which up- or downregulated genes obviously clustered were identifiable. But the mean gene expression (MGE) for homoeologous chromosome A2 reduced with the C2 loss. Some genes and their expressions on C2 were correlated with the phenotype deviations in the aneuploids. These results provided new insights into the transcriptomic perturbation of the allopolyploid genome elicited by the loss of individual chromosome. PMID:26442076
Comparative genomics identifies distinct lineages of S. Enteritidis from Queensland, Australia.

PubMed

Graham, Rikki M A; Hiley, Lester; Rathnayake, Irani U; Jennison, Amy V

2018-01-01

Salmonella enterica is a major cause of gastroenteritis and foodborne illness in Australia where notification rates in the state of Queensland are the highest in the country. S. Enteritidis is among the five most common serotypes reported in Queensland and it is a priority for epidemiological surveillance due to concerns regarding its emergence in Australia. Using whole genome sequencing, we have analysed the genomic epidemiology of 217 S. Enteritidis isolates from Queensland, and observed that they fall into three distinct clades, which we have differentiated as Clades A, B and C. Phage types and MLST sequence types differed between the clades and comparative genomic analysis has shown that each has a unique profile of prophage and genomic islands. Several of the phage regions present in the S. Enteritidis reference strain P125109 were absent in Clades A and C, and these clades also had difference in the presence of pathogenicity islands, containing complete SPI-6 and SPI-19 regions, while P125109 does not. Antimicrobial resistance markers were found in 39 isolates, all but one of which belonged to Clade B. Phylogenetic analysis of the Queensland isolates in the context of 170 international strains showed that Queensland Clade B isolates group together with the previously identified global clade, while the other two clades are distinct and appear largely restricted to Australia. Locally sourced environmental isolates included in this analysis all belonged to Clades A and C, which is consistent with the theory that these clades are a source of locally acquired infection, while Clade B isolates are mostly travel related.
Genomic Analysis of Childhood Brain Tumors: Methods for Genome-Wide Discovery and Precision Medicine Become Mainstream.

PubMed

Mack, Stephen C; Northcott, Paul A

2017-07-20

Recent breakthroughs in next-generation sequencing technology and complementary genomic platforms have transformed our capacity to interrogate the molecular landscapes of human cancers, including childhood brain tumors. Numerous high-throughput genomic studies have been reported for the major histologic brain tumor entities diagnosed in children, including interrogations at the level of the genome, epigenome, and transcriptome, many of which have yielded essential new insights into disease biology. The nature of these discoveries has been largely platform dependent, exemplifying the usefulness of applying different genomic and computational strategies, or integrative approaches, to address specific biologic and/or clinical questions. The goal of this article is to summarize the spectrum of molecular profiling methods available for investigating genomic aspects of childhood brain tumors in both the research and the clinical setting. We provide an overview of the main next-generation sequencing and array-based technologies currently being applied in this field and draw from key examples in the recent neuro-oncology literature to illustrate how these genomic approaches have profoundly advanced our understanding of individual tumor entities. Moreover, we discuss the current status of genomic profiling in the clinic and how different platforms are being used to improve patient diagnosis and stratification, as well as to identify actionable targets for informing molecularly guided therapies, especially for patients for whom conventional standard-of-care treatments have failed. Both the demand for genomic testing and the main challenges associated with incorporating genomics into the clinical management of pediatric patients with brain tumors are discussed, as are recommendations for incorporating these assays into future clinical trials.
SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone.

PubMed

Daniels, Noah M; Hosur, Raghavendra; Berger, Bonnie; Cowen, Lenore J

2012-05-01

One of the most successful methods to date for recognizing protein sequences that are evolutionarily related has been profile hidden Markov models (HMMs). However, these models do not capture pairwise statistical preferences of residues that are hydrogen bonded in beta sheets. These dependencies have been partially captured in the HMM setting by simulated evolution in the training phase and can be fully captured by Markov random fields (MRFs). However, the MRFs can be computationally prohibitive when beta strands are interleaved in complex topologies. We introduce SMURFLite, a method that combines both simplified MRFs and simulated evolution to substantially improve remote homology detection for beta structures. Unlike previous MRF-based methods, SMURFLite is computationally feasible on any beta-structural motif. We test SMURFLite on all propeller and barrel folds in the mainly-beta class of the SCOP hierarchy in stringent cross-validation experiments. We show a mean 26% (median 16%) improvement in area under curve (AUC) for beta-structural motif recognition as compared with HMMER (a well-known HMM method) and a mean 33% (median 19%) improvement as compared with RAPTOR (a well-known threading method) and even a mean 18% (median 10%) improvement in AUC over HHPred (a profile-profile HMM method), despite HHpred's use of extensive additional training data. We demonstrate SMURFLite's ability to scale to whole genomes by running a SMURFLite library of 207 beta-structural SCOP superfamilies against the entire genome of Thermotoga maritima, and make over a 100 new fold predictions. Availability and implementaion: A webserver that runs SMURFLite is available at: http://smurf.cs.tufts.edu/smurflite/
Molecular epidemiology of an outbreak of Legionnaires' disease associated with a cooling tower in Genova-Sestri Ponente, Italy.

PubMed

Castellani Pastoris, M; Ciceroni, L; Lo Monaco, R; Goldoni, P; Mentore, B; Flego, G; Cattani, L; Ciarrocchi, S; Pinto, A; Visca, P

1997-12-01

Fatty acid profile analysis, monoclonal antibody (MAb) subtyping, pulsed-field gel electrophoresis (PFGE), arbitrarily primed polymerase chain reaction (AP-PCR), and ribotyping were used to compare clinical and environmental Legionella pneumophila serogroup 1 isolates from an outbreak of Legionnaires' disease presumptively associated with cooling towers. According to the Oxford subtyping scheme, the MAb subtype of patients' isolates and of two strains originating from a cooling tower was Pontiac, whereas the other isolates were subtype Olda. The strains showed no intrinsic strain-to-strain difference in fatty acid profiles, and ribotyping and length polymorphism of the 16S-23S rDNA intervening regions failed to reveal any differences between the isolates. Conversely, PFGE and AP-PCR appeared to be more discriminatory, as the same genomic profile was found for the clinical and some environmental strains. Meteorologic and epidemiological data and the results of molecular analysis of the Legionella pneumophila serogroup 1 isolates support the hypothesis that the infection was transmitted from one of the cooling towers to the indoor environment of the same building, to homes in proximity that had open windows, and to the streets. In fact, the outbreak diminished and later ended after a part in the tower was replaced. This investigation demonstrates the utility of combined molecular methods (i.e., phenotypic and genomic typing) in comparing epidemiologically linked clinical and environmental isolates. Finally, the outbreak confirms the risk of Legionnaires' disease posed by cooling towers, mainly when atmospheric thermal and humidity inversions occur. This finding emphasizes the need to determine whether the source of infection is in the living or working environment or somewhere else.
Whole-genome sequencing of asian lung cancers: second-hand smoke unlikely to be responsible for higher incidence of lung cancer among Asian never-smokers.

PubMed

Krishnan, Vidhya G; Ebert, Philip J; Ting, Jason C; Lim, Elaine; Wong, Swee-Seong; Teo, Audrey S M; Yue, Yong G; Chua, Hui-Hoon; Ma, Xiwen; Loh, Gary S L; Lin, Yuhao; Tan, Joanna H J; Yu, Kun; Zhang, Shenli; Reinhard, Christoph; Tan, Daniel S W; Peters, Brock A; Lincoln, Stephen E; Ballinger, Dennis G; Laramie, Jason M; Nilsen, Geoffrey B; Barber, Thomas D; Tan, Patrick; Hillmer, Axel M; Ng, Pauline C

2014-11-01

Asian nonsmoking populations have a higher incidence of lung cancer compared with their European counterparts. There is a long-standing hypothesis that the increase of lung cancer in Asian never-smokers is due to environmental factors such as second-hand smoke. We analyzed whole-genome sequencing of 30 Asian lung cancers. Unsupervised clustering of mutational signatures separated the patients into two categories of either all the never-smokers or all the smokers or ex-smokers. In addition, nearly one third of the ex-smokers and smokers classified with the never-smoker-like cluster. The somatic variant profiles of Asian lung cancers were similar to that of European origin with G.C>T.A being predominant in smokers. We found EGFR and TP53 to be the most frequently mutated genes with mutations in 50% and 27% of individuals, respectively. Among the 16 never-smokers, 69% had an EGFR mutation compared with 29% of 14 smokers/ex-smokers. Asian never-smokers had lung cancer signatures distinct from the smoker signature and their mutation profiles were similar to European never-smokers. The profiles of Asian and European smokers are also similar. Taken together, these results suggested that the same mutational mechanisms underlie the etiology for both ethnic groups. Thus, the high incidence of lung cancer in Asian never-smokers seems unlikely to be due to second-hand smoke or other carcinogens that cause oxidative DNA damage, implying that routine EGFR testing is warranted in the Asian population regardless of smoking status. ©2014 American Association for Cancer Research.
Noncoding RNA Expression and Targeted Next-Generation Sequencing Distinguish Tubulocystic Renal Cell Carcinoma (TC-RCC) from Other Renal Neoplasms.

PubMed

Lawrie, Charles H; Armesto, María; Fernandez-Mercado, Marta; Arestín, María; Manterola, Lorea; Goicoechea, Ibai; Larrea, Erika; Caffarel, María M; Araujo, Angela M; Sole, Carla; Sperga, Maris; Alvarado-Cabrero, Isabel; Michal, Michal; Hes, Ondrej; López, José I

2018-01-01

Tubulocystic renal cell carcinoma (TC-RCC) is a rare recently described renal neoplasm characterized by gross, microscopic, and immunohistochemical differences from other renal tumor types and was recently classified as a distinct entity. However, this distinction remains controversial particularly because some genetic studies suggest a close relationship with papillary RCC (PRCC). The molecular basis of this disease remains largely unexplored. We therefore performed noncoding (nc) RNA/miRNA expression analysis and targeted next-generation sequencing mutational profiling on 13 TC-RCC cases (11 pure, two mixed TC-RCC/PRCC) and compared with other renal neoplasms. The expression profile of miRNAs and other ncRNAs in TC-RCC was distinct and validated 10 differentially expressed miRNAs by quantitative RT-PCR, including miR-155 and miR-34a, that were significantly down-regulated compared with PRCC cases (n = 22). With the use of targeted next-generation sequencing we identified mutations in 14 different genes, most frequently (>60% of TC-RCC cases) in ABL1 and PDFGRA genes. These mutations were present in <5% of clear cell RCC, PRCC, or chromophobe RCC cases (n > 600) of The Cancer Genome Atlas database. In summary, this study is by far the largest molecular study of TC-RCC cases and the first to investigate either ncRNA expression or their genomic profile. These results add molecular evidence that TC-RCC is indeed a distinct entity from PRCC and other renal neoplasms. Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Molecular subtype classification of urothelial carcinoma in Lynch syndrome.

PubMed

Therkildsen, Christina; Eriksson, Pontus; Höglund, Mattias; Jönsson, Mats; Sjödahl, Gottfrid; Nilbert, Mef; Liedberg, Fredrik

2018-05-23

Lynch syndrome confers an increased risk for urothelial carcinoma (UC). Molecular subtypes may be relevant to prognosis and therapeutic possibilities, but have to date not been defined in Lynch syndrome-associated urothelial cancer. We aimed to provide a molecular description of Lynch syndrome-associated UC. Thus, Lynch syndrome-associated UCs of the upper urinary tract and the urinary bladder were identified in the Danish hereditary nonpolyposis colorectal cancer (HNPCC) register and were transcriptionally and immunohistochemically profiled and further related to data from 307 sporadic urothelial carcinomas. Whole-genome mRNA expression profiles of 41 tumors and immunohistochemical stainings against FGFR3, KRT5, CCNB1, RB1, and CDKN2A (p16) of 37 tumors from patients with Lynch syndrome were generated. Pathological data, microsatellite instability, anatomic location, and overall survival data were analyzed and compared with sporadic bladder cancer. The 41 Lynch syndrome-associated UC developed at a mean age of 61 years with 59% women. mRNA expression profiling and immunostaining classified the majority of the Lynch syndrome-associated UC as urothelial-like tumors with only 20% being genomically unstable, basal/SCC-like, or other subtypes. The subtypes were associated with stage, grade, and microsatellite instability. Comparison to larger datasets revealed that Lynch syndrome-associated UC shares molecular similarities with sporadic UC. In conclusion, transcriptomic and immunohistochemical profiling identifies a predominance of the urothelial-like molecular subtype in Lynch syndrome and reveals that the molecular subtypes of sporadic bladder cancer are relevant also within this hereditary, mismatch-repair defective subset. © 2018 The Authors. Published by FEBS Press and John Wiley & Sons Ltd.
Divergent transcriptional profiles in pediatric asthma patients of low and high socioeconomic status.

PubMed

Miller, Gregory E; Chen, Edith; Shalowitz, Madeleine U; Story, Rachel E; Leigh, Adam K K; Ham, Paula; Arevalo, Jesusa M G; Cole, Steve W

2018-06-01

There are marked socioeconomic disparities in pediatric asthma control, but the molecular origins of these disparities are not well understood. To fill this gap, we performed genome-wide expression profiling of monocytes and T-helper cells from pediatric asthma patients of lower and higher socioeconomic status (SES). Ninety-nine children with asthma participated in a cross-sectional assessment. Out of which 87% were atopic, and most had disease of mild (54%) or moderate (29%) severity. Children were from lower-SES (n = 49; household income <$50 000) or higher-SES (n = 50; household income >$140 000) families. Peripheral blood monocytes and T-helper cells were isolated for genome-wide expression profiling of mRNA. Lower-SES children had worse asthma quality of life relative to higher-SES children, by both their own and their parents' reports. Although the groups had similar disease severity and potential confounds were controlled, their transcriptional profiles differed notably. The monocytes of lower-SES children showed transcriptional indications of up-regulated anti-microbial and pro-inflammatory activity. The T-helper cells of lower-SES children also had comparatively reduced expression of genes encoding γ-interferon and tumor necrosis factor-α, cytokines that orchestrate Type 1 responses. They also showed up-regulated activity of transcription factors that polarize cells towards Type 2 responses and promote Th17 cell maturation. Collectively, these patterns implicate pro-inflammatory monocytes and Type 2 cytokine activity as mechanisms contributing to worse asthma control among lower-SES children. © 2018 Wiley Periodicals, Inc.
Comprehensive Genomic Analysis and Expression Profiling of Phospholipase C Gene Family during Abiotic Stresses and Development in Rice

PubMed Central

Singh, Amarjeet; Kanwar, Poonam; Pandey, Amita; Tyagi, Akhilesh K.; Sopory, Sudhir K.; Kapoor, Sanjay; Pandey, Girdhar K.

2013-01-01

Background Phospholipase C (PLC) is one of the major lipid hydrolysing enzymes, implicated in lipid mediated signaling. PLCs have been found to play a significant role in abiotic stress triggered signaling and developmental processes in various plant species. Genome wide identification and expression analysis have been carried out for this gene family in Arabidopsis, yet not much has been accomplished in crop plant rice. Methodology/Principal Findings An exhaustive in-silico exploration of rice genome using various online databases and tools resulted in the identification of nine PLC encoding genes. Based on sequence, motif and phylogenetic analysis rice PLC gene family could be divided into phosphatidylinositol-specific PLCs (PI-PLCs) and phosphatidylcholine- PLCs (PC-PLC or NPC) classes with four and five members, respectively. A comparative analysis revealed that PLCs are conserved in Arabidopsis (dicots) and rice (monocot) at gene structure and protein level but they might have evolved through a separate evolutionary path. Transcript profiling using gene chip microarray and quantitative RT-PCR showed that most of the PLC members expressed significantly and differentially under abiotic stresses (salt, cold and drought) and during various developmental stages with condition/stage specific and overlapping expression. This finding suggested an important role of different rice PLC members in abiotic stress triggered signaling and plant development, which was also supported by the presence of relevant cis-regulatory elements in their promoters. Sub-cellular localization of few selected PLC members in Nicotiana benthamiana and onion epidermal cells has provided a clue about their site of action and functional behaviour. Conclusion/Significance The genome wide identification, structural and expression analysis and knowledge of sub-cellular localization of PLC gene family envisage the functional characterization of these genes in crop plants in near future. PMID:23638098
Genome-wide analysis of WRKY gene family in Cucumis sativus

PubMed Central

2011-01-01

Background WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. Results We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Conclusions Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes. PMID:21955985

Comprehensive genomic analysis and expression profiling of phospholipase C gene family during abiotic stresses and development in rice.

PubMed

Singh, Amarjeet; Kanwar, Poonam; Pandey, Amita; Tyagi, Akhilesh K; Sopory, Sudhir K; Kapoor, Sanjay; Pandey, Girdhar K

2013-01-01

Phospholipase C (PLC) is one of the major lipid hydrolysing enzymes, implicated in lipid mediated signaling. PLCs have been found to play a significant role in abiotic stress triggered signaling and developmental processes in various plant species. Genome wide identification and expression analysis have been carried out for this gene family in Arabidopsis, yet not much has been accomplished in crop plant rice. An exhaustive in-silico exploration of rice genome using various online databases and tools resulted in the identification of nine PLC encoding genes. Based on sequence, motif and phylogenetic analysis rice PLC gene family could be divided into phosphatidylinositol-specific PLCs (PI-PLCs) and phosphatidylcholine- PLCs (PC-PLC or NPC) classes with four and five members, respectively. A comparative analysis revealed that PLCs are conserved in Arabidopsis (dicots) and rice (monocot) at gene structure and protein level but they might have evolved through a separate evolutionary path. Transcript profiling using gene chip microarray and quantitative RT-PCR showed that most of the PLC members expressed significantly and differentially under abiotic stresses (salt, cold and drought) and during various developmental stages with condition/stage specific and overlapping expression. This finding suggested an important role of different rice PLC members in abiotic stress triggered signaling and plant development, which was also supported by the presence of relevant cis-regulatory elements in their promoters. Sub-cellular localization of few selected PLC members in Nicotiana benthamiana and onion epidermal cells has provided a clue about their site of action and functional behaviour. The genome wide identification, structural and expression analysis and knowledge of sub-cellular localization of PLC gene family envisage the functional characterization of these genes in crop plants in near future.
Metabolomic profiling and genomic analysis of wheat aneuploid lines to identify genes controlling biochemical pathways in mature grain.

PubMed

Francki, Michael G; Hayton, Sarah; Gummer, Joel P A; Rawlinson, Catherine; Trengove, Robert D

2016-02-01

Metabolomics is becoming an increasingly important tool in plant genomics to decipher the function of genes controlling biochemical pathways responsible for trait variation. Although theoretical models can integrate genes and metabolites for trait variation, biological networks require validation using appropriate experimental genetic systems. In this study, we applied an untargeted metabolite analysis to mature grain of wheat homoeologous group 3 ditelosomic lines, selected compounds that showed significant variation between wheat lines Chinese Spring and at least one ditelosomic line, tracked the genes encoding enzymes of their biochemical pathway using the wheat genome survey sequence and determined the genetic components underlying metabolite variation. A total of 412 analytes were resolved in the wheat grain metabolome, and principal component analysis indicated significant differences in metabolite profiles between Chinese Spring and each ditelosomic lines. The grain metabolome identified 55 compounds positively matched against a mass spectral library where the majority showed significant differences between Chinese Spring and at least one ditelosomic line. Trehalose and branched-chain amino acids were selected for detailed investigation, and it was expected that if genes encoding enzymes directly related to their biochemical pathways were located on homoeologous group 3 chromosomes, then corresponding ditelosomic lines would have a significant reduction in metabolites compared with Chinese Spring. Although a proportion showed a reduction, some lines showed significant increases in metabolites, indicating that genes directly and indirectly involved in biosynthetic pathways likely regulate the metabolome. Therefore, this study demonstrated that wheat aneuploid lines are suitable experimental genetic system to validate metabolomics-genomics networks. © 2015 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.
De novo assembled expressed gene catalog of a fast-growing Eucalyptus tree produced by Illumina mRNA-Seq

PubMed Central

2010-01-01

Background De novo assembly of transcript sequences produced by short-read DNA sequencing technologies offers a rapid approach to obtain expressed gene catalogs for non-model organisms. A draft genome sequence will be produced in 2010 for a Eucalyptus tree species (E. grandis) representing the most important hardwood fibre crop in the world. Genome annotation of this valuable woody plant and genetic dissection of its superior growth and productivity will be greatly facilitated by the availability of a comprehensive collection of expressed gene sequences from multiple tissues and organs. Results We present an extensive expressed gene catalog for a commercially grown E. grandis × E. urophylla hybrid clone constructed using only Illumina mRNA-Seq technology and de novo assembly. A total of 18,894 transcript-derived contigs, a large proportion of which represent full-length protein coding genes were assembled and annotated. Analysis of assembly quality, length and diversity show that this dataset represent the most comprehensive expressed gene catalog for any Eucalyptus tree. mRNA-Seq analysis furthermore allowed digital expression profiling of all of the assembled transcripts across diverse xylogenic and non-xylogenic tissues, which is invaluable for ascribing putative gene functions. Conclusions De novo assembly of Illumina mRNA-Seq reads is an efficient approach for transcriptome sequencing and profiling in Eucalyptus and other non-model organisms. The transcriptome resource (Eucspresso, http://eucspresso.bi.up.ac.za/) generated by this study will be of value for genomic analysis of woody biomass production in Eucalyptus and for comparative genomic analysis of growth and development in woody and herbaceous plants. PMID:21122097
Genome-wide analysis of WRKY gene family in Cucumis sativus.

PubMed

Ling, Jian; Jiang, Weijie; Zhang, Ying; Yu, Hongjun; Mao, Zhenchuan; Gu, Xingfang; Huang, Sanwen; Xie, Bingyan

2011-09-28

WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes.
A MBD-seq protocol for large-scale methylome-wide studies with (very) low amounts of DNA.

PubMed

Aberg, Karolina A; Chan, Robin F; Shabalin, Andrey A; Zhao, Min; Turecki, Gustavo; Staunstrup, Nicklas Heine; Starnawska, Anna; Mors, Ole; Xie, Lin Y; van den Oord, Edwin Jcg

2017-09-01

We recently showed that, after optimization, our methyl-CpG binding domain sequencing (MBD-seq) application approximates the methylome-wide coverage obtained with whole-genome bisulfite sequencing (WGB-seq), but at a cost that enables adequately powered large-scale association studies. A prior drawback of MBD-seq is the relatively large amount of genomic DNA (ideally >1 µg) required to obtain high-quality data. Biomaterials are typically expensive to collect, provide a finite amount of DNA, and may simply not yield sufficient starting material. The ability to use low amounts of DNA will increase the breadth and number of studies that can be conducted. Therefore, we further optimized the enrichment step. With this low starting material protocol, MBD-seq performed equally well, or better, than the protocol requiring ample starting material (>1 µg). Using only 15 ng of DNA as input, there is minimal loss in data quality, achieving 93% of the coverage of WGB-seq (with standard amounts of input DNA) at similar false/positive rates. Furthermore, across a large number of genomic features, the MBD-seq methylation profiles closely tracked those observed for WGB-seq with even slightly larger effect sizes. This suggests that MBD-seq provides similar information about the methylome and classifies methylation status somewhat more accurately. Performance decreases with <15 ng DNA as starting material but, even with as little as 5 ng, MBD-seq still achieves 90% of the coverage of WGB-seq with comparable genome-wide methylation profiles. Thus, the proposed protocol is an attractive option for adequately powered and cost-effective methylome-wide investigations using (very) low amounts of DNA.
Predicting Survival within the Lung Cancer Histopathological Hierarchy Using a Multi-Scale Genomic Model of Development

PubMed Central

Liu, Hongye; Kho, Alvin T; Kohane, Isaac S; Sun, Yao

2006-01-01

Background The histopathologic heterogeneity of lung cancer remains a significant confounding factor in its diagnosis and prognosis—spurring numerous recent efforts to find a molecular classification of the disease that has clinical relevance. Methods and Findings Molecular profiles of tumors from 186 patients representing four different lung cancer subtypes (and 17 normal lung tissue samples) were compared with a mouse lung development model using principal component analysis in both temporal and genomic domains. An algorithm for the classification of lung cancers using a multi-scale developmental framework was developed. Kaplan–Meier survival analysis was conducted for lung adenocarcinoma patient subgroups identified via their developmental association. We found multi-scale genomic similarities between four human lung cancer subtypes and the developing mouse lung that are prognostically meaningful. Significant association was observed between the localization of human lung cancer cases along the principal mouse lung development trajectory and the corresponding patient survival rate at three distinct levels of classical histopathologic resolution: among different lung cancer subtypes, among patients within the adenocarcinoma subtype, and within the stage I adenocarcinoma subclass. The earlier the genomic association between a human tumor profile and the mouse lung development sequence, the poorer the patient's prognosis. Furthermore, decomposing this principal lung development trajectory identified a gene set that was significantly enriched for pyrimidine metabolism and cell-adhesion functions specific to lung development and oncogenesis. Conclusions From a multi-scale disease modeling perspective, the molecular dynamics of murine lung development provide an effective framework that is not only data driven but also informed by the biology of development for elucidating the mechanisms of human lung cancer biology and its clinical outcome. PMID:16800721
Discovering time-lagged rules from microarray data using gene profile classifiers

PubMed Central

2011-01-01

Background Gene regulatory networks have an essential role in every process of life. In this regard, the amount of genome-wide time series data is becoming increasingly available, providing the opportunity to discover the time-delayed gene regulatory networks that govern the majority of these molecular processes. Results This paper aims at reconstructing gene regulatory networks from multiple genome-wide microarray time series datasets. In this sense, a new model-free algorithm called GRNCOP2 (Gene Regulatory Network inference by Combinatorial OPtimization 2), which is a significant evolution of the GRNCOP algorithm, was developed using combinatorial optimization of gene profile classifiers. The method is capable of inferring potential time-delay relationships with any span of time between genes from various time series datasets given as input. The proposed algorithm was applied to time series data composed of twenty yeast genes that are highly relevant for the cell-cycle study, and the results were compared against several related approaches. The outcomes have shown that GRNCOP2 outperforms the contrasted methods in terms of the proposed metrics, and that the results are consistent with previous biological knowledge. Additionally, a genome-wide study on multiple publicly available time series data was performed. In this case, the experimentation has exhibited the soundness and scalability of the new method which inferred highly-related statistically-significant gene associations. Conclusions A novel method for inferring time-delayed gene regulatory networks from genome-wide time series datasets is proposed in this paper. The method was carefully validated with several publicly available data sets. The results have demonstrated that the algorithm constitutes a usable model-free approach capable of predicting meaningful relationships between genes, revealing the time-trends of gene regulation. PMID:21524308
Genome-wide organization and expression profiling of the R2R3-MYB transcription factor family in pineapple (Ananas comosus).

PubMed

Liu, Chaoyang; Xie, Tao; Chen, Chenjie; Luan, Aiping; Long, Jianmei; Li, Chuhao; Ding, Yaqi; He, Yehua

2017-07-01

The MYB proteins comprise one of the largest families of plant transcription factors, which are involved in various plant physiological and biochemical processes. Pineapple (Ananas comosus) is one of three most important tropical fruits worldwide. The completion of pineapple genome sequencing provides a great opportunity to investigate the organization and evolutionary traits of pineapple MYB genes at the genome-wide level. In the present study, a total of 94 pineapple R2R3-MYB genes were identified and further phylogenetically classified into 26 subfamilies, as supported by the conserved gene structures and motif composition. Collinearity analysis indicated that the segmental duplication events played a crucial role in the expansion of pineapple MYB gene family. Further comparative phylogenetic analysis suggested that there have been functional divergences of MYB gene family during plant evolution. RNA-seq data from different tissues and developmental stages revealed distinct temporal and spatial expression profiles of the AcMYB genes. Further quantitative expression analysis showed the specific expression patterns of the selected putative stress-related AcMYB genes in response to distinct abiotic stress and hormonal treatments. The comprehensive expression analysis of the pineapple MYB genes, especially the tissue-preferential and stress-responsive genes, could provide valuable clues for further function characterization. In this work, we systematically identified AcMYB genes by analyzing the pineapple genome sequence using a set of bioinformatics approaches. Our findings provide a global insight into the organization, phylogeny and expression patterns of the pineapple R2R3-MYB genes, and hence contribute to the greater understanding of their biological roles in pineapple.
Evaluation of Signature Erosion in Ebola Virus Due to Genomic Drift and Its Impact on the Performance of Diagnostic Assays

PubMed Central

Sozhamannan, Shanmuga; Holland, Mitchell Y.; Hall, Adrienne T.; Negrón, Daniel A.; Ivancich, Mychal; Koehler, Jeffrey W.; Minogue, Timothy D.; Campbell, Catherine E.; Berger, Walter J.; Christopher, George W.; Goodwin, Bruce G.; Smith, Michael A.

2015-01-01

Genome sequence analyses of the 2014 Ebola Virus (EBOV) isolates revealed a potential problem with the diagnostic assays currently in use; i.e., drifting genomic profiles of the virus may affect the sensitivity or even produce false-negative results. We evaluated signature erosion in ebolavirus molecular assays using an in silico approach and found frequent potential false-negative and false-positive results. We further empirically evaluated many EBOV assays, under real time PCR conditions using EBOV Kikwit (1995) and Makona (2014) RNA templates. These results revealed differences in performance between assays but were comparable between the old and new EBOV templates. Using a whole genome approach and a novel algorithm, termed BioVelocity, we identified new signatures that are unique to each of EBOV, Sudan virus (SUDV), and Reston virus (RESTV). Interestingly, many of the current assay signatures do not fall within these regions, indicating a potential drawback in the past assay design strategies. The new signatures identified in this study may be evaluated with real-time reverse transcription PCR (rRT-PCR) assay development and validation. In addition, we discuss regulatory implications and timely availability to impact a rapidly evolving outbreak using existing but perhaps less than optimal assays versus redesign these assays for addressing genomic changes. PMID:26090727
Evolutionary characterization of Ty3/gypsy-like LTR retrotransposons in the parasitic cestode Echinococcus granulosus.

PubMed

Bae, Young-An

2016-11-01

Cyclophyllidean cestodes including Echinococcus granulosus have a smaller genome and show characteristics such as loss of the gut, a segmented body plan, and accelerated growth rate in hosts compared with other tissue-invading helminths. In an effort to address the molecular mechanism relevant to genome shrinkage, the evolutionary status of long-terminal-repeat (LTR) retrotransposons, which are known as the most potent genomic modulators, was investigated in the E. granulosus draft genome. A majority of the E. granulosus LTR retrotransposons were classified into a novel characteristic clade, named Saci-2, of the Ty3/gypsy family, while the remaining elements belonged to the CsRn1 clade of identical family. Their nucleotide sequences were heavily corrupted by frequent base substitutions and segmental losses. The ceased mobile activity of the major retrotransposons and the following intrinsic DNA loss in their inactive progenies might have contributed to decrease in genome size. Apart from the degenerate copies, a gag gene originating from a CsRn1-like element exhibited substantial evidences suggesting its domestication including a preserved coding profile and transcriptional activity, the presence of syntenic orthologues in cestodes, and selective pressure acting on the gene. To my knowledge, the endogenized gag gene is reported for the first time in invertebrates, though its biological function remains elusive.
Genome-Wide Identification, Characterization and Phylogenetic Analysis of ATP-Binding Cassette (ABC) Transporter Genes in Common Carp (Cyprinus carpio).

PubMed

Liu, Xiang; Li, Shangqi; Peng, Wenzhu; Feng, Shuaisheng; Feng, Jianxin; Mahboob, Shahid; Al-Ghanim, Khalid A; Xu, Peng

2016-01-01

The ATP-binding cassette (ABC) gene family is considered to be one of the largest gene families in all forms of prokaryotic and eukaryotic life. Although the ABC transporter genes have been annotated in some species, detailed information about the ABC superfamily and the evolutionary characterization of ABC genes in common carp (Cyprinus carpio) are still unclear. In this research, we identified 61 ABC transporter genes in the common carp genome. Phylogenetic analysis revealed that they could be classified into seven subfamilies, namely 11 ABCAs, six ABCBs, 19 ABCCs, eight ABCDs, two ABCEs, four ABCFs, and 11 ABCGs. Comparative analysis of the ABC genes in seven vertebrate species including common carp, showed that at least 10 common carp genes were retained from the third round of whole genome duplication, while 12 duplicated ABC genes may have come from the fourth round of whole genome duplication. Gene losses were also observed for 14 ABC genes. Expression profiles of the 61 ABC genes in six common carp tissues (brain, heart, spleen, kidney, intestine, and gill) revealed extensive functional divergence among the ABC genes. Different copies of some genes had tissue-specific expression patterns, which may indicate some gene function specialization. This study provides essential genomic resources for future studies in common carp.
EuPathDB: the eukaryotic pathogen genomics database resource

PubMed Central

Aurrecoechea, Cristina; Barreto, Ana; Basenko, Evelina Y.; Brestelli, John; Brunk, Brian P.; Cade, Shon; Crouch, Kathryn; Doherty, Ryan; Falke, Dave; Fischer, Steve; Gajria, Bindu; Harb, Omar S.; Heiges, Mark; Hertz-Fowler, Christiane; Hu, Sufen; Iodice, John; Kissinger, Jessica C.; Lawrence, Cris; Li, Wei; Pinney, Deborah F.; Pulman, Jane A.; Roos, David S.; Shanmugasundram, Achchuthan; Silva-Franco, Fatima; Steinbiss, Sascha; Stoeckert, Christian J.; Spruill, Drew; Wang, Haiming; Warrenfeltz, Susanne; Zheng, Jie

2017-01-01

The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host–pathogen interactions. PMID:27903906
Genome-Wide Identification, Characterization and Phylogenetic Analysis of ATP-Binding Cassette (ABC) Transporter Genes in Common Carp (Cyprinus carpio)

PubMed Central

Peng, Wenzhu; Feng, Shuaisheng; Feng, Jianxin; Mahboob, Shahid; Al-Ghanim, Khalid A.

2016-01-01

The ATP-binding cassette (ABC) gene family is considered to be one of the largest gene families in all forms of prokaryotic and eukaryotic life. Although the ABC transporter genes have been annotated in some species, detailed information about the ABC superfamily and the evolutionary characterization of ABC genes in common carp (Cyprinus carpio) are still unclear. In this research, we identified 61 ABC transporter genes in the common carp genome. Phylogenetic analysis revealed that they could be classified into seven subfamilies, namely 11 ABCAs, six ABCBs, 19 ABCCs, eight ABCDs, two ABCEs, four ABCFs, and 11 ABCGs. Comparative analysis of the ABC genes in seven vertebrate species including common carp, showed that at least 10 common carp genes were retained from the third round of whole genome duplication, while 12 duplicated ABC genes may have come from the fourth round of whole genome duplication. Gene losses were also observed for 14 ABC genes. Expression profiles of the 61 ABC genes in six common carp tissues (brain, heart, spleen, kidney, intestine, and gill) revealed extensive functional divergence among the ABC genes. Different copies of some genes had tissue-specific expression patterns, which may indicate some gene function specialization. This study provides essential genomic resources for future studies in common carp. PMID:27058731
Proteomic and Epigenetic Analysis of Rice after Seed Spaceflight and Ground-Base Ion Radiations

NASA Astrophysics Data System (ADS)

Wang, Wei; Sun, Yeqing; Peng, Yuming; Zhao, Qian; Wen, Bin; Yang, Jun

Highly ionizing radiation (HZE) in space is considered as main factor causing biological effects to plant seeds. In previous work, we compared the proteomic profiles of rice plants growing after seed spaceflights to ground controls by two-dimensional difference gel electrophoresis (2-D DIGE) with mass spectrometry and found that the protein expression profiles were changed and differentially expressed proteins participated in most of the biological processes of rice. To further evaluate the dosage effects of space radiation and compare between low- and high-dose ion effects, we carried out three independent ground-base ionizing radiation experiments with different cumulative doses (low-dose range: 2~1000mGy, high-dose range: 2000~20000mGy) to rice seeds and performed proteomic analysis of seedlings. We found that protein expression profiles showed obvious boundaries between low- and high-dose radiation groups. Rates of differentially expressed proteins presented a dose-dependent effect, it reached the highest value at 2000mGy dosage point in all three radiation experiments coincidently; while proteins responded to low-dose radiations preferred to change their expressions at the minimum dosage (2mGy). Proteins participating in rice biological processes also responded differently between low- and high-dose radiations: proteins involved in energy metabolism and photosynthesis tended to be regulated after low-dose radiations while stress responding, protein folding and cell redox homeostasis related proteins preferred to change their expressions after high-dose radiations. By comparing the proteomic profiles between ground-base radiations and spaceflights, it was worth noting that ground-base low-dose ion radiation effects shared similar biological effects as space environment. In addition, we discovered that protein nucleoside diphosphate kinase 1 (NDPK1) showed obvious increased regulation after spaceflights and ion radiations. NDPK1 catalyzes nucleotide metabolism and is reported to be involved in DNA repair process. Its expression sensitivity and specificity were confirmed by RT-PCR and western blot analysis, indicating its potential to be used as space radiation biomarker. Space radiations might induce epigenetic effects on rice plants, especially changes of DNA methylation. Early results suggested that there were correlations between DNA methylation polymorphic and genomic mutation rates. In addition, the 5-methylcytosine located in coding gene’s promoter and exon regions could regulate gene expressions thus influence protein expressions. So whether there is correlation between genome DNA methylation changes and protein expression profile alterations caused by space radiation is worth for further investigation. Therefore we used the same rice samples treated by carbon ion radiation with different doses (0, 10, 20,100, 200, 1000, 2000, 5000, 20000mGy) and applied methylation sensitive amplification polymorphism (MSAP) for scanning genome DNA methylation changes. Interestingly, DNA methylation polymorphism rates also presented a dose-dependent effect and showed the same changing trend as rates of differentially expressed proteins. Whether there are correlations between epigenetic and proteomic effects of space radiation is worth for further investigation.
Targeted and genome-scale methylomics reveals gene body signatures in human cell lines

PubMed Central

Ball, Madeleine Price; Li, Jin Billy; Gao, Yuan; Lee, Je-Hyuk; LeProust, Emily; Park, In-Hyun; Xie, Bin; Daley, George Q.; Church, George M.

2012-01-01

Cytosine methylation, an epigenetic modification of DNA, is a target of growing interest for developing high throughput profiling technologies. Here we introduce two new, complementary techniques for cytosine methylation profiling utilizing next generation sequencing technology: bisulfite padlock probes (BSPPs) and methyl sensitive cut counting (MSCC). In the first method, we designed a set of ~10,000 BSPPs distributed over the ENCODE pilot project regions to take advantage of existing expression and chromatin immunoprecipitation data. We observed a pattern of low promoter methylation coupled with high gene body methylation in highly expressed genes. Using the second method, MSCC, we gathered genome-scale data for 1.4 million HpaII sites and confirmed that gene body methylation in highly expressed genes is a consistent phenomenon over the entire genome. Our observations highlight the usefulness of techniques which are not inherently or intentionally biased in favor of only profiling particular subsets like CpG islands or promoter regions. PMID:19329998
Decoherence in yeast cell populations and its implications for genome-wide expression noise.

PubMed

Briones, M R S; Bosco, F

2009-01-20

Gene expression "noise" is commonly defined as the stochastic variation of gene expression levels in different cells of the same population under identical growth conditions. Here, we tested whether this "noise" is amplified with time, as a consequence of decoherence in global gene expression profiles (genome-wide microarrays) of synchronized cells. The stochastic component of transcription causes fluctuations that tend to be amplified as time progresses, leading to a decay of correlations of expression profiles, in perfect analogy with elementary relaxation processes. Measuring decoherence, defined here as a decay in the auto-correlation function of yeast genome-wide expression profiles, we found a slowdown in the decay of correlations, opposite to what would be expected if, as in mixing systems, correlations decay exponentially as the equilibrium state is reached. Our results indicate that the populational variation in gene expression (noise) is a consequence of temporal decoherence, in which the slow decay of correlations is a signature of strong interdependence of the transcription dynamics of different genes.
Functional genomics provides insights into the role of Propionibacterium freudenreichii ssp. shermanii JS in cheese ripening.

PubMed

Ojala, Teija; Laine, Pia K S; Ahlroos, Terhi; Tanskanen, Jarna; Pitkänen, Saara; Salusjärvi, Tuomas; Kankainen, Matti; Tynkkynen, Soile; Paulin, Lars; Auvinen, Petri

2017-01-16

Propionibacterium freudenreichii is a commercially important bacterium that is essential for the development of the characteristic eyes and flavor of Swiss-type cheeses. These bacteria grow actively and produce large quantities of flavor compounds during cheese ripening at warm temperatures but also appear to contribute to the aroma development during the subsequent cold storage of cheese. Here, we advance our understanding of the role of P. freudenreichii in cheese ripening by presenting the 2.68-Mbp annotated genome sequence of P. freudenreichii ssp. shermanii JS and determining its global transcriptional profiles during industrial cheese-making using transcriptome sequencing. The annotation of the genome identified a total of 2377 protein-coding genes and revealed the presence of enzymes and pathways for formation of several flavor compounds. Based on transcriptome profiling, the expression of 348 protein-coding genes was altered between the warm and cold room ripening of cheese. Several propionate, acetate, and diacetyl/acetoin production related genes had higher expression levels in the warm room, whereas a general slowing down of the metabolism and an activation of mobile genetic elements was seen in the cold room. A few ripening-related and amino acid catabolism involved genes were induced or remained active in cold room, indicating that strain JS contributes to the aroma development also during cold room ripening. In addition, we performed a comparative genomic analysis of strain JS and 29 other Propionibacterium strains of 10 different species, including an isolate of both P. freudenreichii subspecies freudenreichii and shermanii. Ortholog grouping of the predicted protein sequences revealed that close to 86% of the ortholog groups of strain JS, including a variety of ripening-related ortholog groups, were conserved across the P. freudenreichii isolates. Taken together, this study contributes to the understanding of the genomic basis of P. freudenreichii and sheds light on its activities during cheese ripening. Copyright Â© 2016 Elsevier B.V. All rights reserved.
Effects of immunostimulation on social behavior, chemical communication and genome-wide gene expression in honey bee workers (Apis mellifera)

PubMed Central

2012-01-01

Background Social insects, such as honey bees, use molecular, physiological and behavioral responses to combat pathogens and parasites. The honey bee genome contains all of the canonical insect immune response pathways, and several studies have demonstrated that pathogens can activate expression of immune effectors. Honey bees also use behavioral responses, termed social immunity, to collectively defend their hives from pathogens and parasites. These responses include hygienic behavior (where workers remove diseased brood) and allo-grooming (where workers remove ectoparasites from nestmates). We have previously demonstrated that immunostimulation causes changes in the cuticular hydrocarbon profiles of workers, which results in altered worker-worker social interactions. Thus, cuticular hydrocarbons may enable workers to identify sick nestmates, and adjust their behavior in response. Here, we test the specificity of behavioral, chemical and genomic responses to immunostimulation by challenging workers with a panel of different immune stimulants (saline, Sephadex beads and Gram-negative bacteria E. coli). Results While only bacteria-injected bees elicited altered behavioral responses from healthy nestmates compared to controls, all treatments resulted in significant changes in cuticular hydrocarbon profiles. Immunostimulation caused significant changes in expression of hundreds of genes, the majority of which have not been identified as members of the canonical immune response pathways. Furthermore, several new candidate genes that may play a role in cuticular hydrocarbon biosynthesis were identified. Effects of immune challenge expression of several genes involved in immune response, cuticular hydrocarbon biosynthesis, and the Notch signaling pathway were confirmed using quantitative real-time PCR. Finally, we identified common genes regulated by pathogen challenge in honey bees and other insects. Conclusions These results demonstrate that honey bee genomic responses to immunostimulation are substantially broader than the previously identified canonical immune response pathways, and may mediate the behavioral changes associated with social immunity by orchestrating changes in chemical signaling. These studies lay the groundwork for future research into the genomic responses of honey bees to native honey bee parasites and pathogens. PMID:23072398
Effects of immunostimulation on social behavior, chemical communication and genome-wide gene expression in honey bee workers (Apis mellifera).

PubMed

Richard, Freddie-Jeanne; Holt, Holly L; Grozinger, Christina M

2012-10-16

Social insects, such as honey bees, use molecular, physiological and behavioral responses to combat pathogens and parasites. The honey bee genome contains all of the canonical insect immune response pathways, and several studies have demonstrated that pathogens can activate expression of immune effectors. Honey bees also use behavioral responses, termed social immunity, to collectively defend their hives from pathogens and parasites. These responses include hygienic behavior (where workers remove diseased brood) and allo-grooming (where workers remove ectoparasites from nestmates). We have previously demonstrated that immunostimulation causes changes in the cuticular hydrocarbon profiles of workers, which results in altered worker-worker social interactions. Thus, cuticular hydrocarbons may enable workers to identify sick nestmates, and adjust their behavior in response. Here, we test the specificity of behavioral, chemical and genomic responses to immunostimulation by challenging workers with a panel of different immune stimulants (saline, Sephadex beads and Gram-negative bacteria E. coli). While only bacteria-injected bees elicited altered behavioral responses from healthy nestmates compared to controls, all treatments resulted in significant changes in cuticular hydrocarbon profiles. Immunostimulation caused significant changes in expression of hundreds of genes, the majority of which have not been identified as members of the canonical immune response pathways. Furthermore, several new candidate genes that may play a role in cuticular hydrocarbon biosynthesis were identified. Effects of immune challenge expression of several genes involved in immune response, cuticular hydrocarbon biosynthesis, and the Notch signaling pathway were confirmed using quantitative real-time PCR. Finally, we identified common genes regulated by pathogen challenge in honey bees and other insects. These results demonstrate that honey bee genomic responses to immunostimulation are substantially broader than the previously identified canonical immune response pathways, and may mediate the behavioral changes associated with social immunity by orchestrating changes in chemical signaling. These studies lay the groundwork for future research into the genomic responses of honey bees to native honey bee parasites and pathogens.
Implications of publicly available genomic data resources in searching for therapeutic targets of obesity and type 2 diabetes.

PubMed

Jung, Sungwon

2018-04-20

Obesity and type 2 diabetes (T2D) are two major conditions that are related to metabolic disorders and affect a large population. Although there have been significant efforts to identify their therapeutic targets, few benefits have come from comprehensive molecular profiling. This limited availability of comprehensive molecular profiling of obesity and T2D may be due to multiple challenges, as these conditions involve multiple organs and collecting tissue samples from subjects is more difficult in obesity and T2D than in other diseases, where surgical treatments are popular choices. While there is no repository of comprehensive molecular profiling data for obesity and T2D, multiple existing data resources can be utilized to cover various aspects of these conditions. This review presents studies with available genomic data resources for obesity and T2D and discusses genome-wide association studies (GWAS), a knockout (KO)-based phenotyping study, and gene expression profiles. These studies, based on their assessed coverage and characteristics, can provide insights into how such data can be utilized to identify therapeutic targets for obesity and T2D.

Measuring and Reducing Off-Target Activities of Programmable Nucleases Including CRISPR-Cas9

PubMed Central

Koo, Taeyoung; Lee, Jungjoon; Kim, Jin-Soo

2015-01-01

Programmable nucleases, which include zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and RNA-guided engineered nucleases (RGENs) repurposed from the type II clustered, regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated protein 9 (Cas9) system are now widely used for genome editing in higher eukaryotic cells and whole organisms, revolutionising almost every discipline in biological research, medicine, and biotechnology. All of these nucleases, however, induce off-target mutations at sites homologous in sequence with on-target sites, limiting their utility in many applications including gene or cell therapy. In this review, we compare methods for detecting nuclease off-target mutations. We also review methods for profiling genome-wide off-target effects and discuss how to reduce or avoid off-target mutations. PMID:25985872
Genomic Prediction of Testcross Performance in Canola (Brassica napus)

PubMed Central

Jan, Habib U.; Abbadi, Amine; Lücke, Sophie; Nichols, Richard A.; Snowdon, Rod J.

2016-01-01

Genomic selection (GS) is a modern breeding approach where genome-wide single-nucleotide polymorphism (SNP) marker profiles are simultaneously used to estimate performance of untested genotypes. In this study, the potential of genomic selection methods to predict testcross performance for hybrid canola breeding was applied for various agronomic traits based on genome-wide marker profiles. A total of 475 genetically diverse spring-type canola pollinator lines were genotyped at 24,403 single-copy, genome-wide SNP loci. In parallel, the 950 F1 testcross combinations between the pollinators and two representative testers were evaluated for a number of important agronomic traits including seedling emergence, days to flowering, lodging, oil yield and seed yield along with essential seed quality characters including seed oil content and seed glucosinolate content. A ridge-regression best linear unbiased prediction (RR-BLUP) model was applied in combination with 500 cross-validations for each trait to predict testcross performance, both across the whole population as well as within individual subpopulations or clusters, based solely on SNP profiles. Subpopulations were determined using multidimensional scaling and K-means clustering. Genomic prediction accuracy across the whole population was highest for seed oil content (0.81) followed by oil yield (0.75) and lowest for seedling emergence (0.29). For seed yieId, seed glucosinolate, lodging resistance and days to onset of flowering (DTF), prediction accuracies were 0.45, 0.61, 0.39 and 0.56, respectively. Prediction accuracies could be increased for some traits by treating subpopulations separately; a strategy which only led to moderate improvements for some traits with low heritability, like seedling emergence. No useful or consistent increase in accuracy was obtained by inclusion of a population substructure covariate in the model. Testcross performance prediction using genome-wide SNP markers shows considerable potential for pre-selection of promising hybrid combinations prior to resource-intensive field testing over multiple locations and years. PMID:26824924
Genomic profiling of multiple sequentially acquired tumor metastatic sites from an “exceptional responder” lung adenocarcinoma patient reveals extensive genomic heterogeneity and novel somatic variants driving treatment response. | Center for Cancer Research

Cancer.gov

Biswas et al. describe an “exceptional responder” lung adenocarcinoma patient who survived with metastatic lung adenocarcinoma for 7 years while undergoing single or combination ERBB2-directed therapies. Whole-genome, whole-exome, and high-coverage ion-torrent targeted sequencing were used to demonstrate extreme genomic heterogeneity between the lung and lymph node metastatic
Microarray-based genomic profiling reveals novel genomic aberrations in follicular lymphoma which associate with patient survival and gene expression status.

PubMed

Schwaenen, Carsten; Viardot, Andreas; Berger, Hilmar; Barth, Thomas F E; Bentink, Stefan; Döhner, Hartmut; Enz, Martina; Feller, Alfred C; Hansmann, Martin-Leo; Hummel, Michael; Kestler, Hans A; Klapper, Wolfram; Kreuz, Markus; Lenze, Dido; Loeffler, Markus; Möller, Peter; Müller-Hermelink, Hans-Konrad; Ott, German; Rosolowski, Maciej; Rosenwald, Andreas; Ruf, Sandra; Siebert, Reiner; Spang, Rainer; Stein, Harald; Truemper, Lorenz; Lichter, Peter; Bentz, Martin; Wessendorf, Swen

2009-01-01

Follicular lymphoma (FL) is characterized by a large number of chromosomal aberrations. However, their exact genomic extension and involved target genes remain to be determined. For this purpose, we used array-based intermediate-high resolution genomic profiling in combination with Affymetrix gene expression analysis. Tumor specimens from 128 FL patients were analyzed for the presence of genomic aberrations and the results were correlated to clinical data sets and mRNA expression levels. In 114 (89%) of the 128 analyzed cases, a total of 688 genomic aberrations (384 gains/amplifications and 304 losses) were detected. Frequent genomic aberrations were: -1p36 (18%), +2p15 (24%), -3q (14%), -6q (25%), +7p (19%), +7q (23%), +8q (14%), -9p (16%), -11q (15%), +12q (20%), -13q (11%), -17p (16%), +18p (18%), and +18q (28%). Critical segments of these imbalances were delineated to genomic fragments with a minimum size down to 0.2 Mb. By comparison of these with mRNA gene expression data, putative candidate genes were identified. Moreover, we found that deletions affecting the tumor suppressor gene CDKN2A/B on 9p21 were detected in nontransformed FL grade I-II. For this aberration as well as for -6q25 and -6q26, an association with inferior survival was observed.
GDA, a web-based tool for Genomics and Drugs integrated analysis.

PubMed

Caroli, Jimmy; Sorrentino, Giovanni; Forcato, Mattia; Del Sal, Giannino; Bicciato, Silvio

2018-05-25

Several major screenings of genetic profiling and drug testing in cancer cell lines proved that the integration of genomic portraits and compound activities is effective in discovering new genetic markers of drug sensitivity and clinically relevant anticancer compounds. Despite most genetic and drug response data are publicly available, the availability of user-friendly tools for their integrative analysis remains limited, thus hampering an effective exploitation of this information. Here, we present GDA, a web-based tool for Genomics and Drugs integrated Analysis that combines drug response data for >50 800 compounds with mutations and gene expression profiles across 73 cancer cell lines. Genomic and pharmacological data are integrated through a modular architecture that allows users to identify compounds active towards cancer cell lines bearing a specific genomic background and, conversely, the mutational or transcriptional status of cells responding or not-responding to a specific compound. Results are presented through intuitive graphical representations and supplemented with information obtained from public repositories. As both personalized targeted therapies and drug-repurposing are gaining increasing attention, GDA represents a resource to formulate hypotheses on the interplay between genomic traits and drug response in cancer. GDA is freely available at http://gda.unimore.it/.
The HIV-1 integrase-LEDGF allosteric inhibitor MUT-A: resistance profile, impairment of virus maturation and infectivity but without influence on RNA packaging or virus immunoreactivity.

PubMed

Amadori, Céline; van der Velden, Yme Ubeles; Bonnard, Damien; Orlov, Igor; van Bel, Nikki; Le Rouzic, Erwann; Miralles, Laia; Brias, Julie; Chevreuil, Francis; Spehner, Daniele; Chasset, Sophie; Ledoussal, Benoit; Mayr, Luzia; Moreau, François; García, Felipe; Gatell, José; Zamborlini, Alessia; Emiliani, Stéphane; Ruff, Marc; Klaholz, Bruno P; Moog, Christiane; Berkhout, Ben; Plana, Montserrat; Benarous, Richard

2017-11-09

HIV-1 Integrase (IN) interacts with the cellular co-factor LEDGF/p75 and tethers the HIV preintegration complex to the host genome enabling integration. Recently a new class of IN inhibitors was described, the IN-LEDGF allosteric inhibitors (INLAIs). Designed to interfere with the IN-LEDGF interaction during integration, the major impact of these inhibitors was surprisingly found on virus maturation, causing a reverse transcription defect in target cells. Here we describe the MUT-A compound as a genuine INLAI with an original chemical structure based on a new type of scaffold, a thiophene ring. MUT-A has all characteristics of INLAI compounds such as inhibition of IN-LEDGF/p75 interaction, IN multimerization, dual antiretroviral (ARV) activities, normal packaging of genomic viral RNA and complete Gag protein maturation. MUT-A has more potent ARV activity compared to other INLAIs previously reported, but similar profile of resistance mutations and absence of ARV activity on SIV. HIV-1 virions produced in the presence of MUT-A were non-infectious with the formation of eccentric condensates outside of the core. In studying the immunoreactivity of these non-infectious virions, we found that inactivated HIV-1 particles were captured by anti-HIV-specific neutralizing and non-neutralizing antibodies (b12, 2G12, PGT121, 4D4, 10-1074, 10E8, VRC01) with efficiencies comparable to non-treated virus. Autologous CD4 + T lymphocyte proliferation and cytokine induction by monocyte-derived dendritic cells (MDDC) pulsed either with MUT-A-inactivated HIV or non-treated HIV were also comparable. Although strongly defective in infectivity, HIV-1 virions produced in the presence of the MUT-A INLAI have a normal protein and genomic RNA content as well as B and T cell immunoreactivities comparable to non-treated HIV-1. These inactivated viruses might form an attractive new approach in vaccine research in an attempt to study if this new type of immunogen could elicit an immune response against HIV-1 in animal models.
DNA Methylation and Transcription Patterns in Intestinal Epithelial Cells From Pediatric Patients With Inflammatory Bowel Diseases Differentiate Disease Subtypes and Associate With Outcome.

PubMed

Howell, Kate Joanne; Kraiczy, Judith; Nayak, Komal M; Gasparetto, Marco; Ross, Alexander; Lee, Claire; Mak, Tim N; Koo, Bon-Kyoung; Kumar, Nitin; Lawley, Trevor; Sinha, Anupam; Rosenstiel, Philip; Heuschkel, Robert; Stegle, Oliver; Zilbauer, Matthias

2018-02-01

We analyzed DNA methylation patterns and transcriptomes of primary intestinal epithelial cells (IEC) of children newly diagnosed with inflammatory bowel diseases (IBD) to learn more about pathogenesis. We obtained mucosal biopsies (N = 236) collected from terminal ileum and ascending and sigmoid colons of children (median age 13 years) newly diagnosed with IBD (43 with Crohn's disease [CD], 23 with ulcerative colitis [UC]), and 30 children without IBD (controls). Patients were recruited and managed at a hospital in the United Kingdom from 2013 through 2016. We also obtained biopsies collected at later stages from a subset of patients. IECs were purified and analyzed for genome-wide DNA methylation patterns and gene expression profiles. Adjacent microbiota were isolated from biopsies and analyzed by 16S gene sequencing. We generated intestinal organoid cultures from a subset of samples and genome-wide DNA methylation analysis was performed. We found gut segment-specific differences in DNA methylation and transcription profiles of IECs from children with IBD vs controls; some were independent of mucosal inflammation. Changes in gut microbiota between IBD and control groups were not as large and were difficult to assess because of large amounts of intra-individual variation. Only IECs from patients with CD had changes in DNA methylation and transcription patterns in terminal ileum epithelium, compared with controls. Colon epithelium from patients with CD and from patients with ulcerative colitis had distinct changes in DNA methylation and transcription patterns, compared with controls. In IECs from patients with IBD, changes in DNA methylation, compared with controls, were stable over time and were partially retained in ex-vivo organoid cultures. Statistical analyses of epithelial cell profiles allowed us to distinguish children with CD or UC from controls; profiles correlated with disease outcome parameters, such as the requirement for treatment with biologic agents. We identified specific changes in DNA methylation and transcriptome patterns in IECs from pediatric patients with IBD compared with controls. These data indicate that IECs undergo changes during IBD development and could be involved in pathogenesis. Further analyses of primary IECs from patients with IBD could improve our understanding of the large variations in disease progression and outcomes. Copyright © 2018 AGA Institute. Published by Elsevier Inc. All rights reserved.
Transcript profiling reveals expression differences in wild-type and glabrous soybean lines

PubMed Central

2011-01-01

Background Trichome hairs affect diverse agronomic characters such as seed weight and yield, prevent insect damage and reduce loss of water but their molecular control has not been extensively studied in soybean. Several detailed models for trichome development have been proposed for Arabidopsis thaliana, but their applicability to important crops such as cotton and soybean is not fully known. Results Two high throughput transcript sequencing methods, Digital Gene Expression (DGE) Tag Profiling and RNA-Seq, were used to compare the transcriptional profiles in wild-type (cv. Clark standard, CS) and a mutant (cv. Clark glabrous, i.e., trichomeless or hairless, CG) soybean isoline that carries the dominant P1 allele. DGE data and RNA-Seq data were mapped to the cDNAs (Glyma models) predicted from the reference soybean genome, Williams 82. Extending the model length by 250 bp at both ends resulted in significantly more matches of authentic DGE tags indicating that many of the predicted gene models are prematurely truncated at the 5' and 3' UTRs. The genome-wide comparative study of the transcript profiles of the wild-type versus mutant line revealed a number of differentially expressed genes. One highly-expressed gene, Glyma04g35130, in wild-type soybean was of interest as it has high homology to the cotton gene GhRDL1 gene that has been identified as being involved in cotton fiber initiation and is a member of the BURP protein family. Sequence comparison of Glyma04g35130 among Williams 82 with our sequences derived from CS and CG isolines revealed various SNPs and indels including addition of one nucleotide C in the CG and insertion of ~60 bp in the third exon of CS that causes a frameshift mutation and premature truncation of peptides in both lines as compared to Williams 82. Conclusion Although not a candidate for the P1 locus, a BURP family member (Glyma04g35130) from soybean has been shown to be abundantly expressed in the CS line and very weakly expressed in the glabrous CG line. RNA-Seq and DGE data are compared and provide experimental data on the expression of predicted soybean gene models as well as an overview of the genes expressed in young shoot tips of two closely related isolines. PMID:22029708
Clinical Actionability of Comprehensive Genomic Profiling for Management of Rare or Refractory Cancers

PubMed Central

Hirshfield, Kim M.; Tolkunov, Denis; Zhong, Hua; Ali, Siraj M.; Stein, Mark N.; Murphy, Susan; Vig, Hetal; Vazquez, Alexei; Glod, John; Moss, Rebecca A.; Belyi, Vladimir; Chan, Chang S.; Chen, Suzie; Goodell, Lauri; Foran, David; Yelensky, Roman; Palma, Norma A.; Sun, James X.; Miller, Vincent A.; Stephens, Philip J.; Ross, Jeffrey S.; Kaufman, Howard; Poplin, Elizabeth; Mehnert, Janice; Tan, Antoinette R.; Bertino, Joseph R.; Aisner, Joseph; DiPaola, Robert S.

2016-01-01

Background. The frequency with which targeted tumor sequencing results will lead to implemented change in care is unclear. Prospective assessment of the feasibility and limitations of using genomic sequencing is critically important. Methods. A prospective clinical study was conducted on 100 patients with diverse-histology, rare, or poor-prognosis cancers to evaluate the clinical actionability of a Clinical Laboratory Improvement Amendments (CLIA)-certified, comprehensive genomic profiling assay (FoundationOne), using formalin-fixed, paraffin-embedded tumors. The primary objectives were to assess utility, feasibility, and limitations of genomic sequencing for genomically guided therapy or other clinical purpose in the setting of a multidisciplinary molecular tumor board. Results. Of the tumors from the 92 patients with sufficient tissue, 88 (96%) had at least one genomic alteration (average 3.6, range 0–10). Commonly altered pathways included p53 (46%), RAS/RAF/MAPK (rat sarcoma; rapidly accelerated fibrosarcoma; mitogen-activated protein kinase) (45%), receptor tyrosine kinases/ligand (44%), PI3K/AKT/mTOR (phosphatidylinositol-4,5-bisphosphate 3-kinase; protein kinase B; mammalian target of rapamycin) (35%), transcription factors/regulators (31%), and cell cycle regulators (30%). Many low frequency but potentially actionable alterations were identified in diverse histologies. Use of comprehensive profiling led to implementable clinical action in 35% of tumors with genomic alterations, including genomically guided therapy, diagnostic modification, and trigger for germline genetic testing. Conclusion. Use of targeted next-generation sequencing in the setting of an institutional molecular tumor board led to implementable clinical action in more than one third of patients with rare and poor-prognosis cancers. Major barriers to implementation of genomically guided therapy were clinical status of the patient and drug access. Early and serial sequencing in the clinical course and expanded access to genomically guided early-phase clinical trials and targeted agents may increase actionability. Implications for Practice: Identification of key factors that facilitate use of genomic tumor testing results and implementation of genomically guided therapy may lead to enhanced benefit for patients with rare or difficult to treat cancers. Clinical use of a targeted next-generation sequencing assay in the setting of an institutional molecular tumor board led to implementable clinical action in over one third of patients with rare and poor prognosis cancers. The major barriers to implementation of genomically guided therapy were clinical status of the patient and drug access both on trial and off label. Approaches to increase actionability include early and serial sequencing in the clinical course and expanded access to genomically guided early phase clinical trials and targeted agents. PMID:27566247
WebaCGH: an interactive online tool for the analysis and display of array comparative genomic hybridisation data.

PubMed

Frankenberger, Casey; Wu, Xiaolin; Harmon, Jerry; Church, Deanna; Gangi, Lisa M; Munroe, David J; Urzúa, Ulises

2006-01-01

Gene copy number variations occur both in normal cells and in numerous pathologies including cancer and developmental diseases. Array comparative genomic hybridisation (aCGH) is an emerging technology that allows detection of chromosomal gains and losses in a high-resolution format. When aCGH is performed on cDNA and oligonucleotide microarrays, the impact of DNA copy number on gene transcription profiles may be directly compared. We have created an online software tool, WebaCGH, that functions to (i) upload aCGH and gene transcription results from multiple experiments; (ii) identify significant aberrant regions using a local Z-score threshold in user-selected chromosomal segments subjected to smoothing with moving averages; and (iii) display results in a graphical format with full genome and individual chromosome views. In the individual chromosome display, data can be zoomed in/out in both dimensions (i.e. ratio and physical location) and plotted features can have 'mouse over' linking to outside databases to identify loci of interest. Uploaded data can be stored indefinitely for subsequent retrieval and analysis. WebaCGH was created as a Java-based web application using the open-source database MySQL. WebaCGH is freely accessible at http://129.43.22.27/WebaCGH/welcome.htm Xiaolin Wu (forestwu@mail.nih.gov) or Ulises Urzúa (uurzua@med.uchile.cl).
Synteny analysis of genes and distribution of loci controlling oil content and fatty acid profile based on QTL alignment map in Brassica napus.

PubMed

Raboanatahiry, Nadia; Chao, Hongbo; Guo, Liangxing; Gan, Jianping; Xiang, Jun; Yan, Mingli; Zhang, Libin; Yu, Longjiang; Li, Maoteng

2017-10-12

Deciphering the genetic architecture of a species is a good way to understand its evolutionary history, but also to tailor its profile for breeding elite cultivars with desirable traits. Aligning QTLs from diverse population in one map and utilizing it for comparison, but also as a basis for multiple analyses assure a stronger evidence to understand the genetic system related to a given phenotype. In this study, 439 genes involved in fatty acid (FA) and triacylglycerol (TAG) biosyntheses were identified in Brassica napus. B. napus genome showed mixed gene loss and insertion compared to B. rapa and B. oleracea, and C genome had more inserted genes. Identified QTLs for oil (OC-QTLs) and fatty acids (FA-QTLs) from nine reported populations were projected on the physical map of the reference genome "Darmor-bzh" to generate a map. Thus, 335 FA-QTLs and OC-QTLs could be highlighted and 82 QTLs were overlapping. Chromosome C3 contained 22 overlapping QTLs with all trait studied except for C18:3. In total, 218 candidate genes which were potentially involved in FA and TAG were identified in 162 QTLs confidence intervals and some of them might affect many traits. Also, 76 among these candidate genes were found inside 57 overlapping QTLs, and candidate genes for oil content were in majority (61/76 genes). Then, sixteen genes were found in overlapping QTLs involving three populations, and the remaining 60 genes were found in overlapping QTLs of two populations. Interaction network and pathway analysis of these candidate genes indicated ten genes that might have strong influence over the other genes that control fatty acids and oil formation. The present results provided new information for genetic basis of FA and TAG formation in B. napus. A map including QTLs from numerous populations was built, which could serve as reference to study the genome profile of B. napus, and new potential genes emerged which might affect seed oil. New useful tracks were showed for the selection of population or/and selection of interesting genes for breeding improvement purpose.
Genome-wide sequence variations between wild and cultivated tomato species revisited by whole genome sequence mapping.

PubMed

Sahu, Kamlesh Kumar; Chattopadhyay, Debasis

2017-06-02

Cultivated tomato (Solanum lycopersicum L.) is the second most important vegetable crop after potato and a member of thirteen interfertile species of Solanum genus. Domestication and continuous selection for desirable traits made cultivated tomato species susceptible to many stresses as compared to the wild species. In this study, we analyzed and compared the genomes of wild and cultivated tomato accessions to identify the genomic regions that encountered changes during domestication. Analysis was based on SNP and InDel mining of twentynine accessions of twelve wild tomato species and forty accessions of cultivated tomato. Percentage of common SNPs among the accessions within a species corresponded with the reproductive behavior of the species. SNP profiles of the wild tomato species within a phylogenetic subsection varied with their geographical distribution. Interestingly, the ratio of genic SNP to total SNPs increased with phylogenetic distance of the wild tomato species from the domesticated species, suggesting that variations in gene-coding region play a major role in speciation. We retrieved 2439 physical positions in 1594 genes including 32 resistance related genes where all the wild accessions possessed a common wild variant allele different from all the cultivated accessions studied. Tajima's D analysis predicted a very strong purifying selection associated with domestication in nearly 1% of its genome, half of which is contributed by chromosome 11. This genomic region with a low Tajima's D value hosts a variety of genes associated with important agronomic trait such as, fruit size, tiller number and wax deposition. Our analysis revealed a broad-spectrum genetic base in wild tomato species and erosion of that in cultivated tomato due to recurrent selection for agronomically important traits. Identification of the common wild variant alleles and the genomic regions undergoing purifying selection during cultivation would facilitate future breeding program by introgression from wild species.
Genomic Confirmation of Hybridisation and Recent Inbreeding in a Vector-Isolated Leishmania Population

PubMed Central

Smith, Barbara A.; Imamura, Hideo; Sanders, Mandy; Svobodova, Milena; Volf, Petr; Berriman, Matthew; Cotton, James A.; Smith, Deborah F.

2014-01-01

Although asexual reproduction via clonal propagation has been proposed as the principal reproductive mechanism across parasitic protozoa of the Leishmania genus, sexual recombination has long been suspected, based on hybrid marker profiles detected in field isolates from different geographical locations. The recent experimental demonstration of a sexual cycle in Leishmania within sand flies has confirmed the occurrence of hybridisation, but knowledge of the parasite life cycle in the wild still remains limited. Here, we use whole genome sequencing to investigate the frequency of sexual reproduction in Leishmania, by sequencing the genomes of 11 Leishmania infantum isolates from sand flies and 1 patient isolate in a focus of cutaneous leishmaniasis in the Çukurova province of southeast Turkey. This is the first genome-wide examination of a vector-isolated population of Leishmania parasites. A genome-wide pattern of patchy heterozygosity and SNP density was observed both within individual strains and across the whole group. Comparisons with other Leishmania donovani complex genome sequences suggest that these isolates are derived from a single cross of two diverse strains with subsequent recombination within the population. This interpretation is supported by a statistical model of the genomic variability for each strain compared to the L. infantum reference genome strain as well as genome-wide scans for recombination within the population. Further analysis of these heterozygous blocks indicates that the two parents were phylogenetically distinct. Patterns of linkage disequilibrium indicate that this population reproduced primarily clonally following the original hybridisation event, but that some recombination also occurred. This observation allowed us to estimate the relative rates of sexual and asexual reproduction within this population, to our knowledge the first quantitative estimate of these events during the Leishmania life cycle. PMID:24453988
Rice Phospholipase A Superfamily: Organization, Phylogenetic and Expression Analysis during Abiotic Stresses and Development

PubMed Central

Singh, Amarjeet; Baranwal, Vinay; Shankar, Alka; Kanwar, Poonam; Ranjan, Rajeev; Yadav, Sandeep; Pandey, Amita; Kapoor, Sanjay; Pandey, Girdhar K.

2012-01-01

Background Phospholipase A (PLA) is an important group of enzymes responsible for phospholipid hydrolysis in lipid signaling. PLAs have been implicated in abiotic stress signaling and developmental events in various plants species. Genome-wide analysis of PLA superfamily has been carried out in dicot plant Arabidopsis. A comprehensive genome-wide analysis of PLAs has not been presented yet in crop plant rice. Methodology/Principal Findings A comprehensive bioinformatics analysis identified a total of 31 PLA encoding genes in the rice genome, which are divided into three classes; phospholipase A1 (PLA1), patatin like phospholipases (pPLA) and low molecular weight secretory phospholipase A2 (sPLA2) based on their sequences and phylogeny. A subset of 10 rice PLAs exhibited chromosomal duplication, emphasizing the role of duplication in the expansion of this gene family in rice. Microarray expression profiling revealed a number of PLA members expressing differentially and significantly under abiotic stresses and reproductive development. Comparative expression analysis with Arabidopsis PLAs revealed a high degree of functional conservation between the orthologs in two plant species, which also indicated the vital role of PLAs in stress signaling and plant development across different plant species. Moreover, sub-cellular localization of a few candidates suggests their differential localization and functional role in the lipid signaling. Conclusion/Significance The comprehensive analysis and expression profiling would provide a critical platform for the functional characterization of the candidate PLA genes in crop plants. PMID:22363522
Social genomics of healthy and disordered internet gaming.

PubMed

Snodgrass, Jeffrey G; Dengah Ii, H J François; Lacy, Michael G; Else, Robert J; Polzer, Evan R; Arevalo, Jesusa M G; Cole, Steven W

2018-06-20

To combine social genomics with cultural approaches to expand understandings of the somatic health dynamics of online gaming, including in the controversial nosological construct of internet gaming disorder (IGD). In blood samples from 56 U.S. gamers, we examined expression of the conserved transcriptional response to adversity (CTRA), a leukocyte gene expression profile activated by chronic stress. We compared positively engaged and problem gamers, as identified by an ethnographically developed measure, the Positive and Negative Gaming Experiences Scale (PNGE-42), and also by a clinically derived IGD scale (IGDS-SF9). CTRA profiles showed a clear relationship with PNGE-42, with a substantial linkage to offline social support, but were not meaningfully associated with disordered play as measured by IGDS-SF9. Our study advances understanding of the psychobiology of play, demonstrating via novel transcriptomic methods the association of negatively experienced internet play with biological measures of chronic threat, uncertainty, and distress. Our findings are consistent with the view that problematic patterns of online gaming are a proxy for broader patterns of biopsychosocial stress and distress such as loneliness, rather than a psychiatric disorder sui generis, which might exist apart from gamers' other life problems. By confirming the biological correlates of certain patterns of internet gaming, culturally-sensitive genomics approaches such as this can inform both evolutionary theorizing regarding the nature of play, as well as current psychiatric debates about the appropriateness of modeling distressful gaming on substance addiction and problem gambling. © 2018 Wiley Periodicals, Inc.
Transcriptional profiling of rat skeletal muscle hypertrophy under restriction of blood flow.

PubMed

Xu, Shouyu; Liu, Xueyun; Chen, Zhenhuang; Li, Gaoquan; Chen, Qin; Zhou, Guoqing; Ma, Ruijie; Yao, Xinmiao; Huang, Xiao

2016-12-15

Blood flow restriction (BFR) under low-intensity resistance training (LIRT) can produce similar effects upon muscles to that of high-intensity resistance training (HIRT) while overcoming many of the restrictions to HIRT that occurs in a clinical setting. However, the potential molecular mechanisms of BFR induced muscle hypertrophy remain largely unknown. Here, using a BFR rat model, we aim to better elucidate the mechanisms regulating muscle hypertrophy as induced by BFR and reveal possible clinical therapeutic targets for atrophy cases. We performed genome wide screening with microarray analysis to identify unique differentially expressed genes during rat muscle hypertrophy. We then successfully separated the differentially expressed genes from BRF treated soleus samples by comparing the Affymetrix rat Genome U34 2.0 array with the control. Using qRT-PCR and immunohistochemistry (IHC) we also analyzed other related differentially expressed genes. Results suggested that muscle hypertrophy induced by BFR is essentially regulated by the rate of protein turnover. Specifically, PI3K/AKT and MAPK pathways act as positive regulators in controlling protein synthesis where ubiquitin-proteasome acts as a negative regulator. This represents the first general genome wide level investigation of the gene expression profile in the rat soleus after BFR treatment. This may aid our understanding of the molecular mechanisms regulating and controlling muscle hypertrophy and provide support to the BFR strategies aiming to prevent muscle atrophy in a clinical setting. Copyright © 2016 Elsevier B.V. All rights reserved.
Identification of genomic aberrations associated with lymph node metastasis in diffuse-type gastric cancer.

PubMed

Choi, Ji-Hye; Kim, Young-Bae; Ahn, Ji Mi; Kim, Min Jae; Bae, Won Jung; Han, Sang-Uk; Woo, Hyun Goo; Lee, Dakeun

2018-04-06

Diffuse-type gastric cancer (DGC) is a GC subtype with heterogeneous clinical outcomes. Lymph node metastasis of DGC heralds a dismal progression, which hampers the curative treatment of patients. However, the genomic heterogeneity of DGC remains unknown. To identify genomic variations associated with lymph node metastasis in DGC, we performed whole exome sequencing on 23 cases of DGC and paired non-tumor tissues and compared the mutation profiles according to the presence (N3, n = 13) or absence (N0, n = 10) of regional lymph node metastasis. Overall, we identified 185 recurrently mutated genes in DGC, which included a significant novel mutation at CMTM2, as well as previously known mutations at CDH1, RHOA, and TP53. Noticeably, CMTM2 expression could predict the prognostic outcomes of DGC but not intestinal-type GC (IGC), indicating pivotal roles of CMTM2 in DGC progression. In addition, we identified a recurrent loss of heterozygosity (LOH) of DNA copy numbers at the 3p12-pcen locus in DGC. A comparison of N0 and N3 tumors showed that N3 tumors exhibited more frequent DNA copy number aberrations, including copy-neutral LOH and mutations of CpTpT trinucleotides, than N0 tumors (P = 0.2 × 10 -3 ). In conclusion, DGCs have distinct profiles of somatic mutations and DNA copy numbers according to the status of lymph node metastasis, and this might be helpful in delineating the pathobiology of DGC.
A comparison of PCR assays for beak and feather disease virus and high resolution melt (HRM) curve analysis of replicase associated protein and capsid genes.

PubMed

Das, Shubhagata; Sarker, Subir; Ghorashi, Seyed Ali; Forwood, Jade K; Raidal, Shane R

2016-11-01

Beak and feather disease virus (BFDV) threatens a wide range of endangered psittacine birds worldwide. In this study, we assessed a novel PCR assay and genetic screening method using high-resolution melt (HRM) curve analysis for BFDV targeting the capsid (Cap) gene (HRM-Cap) alongside conventional PCR detection as well as a PCR method that targets a much smaller fragment of the virus genome in the replicase initiator protein (Rep) gene (HRM-Rep). Limits of detection, sensitivity, specificity and discriminatory power for differentiating BFDV sequences were compared. HRM-Cap had a high positive predictive value and could readily differentiate between a reference genotype and 17 other diverse BFDV genomes with more discriminatory power (genotype confidence percentage) than HRM-Rep. Melt curve profiles generated by HRM-Cap correlated with unique DNA sequence profiles for each individual test genome. The limit of detection of HRM-Cap was lower (2×10 -5 ng/reaction or 48 viral copies) than that for both HRM-Rep and conventional BFDV PCR which had similar sensitivity (2×10 -6 ng or 13 viral copies/reaction). However, when used in a diagnostic setting with 348 clinical samples there was strong agreement between HRM-Cap and conventional PCR (kappa=0.87, P<0.01, 98% specificity) and HRM-Cap demonstrated higher specificity (99.9%) than HRM-Rep (80.3%). Copyright © 2016 Elsevier B.V. All rights reserved.
Comparative systems biology across an evolutionary gradient within the Shewanella genus.

PubMed

Konstantinidis, Konstantinos T; Serres, Margrethe H; Romine, Margaret F; Rodrigues, Jorge L M; Auchtung, Jennifer; McCue, Lee-Ann; Lipton, Mary S; Obraztsova, Anna; Giometti, Carol S; Nealson, Kenneth H; Fredrickson, James K; Tiedje, James M

2009-09-15

To what extent genotypic differences translate to phenotypic variation remains a poorly understood issue of paramount importance for several cornerstone concepts of microbiology including the species definition. Here, we take advantage of the completed genomic sequences, expressed proteomic profiles, and physiological studies of 10 closely related Shewanella strains and species to provide quantitative insights into this issue. Our analyses revealed that, despite extensive horizontal gene transfer within these genomes, the genotypic and phenotypic similarities among the organisms were generally predictable from their evolutionary relatedness. The power of the predictions depended on the degree of ecological specialization of the organisms evaluated. Using the gradient of evolutionary relatedness formed by these genomes, we were able to partly isolate the effect of ecology from that of evolutionary divergence and to rank the different cellular functions in terms of their rates of evolution. Our ranking also revealed that whole-cell protein expression differences among these organisms, when the organisms were grown under identical conditions, were relatively larger than differences at the genome level, suggesting that similarity in gene regulation and expression should constitute another important parameter for (new) species description. Collectively, our results provide important new information toward beginning a systems-level understanding of bacterial species and genera.
An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species

PubMed Central

Galpert, Deborah; del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

2015-01-01

Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification. PMID:26605337

Digestive tumor bank protocol: from surgical specimens to genomic studies of digestive cancers.

PubMed

Popescu, I; Stroescu, C; Dumitrascu, T; Herlea, V; Paslaru, Liliana; Lazar, V; Boissin, H; Taieb, J; Horeanga, Ionela

2006-01-01

Cancer is a complex polygenic and multifactorial disease, resulting from successive dynamic changes in the genome of somatic cells and from the accumulation of molecular alterations in both tumour cells and host cells. For the majority of cancers, including many malignancies of the gastrointestinal tract, our current means of diagnosis and treatment of the tumors are grossly insufficient. In recent years the development of several gene expression profiling methods such as comparative genomic hybridization (CGH), differential display, serial analysis of gene expression (SAGE) and DNA arrays, together with the sequencing of the human genome, has provided an opportunity to monitor and investigate the complete cascade of molecular events leading to tumor development and progression. Given the central role played by surgeons in the current management of patients with solid cancers, it is of paramount importance for them to know the principles characterizing this laboratory tools to critically assess the results originating from this biotechnology. We describe in this article the scientific partnership between Fundeni Clinical Institute Bucharest, Romania and RNtech Company, Paris, France for the development of a center of biological resources (Biobank) as well as the standardized protocol of working with the biological samples, the ongoing projects and the future perspectives.
An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species.

PubMed

Galpert, Deborah; Del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

2015-01-01

Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification.
Basket Studies: Redefining Clinical Trials in the Era of Genome-Driven Oncology.

PubMed

Tao, Jessica J; Schram, Alison M; Hyman, David M

2018-01-29

Understanding a tumor's detailed molecular profile has become increasingly necessary to deliver the standard of care for patients with advanced cancer. Innovations in both tumor genomic sequencing technology and the development of drugs that target molecular alterations have fueled recent gains in genome-driven oncology care. "Basket studies," or histology-agnostic clinical trials in genomically selected patients, represent one important research tool to continue making progress in this field. We review key aspects of genome-driven oncology care, including the purpose and utility of basket studies, biostatistical considerations in trial design, genomic knowledgebase development, and patient matching and enrollment models, which are critical for translating our genomic knowledge into clinically meaningful outcomes.
Oligonucleotide arrays vs. metaphase-comparative genomic hybridisation and BAC arrays for single-cell analysis: first applications to preimplantation genetic diagnosis for Robertsonian translocation carriers.

PubMed

Ramos, Laia; del Rey, Javier; Daina, Gemma; García-Aragonés, Manel; Armengol, Lluís; Fernandez-Encinas, Alba; Parriego, Mònica; Boada, Montserrat; Martinez-Passarell, Olga; Martorell, Maria Rosa; Casagran, Oriol; Benet, Jordi; Navarro, Joaquima

2014-01-01

Comprehensive chromosome analysis techniques such as metaphase-Comparative Genomic Hybridisation (CGH) and array-CGH are available for single-cell analysis. However, while metaphase-CGH and BAC array-CGH have been widely used for Preimplantation Genetic Diagnosis, oligonucleotide array-CGH has not been used in an extensive way. A comparison between oligonucleotide array-CGH and metaphase-CGH has been performed analysing 15 single fibroblasts from aneuploid cell-lines and 18 single blastomeres from human cleavage-stage embryos. Afterwards, oligonucleotide array-CGH and BAC array-CGH were also compared analysing 16 single blastomeres from human cleavage-stage embryos. All three comprehensive analysis techniques provided broadly similar cytogenetic profiles; however, non-identical profiles appeared when extensive aneuploidies were present in a cell. Both array techniques provided an optimised analysis procedure and a higher resolution than metaphase-CGH. Moreover, oligonucleotide array-CGH was able to define extra segmental imbalances in 14.7% of the blastomeres and it better determined the specific unbalanced chromosome regions due to a higher resolution of the technique (≈ 20 kb). Applicability of oligonucleotide array-CGH for Preimplantation Genetic Diagnosis has been demonstrated in two cases of Robertsonian translocation carriers 45,XY,der(13;14)(q10;q10). Transfer of euploid embryos was performed in both cases and pregnancy was achieved by one of the couples. This is the first time that an oligonucleotide array-CGH approach has been successfully applied to Preimplantation Genetic Diagnosis for balanced chromosome rearrangement carriers.
Oligonucleotide Arrays vs. Metaphase-Comparative Genomic Hybridisation and BAC Arrays for Single-Cell Analysis: First Applications to Preimplantation Genetic Diagnosis for Robertsonian Translocation Carriers

PubMed Central

Ramos, Laia; del Rey, Javier; Daina, Gemma; García-Aragonés, Manel; Armengol, Lluís; Fernandez-Encinas, Alba; Parriego, Mònica; Boada, Montserrat; Martinez-Passarell, Olga; Martorell, Maria Rosa; Casagran, Oriol; Benet, Jordi; Navarro, Joaquima

2014-01-01

Comprehensive chromosome analysis techniques such as metaphase-Comparative Genomic Hybridisation (CGH) and array-CGH are available for single-cell analysis. However, while metaphase-CGH and BAC array-CGH have been widely used for Preimplantation Genetic Diagnosis, oligonucleotide array-CGH has not been used in an extensive way. A comparison between oligonucleotide array-CGH and metaphase-CGH has been performed analysing 15 single fibroblasts from aneuploid cell-lines and 18 single blastomeres from human cleavage-stage embryos. Afterwards, oligonucleotide array-CGH and BAC array-CGH were also compared analysing 16 single blastomeres from human cleavage-stage embryos. All three comprehensive analysis techniques provided broadly similar cytogenetic profiles; however, non-identical profiles appeared when extensive aneuploidies were present in a cell. Both array techniques provided an optimised analysis procedure and a higher resolution than metaphase-CGH. Moreover, oligonucleotide array-CGH was able to define extra segmental imbalances in 14.7% of the blastomeres and it better determined the specific unbalanced chromosome regions due to a higher resolution of the technique (≈20 kb). Applicability of oligonucleotide array-CGH for Preimplantation Genetic Diagnosis has been demonstrated in two cases of Robertsonian translocation carriers 45,XY,der(13;14)(q10;q10). Transfer of euploid embryos was performed in both cases and pregnancy was achieved by one of the couples. This is the first time that an oligonucleotide array-CGH approach has been successfully applied to Preimplantation Genetic Diagnosis for balanced chromosome rearrangement carriers. PMID:25415307
A methodological study of genome-wide DNA methylation analyses using matched archival formalin-fixed paraffin embedded and fresh frozen breast tumors.

PubMed

Espinal, Allyson C; Wang, Dan; Yan, Li; Liu, Song; Tang, Li; Hu, Qiang; Morrison, Carl D; Ambrosone, Christine B; Higgins, Michael J; Sucheston-Campbell, Lara E

2017-02-28

DNA from archival formalin-fixed and paraffin embedded (FFPE) tissue is an invaluable resource for genome-wide methylation studies although concerns about poor quality may limit its use. In this study, we compared DNA methylation profiles of breast tumors using DNA from fresh-frozen (FF) tissues and three types of matched FFPE samples. For 9/10 patients, correlation and unsupervised clustering analysis revealed that the FF and FFPE samples were consistently correlated with each other and clustered into distinct subgroups. Greater than 84% of the top 100 loci previously shown to differentiate ER+ and ER- tumors in FF tissues were also FFPE DML. Weighted Correlation Gene Network Analyses (WCGNA) grouped the DML loci into 16 modules in FF tissue, with ~85% of the module membership preserved across tissue types. Restored FFPE and matched FF samples were profiled using the Illumina Infinium HumanMethylation450K platform. Methylation levels (β-values) across all loci and the top 100 loci previously shown to differentiate tumors by estrogen receptor status (ER+ or ER-) in a larger FF study, were compared between matched FF and FFPE samples using Pearson's correlation, hierarchical clustering and WCGNA. Positive predictive values and sensitivity levels for detecting differentially methylated loci (DML) in FF samples were calculated in an independent FFPE cohort. FFPE breast tumors samples show lower overall detection of DMLs versus FF, however FFPE and FF DMLs compare favorably. These results support the emerging consensus that the 450K platform can be employed to investigate epigenetics in large sets of archival FFPE tissues.
Genome-Wide Classification and Evolutionary and Expression Analyses of Citrus MYB Transcription Factor Families in Sweet Orange

PubMed Central

Hou, Xiao-Jin; Li, Si-Bei; Liu, Sheng-Rui; Hu, Chun-Gen; Zhang, Jin-Zhi

2014-01-01

MYB family genes are widely distributed in plants and comprise one of the largest transcription factors involved in various developmental processes and defense responses of plants. To date, few MYB genes and little expression profiling have been reported for citrus. Here, we describe and classify 177 members of the sweet orange MYB gene (CsMYB) family in terms of their genomic gene structures and similarity to their putative Arabidopsis orthologs. According to these analyses, these CsMYBs were categorized into four groups (4R-MYB, 3R-MYB, 2R-MYB and 1R-MYB). Gene structure analysis revealed that 1R-MYB genes possess relatively more introns as compared with 2R-MYB genes. Investigation of their chromosomal localizations revealed that these CsMYBs are distributed across nine chromosomes. Sweet orange includes a relatively small number of MYB genes compared with the 198 members in Arabidopsis, presumably due to a paralog reduction related to repetitive sequence insertion into promoter and non-coding transcribed region of the genes. Comparative studies of CsMYBs and Arabidopsis showed that CsMYBs had fewer gene duplication events. Expression analysis revealed that the MYB gene family has a wide expression profile in sweet orange development and plays important roles in development and stress responses. In addition, 337 new putative microsatellites with flanking sequences sufficient for primer design were also identified from the 177 CsMYBs. These results provide a useful reference for the selection of candidate MYB genes for cloning and further functional analysis forcitrus. PMID:25375352
Epigenetic Regulation of ZBTB18 Promotes Glioblastoma Progression. | Office of Cancer Genomics

Cancer.gov

Glioblastoma (GBM) comprises distinct subtypes characterized by their molecular profile. Mesenchymal identity in GBM has been associated with a comparatively unfavorable prognosis, primarily due to inherent resistance of these tumors to current therapies. The identification of molecular determinants of mesenchymal transformation could potentially allow for the discovery of new therapeutic targets. Zinc Finger and BTB Domain Containing 18 (ZBTB18/ZNF238/RP58) is a zinc finger transcriptional repressor with a crucial role in brain development and neuronal differentiation.
The Chthonomonas calidirosea Genome Is Highly Conserved across Geographic Locations and Distinct Chemical and Microbial Environments in New Zealand's Taupō Volcanic Zone.

PubMed

Lee, Kevin C; Stott, Matthew B; Dunfield, Peter F; Huttenhower, Curtis; McDonald, Ian R; Morgan, Xochitl C

2016-06-15

Chthonomonas calidirosea T49(T) is a low-abundance, carbohydrate-scavenging, and thermophilic soil bacterium with a seemingly disorganized genome. We hypothesized that the C. calidirosea genome would be highly responsive to local selection pressure, resulting in the divergence of its genomic content, genome organization, and carbohydrate utilization phenotype across environments. We tested this hypothesis by sequencing the genomes of four C. calidirosea isolates obtained from four separate geothermal fields in the Taupō Volcanic Zone, New Zealand. For each isolation site, we measured physicochemical attributes and defined the associated microbial community by 16S rRNA gene sequencing. Despite their ecological and geographical isolation, the genome sequences showed low divergence (maximum, 1.17%). Isolate-specific variations included single-nucleotide polymorphisms (SNPs), restriction-modification systems, and mobile elements but few major deletions and no major rearrangements. The 50-fold variation in C. calidirosea relative abundance among the four sites correlated with site environmental characteristics but not with differences in genomic content. Conversely, the carbohydrate utilization profiles of the C. calidirosea isolates corresponded to the inferred isolate phylogenies, which only partially paralleled the geographical relationships among the sample sites. Genomic sequence conservation does not entirely parallel geographic distance, suggesting that stochastic dispersal and localized extinction, which allow for rapid population homogenization with little restriction by geographical barriers, are possible mechanisms of C. calidirosea distribution. This dispersal and extinction mechanism is likely not limited to C. calidirosea but may shape the populations and genomes of many other low-abundance free-living taxa. This study compares the genomic sequence variations and metabolisms of four strains of Chthonomonas calidirosea, a rare thermophilic bacterium from the phylum Armatimonadetes It additionally compares the microbial communities and chemistry of each of the geographically distinct sites from which the four C. calidirosea strains were isolated. C. calidirosea was previously reported to possess a highly disorganized genome, but it was unclear whether this reflected rapid evolution. Here, we show that each isolation site has a distinct chemistry and microbial community, but despite this, the C. calidirosea genome is highly conserved across all isolation sites. Furthermore, genomic sequence differences only partially paralleled geographic distance, suggesting that C. calidirosea genotypes are not primarily determined by adaptive evolution. Instead, the presence of C. calidirosea may be driven by stochastic dispersal and localized extinction. This ecological mechanism may apply to many other low-abundance taxa. Copyright © 2016 Lee et al.
The Chthonomonas calidirosea Genome Is Highly Conserved across Geographic Locations and Distinct Chemical and Microbial Environments in New Zealand's Taupō Volcanic Zone

PubMed Central

Lee, Kevin C.; Stott, Matthew B.; Dunfield, Peter F.; Huttenhower, Curtis; McDonald, Ian R.

2016-01-01

ABSTRACT Chthonomonas calidirosea T49T is a low-abundance, carbohydrate-scavenging, and thermophilic soil bacterium with a seemingly disorganized genome. We hypothesized that the C. calidirosea genome would be highly responsive to local selection pressure, resulting in the divergence of its genomic content, genome organization, and carbohydrate utilization phenotype across environments. We tested this hypothesis by sequencing the genomes of four C. calidirosea isolates obtained from four separate geothermal fields in the Taupō Volcanic Zone, New Zealand. For each isolation site, we measured physicochemical attributes and defined the associated microbial community by 16S rRNA gene sequencing. Despite their ecological and geographical isolation, the genome sequences showed low divergence (maximum, 1.17%). Isolate-specific variations included single-nucleotide polymorphisms (SNPs), restriction-modification systems, and mobile elements but few major deletions and no major rearrangements. The 50-fold variation in C. calidirosea relative abundance among the four sites correlated with site environmental characteristics but not with differences in genomic content. Conversely, the carbohydrate utilization profiles of the C. calidirosea isolates corresponded to the inferred isolate phylogenies, which only partially paralleled the geographical relationships among the sample sites. Genomic sequence conservation does not entirely parallel geographic distance, suggesting that stochastic dispersal and localized extinction, which allow for rapid population homogenization with little restriction by geographical barriers, are possible mechanisms of C. calidirosea distribution. This dispersal and extinction mechanism is likely not limited to C. calidirosea but may shape the populations and genomes of many other low-abundance free-living taxa. IMPORTANCE This study compares the genomic sequence variations and metabolisms of four strains of Chthonomonas calidirosea, a rare thermophilic bacterium from the phylum Armatimonadetes. It additionally compares the microbial communities and chemistry of each of the geographically distinct sites from which the four C. calidirosea strains were isolated. C. calidirosea was previously reported to possess a highly disorganized genome, but it was unclear whether this reflected rapid evolution. Here, we show that each isolation site has a distinct chemistry and microbial community, but despite this, the C. calidirosea genome is highly conserved across all isolation sites. Furthermore, genomic sequence differences only partially paralleled geographic distance, suggesting that C. calidirosea genotypes are not primarily determined by adaptive evolution. Instead, the presence of C. calidirosea may be driven by stochastic dispersal and localized extinction. This ecological mechanism may apply to many other low-abundance taxa. PMID:27060125
Genome-Wide Profiling Reveals That Herbal Medicine Jinfukang-Induced Polyadenylation Alteration Is Involved in Anti-Lung Cancer Activity

PubMed Central

Li, Guoqing; Shao, Jinhui; Liu, Cong; Lu, Jun; Zhao, Xiaodong

2017-01-01

Alternative polyadenylation (APA) plays an important role in regulation of genes expression and is involved in many biological processes. As eukaryotic cells receive a variety of external signals, genes produce diverse transcriptional isoforms and exhibit different translation efficiency. The traditional Chinese medicine (TCM) Jinfukang (JFK) has been effectively used for lung cancer treatment. In this study, we investigated whether JFK exerts its antitumor effect by modulating APA patterns in lung cancer cells. We performed a genome-wide APA site profiling analysis in JFK treated lung cancer cells A549 with 3T-seq approach that we reported previously. Comparing with those in untreated A549, in JFK treated A549 we observed APA-mediated 3′ UTRs alterations in 310 genes including 77 genes with shortened 3′ UTRs. In particular, we identified TMEM123, a gene involved in oncotic cell death, which produced transcripts with shortened 3′ UTR and thus was upregulated upon JFK treatment. Taken together, our studies suggest that APA might be one of the antitumor mechanisms of JFK and provide a new insight for the understanding of TCM against cancer. PMID:29234412
Genome-Wide Profiling Reveals That Herbal Medicine Jinfukang-Induced Polyadenylation Alteration Is Involved in Anti-Lung Cancer Activity.

PubMed

Kou, Yao; Li, Guoqing; Shao, Jinhui; Liu, Cong; Wu, Jun; Lu, Jun; Zhao, Xiaodong; Tian, Jing

2017-01-01

Alternative polyadenylation (APA) plays an important role in regulation of genes expression and is involved in many biological processes. As eukaryotic cells receive a variety of external signals, genes produce diverse transcriptional isoforms and exhibit different translation efficiency. The traditional Chinese medicine (TCM) Jinfukang (JFK) has been effectively used for lung cancer treatment. In this study, we investigated whether JFK exerts its antitumor effect by modulating APA patterns in lung cancer cells. We performed a genome-wide APA site profiling analysis in JFK treated lung cancer cells A549 with 3T-seq approach that we reported previously. Comparing with those in untreated A549, in JFK treated A549 we observed APA-mediated 3' UTRs alterations in 310 genes including 77 genes with shortened 3' UTRs. In particular, we identified TMEM123 , a gene involved in oncotic cell death, which produced transcripts with shortened 3' UTR and thus was upregulated upon JFK treatment. Taken together, our studies suggest that APA might be one of the antitumor mechanisms of JFK and provide a new insight for the understanding of TCM against cancer.
GTA: a game theoretic approach to identifying cancer subnetwork markers.

PubMed

Farahmand, S; Goliaei, S; Ansari-Pour, N; Razaghi-Moghadam, Z

2016-03-01

The identification of genetic markers (e.g. genes, pathways and subnetworks) for cancer has been one of the most challenging research areas in recent years. A subset of these studies attempt to analyze genome-wide expression profiles to identify markers with high reliability and reusability across independent whole-transcriptome microarray datasets. Therefore, the functional relationships of genes are integrated with their expression data. However, for a more accurate representation of the functional relationships among genes, utilization of the protein-protein interaction network (PPIN) seems to be necessary. Herein, a novel game theoretic approach (GTA) is proposed for the identification of cancer subnetwork markers by integrating genome-wide expression profiles and PPIN. The GTA method was applied to three distinct whole-transcriptome breast cancer datasets to identify the subnetwork markers associated with metastasis. To evaluate the performance of our approach, the identified subnetwork markers were compared with gene-based, pathway-based and network-based markers. We show that GTA is not only capable of identifying robust metastatic markers, it also provides a higher classification performance. In addition, based on these GTA-based subnetworks, we identified a new bonafide candidate gene for breast cancer susceptibility.
Comparative Genomic Analysis of Two Clonally Related Multidrug Resistant Mycobacterium tuberculosis by Single Molecule Real Time Sequencing.

PubMed

Leung, Kenneth Siu-Sing; Siu, Gilman Kit-Hang; Tam, Kingsley King-Gee; To, Sabrina Wai-Chi; Rajwani, Rahim; Ho, Pak-Leung; Wong, Samson Sai-Yin; Zhao, Wei W; Ma, Oliver Chiu-Kit; Yam, Wing-Cheong

2017-01-01

Background: Multidrug-resistant tuberculosis (MDR-TB) is posing a major threat to global TB control. In this study, we focused on two consecutive MDR-TB isolated from the same patient before and after the initiation of anti-TB treatment. To better understand the genomic characteristics of MDR-TB, Single Molecule Real-Time (SMRT) Sequencing and comparative genomic analyses was performed to identify mutations that contributed to the stepwise development of drug resistance and growth fitness in MDR-TB under in vivo challenge of anti-TB drugs. Result: Both pre-treatment and post-treatment strain demonstrated concordant phenotypic and genotypic susceptibility profiles toward rifampicin, pyrazinamide, streptomycin, fluoroquinolones, aminoglycosides, cycloserine, ethionamide, and para-aminosalicylic acid. However, although both strains carried identical missense mutations at rpoB S531L, inhA C-15T, and embB M306V, MYCOTB Sensititre assay showed that the post-treatment strain had 16-, 8-, and 4-fold elevation in the minimum inhibitory concentrations (MICs) toward rifabutin, isoniazid, and ethambutol respectively. The results have indicated the presence of additional resistant-related mutations governing the stepwise development of MDR-TB. Further comparative genomic analyses have identified three additional polymorphisms between the clinical isolates. These include a single nucleotide deletion at nucleotide position 360 of rv0888 in pre-treatment strain, and a missense mutation at rv3303c ( lpdA) V44I and a 6-bp inframe deletion at codon 67-68 in rv2071c ( cobM) in the post-treatment strain. Multiple sequence alignment showed that these mutations were occurring at highly conserved regions among pathogenic mycobacteria. Using structural-based and sequence-based algorithms, we further predicted that the mutations potentially have deleterious effect on protein function. Conclusion: This is the first study that compared the full genomes of two clonally-related MDR-TB clinical isolates during the course of anti-TB treatment. Our work has demonstrated the robustness of SMRT Sequencing in identifying mutations among MDR-TB clinical isolates. Comparative genome analysis also suggested novel mutations at rv0888, lpdA , and cobM that might explain the difference in antibiotic resistance and growth pattern between the two MDR-TB strains.
A BAC clone fingerprinting approach to the detection of human genome rearrangements

PubMed Central

Krzywinski, Martin; Bosdet, Ian; Mathewson, Carrie; Wye, Natasja; Brebner, Jay; Chiu, Readman; Corbett, Richard; Field, Matthew; Lee, Darlene; Pugh, Trevor; Volik, Stas; Siddiqui, Asim; Jones, Steven; Schein, Jacquie; Collins, Collin; Marra, Marco

2007-01-01

We present a method, called fingerprint profiling (FPP), that uses restriction digest fingerprints of bacterial artificial chromosome clones to detect and classify rearrangements in the human genome. The approach uses alignment of experimental fingerprint patterns to in silico digests of the sequence assembly and is capable of detecting micro-deletions (1-5 kb) and balanced rearrangements. Our method has compelling potential for use as a whole-genome method for the identification and characterization of human genome rearrangements. PMID:17953769
Understanding development and stem cells using single cell-based analyses of gene expression

PubMed Central

Kumar, Pavithra; Tan, Yuqi

2017-01-01

In recent years, genome-wide profiling approaches have begun to uncover the molecular programs that drive developmental processes. In particular, technical advances that enable genome-wide profiling of thousands of individual cells have provided the tantalizing prospect of cataloging cell type diversity and developmental dynamics in a quantitative and comprehensive manner. Here, we review how single-cell RNA sequencing has provided key insights into mammalian developmental and stem cell biology, emphasizing the analytical approaches that are specific to studying gene expression in single cells. PMID:28049689
Cancer diagnostics: The journey from histomorphology to molecular profiling.

PubMed

Ahmed, Atif A; Abedalthagafi, Malak

2016-09-06

Although histomorphology has made significant advances into the understanding of cancer etiology, classification and pathogenesis, it is sometimes complicated by morphologic ambiguities, and other shortcomings that necessitate the development of ancillary tests to complement its diagnostic value. A new approach to cancer patient management consists of targeting specific molecules or gene mutations in the cancer genome by inhibitory therapy. Molecular diagnostic tests and genomic profiling methods are increasingly being developed to identify tumor targeted molecular profile that is the basis of targeted therapy. Novel targeted therapy has revolutionized the treatment of gastrointestinal stromal tumor, renal cell carcinoma and other cancers that were previously difficult to treat with standard chemotherapy. In this review, we discuss the role of histomorphology in cancer diagnosis and management and the rising role of molecular profiling in targeted therapy. Molecular profiling in certain diagnostic and therapeutic difficulties may provide a practical and useful complement to histomorphology and opens new avenues for targeted therapy and alternative methods of cancer patient management.
High-resolution mapping of transcription factor binding sites on native chromatin

PubMed Central

Kasinathan, Sivakanthan; Orsi, Guillermo A.; Zentner, Gabriel E.; Ahmad, Kami; Henikoff, Steven

2014-01-01

Sequence-specific DNA-binding proteins including transcription factors (TFs) are key determinants of gene regulation and chromatin architecture. Formaldehyde cross-linking and sonication followed by Chromatin ImmunoPrecipitation (X-ChIP) is widely used for profiling of TF binding, but is limited by low resolution and poor specificity and sensitivity. We present a simple protocol that starts with micrococcal nuclease-digested uncross-linked chromatin and is followed by affinity purification of TFs and paired-end sequencing. The resulting ORGANIC (Occupied Regions of Genomes from Affinity-purified Naturally Isolated Chromatin) profiles of Saccharomyces cerevisiae Abf1 and Reb1 provide highly accurate base-pair resolution maps that are not biased toward accessible chromatin, and do not require input normalization. We also demonstrate the high specificity of our method when applied to larger genomes by profiling Drosophila melanogaster GAGA Factor and Pipsqueak. Our results suggest that ORGANIC profiling is a widely applicable high-resolution method for sensitive and specific profiling of direct protein-DNA interactions. PMID:24336359
A house finch (Haemorhous mexicanus) spleen transcriptome reveals intra- and interspecific patterns of gene expression, alternative splicing and genetic diversity in passerines.

PubMed

Zhang, Qu; Hill, Geoffrey E; Edwards, Scott V; Backström, Niclas

2014-04-24

With its plumage color dimorphism and unique history in North America, including a recent population expansion and an epizootic of Mycoplasma gallisepticum (MG), the house finch (Haemorhous mexicanus) is a model species for studying sexual selection, plumage coloration and host-parasite interactions. As part of our ongoing efforts to make available genomic resources for this species, here we report a transcriptome assembly derived from genes expressed in spleen. We characterize transcriptomes from two populations with different histories of demography and disease exposure: a recently founded population in the eastern US that has been exposed to MG for over a decade and a native population from the western range that has never been exposed to MG. We utilize this resource to quantify conservation in gene expression in passerine birds over approximately 50 MY by comparing splenic expression profiles for 9,646 house finch transcripts and those from zebra finch and find that less than half of all genes expressed in spleen in either species are expressed in both species. Comparative gene annotations from several vertebrate species suggest that the house finch transcriptomes contain ~15 genes not yet found in previously sequenced vertebrate genomes. The house finch transcriptomes harbour ~85,000 SNPs, ~20,000 of which are non-synonymous. Although not yet validated by biological or technical replication, we identify a set of genes exhibiting differences between populations in gene expression (n = 182; 2% of all transcripts), allele frequencies (76 FST ouliers) and alternative splicing as well as genes with several fixed non-synonymous substitutions; this set includes genes with functions related to double-strand break repair and immune response. The two house finch spleen transcriptome profiles will add to the increasing data on genome and transcriptome sequence information from natural populations. Differences in splenic expression between house finch and zebra finch imply either significant evolutionary turnover of splenic expression patterns or different physiological states of the individuals examined. The transcriptome resource will enhance the potential to annotate an eventual house finch genome, and the set of gene-based high-quality SNPs will help clarify the genetic underpinnings of host-pathogen interactions and sexual selection.
Genome-Wide Profiles of Extra-cranial Malignant Rhabdoid Tumors Reveal Heterogeneity and Dysregulated Developmental Pathways | Office of Cancer Genomics

Cancer.gov

Malignant rhabdoid tumors (MRTs) are rare lethal tumors of childhood that most commonly occur in the kidney and brain. MRTs are driven by SMARCB1 loss, but the molecular consequences of SMARCB1 loss in extra-cranial tumors have not been comprehensively described and genomic resources for analyses of extra-cranial MRT are limited.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.