Science.gov

Sample records for comparing genomic expression

  1. Genome-level identification, gene expression, and comparative analysis of porcine ß-defensin genes

    PubMed Central

    2012-01-01

    Background Beta-defensins (β-defensins) are innate immune peptides with evolutionary conservation across a wide range of species and has been suggested to play important roles in innate immune reactions against pathogens. However, the complete β-defensin repertoire in the pig has not been fully addressed. Result A BLAST analysis was performed against the available pig genomic sequence in the NCBI database to identify β-defensin-related sequences using previously reported β-defensin sequences of pigs, humans, and cattle. The porcine β-defensin gene clusters were mapped to chromosomes 7, 14, 15 and 17. The gene expression analysis of 17 newly annotated porcine β-defensin genes across 15 tissues using semi-quantitative reverse transcription polymerase chain reaction (RT-PCR) showed differences in their tissue distribution, with the kidney and testis having the largest pBD expression repertoire. We also analyzed single nucleotide polymorphisms (SNPs) in the mature peptide region of pBD genes from 35 pigs of 7 breeds. We found 8 cSNPs in 7 pBDs. Conclusion We identified 29 porcine β-defensin (pBD) gene-like sequences, including 17 unreported pBDs in the porcine genome. Comparative analysis of β-defensin genes in the pig genome with those in human and cattle genomes showed structural conservation of β-defensin syntenic regions among these species. PMID:23150902

  2. BμG@Sbase—a microbial gene expression and comparative genomic database

    PubMed Central

    Witney, Adam A.; Waldron, Denise E.; Brooks, Lucy A.; Tyler, Richard H.; Withers, Michael; Stoker, Neil G.; Wren, Brendan W.; Butcher, Philip D.; Hinds, Jason

    2012-01-01

    The reducing cost of high-throughput functional genomic technologies is creating a deluge of high volume, complex data, placing the burden on bioinformatics resources and tool development. The Bacterial Microarray Group at St George's (BμG@S) has been at the forefront of bacterial microarray design and analysis for over a decade and while serving as a hub of a global network of microbial research groups has developed BμG@Sbase, a microbial gene expression and comparative genomic database. BμG@Sbase (http://bugs.sgul.ac.uk/bugsbase/) is a web-browsable, expertly curated, MIAME-compliant database that stores comprehensive experimental annotation and multiple raw and analysed data formats. Consistent annotation is enabled through a structured set of web forms, which guide the user through the process following a set of best practices and controlled vocabulary. The database currently contains 86 expertly curated publicly available data sets (with a further 124 not yet published) and full annotation information for 59 bacterial microarray designs. The data can be browsed and queried using an explorer-like interface; integrating intuitive tree diagrams to present complex experimental details clearly and concisely. Furthermore the modular design of the database will provide a robust platform for integrating other data types beyond microarrays into a more Systems analysis based future. PMID:21948792

  3. Comparing genomic expression patterns across plant species reveals highly diverged transcriptional dynamics in response to salt stress

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Rice and barley are both members of Poaceae (grass family) but have a marked difference in salt tolerance. The molecular mechanism underlying this difference was previously unexplored. This study employs a comparative genomics approach to identify analogous and contrasting gene expression patterns b...

  4. Ensembl comparative genomics resources

    PubMed Central

    Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J.; Searle, Stephen M. J.; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847

  5. Ensembl comparative genomics resources.

    PubMed

    Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847

  6. Comparative genomics reveals tissue-specific regulation of prolactin receptor gene expression.

    PubMed

    Schennink, Anke; Trott, Josephine F; Manjarin, Rodrigo; Lemay, Danielle G; Freking, Bradley A; Hovey, Russell C

    2015-02-01

    Prolactin (PRL), acting via the PRL receptor (PRLR), controls hundreds of biological processes across a range of species. Endocrine PRL elicits well-documented effects on target tissues such as the mammary glands and reproductive organs in addition to coordinating whole-body homeostasis during states such as lactation or adaptive responses to the environment. While changes in PRLR expression likely facilitates these tissue-specific responses to circulating PRL, the mechanisms regulating this regulation in non-rodent species has received limited attention. We performed a wide-scale analysis of PRLR 5' transcriptional regulation in pig tissues. Apart from the abundantly expressed and widely conserved exon 1, we identified alternative splicing of transcripts from an additional nine first exons of the porcine PRLR (pPRLR) gene. Notably, exon 1.5 transcripts were expressed most abundantly in the heart, while expression of exon 1.3-containing transcripts was greatest in the kidneys and small intestine. Expression of exon 1.3 mRNAs within the kidneys was most abundant in the renal cortex, and increased during gestation. A comparative analysis revealed a human homologue to exon 1.3, hE1N2, which was also principally transcribed in the kidneys and small intestines, and an exon hE1N3 was only expressed in the kidneys of humans. Promoter alignment revealed conserved motifs within the proximal promoter upstream of exon 1.3, including putative binding sites for hepatocyte nuclear factor-1 and Sp1. Together, these results highlight the diverse, conserved and tissue-specific regulation of PRLR expression in the targets for PRL, which may function to coordinate complex physiological states such as lactation and osmoregulation. PMID:25358647

  7. Comparative analysis of the expressed genome of the infective juvenile entomopathogenic nematode, Heterorhabditis bacteriophora.

    PubMed

    Sandhu, Sukhinder K; Jagdale, Ganpati B; Hogenhout, Saskia A; Grewal, Parwinder S

    2006-02-01

    We report the first cDNA-sequencing project of the entomopathogenic nematode, Heterorhabditis bacteriophora. A total of 1246 expressed sequence tags (ESTs) were generated by random sequencing of clones from a cDNA library of the infective juvenile stage. The ESTs were annotated resulting in 1072 useful ESTs that were categorized into functional categories according to Kyoto Encyclopedia of Genes and Genomes. Approximately 459 of 1072 ESTs (43%) had significant similarities to annotated sequences in GenBank. Of these, 417 had significant similarities to the free-living nematode Caenorhanditis elegans proteins. Most ESTs (18%) belonged to the genetic information processing category followed by metabolism (15% ESTs) and environmental information processing (15%) pathways. Several interesting ESTs were found that may have roles in the infectivity and survival of infective juveniles. These included proteases, dauer pathway genes (akt-1, pdk-1 & daf-7) and aging and stress resistance genes such as superoxide dismutase (sod-4), heat shock genes (hsp-4 & hsp-6), and eat genes, and signaling proteins like G-protein coupled receptors, regulators of G-protein signaling (rgs), and serine/threonine kinases. Other interesting ESTs include systemic RNAi defective protein (sid-1), ribonuclease III family members (rnh-2 &rnc) and transposase gene (Tc3A). About 67% of the ESTs did not find matches in any of the searched databases suggesting potentially novel genes in this enomopathogenic nematode. Note: Sequences described in this paper have been deposited in Genbank under the accessions DN 152655-DN 152999, and DN 153000-DN 153726. PMID:16414368

  8. Ebolavirus comparative genomics

    DOE PAGESBeta

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; et al

    2015-07-14

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of themore » same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.« less

  9. Ebolavirus comparative genomics

    PubMed Central

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Wassenaar, Trudy M.; Ussery, David W.

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  10. Ebolavirus comparative genomics.

    PubMed

    Jun, Se-Ran; Leuze, Michael R; Nookaew, Intawat; Uberbacher, Edward C; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S; Pedersen, Thomas D; Wassenaar, Trudy M; Ussery, David W

    2015-09-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  11. Comparative genomic analyses in Asparagus.

    PubMed

    Kuhl, Joseph C; Havey, Michael J; Martin, William J; Cheung, Foo; Yuan, Qiaoping; Landherr, Lena; Hu, Yi; Leebens-Mack, James; Town, Christopher D; Sink, Kenneth C

    2005-12-01

    Garden asparagus (Asparagus officinalis L.) belongs to the monocot family Asparagaceae in the order Asparagales. Onion (Allium cepa L.) and Asparagus officinalis are 2 of the most economically important plants of the core Asparagales, a well supported monophyletic group within the Asparagales. Coding regions in onion have lower GC contents than the grasses. We compared the GC content of 3374 unique expressed sequence tags (ESTs) from A. officinalis with Lycoris longituba and onion (both members of the core Asparagales), Acorus americanus (sister to all other monocots), the grasses, and Arabidopsis. Although ESTs in A. officinalis and Acorus had a higher average GC content than Arabidopsis, Lycoris, and onion, all were clearly lower than the grasses. The Asparagaceae have the smallest nuclear genomes among all plants in the core Asparagales, which typically have huge genomes. Within the Asparagaceae, European Asparagus species have approximately twice the nuclear DNA of that of southern African Asparagus species. We cloned and sequenced 20 genomic amplicons from European A. officinalis and the southern African species Asparagus plumosus and observed no clear evidence for a recent genome doubling in A. officinalis relative to A. plumosus. These results indicate that members of the genus Asparagus with smaller genomes may be useful genomic models for plants in the core Asparagales. PMID:16391674

  12. Identification of a novel enhancer of brain expression near the apoE gene cluster by comparative genomics

    SciTech Connect

    Zheng, Ping; Pennacchio, Len A.; Goff, Wilfried Le; Rubin, Edward M.; Smith, Jonathan D.

    2003-10-01

    Comparative analysis of the human and mouse genomic sequences downstream of the apolipoprotein E gene (APOE) revealed a highly conserved element with previously undefined function. In reporter gene transfection studies, this element which is located f42 kb distal to APOE was found to have silencer activity in a subset of cell lines examined. Analysis of transgenic mice containing a fusion construct linking this distal 631 bp conserved element to a reporter gene comprised of the human APOE gene with its proximal promoter resulted in robust brain expression of the transgenic human apoE mRNA in three independent transgenic lines, supporting the identification of a novel brain controlling region (BCR). Further studies using immunohistochemistry revealed widespread human apoE localization throughout the brains of the BCR-apoE transgenic mice with prominent expression in the cortex and diencephalon. In addition, double-label immunofluorescence performed on brain sections and cultures of primary cortical cells localized human apoE protein to cortical neurons and microglia. These studies demonstrate that comparative sequence analysis is a successful strategy to predict candidate regulatory regions in vivo, although they do not imply that this element controls apoE expression physiologically.

  13. Aspergillus parasiticus SU-1 genome sequence, predicted chromosome structure, and comparative gene expression under aflatoxin-inducing conditions: evidence that differential expression contributes to species phenotype.

    PubMed

    Linz, John E; Wee, Josephine; Roze, Ludmila V

    2014-08-01

    The filamentous fungi Aspergillus parasiticus and Aspergillus flavus produce the carcinogenic secondary metabolite aflatoxin on susceptible crops. These species differ in the quantity of aflatoxins B1, B2, G1, and G2 produced in culture, in the ability to produce the mycotoxin cyclopiazonic acid, and in morphology of mycelia and conidiospores. To understand the genetic basis for differences in biochemistry and morphology, we conducted next-generation sequence (NGS) analysis of the A. parasiticus strain SU-1 genome and comparative gene expression (RNA sequence analysis [RNA Seq]) analysis of A. parasiticus SU-1 and A. flavus strain NRRL 3357 (3357) grown under aflatoxin-inducing and -noninducing culture conditions. Although A. parasiticus SU-1 and A. flavus 3357 are highly similar in genome structure and gene organization, we observed differences in the presence of specific mycotoxin gene clusters and differential expression of specific mycotoxin genes and gene clusters that help explain differences in the type and quantity of mycotoxins synthesized. Using computer-aided analysis of secondary metabolite clusters (antiSMASH), we demonstrated that A. parasiticus SU-1 and A. flavus 3357 may carry up to 93 secondary metabolite gene clusters, and surprisingly, up to 10% of the genome appears to be dedicated to secondary metabolite synthesis. The data also suggest that fungus-specific zinc binuclear cluster (C6) transcription factors play an important role in regulation of secondary metabolite cluster expression. Finally, we identified uniquely expressed genes in A. parasiticus SU-1 that encode C6 transcription factors and genes involved in secondary metabolism and stress response/cellular defense. Future work will focus on these differentially expressed A. parasiticus SU-1 loci to reveal their role in determining distinct species characteristics. PMID:24951444

  14. PLEXdb: Plant and Pathogen Expression Database and Tools for Comparative and Functional Genomics Analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    PLEXdb is a plant expression database that supports all Affymetrix microarray designs for plants and plant pathogens. PLEXdb provides annotation and hand-curated microarray data. Experiments deposited in PLEXdb are checked for MIAME/Plant compliance and completeness, then processed by normalizing th...

  15. Comparative genomics reveals tissue-specific regulation of prolactin receptor gene expression

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Prolactin (PRL), acting via the prolactin receptor, fulfills a diversity of biological functions including the maintenance of solute balance and mineral homeostasis via tissues such as the heart, kidneys and intestine. Expression and activity of the prolactin receptor (PRLR) is regulated by various ...

  16. Genome-wide expression analysis comparing hypertrophic changes in normal and dysferlinopathy mice

    PubMed Central

    Lee, Yun-Sil; Talbot, C. Conover; Lee, Se-Jin

    2015-01-01

    Because myostatin normally limits skeletal muscle growth, there are extensive efforts to develop myostatin inhibitors for clinical use. One potential concern is that in muscle degenerative diseases, inducing hypertrophy may increase stress on dystrophic fibers. Our study shows that blocking this pathway in dysferlin deficient mice results in early improvement in histopathology but ultimately accelerates muscle degeneration. Hence, benefits of this approach should be weighed against these potential detrimental effects. Here, we present detailed experimental methods and analysis for the gene expression profiling described in our recently published study in Human Molecular Genetics (Lee et al., 2015). Our data sets have been deposited in the Gene Expression Omnibus (GEO) database (GSE62945) and are available at http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE62945. Our data provide a resource for exploring molecular mechanisms that are related to hypertrophy-induced, accelerated muscular degeneration in dysferlinopathy. PMID:26697388

  17. Genome-wide expression analysis comparing hypertrophic changes in normal and dysferlinopathy mice.

    PubMed

    Lee, Yun-Sil; Talbot, C Conover; Lee, Se-Jin

    2015-12-01

    Because myostatin normally limits skeletal muscle growth, there are extensive efforts to develop myostatin inhibitors for clinical use. One potential concern is that in muscle degenerative diseases, inducing hypertrophy may increase stress on dystrophic fibers. Our study shows that blocking this pathway in dysferlin deficient mice results in early improvement in histopathology but ultimately accelerates muscle degeneration. Hence, benefits of this approach should be weighed against these potential detrimental effects. Here, we present detailed experimental methods and analysis for the gene expression profiling described in our recently published study in Human Molecular Genetics (Lee et al., 2015). Our data sets have been deposited in the Gene Expression Omnibus (GEO) database (GSE62945) and are available at http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE62945. Our data provide a resource for exploring molecular mechanisms that are related to hypertrophy-induced, accelerated muscular degeneration in dysferlinopathy. PMID:26697388

  18. Phytozome Comparative Plant Genomics Portal

    SciTech Connect

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  19. COMPARATIVE GENOMICS IN LEGUMES

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The legume plant family will soon include three sequenced genomes. The majority of the gene-containing portions of the model legumes Medicago truncatula and Lotus japonicus have been sequenced in clone-by-clone projects, and the sequencing of the soybean genome is underway in a whole-genome shotgun ...

  20. Identification of differentially expressed small non-coding RNAs in the legume endosymbiont Sinorhizobium meliloti by comparative genomics.

    PubMed

    del Val, Coral; Rivas, Elena; Torres-Quesada, Omar; Toro, Nicolás; Jiménez-Zurdo, José I

    2007-12-01

    Bacterial small non-coding RNAs (sRNAs) are being recognized as novel widespread regulators of gene expression in response to environmental signals. Here, we present the first search for sRNA-encoding genes in the nitrogen-fixing endosymbiont Sinorhizobium meliloti, performed by a genome-wide computational analysis of its intergenic regions. Comparative sequence data from eight related alpha-proteobacteria were obtained, and the interspecies pairwise alignments were scored with the programs eQRNA and RNAz as complementary predictive tools to identify conserved and stable secondary structures corresponding to putative non-coding RNAs. Northern experiments confirmed that eight of the predicted loci, selected among the original 32 candidates as most probable sRNA genes, expressed small transcripts. This result supports the combined use of eQRNA and RNAz as a robust strategy to identify novel sRNAs in bacteria. Furthermore, seven of the transcripts accumulated differentially in free-living and symbiotic conditions. Experimental mapping of the 5'-ends of the detected transcripts revealed that their encoding genes are organized in autonomous transcription units with recognizable promoter and, in most cases, termination signatures. These findings suggest novel regulatory functions for sRNAs related to the interactions of alpha-proteobacteria with their eukaryotic hosts. PMID:17971083

  1. Identification of differentially expressed small non-coding RNAs in the legume endosymbiont Sinorhizobium meliloti by comparative genomics

    PubMed Central

    del Val, Coral; Rivas, Elena; Torres-Quesada, Omar; Toro, Nicolás; Jiménez-Zurdo, José I

    2007-01-01

    Bacterial small non-coding RNAs (sRNAs) are being recognized as novel widespread regulators of gene expression in response to environmental signals. Here, we present the first search for sRNA-encoding genes in the nitrogen-fixing endosymbiont Sinorhizobium meliloti, performed by a genome-wide computational analysis of its intergenic regions. Comparative sequence data from eight related α-proteobacteria were obtained, and the interspecies pairwise alignments were scored with the programs eQRNA and RNAz as complementary predictive tools to identify conserved and stable secondary structures corresponding to putative non-coding RNAs. Northern experiments confirmed that eight of the predicted loci, selected among the original 32 candidates as most probable sRNA genes, expressed small transcripts. This result supports the combined use of eQRNA and RNAz as a robust strategy to identify novel sRNAs in bacteria. Furthermore, seven of the transcripts accumulated differentially in free-living and symbiotic conditions. Experimental mapping of the 5′-ends of the detected transcripts revealed that their encoding genes are organized in autonomous transcription units with recognizable promoter and, in most cases, termination signatures. These findings suggest novel regulatory functions for sRNAs related to the interactions of α-proteobacteria with their eukaryotic hosts. PMID:17971083

  2. [Research proceedings on primate comparative genomics].

    PubMed

    Liao, Cheng-Hong; Su, Bing

    2012-02-01

    With the accomplishment of genome sequencing of human, chimpanzee and other primates, there has been a great amount of primate genome information accumulated. Primate comparative genomics has become a new research field at current genome era. In this article, we reviewed recent progress in phylogeny, genome structure and gene expression of human and nonhuman primates, and we elaborated the major biological differences among human, chimpanzee and other non-human primate species, which is informative in revealing the mechanism of human evolution. PMID:22345018

  3. Comparative genomics of Brassicaceae crops

    PubMed Central

    Sharma, Ashutosh; Li, Xiaonan; Lim, Yong Pyo

    2014-01-01

    The family Brassicaceae is one of the major groups of the plant kingdom and comprises diverse species of great economic, agronomic and scientific importance, including the model plant Arabidopsis. The sequencing of the Arabidopsis genome has revolutionized our knowledge in the field of plant biology and provides a foundation in genomics and comparative biology. Genomic resources have been utilized in Brassica for diversity analyses, construction of genetic maps and identification of agronomic traits. In Brassicaceae, comparative sequence analysis across the species has been utilized to understand genome structure, evolution and the detection of conserved genomic segments. In this review, we focus on the progress made in genetic resource development, genome sequencing and comparative mapping in Brassica and related species. The utilization of genomic resources and next-generation sequencing approaches in improvement of Brassica crops is also discussed. PMID:24987286

  4. Comparative genomics of Brassicaceae crops.

    PubMed

    Sharma, Ashutosh; Li, Xiaonan; Lim, Yong Pyo

    2014-05-01

    The family Brassicaceae is one of the major groups of the plant kingdom and comprises diverse species of great economic, agronomic and scientific importance, including the model plant Arabidopsis. The sequencing of the Arabidopsis genome has revolutionized our knowledge in the field of plant biology and provides a foundation in genomics and comparative biology. Genomic resources have been utilized in Brassica for diversity analyses, construction of genetic maps and identification of agronomic traits. In Brassicaceae, comparative sequence analysis across the species has been utilized to understand genome structure, evolution and the detection of conserved genomic segments. In this review, we focus on the progress made in genetic resource development, genome sequencing and comparative mapping in Brassica and related species. The utilization of genomic resources and next-generation sequencing approaches in improvement of Brassica crops is also discussed. PMID:24987286

  5. Identification of candidate genes in Populus cell wall biosynthesis using text-mining, co-expression network and comparative genomics

    SciTech Connect

    Yang, Xiaohan; Ye, Chuyu; Bisaria, Anjali; Tuskan, Gerald A; Kalluri, Udaya C

    2011-01-01

    Populus is an important bioenergy crop for bioethanol production. A greater understanding of cell wall biosynthesis processes is critical in reducing biomass recalcitrance, a major hindrance in efficient generation of ethanol from lignocellulosic biomass. Here, we report the identification of candidate cell wall biosynthesis genes through the development and application of a novel bioinformatics pipeline. As a first step, via text-mining of PubMed publications, we obtained 121 Arabidopsis genes that had the experimental evidences supporting their involvement in cell wall biosynthesis or remodeling. The 121 genes were then used as bait genes to query an Arabidopsis co-expression database and additional genes were identified as neighbors of the bait genes in the network, increasing the number of genes to 548. The 548 Arabidopsis genes were then used to re-query the Arabidopsis co-expression database and re-construct a network that captured additional network neighbors, expanding to a total of 694 genes. The 694 Arabidopsis genes were computationally divided into 22 clusters. Queries of the Populus genome using the Arabidopsis genes revealed 817 Populus orthologs. Functional analysis of gene ontology and tissue-specific gene expression indicated that these Arabidopsis and Populus genes are high likelihood candidates for functional genomics in relation to cell wall biosynthesis.

  6. An Expressed Sequence Tag (EST)-enriched genetic map of turbot (Scophthalmus maximus): a useful framework for comparative genomics across model and farmed teleosts

    PubMed Central

    2012-01-01

    Background The turbot (Scophthalmus maximus) is a relevant species in European aquaculture. The small turbot genome provides a source for genomics strategies to use in order to understand the genetic basis of productive traits, particularly those related to sex, growth and pathogen resistance. Genetic maps represent essential genomic screening tools allowing to localize quantitative trait loci (QTL) and to identify candidate genes through comparative mapping. This information is the backbone to develop marker-assisted selection (MAS) programs in aquaculture. Expressed sequenced tag (EST) resources have largely increased in turbot, thus supplying numerous type I markers suitable for extending the previous linkage map, which was mostly based on anonymous loci. The aim of this study was to construct a higher-resolution turbot genetic map using EST-linked markers, which will turn out to be useful for comparative mapping studies. Results A consensus gene-enriched genetic map of the turbot was constructed using 463 SNP and microsatellite markers in nine reference families. This map contains 438 markers, 180 EST-linked, clustered at 24 linkage groups. Linkage and comparative genomics evidences suggested additional linkage group fusions toward the consolidation of turbot map according to karyotype information. The linkage map showed a total length of 1402.7 cM with low average intermarker distance (3.7 cM; ~2 Mb). A global 1.6:1 female-to-male recombination frequency (RF) ratio was observed, although largely variable among linkage groups and chromosome regions. Comparative sequence analysis revealed large macrosyntenic patterns against model teleost genomes, significant hits decreasing from stickleback (54%) to zebrafish (20%). Comparative mapping supported particular chromosome rearrangements within Acanthopterygii and aided to assign unallocated markers to specific turbot linkage groups. Conclusions The new gene-enriched high-resolution turbot map represents a

  7. High-resolution detection of recurrent aberrations in lung adenocarcinomas by array comparative genomic hybridization and expression analysis of selective genes by quantitative PCR.

    PubMed

    Zhu, Hong; Wong, Maria Pik; Tin, Vicky

    2014-06-01

    Genomic abnormalities are the hallmark of cancers and may harbor potential candidate genes important for cancer development and progression. We performed array comparative genomic hybridization (array CGH) on 36 cases of primary lung adenocarcinoma (AD) using an array containing 2621 BAC or PAC clones spanning the genome at an average interval of 1 Mb. Array CGH identified the commonest aberrations consisting of DNA gains within 1p, 1q, 5p, 5q, 7p, 7q, 8q, 11q, 12p, 13q, 16p, 17q, 20q, and losses with 6q, 9p, 10q and 18q. High-level copy gains involved mainly 7p21-p15 and 20q13.3. Dual color fluorescence in situ hybridization (FISH) was performed on a selective locus for validation of array CGH results. Genomic aberrations were compared with different clinicopathological features and a trend of higher number of aberrations in tumors with aggressive phenotypes and current tobacco exposure was identified. According to array CGH data, 23 candidate genes were selected for quantitative PCR (qPCR) analysis. The concordance observed between the genomic and expression changes in most of the genes suggested that they could be candidate cancer-related genes that contributed to the development of lung AD. PMID:24728343

  8. Identification, expression, and comparative genomic analysis of the IPT and CKX gene families in Chinese cabbage (Brassica rapa ssp. pekinensis)

    PubMed Central

    2013-01-01

    Background Cytokinins (CKs) have significant roles in various aspects of plant growth and development, and they are also involved in plant stress adaptations. The fine-tuning of the controlled CK levels in individual tissues, cells, and organelles is properly maintained by isopentenyl transferases (IPTs) and cytokinin oxidase/dehydrogenases (CKXs). Chinese cabbage is one of the most economically important vegetable crops worldwide. The whole genome sequencing of Brassica rapa enables us to perform the genome-wide identification and functional analysis of the IPT and CKX gene families. Results In this study, a total of 13 BrIPT genes and 12 BrCKX genes were identified. The gene structures, conserved domains and phylogenetic relationships were analyzed. The isoelectric point, subcellular localization and glycosylation sites of the proteins were predicted. Segmental duplicates were found in both BrIPT and BrCKX gene families. We also analyzed evolutionary patterns and divergence of the IPT and CKX genes in the Cruciferae family. The transcription levels of BrIPT and BrCKX genes were analyzed to obtain an initial picture of the functions of these genes. Abiotic stress elements related to adverse environmental stimuli were found in the promoter regions of BrIPT and BrCKX genes and they were confirmed to respond to drought and high salinity conditions. The effects of 6-BA and ABA on the expressions of BrIPT and BrCKX genes were also investigated. Conclusions The expansion of BrIPT and BrCKX genes after speciation from Arabidopsis thaliana is mainly attributed to segmental duplication events during the whole genome triplication (WGT) and substantial duplicated genes are lost during the long evolutionary history. Genes produced by segmental duplication events have changed their expression patterns or may adopted new functions and thus are obtained. BrIPT and BrCKX genes respond well to drought and high salinity stresses, and their transcripts are affected by exogenous

  9. Genomic expression during human myelopoiesis

    PubMed Central

    Ferrari, Francesco; Bortoluzzi, Stefania; Coppe, Alessandro; Basso, Dario; Bicciato, Silvio; Zini, Roberta; Gemelli, Claudia; Danieli, Gian Antonio; Ferrari, Sergio

    2007-01-01

    Background Human myelopoiesis is an exciting biological model for cellular differentiation since it represents a plastic process where multipotent stem cells gradually limit their differentiation potential, generating different precursor cells which finally evolve into distinct terminally differentiated cells. This study aimed at investigating the genomic expression during myeloid differentiation through a computational approach that integrates gene expression profiles with functional information and genome organization. Results Gene expression data from 24 experiments for 8 different cell types of the human myelopoietic lineage were used to generate an integrated myelopoiesis dataset of 9,425 genes, each reliably associated to a unique genomic position and chromosomal coordinate. Lists of genes constitutively expressed or silent during myelopoiesis and of genes differentially expressed in commitment phase of myelopoiesis were first identified using a classical data analysis procedure. Then, the genomic distribution of myelopoiesis genes was investigated integrating transcriptional and functional characteristics of genes. This approach allowed identifying specific chromosomal regions significantly highly or weakly expressed, and clusters of differentially expressed genes and of transcripts related to specific functional modules. Conclusion The analysis of genomic expression during human myelopoiesis using an integrative computational approach allowed discovering important relationships between genomic position, biological function and expression patterns and highlighting chromatin domains, including genes with coordinated expression and lineage-specific functions. PMID:17683550

  10. Comparative genomics for biodiversity conservation

    PubMed Central

    Grueber, Catherine E.

    2015-01-01

    Genomic approaches are gathering momentum in biology and emerging opportunities lie in the creative use of comparative molecular methods for revealing the processes that influence diversity of wildlife. However, few comparative genomic studies are performed with explicit and specific objectives to aid conservation of wild populations. Here I provide a brief overview of comparative genomic approaches that offer specific benefits to biodiversity conservation. Because conservation examples are few, I draw on research from other areas to demonstrate how comparing genomic data across taxa may be used to inform the characterisation of conservation units and studies of hybridisation, as well as studies that provide conservation outcomes from a better understanding of the drivers of divergence. A comparative approach can also provide valuable insight into the threatening processes that impact rare species, such as emerging diseases and their management in conservation. In addition to these opportunities, I note areas where additional research is warranted. Overall, comparing and contrasting the genomic composition of threatened and other species provide several useful tools for helping to preserve the molecular biodiversity of the global ecosystem. PMID:26106461

  11. Comparative genomic analysis of transgenic poplar dwarf mutant reveals numerous differentially expressed genes involved in energy flow.

    PubMed

    Chen, Su; Bai, Shuang; Liu, Guifeng; Li, Huiyu; Jiang, Jing

    2014-01-01

    In our previous research, the Tamarix androssowii LEA gene (Tamarix androssowii late embryogenesis abundant protein Mrna, GenBank ID: DQ663481) was transferred into Populus simonii × Populus nigra. Among the eleven transgenic lines, one exhibited a dwarf phenotype compared to the wild type and other transgenic lines, named dwf1. To uncover the mechanisms underlying this phenotype, digital gene expression libraries were produced from dwf1, wild-type, and other normal transgenic lines, XL-5 and XL-6. Gene expression profile analysis indicated that dwf1 had a unique gene expression pattern in comparison to the other two transgenic lines. Finally, a total of 1246 dwf1-unique differentially expressed genes were identified. These genes were further subjected to gene ontology and pathway analysis. Results indicated that photosynthesis and carbohydrate metabolism related genes were significantly affected. In addition, many transcription factors genes were also differentially expressed in dwf1. These various differentially expressed genes may be critical for dwarf mutant formation; thus, the findings presented here might provide insight for our understanding of the mechanisms of tree growth and development. PMID:25192286

  12. Screening of Tissue-Specific Genes and Promoters in Tomato by Comparing Genome Wide Expression Profiles of Arabidopsis Orthologues

    PubMed Central

    Lim, Chan Ju; Lee, Ha Yeon; Kim, Woong Bom; Lee, Bok-Sim; Kim, Jungeun; Ahmad, Raza; Kim, Hyun A; Yi, So Young; Hur, Cheol-Goo; Kwon, Suk-Yoon

    2012-01-01

    Constitutive overexpression of transgenes occasionally interferes with normal growth and developmental processes in plants. Thus, the development of tissue-specific promoters that drive transgene expression has become agriculturally important. To identify tomato tissue-specific promoters, tissue-specific genes were screened using a series of in silico-based and experimental procedures, including genome-wide orthologue searches of tomato and Arabidopsis databases, isolation of tissue-specific candidates using an Arabidopsis microarray database, and validation of tissue specificity by reverse transcription-polymerase chain reaction (RT-PCR) analysis and promoter assay. Using these procedures, we found 311 tissue-specific candidate genes and validated 10 tissue-specific genes by RT-PCR. Among these identified genes, histochemical analysis of five isolated promoter::GUS transgenic tomato and Arabidopsis plants revealed that their promoters have different but distinct tissue-specific activities in anther, fruit, and root, respectively. Therefore, it appears these in silico-based screening approaches in addition to the identification of new tissue-specific genes and promoters will be helpful for the further development of tailored crop development. PMID:22699756

  13. Enhancer Identification through Comparative Genomics

    SciTech Connect

    Visel, Axel; Bristow, James; Pennacchio, Len A.

    2006-10-01

    With the availability of genomic sequence from numerousvertebrates, a paradigm shift has occurred in the identification ofdistant-acting gene regulatory elements. In contrast to traditionalgene-centric studies in which investigators randomly scanned genomicfragments that flank genes of interest in functional assays, the modernapproach begins electronically with publicly available comparativesequence datasets that provide investigators with prioritized lists ofputative functional sequences based on their evolutionary conservation.However, although a large number of tools and resources are nowavailable, application of comparative genomic approaches remains far fromtrivial. In particular, it requires users to dynamically consider thespecies and methods for comparison depending on the specific biologicalquestion under investigation. While there is currently no single generalrule to this end, it is clear that when applied appropriately,comparative genomic approaches exponentially increase our power ingenerating biological hypotheses for subsequent experimentaltesting.

  14. Enhancer Identification through Comparative Genomics

    PubMed Central

    Visel, Axel; Bristow, James; Pennacchio, Len A.

    2007-01-01

    With the availability of genomic sequence from numerous vertebrates, a paradigm shift has occurred in the identification of distant-acting gene regulatory elements. In contrast to traditional gene-centric studies in which investigators randomly scanned genomic fragments that flank genes of interest in functional assays, the modern approach begins electronically with publicly available comparative sequence datasets that provide investigators with prioritized lists of putative functional sequences based on their evolutionary conservation. However, although a large number of tools and resources are now available, application of comparative genomic approaches remains far from trivial. In particular, it requires users to dynamically consider the species and methods for comparison depending on the specific biological question under investigation. While there is currently no single general rule to this end, it is clear that when applied appropriately, comparative genomic approaches exponentially increase our power in generating biological hypotheses for subsequent experimental testing. It is anticipated that cardiac-related genes and the identification of their distant-acting transcriptional enhancers are particularly poised to benefit from these modern capabilities. PMID:17276707

  15. Comparative genomic analysis and expression of the APETALA2-like genes from barley, wheat, and barley-wheat amphiploids

    PubMed Central

    Gil-Humanes, Javier; Pistón, Fernando; Martín, Antonio; Barro, Francisco

    2009-01-01

    Background The APETALA2-like genes form a large multi-gene family of transcription factors which play an important role during the plant life cycle, being key regulators of many developmental processes. Many studies in Arabidopsis have revealed that the APETALA2 (AP2) gene is implicated in the establishment of floral meristem and floral organ identity as well as temporal and spatial regulation of flower homeotic gene expression. Results In this work, we have cloned and characterised the AP2-like gene from accessions of Hordeum chilense and Hordeum vulgare, wild and domesticated barley, respectively, and compared with other AP2 homoeologous genes, including the Q gene in wheat. The Hordeum AP2-like genes contain two plant-specific DNA binding motifs called AP2 domains, as does the Q gene of wheat. We confirm that the H. chilense AP2-like gene is located on chromosome 5Hch. Patterns of expression of the AP2-like genes were examined in floral organs and other tissues in barley, wheat and in tritordeum amphiploids (barley × wheat hybrids). In tritordeum amphiploids, the level of transcription of the barley AP2-like gene was lower than in its barley parental and the chromosome substitutions 1D/1Hch and 2D/2Hch were seen to modify AP2 gene expression levels. Conclusion The results are of interest in order to understand the role of the AP2-like gene in the spike morphology of barley and wheat, and to understand the regulation of this gene in the amphiploids obtained from barley-wheat crossing. This information may have application in cereal breeding programs to up- or down-regulate the expression of AP2-like genes in order to modify spike characteristics and to obtain free-threshing plants. PMID:19480686

  16. Comparative genomics: methods and applications

    NASA Astrophysics Data System (ADS)

    Haubold, Bernhard; Wiehe, Thomas

    2004-09-01

    Interpreting the functional content of a given genomic sequence is one of the central challenges of biology today. Perhaps the most promising approach to this problem is based on the comparative method of classic biology in the modern guise of sequence comparison. For instance, protein-coding regions tend to be conserved between species. Hence, a simple method for distinguishing a functional exon from the chance absence of stop codons is to investigate its homologue from closely related species. Predicting regulatory elements is even more difficult than exon prediction, but again, comparisons pinpointing conserved sequence motifs upstream of translation start sites are helping to unravel gene regulatory networks. In addition to interspecific studies, intraspecific sequence comparison yields insights into the evolutionary forces that have acted on a species in the past. Of particular interest here is the identification of selection events such as selective sweeps. Both intra- and interspecific sequence comparisons are based on a variety of computational methods, including alignment, phylogenetic reconstruction, and coalescent theory. This article surveys the biology and the central computational ideas applied in recent comparative genomics projects. We argue that the most fruitful method of understanding the functional content of genomes is to study them in the context of related genomic sequences. In particular, such a study may reveal selection, a fundamental pointer to biological relevance.

  17. Comparative Genomics of Carp Herpesviruses

    PubMed Central

    Kurobe, Tomofumi; Gatherer, Derek; Cunningham, Charles; Korf, Ian; Fukuda, Hideo; Hedrick, Ronald P.; Waltzek, Thomas B.

    2013-01-01

    Three alloherpesviruses are known to cause disease in cyprinid fish: cyprinid herpesviruses 1 and 3 (CyHV1 and CyHV3) in common carp and koi and cyprinid herpesvirus 2 (CyHV2) in goldfish. We have determined the genome sequences of CyHV1 and CyHV2 and compared them with the published CyHV3 sequence. The CyHV1 and CyHV2 genomes are 291,144 and 290,304 bp, respectively, in size, and thus the CyHV3 genome, at 295,146 bp, remains the largest recorded among the herpesviruses. Each of the three genomes consists of a unique region flanked at each terminus by a sizeable direct repeat. The CyHV1, CyHV2, and CyHV3 genomes are predicted to contain 137, 150, and 155 unique, functional protein-coding genes, respectively, of which six, four, and eight, respectively, are duplicated in the terminal repeat. The three viruses share 120 orthologous genes in a largely colinear arrangement, of which up to 55 are also conserved in the other member of the genus Cyprinivirus, anguillid herpesvirus 1. Twelve genes are conserved convincingly in all sequenced alloherpesviruses, and two others are conserved marginally. The reference CyHV3 strain has been reported to contain five fragmented genes that are presumably nonfunctional. The CyHV2 strain has two fragmented genes, and the CyHV1 strain has none. CyHV1, CyHV2, and CyHV3 have five, six, and five families of paralogous genes, respectively. One family unique to CyHV1 is related to cellular JUNB, which encodes a transcription factor involved in oncogenesis. To our knowledge, this is the first time that JUNB-related sequences have been reported in a herpesvirus. PMID:23269803

  18. Comparative genomics of carp herpesviruses.

    PubMed

    Davison, Andrew J; Kurobe, Tomofumi; Gatherer, Derek; Cunningham, Charles; Korf, Ian; Fukuda, Hideo; Hedrick, Ronald P; Waltzek, Thomas B

    2013-03-01

    Three alloherpesviruses are known to cause disease in cyprinid fish: cyprinid herpesviruses 1 and 3 (CyHV1 and CyHV3) in common carp and koi and cyprinid herpesvirus 2 (CyHV2) in goldfish. We have determined the genome sequences of CyHV1 and CyHV2 and compared them with the published CyHV3 sequence. The CyHV1 and CyHV2 genomes are 291,144 and 290,304 bp, respectively, in size, and thus the CyHV3 genome, at 295,146 bp, remains the largest recorded among the herpesviruses. Each of the three genomes consists of a unique region flanked at each terminus by a sizeable direct repeat. The CyHV1, CyHV2, and CyHV3 genomes are predicted to contain 137, 150, and 155 unique, functional protein-coding genes, respectively, of which six, four, and eight, respectively, are duplicated in the terminal repeat. The three viruses share 120 orthologous genes in a largely colinear arrangement, of which up to 55 are also conserved in the other member of the genus Cyprinivirus, anguillid herpesvirus 1. Twelve genes are conserved convincingly in all sequenced alloherpesviruses, and two others are conserved marginally. The reference CyHV3 strain has been reported to contain five fragmented genes that are presumably nonfunctional. The CyHV2 strain has two fragmented genes, and the CyHV1 strain has none. CyHV1, CyHV2, and CyHV3 have five, six, and five families of paralogous genes, respectively. One family unique to CyHV1 is related to cellular JUNB, which encodes a transcription factor involved in oncogenesis. To our knowledge, this is the first time that JUNB-related sequences have been reported in a herpesvirus. PMID:23269803

  19. Comparative genomic hybridization: an overview.

    PubMed Central

    Houldsworth, J.; Chaganti, R. S.

    1994-01-01

    Comparative genomic hybridization (CGH) is a newly described molecular-cytogenetic assay that globally assays for chromosomal gains and losses in a genomic complement. In this assay, normal human metaphase chromosomes are competitively hybridized with two differentially labeled genomic DNAs (test and reference), which upon fluorescence microscopy, reveal the chromosomal locations of copy number changes in DNA sequences between the two complements. Application of CGH to DNAs extracted from fresh frozen specimens and cell lines of various tumor types has revealed a number of recurring chromosomal gains and losses that were undetected by traditional cytogenetic analysis. Few previously known sites were found to be in higher copy number, or lost by CGH, while many novel amplified regions were identified. These regions warrant further molecular genetic studies aimed at isolating the perturbed genes. Since CGH can also be performed on DNA extracted from formalin-fixed paraffin-embedded archived tumor specimens with few modifications, gains and losses of genetic material can be determined for specimens that would otherwise be unanalyzable. Prospective and retrospective application of CGH to tumor specimens would permit correlative studies to be performed, possibly identifying diagnostic and prognostic indicators of disease. CGH may also have a future role in detection and identification of chromosomal abnormalities in prenatal diagnosis and in dysmorphic anomalies. Images Figure 1 Figure 2 PMID:7992829

  20. Comparative primate genomics: emerging patterns of genome content and dynamics

    PubMed Central

    Rogers, Jeffrey; Gibbs, Richard A.

    2014-01-01

    Preface Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for several primates, with analyses of several others underway. Whole genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other nonhuman primates provide valuable insight into genetic similarities and differences among species used as models for disease-related research. This review summarizes current knowledge regarding primate genome content and dynamics and offers a series of goals for the near future. PMID:24709753

  1. Comparative primate genomics: emerging patterns of genome content and dynamics.

    PubMed

    Rogers, Jeffrey; Gibbs, Richard A

    2014-05-01

    Advances in genome sequencing technologies have created new opportunities for comparative primate genomics. Genome assemblies have been published for various primate species, and analyses of several others are underway. Whole-genome assemblies for the great apes provide remarkable new information about the evolutionary origins of the human genome and the processes involved. Genomic data for macaques and other non-human primates offer valuable insights into genetic similarities and differences among species that are used as models for disease-related research. This Review summarizes current knowledge regarding primate genome content and dynamics, and proposes a series of goals for the near future. PMID:24709753

  2. Cocoa/Cotton Comparative Genomics

    Technology Transfer Automated Retrieval System (TEKTRAN)

    With genome sequence from two members of the Malvaceae family recently made available, we are exploring syntenic relationships, gene content, and evolutionary trajectories between the cacao and cotton genomes. An assembly of cacao (Theobroma cacao) using Illumina and 454 sequence technology yielded ...

  3. Haemonchus contortus: Genome Structure, Organization and Comparative Genomics.

    PubMed

    Laing, R; Martinelli, A; Tracey, A; Holroyd, N; Gilleard, J S; Cotton, J A

    2016-01-01

    One of the first genome sequencing projects for a parasitic nematode was that for Haemonchus contortus. The open access data from the Wellcome Trust Sanger Institute provided a valuable early resource for the research community, particularly for the identification of specific genes and genetic markers. Later, a second sequencing project was initiated by the University of Melbourne, and the two draft genome sequences for H. contortus were published back-to-back in 2013. There is a pressing need for long-range genomic information for genetic mapping, population genetics and functional genomic studies, so we are continuing to improve the Wellcome Trust Sanger Institute assembly to provide a finished reference genome for H. contortus. This review describes this process, compares the H. contortus genome assemblies with draft genomes from other members of the strongylid group and discusses future directions for parasite genomics using the H. contortus model. PMID:27238013

  4. Comparative genomic hybridization (CGH) in genotoxicology.

    PubMed

    Baumgartner, Adolf

    2013-01-01

    In the past two decades comparative genomic hybridization (CGH) and array CGH have become crucial and indispensable tools in clinical diagnostics. Initially developed for the genome-wide screening of chromosomal imbalances in tumor cells, CGH as well as array CGH have also been employed in genotoxicology and most recently in toxicogenomics. The latter methodology allows a multi-endpoint analysis of how genes and proteins react to toxic agents revealing molecular mechanisms of toxicology. This chapter provides a background on the use of CGH and array CGH in the context of genotoxicology as well as a protocol for conventional CGH to understand the basic principles of CGH. Array CGH is still cost intensive and requires suitable analytical algorithms but might become the dominating assay in the future when more companies provide a large variety of different commercial DNA arrays/chips leading to lower costs for array CGH equipment as well as consumables such as DNA chips. As the amount of data generated with microarrays exponentially grows, the demand for powerful adaptive algorithms for analysis, competent databases, as well as a sound regulatory framework will also increase. Nevertheless, chromosomal and array CGH are being demonstrated to be effective tools for investigating copy number changes/variations in the whole genome, DNA expression patterns, as well as loss of heterozygosity after a genotoxic impact. This will lead to new insights into affected genes and the underlying structures of regulatory and signaling pathways in genotoxicology and could conclusively identify yet unknown harmful toxicants. PMID:23896881

  5. Comparative genomics of protoploid Saccharomycetaceae.

    PubMed

    Souciet, Jean-Luc; Dujon, Bernard; Gaillardin, Claude; Johnston, Mark; Baret, Philippe V; Cliften, Paul; Sherman, David J; Weissenbach, Jean; Westhof, Eric; Wincker, Patrick; Jubin, Claire; Poulain, Julie; Barbe, Valérie; Ségurens, Béatrice; Artiguenave, François; Anthouard, Véronique; Vacherie, Benoit; Val, Marie-Eve; Fulton, Robert S; Minx, Patrick; Wilson, Richard; Durrens, Pascal; Jean, Géraldine; Marck, Christian; Martin, Tiphaine; Nikolski, Macha; Rolland, Thomas; Seret, Marie-Line; Casarégola, Serge; Despons, Laurence; Fairhead, Cécile; Fischer, Gilles; Lafontaine, Ingrid; Leh, Véronique; Lemaire, Marc; de Montigny, Jacky; Neuvéglise, Cécile; Thierry, Agnès; Blanc-Lenfle, Isabelle; Bleykasten, Claudine; Diffels, Julie; Fritsch, Emilie; Frangeul, Lionel; Goëffon, Adrien; Jauniaux, Nicolas; Kachouri-Lafond, Rym; Payen, Célia; Potier, Serge; Pribylova, Lenka; Ozanne, Christophe; Richard, Guy-Franck; Sacerdot, Christine; Straub, Marie-Laure; Talla, Emmanuel

    2009-10-01

    Our knowledge of yeast genomes remains largely dominated by the extensive studies on Saccharomyces cerevisiae and the consequences of its ancestral duplication, leaving the evolution of the entire class of hemiascomycetes only partly explored. We concentrate here on five species of Saccharomycetaceae, a large subdivision of hemiascomycetes, that we call "protoploid" because they diverged from the S. cerevisiae lineage prior to its genome duplication. We determined the complete genome sequences of three of these species: Kluyveromyces (Lachancea) thermotolerans and Saccharomyces (Lachancea) kluyveri (two members of the newly described Lachancea clade), and Zygosaccharomyces rouxii. We included in our comparisons the previously available sequences of Kluyveromyces lactis and Ashbya (Eremothecium) gossypii. Despite their broad evolutionary range and significant individual variations in each lineage, the five protoploid Saccharomycetaceae share a core repertoire of approximately 3300 protein families and a high degree of conserved synteny. Synteny blocks were used to define gene orthology and to infer ancestors. Far from representing minimal genomes without redundancy, the five protoploid yeasts contain numerous copies of paralogous genes, either dispersed or in tandem arrays, that, altogether, constitute a third of each genome. Ancient, conserved paralogs as well as novel, lineage-specific paralogs were identified. PMID:19525356

  6. Glutathione S-Transferase Gene Family in Gossypium raimondii and G. arboreum: Comparative Genomic Study and their Expression under Salt Stress

    PubMed Central

    Dong, Yating; Li, Cong; Zhang, Yi; He, Qiuling; Daud, Muhammad K.; Chen, Jinhong; Zhu, Shuijin

    2016-01-01

    Glutathione S-transferases (GSTs) play versatile functions in multiple aspects of plant growth and development. A comprehensive genome-wide survey of this gene family in the genomes of G. raimondii and G. arboreum was carried out in this study. Based on phylogenetic analyses, the GST gene family of both two diploid cotton species could be divided into eight classes, and approximately all the GST genes within the same subfamily shared similar gene structure. Additionally, the gene structures between the orthologs were highly conserved. The chromosomal localization analyses revealed that GST genes were unevenly distributed across the genome in both G. raimondii and G. arboreum. Tandem duplication could be the major driver for the expansion of GST gene families. Meanwhile, the expression analysis for the selected 40 GST genes showed that they exhibited tissue-specific expression patterns and their expression were induced or repressed by salt stress. Those findings shed lights on the function and evolution of the GST gene family in Gossypium species. PMID:26904090

  7. Genetic, comparative genomic, and expression analyses of the Mc1r locus in the polychromatic Midas cichlid fish (Teleostei, Cichlidae Amphilophus sp.) species group.

    PubMed

    Henning, Frederico; Renz, Adina Josepha; Fukamachi, Shoji; Meyer, Axel

    2010-05-01

    Natural populations of the Midas cichlid species in several different crater lakes in Nicaragua exhibit a conspicuous color polymorphism. Most individuals are dark and the remaining have a gold coloration. The color morphs mate assortatively and sympatric population differentiation has been shown based on neutral molecular data. We investigated the color polymorphism using segregation analysis and a candidate gene approach. The segregation patterns observed in a mapping cross between a gold and a dark individual were consistent with a single dominant gene as a cause of the gold phenotype. This suggests that a simple genetic architecture underlies some of the speciation events in the Midas cichlids. We compared the expression levels of several candidate color genes Mc1r, Ednrb1, Slc45a2, and Tfap1a between the color morphs. Mc1r was found to be up regulated in the gold morph. Given its widespread association in color evolution and role on melanin synthesis, the Mc1r locus was further investigated using sequences derived from a genomic library. Comparative analysis revealed conserved synteny in relation to the majority of teleosts and highlighted several previously unidentified conserved non-coding elements (CNEs) in the upstream and downstream regions in the vicinity of Mc1r. The identification of the CNEs regions allowed the comparison of sequences from gold and dark specimens of natural populations. No polymorphisms were found between in the population sample and Mc1r showed no linkage to the gold phenotype in the mapping cross, demonstrating that it is not causally related to the color polymorphism in the Midas cichlid. PMID:20449580

  8. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    SciTech Connect

    Callister, Stephen J.; McCue, Lee Ann; Turse, Josh E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-02-06

    Comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry. Experimental validation of the existence of this core genome requires extensive measurement and is not typically undertaken. Enabled by an extensive proteome database development over a six year period, we experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. While genomic studies establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.

  9. Comparative Reannotation of 21 Aspergillus Genomes

    SciTech Connect

    Salamov, Asaf; Riley, Robert; Kuo, Alan; Grigoriev, Igor

    2013-03-08

    We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one which most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;;2percent per genome), supported by comparative analysis, additionally correcting ~;;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.

  10. Comparative functional genomics revealed conservation and diversification of three enhancers of the isl1 gene for motor and sensory neuron-specific expression.

    PubMed

    Uemura, Osamu; Okada, Yohei; Ando, Hideki; Guedj, Mickael; Higashijima, Shin-Ichi; Shimazaki, Takuya; Chino, Naoichi; Okano, Hideyuki; Okamoto, Hitoshi

    2005-02-15

    Islet-1 (Isl1) is a member of the Isl1 family of LIM-homeodomain transcription factors (LIM-HD) that is expressed in a defined subset of motor and sensory neurons during vertebrate embryogenesis. To investigate how this specific expression of isl1 is regulated, we searched for enhancers of the isl1 gene that are conserved in vertebrate evolution. Initially, two enhancer elements, CREST1 and CREST2, were identified downstream of the isl1 locus in the genomes of fugu, chick, mouse, and human by BLAST searching for highly similar elements to those originally identified as motor and sensory neuron-specific enhancers in the zebrafish genome. The combined action of these elements is sufficient for completely recapitulating the subtype-specific expression of the isl1 gene in motor neurons of the mouse spinal cord. Furthermore, by direct comparison of the upstream flanking regions of the zebrafish and human isl1 genes, we identified another highly conserved noncoding element, CREST3, and subsequently C3R, a similar element to CREST3 with two CDP CR1 recognition motifs, in the upstream regions of all other isl1 family members. In mouse and human, CRESTs are located as far as more than 300 kb away from the isl1 locus, while they are much closer to the isl1 locus in zebrafish. Although all of zebrafish CREST2, CREST3, and C3R activate gene expression in the sensory neurons of zebrafish, CREST2 of mouse and human does not have the sequence necessary for sensory neuron-specific expression. Our results revealed both a remarkable conservation of the regulatory elements regulating subtype-specific gene expression in motor and sensory neurons and the dynamic process of reorganization of these elements whereby each element increases the level of cell-type specificity by losing redundant functions with the other elements during vertebrate evolution. PMID:15680372

  11. Gramene 2013: Comparative plant genomics resources

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gramene (http://www.gramene.org) is a curated online resource for comparative functional genomics in crops and model plant species, currently hosting 27 fully and 10 partially sequenced reference genomes in its build number 38. Its strength derives from the application of a phylogenetic framework fo...

  12. Gramene: a growing plant comparative genomics resource

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gramene (www.gramene.org) is a curated genetic, genomic and comparative genome analysis resource for the major crop species, such as rice, maize, wheat and many other plant (mainly grass) species. Gramene is an open-source project, with all data and software freely downloadable through the ftp site ...

  13. Gramene 2016: comparative plant genomics and pathway resources

    PubMed Central

    Tello-Ruiz, Marcela K.; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M.; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A.; Huerta, Laura; Keays, Maria; Tang, Y. Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J.; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. PMID:26553803

  14. Gramene 2016: comparative plant genomics and pathway resources.

    PubMed

    Tello-Ruiz, Marcela K; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A; Huerta, Laura; Keays, Maria; Tang, Y Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼ 200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. PMID:26553803

  15. Orthology for comparative genomics in the mouse genome database.

    PubMed

    Dolan, Mary E; Baldarelli, Richard M; Bello, Susan M; Ni, Li; McAndrews, Monica S; Bult, Carol J; Kadin, James A; Richardson, Joel E; Ringwald, Martin; Eppig, Janan T; Blake, Judith A

    2015-08-01

    The mouse genome database (MGD) is the model organism database component of the mouse genome informatics system at The Jackson Laboratory. MGD is the international data resource for the laboratory mouse and facilitates the use of mice in the study of human health and disease. Since its beginnings, MGD has included comparative genomics data with a particular focus on human-mouse orthology, an essential component of the use of mouse as a model organism. Over the past 25 years, novel algorithms and addition of orthologs from other model organisms have enriched comparative genomics in MGD data, extending the use of orthology data to support the laboratory mouse as a model of human biology. Here, we describe current comparative data in MGD and review the history and refinement of orthology representation in this resource. PMID:26223881

  16. [Comparative genomic classification of human hepatocellular carcinoma].

    PubMed

    Kaposi-Novák, Pál

    2009-03-01

    Global transcriptome analysis has been successfully applied to characterize various human tumors, including hepatocellular carcinomas. This novel technology can facilitate early diagnosis, as well as prognostic and therapeutic diversification of cancer patients. To enhance access to the genomic information buried in archived pathology samples, we assessed RT-PCR amplification rates in paraffin-embedded tissues preserved in three different fixatives. Reliable amplification could be achieved from all paraffin-embedded specimens, when the amplicon size did not exceed 225 bp. A longer amplicon size resulted in rapid decrease of yield and reproducibility. In addition, formalin provided superior morphology and better reactivity with claudin-4 and -7 immunohistochemistry. Amplification of the initial sample is often required before transcriptome analysis of clinical specimens could be performed. We introduced a random nonamer primed T3 polymerase reaction into the conventional linear RNA amplification protocol. The modified T3T7 method generated a sense strand product ideal for synthesizing indirectly labeled cDNA templates. Microarray analysis of amplified frozen and laser-microdissected Myc and Myc/TGFalpha mouse liver tumors confirmed good reproducibility (r=0.9) of the reaction and conservation of original transcriptional patterns (r=0.78). Finally, we tested the utility of expression profiling for the classification of human HCC samples. By comparing expression data from HGF-treated c-Met conditional knock-out and control primary mouse hepatocytes, we identified 690 HGF/c-Met target genes. Functional analysis of the significant gene set implicated c-Met as key regulator of hepatocyte motility and oxidative homeostasis. Cross comparison of the c-Met-induced transcription signature with human HCC expression profiles revealed a group of tumors (27%) with potentially activated c-Met signaling (MET+). These tumors were characterized by higher vascular invasion rate

  17. Comparative genomics of BCG vaccines.

    PubMed

    Behr, M A

    2001-01-01

    Bacille Calmette-Guérin (BCG) vaccines have been given to more people than any other vaccine. They have also probably resulted in as much controversy as any other vaccine. In clinical trials, the efficacy of BCG vaccination against pulmonary TB has been widely variable. At the same time, a number of investigators have observed phenotypic differences between BCG daughter strains, raising the possibility that differences between BCG products may in some way translate into different outcomes. With recent genomic analysis of BCG strains, it has become possible to piece together the molecular events that have resulted in current BCG vaccines. Between the derivation of BCG in 1921 and the lyophilization of BCG Pasteur 1173 in 1961, there have been at least seven genetic events, including deletions, duplications and a single nucleotide polymorphism. The phenotypic relevance of these changes in BCG vaccines remains to be explored. PMID:11463238

  18. Linking the genomes of nonmodel teleosts through comparative genomics.

    PubMed

    Sarropoulou, E; Nousdili, D; Magoulas, A; Kotoulas, G

    2008-01-01

    Recently the genomes of two more teleost species have been released: the medaka (Oryzias latipes), and the three-spined stickleback (Gasterosteus aculateus). The rapid developments in genomics of fish species paved the way to new and valuable research in comparative genetics and genomics. With the accumulation of information in model species, the genetic and genomic characterization of nonmodel, but economically important species, is now feasible. Furthermore, comparison of low coverage gene maps of aquacultured fish species against fully sequenced fish species will enhance the efficiency of candidate genes identification projected for quantitative trait loci (QTL) scans for traits of commercial interest. This study shows the syntenic relationship between the genomes of six different teleost species, including three fully sequenced model species: Tetraodon nigroviridis, Oryzias latipes, Gasterosteus aculateus, and three marine species of commercial and evolutionary interest: Sparus aurata, Dicentrarchus labrax, Oreochromis spp. All three commercial fish species belong to the order Perciformes, which is the richest in number of species (approximately 10,000) but poor in terms of available genomic information and tools. Syntenic relationships were established by using 800 EST and microsatellites sequences successfully mapped on the RH map of seabream. Comparison to the stickleback genome produced most positive BLAT hits (58%) followed by medaka (32%) and Tetraodon (30%). Thus, stickleback was used as the major stepping stone to compare seabass and tilapia to seabream. In addition to the significance for the aquaculture industry, this approach can encompass important ecological and evolutionary implications. PMID:18297360

  19. Homology-independent metrics for comparative genomics.

    PubMed

    Coutinho, Tarcisio José Domingos; Franco, Glória Regina; Lobo, Francisco Pereira

    2015-01-01

    A mainstream procedure to analyze the wealth of genomic data available nowadays is the detection of homologous regions shared across genomes, followed by the extraction of biological information from the patterns of conservation and variation observed in such regions. Although of pivotal importance, comparative genomic procedures that rely on homology inference are obviously not applicable if no homologous regions are detectable. This fact excludes a considerable portion of "genomic dark matter" with no significant similarity - and, consequently, no inferred homology to any other known sequence - from several downstream comparative genomic methods. In this review we compile several sequence metrics that do not rely on homology inference and can be used to compare nucleotide sequences and extract biologically meaningful information from them. These metrics comprise several compositional parameters calculated from sequence data alone, such as GC content, dinucleotide odds ratio, and several codon bias metrics. They also share other interesting properties, such as pervasiveness (patterns persist on smaller scales) and phylogenetic signal. We also cite examples where these homology-independent metrics have been successfully applied to support several bioinformatics challenges, such as taxonomic classification of biological sequences without homology inference. They where also used to detect higher-order patterns of interactions in biological systems, ranging from detecting coevolutionary trends between the genomes of viruses and their hosts to characterization of gene pools of entire microbial communities. We argue that, if correctly understood and applied, homology-independent metrics can add important layers of biological information in comparative genomic studies without prior homology inference. PMID:26029354

  20. Comparative Genomics in Identifying Aflatoxin Biosynthetic Genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Aspergillus flavus produces the most toxic and the most carcinogenic mycotoxins, aflatoxin B1 and B2. In order to solve aflatoxin contamination of food commodities, A. flavus genomics tools for identification of genes involved in aflatoxin biosynthesis have been employed. A. flavus Expressed Seque...

  1. Comparative Genomic Analysis of N2-Fixing and Non-N2-Fixing Paenibacillus spp.: Organization, Evolution and Expression of the Nitrogen Fixation Genes

    PubMed Central

    Xie, Jian-Bo; Du, Zhenglin; Bai, Lanqing; Tian, Changfu; Zhang, Yunzhi; Xie, Jiu-Yan; Wang, Tianshu; Liu, Xiaomeng; Chen, Xi; Cheng, Qi; Chen, Sanfeng; Li, Jilun

    2014-01-01

    We provide here a comparative genome analysis of 31 strains within the genus Paenibacillus including 11 new genomic sequences of N2-fixing strains. The heterogeneity of the 31 genomes (15 N2-fixing and 16 non-N2-fixing Paenibacillus strains) was reflected in the large size of the shell genome, which makes up approximately 65.2% of the genes in pan genome. Large numbers of transposable elements might be related to the heterogeneity. We discovered that a minimal and compact nif cluster comprising nine genes nifB, nifH, nifD, nifK, nifE, nifN, nifX, hesA and nifV encoding Mo-nitrogenase is conserved in the 15 N2-fixing strains. The nif cluster is under control of a σ70-depedent promoter and possesses a GlnR/TnrA-binding site in the promoter. Suf system encoding [Fe–S] cluster is highly conserved in N2-fixing and non-N2-fixing strains. Furthermore, we demonstrate that the nif cluster enabled Escherichia coli JM109 to fix nitrogen. Phylogeny of the concatenated NifHDK sequences indicates that Paenibacillus and Frankia are sister groups. Phylogeny of the concatenated 275 single-copy core genes suggests that the ancestral Paenibacillus did not fix nitrogen. The N2-fixing Paenibacillus strains were generated by acquiring the nif cluster via horizontal gene transfer (HGT) from a source related to Frankia. During the history of evolution, the nif cluster was lost, producing some non-N2-fixing strains, and vnf encoding V-nitrogenase or anf encoding Fe-nitrogenase was acquired, causing further diversification of some strains. In addition, some N2-fixing strains have additional nif and nif-like genes which may result from gene duplications. The evolution of nitrogen fixation in Paenibacillus involves a mix of gain, loss, HGT and duplication of nif/anf/vnf genes. This study not only reveals the organization and distribution of nitrogen fixation genes in Paenibacillus, but also provides insight into the complex evolutionary history of nitrogen fixation. PMID:24651173

  2. Comparative genomic analysis of N2-fixing and non-N2-fixing Paenibacillus spp.: organization, evolution and expression of the nitrogen fixation genes.

    PubMed

    Xie, Jian-Bo; Du, Zhenglin; Bai, Lanqing; Tian, Changfu; Zhang, Yunzhi; Xie, Jiu-Yan; Wang, Tianshu; Liu, Xiaomeng; Chen, Xi; Cheng, Qi; Chen, Sanfeng; Li, Jilun

    2014-03-01

    We provide here a comparative genome analysis of 31 strains within the genus Paenibacillus including 11 new genomic sequences of N2-fixing strains. The heterogeneity of the 31 genomes (15 N2-fixing and 16 non-N2-fixing Paenibacillus strains) was reflected in the large size of the shell genome, which makes up approximately 65.2% of the genes in pan genome. Large numbers of transposable elements might be related to the heterogeneity. We discovered that a minimal and compact nif cluster comprising nine genes nifB, nifH, nifD, nifK, nifE, nifN, nifX, hesA and nifV encoding Mo-nitrogenase is conserved in the 15 N2-fixing strains. The nif cluster is under control of a σ(70)-depedent promoter and possesses a GlnR/TnrA-binding site in the promoter. Suf system encoding [Fe-S] cluster is highly conserved in N2-fixing and non-N2-fixing strains. Furthermore, we demonstrate that the nif cluster enabled Escherichia coli JM109 to fix nitrogen. Phylogeny of the concatenated NifHDK sequences indicates that Paenibacillus and Frankia are sister groups. Phylogeny of the concatenated 275 single-copy core genes suggests that the ancestral Paenibacillus did not fix nitrogen. The N2-fixing Paenibacillus strains were generated by acquiring the nif cluster via horizontal gene transfer (HGT) from a source related to Frankia. During the history of evolution, the nif cluster was lost, producing some non-N2-fixing strains, and vnf encoding V-nitrogenase or anf encoding Fe-nitrogenase was acquired, causing further diversification of some strains. In addition, some N2-fixing strains have additional nif and nif-like genes which may result from gene duplications. The evolution of nitrogen fixation in Paenibacillus involves a mix of gain, loss, HGT and duplication of nif/anf/vnf genes. This study not only reveals the organization and distribution of nitrogen fixation genes in Paenibacillus, but also provides insight into the complex evolutionary history of nitrogen fixation. PMID:24651173

  3. Phytozome System for Comparative Plant Genomics

    SciTech Connect

    2011-09-27

    Phytozome is a joint project of the Department of Energy's Joint Genome Institute and the UC Berkeley Center for Integrative Genomics to facilitate comparative genomic studies amongst green plants. Families of orthologous and paralogous genes that represent the modern descendents of ancestral gene sets are constructed at key phylogenetic nodes. These families allow easy access to clade specific orthology/paralogy relationships as well as clade specific genes and gene expansions. As of release 7.0, Phytozome provides access to twenty-five sequenced and annotated green plant genomes which have been clustered into gene families at eleven evolutionarily significant nodes., Where possible, each gene has been annotated with PFAM, KOG, KEGG, and PANTHER assignments, and publicly available annotations from RefSeq, UniProt, TAIR, JGI are lyper-linked and searchable.

  4. Phytozome System for Comparative Plant Genomics

    Energy Science and Technology Software Center (ESTSC)

    2011-09-27

    Phytozome is a joint project of the Department of Energy's Joint Genome Institute and the UC Berkeley Center for Integrative Genomics to facilitate comparative genomic studies amongst green plants. Families of orthologous and paralogous genes that represent the modern descendents of ancestral gene sets are constructed at key phylogenetic nodes. These families allow easy access to clade specific orthology/paralogy relationships as well as clade specific genes and gene expansions. As of release 7.0, Phytozome providesmore » access to twenty-five sequenced and annotated green plant genomes which have been clustered into gene families at eleven evolutionarily significant nodes., Where possible, each gene has been annotated with PFAM, KOG, KEGG, and PANTHER assignments, and publicly available annotations from RefSeq, UniProt, TAIR, JGI are lyper-linked and searchable.« less

  5. Homology-Independent Metrics for Comparative Genomics

    PubMed Central

    Coutinho, Tarcisio José Domingos; Franco, Glória Regina; Lobo, Francisco Pereira

    2015-01-01

    A mainstream procedure to analyze the wealth of genomic data available nowadays is the detection of homologous regions shared across genomes, followed by the extraction of biological information from the patterns of conservation and variation observed in such regions. Although of pivotal importance, comparative genomic procedures that rely on homology inference are obviously not applicable if no homologous regions are detectable. This fact excludes a considerable portion of “genomic dark matter” with no significant similarity — and, consequently, no inferred homology to any other known sequence — from several downstream comparative genomic methods. In this review we compile several sequence metrics that do not rely on homology inference and can be used to compare nucleotide sequences and extract biologically meaningful information from them. These metrics comprise several compositional parameters calculated from sequence data alone, such as GC content, dinucleotide odds ratio, and several codon bias metrics. They also share other interesting properties, such as pervasiveness (patterns persist on smaller scales) and phylogenetic signal. We also cite examples where these homology-independent metrics have been successfully applied to support several bioinformatics challenges, such as taxonomic classification of biological sequences without homology inference. They where also used to detect higher-order patterns of interactions in biological systems, ranging from detecting coevolutionary trends between the genomes of viruses and their hosts to characterization of gene pools of entire microbial communities. We argue that, if correctly understood and applied, homology-independent metrics can add important layers of biological information in comparative genomic studies without prior homology inference. PMID:26029354

  6. Sequencing and comparing whole mitochondrial genomes ofanimals

    SciTech Connect

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  7. Comparative genomics of Cluster O mycobacteriophages.

    PubMed

    Cresawn, Steven G; Pope, Welkin H; Jacobs-Sera, Deborah; Bowman, Charles A; Russell, Daniel A; Dedrick, Rebekah M; Adair, Tamarah; Anders, Kirk R; Ball, Sarah; Bollivar, David; Breitenberger, Caroline; Burnett, Sandra H; Butela, Kristen; Byrnes, Deanna; Carzo, Sarah; Cornely, Kathleen A; Cross, Trevor; Daniels, Richard L; Dunbar, David; Findley, Ann M; Gissendanner, Chris R; Golebiewska, Urszula P; Hartzog, Grant A; Hatherill, J Robert; Hughes, Lee E; Jalloh, Chernoh S; De Los Santos, Carla; Ekanem, Kevin; Khambule, Sphindile L; King, Rodney A; King-Smith, Christina; Klyczek, Karen; Krukonis, Greg P; Laing, Christian; Lapin, Jonathan S; Lopez, A Javier; Mkhwanazi, Sipho M; Molloy, Sally D; Moran, Deborah; Munsamy, Vanisha; Pacey, Eddie; Plymale, Ruth; Poxleitner, Marianne; Reyna, Nathan; Schildbach, Joel F; Stukey, Joseph; Taylor, Sarah E; Ware, Vassie C; Wellmann, Amanda L; Westholm, Daniel; Wodarski, Donna; Zajko, Michelle; Zikalala, Thabiso S; Hendrix, Roger W; Hatfull, Graham F

    2015-01-01

    Mycobacteriophages--viruses of mycobacterial hosts--are genetically diverse but morphologically are all classified in the Caudovirales with double-stranded DNA and tails. We describe here a group of five closely related mycobacteriophages--Corndog, Catdawg, Dylan, Firecracker, and YungJamal--designated as Cluster O with long flexible tails but with unusual prolate capsids. Proteomic analysis of phage Corndog particles, Catdawg particles, and Corndog-infected cells confirms expression of half of the predicted gene products and indicates a non-canonical mechanism for translation of the Corndog tape measure protein. Bioinformatic analysis identifies 8-9 strongly predicted SigA promoters and all five Cluster O genomes contain more than 30 copies of a 17 bp repeat sequence with dyad symmetry located throughout the genomes. Comparison of the Cluster O phages provides insights into phage genome evolution including the processes of gene flux by horizontal genetic exchange. PMID:25742016

  8. Comparative Genomics of Cluster O Mycobacteriophages

    PubMed Central

    Cresawn, Steven G.; Pope, Welkin H.; Jacobs-Sera, Deborah; Bowman, Charles A.; Russell, Daniel A.; Dedrick, Rebekah M.; Adair, Tamarah; Anders, Kirk R.; Ball, Sarah; Bollivar, David; Breitenberger, Caroline; Burnett, Sandra H.; Butela, Kristen; Byrnes, Deanna; Carzo, Sarah; Cornely, Kathleen A.; Cross, Trevor; Daniels, Richard L.; Dunbar, David; Findley, Ann M.; Gissendanner, Chris R.; Golebiewska, Urszula P.; Hartzog, Grant A.; Hatherill, J. Robert; Hughes, Lee E.; Jalloh, Chernoh S.; De Los Santos, Carla; Ekanem, Kevin; Khambule, Sphindile L.; King, Rodney A.; King-Smith, Christina; Klyczek, Karen; Krukonis, Greg P.; Laing, Christian; Lapin, Jonathan S.; Lopez, A. Javier; Mkhwanazi, Sipho M.; Molloy, Sally D.; Moran, Deborah; Munsamy, Vanisha; Pacey, Eddie; Plymale, Ruth; Poxleitner, Marianne; Reyna, Nathan; Schildbach, Joel F.; Stukey, Joseph; Taylor, Sarah E.; Ware, Vassie C.; Wellmann, Amanda L.; Westholm, Daniel; Wodarski, Donna; Zajko, Michelle; Zikalala, Thabiso S.; Hendrix, Roger W.; Hatfull, Graham F.

    2015-01-01

    Mycobacteriophages – viruses of mycobacterial hosts – are genetically diverse but morphologically are all classified in the Caudovirales with double-stranded DNA and tails. We describe here a group of five closely related mycobacteriophages – Corndog, Catdawg, Dylan, Firecracker, and YungJamal – designated as Cluster O with long flexible tails but with unusual prolate capsids. Proteomic analysis of phage Corndog particles, Catdawg particles, and Corndog-infected cells confirms expression of half of the predicted gene products and indicates a non-canonical mechanism for translation of the Corndog tape measure protein. Bioinformatic analysis identifies 8–9 strongly predicted SigA promoters and all five Cluster O genomes contain more than 30 copies of a 17 bp repeat sequence with dyad symmetry located throughout the genomes. Comparison of the Cluster O phages provides insights into phage genome evolution including the processes of gene flux by horizontal genetic exchange. PMID:25742016

  9. Comparative genomic hybridization with single cells after whole genome amplification

    SciTech Connect

    Haddad, B.R.; Baldini, A.; Hughes, M.R.

    1994-09-01

    Conventional karyotype analysis is the ideal way to diagnose chromosomal imbalances. However it requires cell culture and chromosome preparation. There are instances where a very small number of cells are available for cytogenetic evaluation and chromosomes cannot be obtained. Comparative genomic hybridization (CGH) is a novel molecular cytogenetic technique that provides information about genetic imbalances affecting the genome. The power of this technique lies in its ability to detect genetic imbalances using total genomic DNA. We have previously demonstrated the feasibility of whole genome amplification from single cells for subsequent analysis of multiple genetic loci by PCR. In this present work, we combine whole genome amplification with CGH to detect chromosomal imbalances from small numbers of cells. Both cytogenetically normal and abnormal cells were individually picked by micromanipulation and subjected to whole genome amplification using random oligonucleotide primers. Amplified test and control DNA were differentially labeled by incorporation of digoxigenin or biotin, mixed together and hybridized to normal male metaphase spreads. Hybridization was detected with two fluorochromes, rhodamine-anti-digoxigenin and FITC -Avidin. Ratio of intensities of the two fluorochromes along the target chromosomes was analyzed using locally developed computer imaging software. Using the combination of whole genome amplification and CGH, we were able to detect different chromosomal aneuploidies from 30, 20, and 10 cells. It can also be applied to the analysis of fetal cells sorted from maternal circulation, or to tumor cells obtained from needle biopsies or from different body fluids and effusions. Finally, its successful application to single cells will have a great impact on preimplantation diagnosis.

  10. VISTA - computational tools for comparative genomics

    SciTech Connect

    Frazer, Kelly A.; Pachter, Lior; Poliakov, Alexander; Rubin,Edward M.; Dubchak, Inna

    2004-01-01

    Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/VISTA/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, submit their own sequences of interest to several VISTA servers for various types of comparative analysis, and obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kilobase (kb) interval on human chromosome 5 that encodes for the kinesin family member3A (KIF3A) protein.

  11. Ebolavirus comparative genomics

    SciTech Connect

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Ussery, David W.

    2015-07-14

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.

  12. Comparative genomics of Shiga toxin encoding bacteriophages

    PubMed Central

    2012-01-01

    Background Stx bacteriophages are responsible for driving the dissemination of Stx toxin genes (stx) across their bacterial host range. Lysogens carrying Stx phages can cause severe, life-threatening disease and Stx toxin is an integral virulence factor. The Stx-bacteriophage vB_EcoP-24B, commonly referred to as Ф24B, is capable of multiply infecting a single bacterial host cell at a high frequency, with secondary infection increasing the rate at which subsequent bacteriophage infections can occur. This is biologically unusual, therefore determining the genomic content and context of Ф24B compared to other lambdoid Stx phages is important to understanding the factors controlling this phenomenon and determining whether they occur in other Stx phages. Results The genome of the Stx2 encoding phage, Ф24B was sequenced and annotated. The genomic organisation and general features are similar to other sequenced Stx bacteriophages induced from Enterohaemorrhagic Escherichia coli (EHEC), however Ф24B possesses significant regions of heterogeneity, with implications for phage biology and behaviour. The Ф24B genome was compared to other sequenced Stx phages and the archetypal lambdoid phage, lambda, using the Circos genome comparison tool and a PCR-based multi-loci comparison system. Conclusions The data support the hypothesis that Stx phages are mosaic, and recombination events between the host, phages and their remnants within the same infected bacterial cell will continue to drive the evolution of Stx phage variants and the subsequent dissemination of shigatoxigenic potential. PMID:22799768

  13. The Rice Genome Knowledgebase (RGKbase): an annotation database for rice comparative genomics and evolutionary biology

    PubMed Central

    Wang, Dapeng; Xia, Yan; Li, Xinna; Hou, Lixia; Yu, Jun

    2013-01-01

    Over the past 10 years, genomes of cultivated rice cultivars and their wild counterparts have been sequenced although most efforts are focused on genome assembly and annotation of two major cultivated rice (Oryza sativa L.) subspecies, 93-11 (indica) and Nipponbare (japonica). To integrate information from genome assemblies and annotations for better analysis and application, we now introduce a comparative rice genome database, the Rice Genome Knowledgebase (RGKbase, http://rgkbase.big.ac.cn/RGKbase/). RGKbase is built to have three major components: (i) integrated data curation for rice genomics and molecular biology, which includes genome sequence assemblies, transcriptomic and epigenomic data, genetic variations, quantitative trait loci (QTLs) and the relevant literature; (ii) User-friendly viewers, such as Gbrowse, GeneBrowse and Circos, for genome annotations and evolutionary dynamics and (iii) Bioinformatic tools for compositional and synteny analyses, gene family classifications, gene ontology terms and pathways and gene co-expression networks. RGKbase current includes data from five rice cultivars and species: Nipponbare (japonica), 93-11 (indica), PA64s (indica), the African rice (Oryza glaberrima) and a wild rice species (Oryza brachyantha). We are also constantly introducing new datasets from variety of public efforts, such as two recent releases—sequence data from ∼1000 rice varieties, which are mapped into the reference genome, yielding ample high-quality single-nucleotide polymorphisms and insertions–deletions. PMID:23193278

  14. Decoding the molecular evolution of human cognition using comparative genomics

    PubMed Central

    Usui, Noriyoshi; Co, Marissa; Konopka, Genevieve

    2014-01-01

    Identification of genetic and molecular factors responsible for the specialized cognitive abilities of humans is expected to provide important insights into the mechanisms responsible for disorders of cognition such as autism, schizophrenia, and Alzheimer’s disease. Here, we discuss the use of comparative genomics for identifying salient genes and gene networks that may underlie cognition. We focus on the comparison of human and non-human primate brain gene expression and the utility of building gene co-expression networks for prioritizing hundreds of genes that differ in expression among the species queried. We also discuss the importance and methods for functional studies of individual genes identified. Together, this integration of comparative genomics with cellular and animal models should provide improved systems for developing effective therapeutics for disorders of cognition. PMID:25247723

  15. The Aedes aegypti genome: a comparative perspective.

    PubMed

    Waterhouse, R M; Wyder, S; Zdobnov, E M

    2008-02-01

    The sequencing of the second mosquito genome, Aedes aegypti, in addition to Anopheles gambiae, is a major milestone that will drive molecular-level and genome-wide high-throughput studies of not only these but also other mosquito vectors of human pathogens. Here we overview the ancestry of the mosquito genes, list the major expansions of gene families that may relate to species adaptation processes, as exemplified by CYP9 cytochrome P450 genes, and discuss the conservation of chromosomal gene arrangements among the two mosquitoes and fruit fly. Many more invertebrate genomes are expected to be sequenced in the near future, including additional vectors of human pathogens (see http://www.vectorbase.org), and further comparative analyses will become increasingly refined and informative, hopefully improving our understanding of the genetic basis of phenotypical differences among these species, their vectorial capacity, and ultimately leading to the development of novel disease control strategies. PMID:18237279

  16. A comparative gene expression database for invertebrates

    PubMed Central

    2011-01-01

    Background As whole genome and transcriptome sequencing gets cheaper and faster, a great number of 'exotic' animal models are emerging, rapidly adding valuable data to the ever-expanding Evo-Devo field. All these new organisms serve as a fantastic resource for the research community, but the sheer amount of data, some published, some not, makes detailed comparison of gene expression patterns very difficult to summarize - a problem sometimes even noticeable within a single lab. The need to merge existing data with new information in an organized manner that is publicly available to the research community is now more necessary than ever. Description In order to offer a homogenous way of storing and handling gene expression patterns from a variety of organisms, we have developed the first web-based comparative gene expression database for invertebrates that allows species-specific as well as cross-species gene expression comparisons. The database can be queried by gene name, developmental stage and/or expression domains. Conclusions This database provides a unique tool for the Evo-Devo research community that allows the retrieval, analysis and comparison of gene expression patterns within or among species. In addition, this database enables a quick identification of putative syn-expression groups that can be used to initiate, among other things, gene regulatory network (GRN) projects. PMID:21861937

  17. Comparative genomics of brain size evolution

    PubMed Central

    Enard, Wolfgang

    2014-01-01

    Which genetic changes took place during mammalian, primate and human evolution to build a larger brain? To answer this question, one has to correlate genetic changes with brain size changes across a phylogeny. Such a comparative genomics approach provides unique information to better understand brain evolution and brain development. However, its statistical power is limited for example due to the limited number of species, the presumably complex genetics of brain size evolution and the large search space of mammalian genomes. Hence, it is crucial to add functional information, for example by limiting the search space to genes and regulatory elements known to play a role in the relevant cell types during brain development. Similarly, it is crucial to experimentally follow up on hypotheses generated by such a comparative approach. Recent progress in understanding the molecular and cellular mechanisms of mammalian brain development, in genome sequencing and in genome editing, promises to make a close integration of evolutionary and experimental methods a fruitful approach to better understand the genetics of mammalian brain size evolution. PMID:24904382

  18. Comparative genomics tools applied to bioterrorism defence.

    PubMed

    Slezak, Tom; Kuczmarski, Tom; Ott, Linda; Torres, Clinton; Medeiros, Dan; Smith, Jason; Truitt, Brian; Mulakken, Nisha; Lam, Marisa; Vitalis, Elizabeth; Zemla, Adam; Zhou, Carol Ecale; Gardner, Shea

    2003-06-01

    Rapid advances in the genomic sequencing of bacteria and viruses over the past few years have made it possible to consider sequencing the genomes of all pathogens that affect humans and the crops and livestock upon which our lives depend. Recent events make it imperative that full genome sequencing be accomplished as soon as possible for pathogens that could be used as weapons of mass destruction or disruption. This sequence information must be exploited to provide rapid and accurate diagnostics to identify pathogens and distinguish them from harmless near-neighbours and hoaxes. The Chem-Bio Non-Proliferation (CBNP) programme of the US Department of Energy (DOE) began a large-scale effort of pathogen detection in early 2000 when it was announced that the DOE would be providing bio-security at the 2002 Winter Olympic Games in Salt Lake City, Utah. Our team at the Lawrence Livermore National Lab (LLNL) was given the task of developing reliable and validated assays for a number of the most likely bioterrorist agents. The short timeline led us to devise a novel system that utilised whole-genome comparison methods to rapidly focus on parts of the pathogen genomes that had a high probability of being unique. Assays developed with this approach have been validated by the Centers for Disease Control (CDC). They were used at the 2002 Winter Olympics, have entered the public health system, and have been in continual use for non-publicised aspects of homeland defence since autumn 2001. Assays have been developed for all major threat list agents for which adequate genomic sequence is available, as well as for other pathogens requested by various government agencies. Collaborations with comparative genomics algorithm developers have enabled our LLNL team to make major advances in pathogen detection, since many of the existing tools simply did not scale well enough to be of practical use for this application. It is hoped that a discussion of a real-life practical application of

  19. Genome-wide identification and comparative expression analysis reveal a rapid expansion and functional divergence of duplicated genes in the WRKY gene family of cabbage, Brassica oleracea var. capitata.

    PubMed

    Yao, Qiu-Yang; Xia, En-Hua; Liu, Fei-Hu; Gao, Li-Zhi

    2015-02-15

    WRKY transcription factors (TFs), one of the ten largest TF families in higher plants, play important roles in regulating plant development and resistance. To date, little is known about the WRKY TF family in Brassica oleracea. Recently, the completed genome sequence of cabbage (B. oleracea var. capitata) allows us to systematically analyze WRKY genes in this species. A total of 148 WRKY genes were characterized and classified into seven subgroups that belong to three major groups. Phylogenetic and synteny analyses revealed that the repertoire of cabbage WRKY genes was derived from a common ancestor shared with Arabidopsis thaliana. The B. oleracea WRKY genes were found to be preferentially retained after the whole-genome triplication (WGT) event in its recent ancestor, suggesting that the WGT event had largely contributed to a rapid expansion of the WRKY gene family in B. oleracea. The analysis of RNA-Seq data from various tissues (i.e., roots, stems, leaves, buds, flowers and siliques) revealed that most of the identified WRKY genes were positively expressed in cabbage, and a large portion of them exhibited patterns of differential and tissue-specific expression, demonstrating that these gene members might play essential roles in plant developmental processes. Comparative analysis of the expression level among duplicated genes showed that gene expression divergence was evidently presented among cabbage WRKY paralogs, indicating functional divergence of these duplicated WRKY genes. PMID:25481634

  20. DCODE.ORG Anthology of Comparative Genomic Tools

    SciTech Connect

    Loots, G G; Ovcharenko, I

    2005-01-11

    Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the noncoding encryption of gene regulation across genomes. To facilitate the use of comparative genomics to practical applications in genetics and genomics we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools: zPicture and Mulan; a phylogenetic shadowing tool: eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools: rVista and multiTF; a tool for extracting cis-regulatory modules governing the expression of co-regulated genes, CREME; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ web site.

  1. COMPARISON OF COMPARATIVE GENOMIC HYBRIDIZATIONS TECHNOLOGIES ACROSS MICROARRAY PLATFORMS

    EPA Science Inventory

    Comparative Genomic Hybridization (CGH) measures DNA copy number differences between a reference genome and a test genome. The DNA samples are differentially labeled and hybridized to an immobilized substrate. In early CGH experiments, the DNA targets were hybridized to metaphase...

  2. Comparative genome analysis of Basidiomycete fungi

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Henrissat, Bernard; Nagy, Laszlo; Brown, Daren; Held, Benjamin; Baker, Scott; Blanchette, Robert; Boussau, Bastien; Doty, Sharon L.; Fagnan, Kirsten; Floudas, Dimitris; Levasseur, Anthony; Manning, Gerard; Martin, Francis; Morin, Emmanuelle; Otillar, Robert; Pisabarro, Antonio; Walton, Jonathan; Wolfe, Ken; Hibbett, David; Grigoriev, Igor

    2013-08-07

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprotrophs including the majority of wood decaying and ectomycorrhizal species. To better understand the genetic diversity of this phylum we compared the genomes of 35 basidiomycetes including 6 newly sequenced genomes. These genomes span extremes of genome size, gene number, and repeat content. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) found in only one organism. Correlations between lifestyle and certain gene families are evident. Phylogenetic patterns of plant biomass-degrading genes in Agaricomycotina suggest a continuum rather than a dichotomy between the white rot and brown rot modes of wood decay. Based on phylogenetically-informed PCA analysis of wood decay genes, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has typical ligninolytic class II fungal peroxidases (PODs). This prediction is supported by growth assays in which both fungi exhibit wood decay with white rot-like characteristics. Based on this, we suggest that the white/brown rot dichotomy may be inadequate to describe the full range of wood decaying fungi. Analysis of the rate of discovery of proteins with no or few homologs suggests the value of continued sequencing of basidiomycete fungi.

  3. Comparative Analysis of Genome Sequences with VISTA

    DOE Data Explorer

    Dubchak, Inna

    VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

  4. Comparative genomics of biotechnologically important yeasts.

    PubMed

    Riley, Robert; Haridas, Sajeet; Wolfe, Kenneth H; Lopes, Mariana R; Hittinger, Chris Todd; Göker, Markus; Salamov, Asaf A; Wisecaver, Jennifer H; Long, Tanya M; Calvey, Christopher H; Aerts, Andrea L; Barry, Kerrie W; Choi, Cindy; Clum, Alicia; Coughlan, Aisling Y; Deshpande, Shweta; Douglass, Alexander P; Hanson, Sara J; Klenk, Hans-Peter; LaButti, Kurt M; Lapidus, Alla; Lindquist, Erika A; Lipzen, Anna M; Meier-Kolthoff, Jan P; Ohm, Robin A; Otillar, Robert P; Pangilinan, Jasmyn L; Peng, Yi; Rokas, Antonis; Rosa, Carlos A; Scheuner, Carmen; Sibirny, Andriy A; Slot, Jason C; Stielow, J Benjamin; Sun, Hui; Kurtzman, Cletus P; Blackwell, Meredith; Grigoriev, Igor V; Jeffries, Thomas W

    2016-08-30

    Ascomycete yeasts are metabolically diverse, with great potential for biotechnology. Here, we report the comparative genome analysis of 29 taxonomically and biotechnologically important yeasts, including 16 newly sequenced. We identify a genetic code change, CUG-Ala, in Pachysolen tannophilus in the clade sister to the known CUG-Ser clade. Our well-resolved yeast phylogeny shows that some traits, such as methylotrophy, are restricted to single clades, whereas others, such as l-rhamnose utilization, have patchy phylogenetic distributions. Gene clusters, with variable organization and distribution, encode many pathways of interest. Genomics can predict some biochemical traits precisely, but the genomic basis of others, such as xylose utilization, remains unresolved. Our data also provide insight into early evolution of ascomycetes. We document the loss of H3K9me2/3 heterochromatin, the origin of ascomycete mating-type switching, and panascomycete synteny at the MAT locus. These data and analyses will facilitate the engineering of efficient biosynthetic and degradative pathways and gateways for genomic manipulation. PMID:27535936

  5. Comparative genomics and evolution of eukaryotic phospholipidbiosynthesis

    SciTech Connect

    Lykidis, Athanasios

    2006-12-01

    Phospholipid biosynthetic enzymes produce diverse molecular structures and are often present in multiple forms encoded by different genes. This work utilizes comparative genomics and phylogenetics for exploring the distribution, structure and evolution of phospholipid biosynthetic genes and pathways in 26 eukaryotic genomes. Although the basic structure of the pathways was formed early in eukaryotic evolution, the emerging picture indicates that individual enzyme families followed unique evolutionary courses. For example, choline and ethanolamine kinases and cytidylyltransferases emerged in ancestral eukaryotes, whereas, multiple forms of the corresponding phosphatidyltransferases evolved mainly in a lineage specific manner. Furthermore, several unicellular eukaryotes maintain bacterial-type enzymes and reactions for the synthesis of phosphatidylglycerol and cardiolipin. Also, base-exchange phosphatidylserine synthases are widespread and ancestral enzymes. The multiplicity of phospholipid biosynthetic enzymes has been largely generated by gene expansion in a lineage specific manner. Thus, these observations suggest that phospholipid biosynthesis has been an actively evolving system. Finally, comparative genomic analysis indicates the existence of novel phosphatidyltransferases and provides a candidate for the uncharacterized eukaryotic phosphatidylglycerol phosphate phosphatase.

  6. Image analysis in comparative genomic hybridization

    SciTech Connect

    Lundsteen, C.; Maahr, J.; Christensen, B.

    1995-01-01

    Comparative genomic hybridization (CGH) is a new technique by which genomic imbalances can be detected by combining in situ suppression hybridization of whole genomic DNA and image analysis. We have developed software for rapid, quantitative CGH image analysis by a modification and extension of the standard software used for routine karyotyping of G-banded metaphase spreads in the Magiscan chromosome analysis system. The DAPI-counterstained metaphase spread is karyotyped interactively. Corrections for image shifts between the DAPI, FITC, and TRITC images are done manually by moving the three images relative to each other. The fluorescence background is subtracted. A mean filter is applied to smooth the FITC and TRITC images before the fluorescence ratio between the individual FITC and TRITC-stained chromosomes is computed pixel by pixel inside the area of the chromosomes determined by the DAPI boundaries. Fluorescence intensity ratio profiles are generated, and peaks and valleys indicating possible gains and losses of test DNA are marked if they exceed ratios below 0.75 and above 1.25. By combining the analysis of several metaphase spreads, consistent findings of gains and losses in all or almost all spreads indicate chromosomal imbalance. Chromosomal imbalances are detected either by visual inspection of fluorescence ratio (FR) profiles or by a statistical approach that compares FR measurements of the individual case with measurements of normal chromosomes. The complete analysis of one metaphase can be carried out in approximately 10 minutes. 8 refs., 7 figs., 1 tab.

  7. Comparative genome map of human and cattle

    SciTech Connect

    Solinas-Toldo, S.; Fries, R.; Lengauer, C.

    1995-06-10

    Chromosomal homologies between individual human chromosomes and the bovine karyotype have been established by using a new approach termed Zoo-FISH. Labeled DNA libraries from flow-sorted human chromosomes were used as probes for fluorescence in situ hybridization on cattle chromosomes. All human DNA libraries, except the Y chromosome library, hybridized to one or more cattle chromosomes, identifying and delineating 50 segments of homology, most of them corresponding to the regions of homology as identified by the previous mapping of individual conserved loci. However, Zoo-FISH refines the comparative maps constructed by molecular gene mapping of individual loci by providing information on the boundaries of conserved regions in the absence of obvious cytogenetic homologies of human and bovine chromosomes. It allows study of karyotypic evolution and opens new avenues for genomic analysis by facilitating the extrapolation of results from the human genome initiative. 50 refs., 3 figs., 1 tab.

  8. Reduction and Expansion in Microsporidian Genome Evolution: New Insights from Comparative Genomics

    PubMed Central

    Heinz, Eva; Watson, Andrew K.; Foster, Peter G.; Sendra, Kacper M.; Heaps, Sarah E.; Hirt, Robert P.; Martin Embley, T.

    2013-01-01

    Microsporidia are an abundant group of obligate intracellular parasites of other eukaryotes, including immunocompromised humans, but the molecular basis of their intracellular lifestyle and pathobiology are poorly understood. New genomes from a taxonomically broad range of microsporidians, complemented by published expression data, provide an opportunity for comparative analyses to identify conserved and lineage-specific patterns of microsporidian genome evolution that have underpinned this success. In this study, we infer that a dramatic bottleneck in the last common microsporidian ancestor (LCMA) left a small conserved core of genes that was subsequently embellished by gene family expansion driven by gene acquisition in different lineages. Novel expressed protein families represent a substantial fraction of sequenced microsporidian genomes and are significantly enriched for signals consistent with secretion or membrane location. Further evidence of selection is inferred from the gain and reciprocal loss of functional domains between paralogous genes, for example, affecting transport proteins. Gene expansions among transporter families preferentially affect those that are located on the plasma membrane of model organisms, consistent with recruitment to plug conserved gaps in microsporidian biosynthesis and metabolism. Core microsporidian genes shared with other eukaryotes are enriched in orthologs that, in yeast, are highly expressed, highly connected, and often essential, consistent with strong negative selection against further reduction of the conserved gene set since the LCMA. Our study reveals that microsporidian genome evolution is a highly dynamic process that has balanced constraint, reductive evolution, and genome expansion during adaptation to an extraordinarily successful obligate intracellular lifestyle. PMID:24259309

  9. Comparative genome sequencing reveals genomic signature of extreme desiccation tolerance in the anhydrobiotic midge.

    PubMed

    Gusev, Oleg; Suetsugu, Yoshitaka; Cornette, Richard; Kawashima, Takeshi; Logacheva, Maria D; Kondrashov, Alexey S; Penin, Aleksey A; Hatanaka, Rie; Kikuta, Shingo; Shimura, Sachiko; Kanamori, Hiroyuki; Katayose, Yuichi; Matsumoto, Takashi; Shagimardanova, Elena; Alexeev, Dmitry; Govorun, Vadim; Wisecaver, Jennifer; Mikheyev, Alexander; Koyanagi, Ryo; Fujie, Manabu; Nishiyama, Tomoaki; Shigenobu, Shuji; Shibata, Tomoko F; Golygina, Veronika; Hasebe, Mitsuyasu; Okuda, Takashi; Satoh, Nori; Kikawada, Takahiro

    2014-01-01

    Anhydrobiosis represents an extreme example of tolerance adaptation to water loss, where an organism can survive in an ametabolic state until water returns. Here we report the first comparative analysis examining the genomic background of extreme desiccation tolerance, which is exclusively found in larvae of the only anhydrobiotic insect, Polypedilum vanderplanki. We compare the genomes of P. vanderplanki and a congeneric desiccation-sensitive midge P. nubifer. We determine that the genome of the anhydrobiotic species specifically contains clusters of multi-copy genes with products that act as molecular shields. In addition, the genome possesses several groups of genes with high similarity to known protective proteins. However, these genes are located in distinct paralogous clusters in the genome apart from the classical orthologues of the corresponding genes shared by both chironomids and other insects. The transcripts of these clustered paralogues contribute to a large majority of the mRNA pool in the desiccating larvae and most likely define successful anhydrobiosis. Comparison of expression patterns of orthologues between two chironomid species provides evidence for the existence of desiccation-specific gene expression systems in P. vanderplanki. PMID:25216354

  10. Regulation of cytochrome P450 expression in Drosophila: Genomic insights.

    PubMed

    Giraudo, Maeva; Unnithan, G Chandran; Le Goff, Gaëlle; Feyereisen, René

    2010-06-01

    Genomic tools such as the availability of the Drosophila genome sequence, the relative ease of stable transformation, and DNA microarrays have made the fruit fly a powerful model in insecticide toxicology research. We have used transgenic promoter-GFP constructs to document the detailed pattern of induced Cyp6a2 gene expression in larval and adult Drosophila tissues. We also compared various insecticides and xenobiotics for their ability to induce this cytochrome P450 gene, and show that the pattern of Cyp6a2 inducibility is comparable to that of vertebrate CYP2B genes, and different from that of vertebrate CYP1A genes, suggesting a degree of evolutionary conservation for the "phenobarbital-type" induction mechanism. Our results are compared to the increasingly diverse reports on P450 induction that can be gleaned from whole genome or from "detox" microarray experiments in Drosophila. These suggest that only a third of the genomic repertoire of CYP genes is inducible by xenobiotics, and that there are distinct subsets of inducers / induced genes, suggesting multiple xenobiotic transduction mechanisms. A relationship between induction and resistance is not supported by expression data from the literature. The relative abundance of expression data now available is in contrast to the paucity of studies on functional expression of P450 enzymes, and this remains a challenge for our understanding of the toxicokinetic aspects of insecticide action. PMID:20582327

  11. Regulation of cytochrome P450 expression in Drosophila: Genomic insights

    PubMed Central

    Giraudo, Maeva; Unnithan, G. Chandran; Le Goff, Gaëlle; Feyereisen, René

    2009-01-01

    Genomic tools such as the availability of the Drosophila genome sequence, the relative ease of stable transformation, and DNA microarrays have made the fruit fly a powerful model in insecticide toxicology research. We have used transgenic promoter-GFP constructs to document the detailed pattern of induced Cyp6a2 gene expression in larval and adult Drosophila tissues. We also compared various insecticides and xenobiotics for their ability to induce this cytochrome P450 gene, and show that the pattern of Cyp6a2 inducibility is comparable to that of vertebrate CYP2B genes, and different from that of vertebrate CYP1A genes, suggesting a degree of evolutionary conservation for the “phenobarbital-type” induction mechanism. Our results are compared to the increasingly diverse reports on P450 induction that can be gleaned from whole genome or from “detox” microarray experiments in Drosophila. These suggest that only a third of the genomic repertoire of CYP genes is inducible by xenobiotics, and that there are distinct subsets of inducers / induced genes, suggesting multiple xenobiotic transduction mechanisms. A relationship between induction and resistance is not supported by expression data from the literature. The relative abundance of expression data now available is in contrast to the paucity of studies on functional expression of P450 enzymes, and this remains a challenge for our understanding of the toxicokinetic aspects of insecticide action. PMID:20582327

  12. Quantitative analysis of comparative genomic hybridization

    SciTech Connect

    Manoir, S. du; Bentz, M.; Joos, S. |

    1995-01-01

    Comparative genomic hybridization (CGH) is a new molecular cytogenetic method for the detection of chromosomal imbalances. Following cohybridization of DNA prepared from a sample to be studied and control DNA to normal metaphase spreads, probes are detected via different fluorochromes. The ratio of the test and control fluorescence intensities along a chromosome reflects the relative copy number of segments of a chromosome in the test genome. Quantitative evaluation of CGH experiments is required for the determination of low copy changes, e.g., monosomy or trisomy, and for the definition of the breakpoints involved in unbalanced rearrangements. In this study, a program for quantitation of CGH preparations is presented. This program is based on the extraction of the fluorescence ratio profile along each chromosome, followed by averaging of individual profiles from several metaphase spreads. Objective parameters critical for quantitative evaluations were tested, and the criteria for selection of suitable CGH preparations are described. The granularity of the chromosome painting and the regional inhomogeneity of fluorescence intensities in metaphase spreads proved to be crucial parameters. The coefficient of variation of the ratio value for chromosomes in balanced state (CVBS) provides a general quality criterion for CGH experiments. Different cutoff levels (thresholds) of average fluorescence ratio values were compared for their specificity and sensitivity with regard to the detection of chromosomal imbalances. 27 refs., 15 figs., 1 tab.

  13. Comparative genomics of ten solanaceous plastomes.

    PubMed

    Kaur, Harpreet; Singh, Bhupinder Pal; Singh, Harpreet; Nagpal, Avinash Kaur

    2014-01-01

    Availability of complete plastid genomes of ten solanaceous species, Atropa belladonna, Capsicum annuum, Datura stramonium, Nicotiana sylvestris, Nicotiana tabacum, Nicotiana tomentosiformis, Nicotiana undulata, Solanum bulbocastanum, Solanum lycopersicum, and Solanum tuberosum provided us with an opportunity to conduct their in silico comparative analysis in depth. The size of complete chloroplast genomes and LSC and SSC regions of three species of Solanum is comparatively smaller than that of any other species studied till date (exception: SSC region of A. belladonna). AT content of coding regions was found to be less than noncoding regions. A duplicate copy of trnH gene in C. annuum and two alternative tRNA genes for proline in D. stramonium were observed for the first time in this analysis. Further, homology search revealed the presence of rps19 pseudogene and infA genes in A. belladonna and D. stramonium, a region identical to rps19 pseudogene in C. annum and orthologues of sprA gene in another six species. Among the eighteen intron-containing genes, 3 genes have two introns and 15 genes have one intron. The longest insertion was found in accD gene in C. annuum. Phylogenetic analysis using concatenated protein coding sequences gave two clades, one for Nicotiana species and another for Solanum, Capsicum, Atropa, and Datura. PMID:25477958

  14. Comparative Genomics of Ten Solanaceous Plastomes

    PubMed Central

    Kaur, Harpreet; Singh, Bhupinder Pal; Singh, Harpreet; Nagpal, Avinash Kaur

    2014-01-01

    Availability of complete plastid genomes of ten solanaceous species, Atropa belladonna, Capsicum annuum, Datura stramonium, Nicotiana sylvestris, Nicotiana tabacum, Nicotiana tomentosiformis, Nicotiana undulata, Solanum bulbocastanum, Solanum lycopersicum, and Solanum tuberosum provided us with an opportunity to conduct their in silico comparative analysis in depth. The size of complete chloroplast genomes and LSC and SSC regions of three species of Solanum is comparatively smaller than that of any other species studied till date (exception: SSC region of A. belladonna). AT content of coding regions was found to be less than noncoding regions. A duplicate copy of trnH gene in C. annuum and two alternative tRNA genes for proline in D. stramonium were observed for the first time in this analysis. Further, homology search revealed the presence of rps19 pseudogene and infA genes in A. belladonna and D. stramonium, a region identical to rps19 pseudogene in C. annum and orthologues of sprA gene in another six species. Among the eighteen intron-containing genes, 3 genes have two introns and 15 genes have one intron. The longest insertion was found in accD gene in C. annuum. Phylogenetic analysis using concatenated protein coding sequences gave two clades, one for Nicotiana species and another for Solanum, Capsicum, Atropa, and Datura. PMID:25477958

  15. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    SciTech Connect

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  16. Retroviral enhancer detection insertions in zebrafish combined with comparative genomics reveal genomic regulatory blocks - a fundamental feature of vertebrate genomes

    PubMed Central

    Kikuta, Hiroshi; Fredman, David; Rinkwitz, Silke; Lenhard, Boris; Becker, Thomas S

    2007-01-01

    A large-scale enhancer detection screen was performed in the zebrafish using a retroviral vector carrying a basal promoter and a fluorescent protein reporter cassette. Analysis of insertional hotspots uncovered areas around developmental regulatory genes in which an insertion results in the same global expression pattern, irrespective of exact position. These areas coincide with vertebrate chromosomal segments containing identical gene order; a phenomenon known as conserved synteny and thought to be a vestige of evolution. Genomic comparative studies have found large numbers of highly conserved noncoding elements (HCNEs) spanning these and other loci. HCNEs are thought to act as transcriptional enhancers based on the finding that many of those that have been tested direct tissue specific expression in transient or transgenic assays. Although gene order in hox and other gene clusters has long been known to be conserved because of shared regulatory sequences or overlapping transcriptional units, the chromosomal areas found through insertional hotspots contain only one or a few developmental regulatory genes as well as phylogenetically unrelated genes. We have termed these regions genomic regulatory blocks (GRBs), and show that they underlie the phenomenon of conserved synteny through all sequenced vertebrate genomes. After teleost whole genome duplication, a subset of GRBs were retained in two copies, underwent degenerative changes compared with tetrapod loci that exist as single copy, and that therefore can be viewed as representing the ancestral form. We discuss these findings in light of evolution of vertebrate chromosomal architecture and the identification of human disease mutations. PMID:18047696

  17. Comparative genomics of mutualistic viruses of Glyptapanteles parasitic wasps

    PubMed Central

    Desjardins, Christopher A; Gundersen-Rindal, Dawn E; Hostetler, Jessica B; Tallon, Luke J; Fadrosh, Douglas W; Fuester, Roger W; Pedroni, Monica J; Haas, Brian J; Schatz, Michael C; Jones, Kristine M; Crabtree, Jonathan; Forberger, Heather; Nene, Vishvanath

    2008-01-01

    Background Polydnaviruses, double-stranded DNA viruses with segmented genomes, have evolved as obligate endosymbionts of parasitoid wasps. Virus particles are replication deficient and produced by female wasps from proviral sequences integrated into the wasp genome. These particles are co-injected with eggs into caterpillar hosts, where viral gene expression facilitates parasitoid survival and, thereby, survival of proviral DNA. Here we characterize and compare the encapsidated viral genome sequences of bracoviruses in the family Polydnaviridae associated with Glyptapanteles gypsy moth parasitoids, along with near complete proviral sequences from which both viral genomes are derived. Results The encapsidated Glyptapanteles indiensis and Glyptapanteles flavicoxis bracoviral genomes, each composed of 29 different size segments, total approximately 517 and 594 kbp, respectively. They are generated from a minimum of seven distinct loci in the wasp genome. Annotation of these sequences revealed numerous novel features for polydnaviruses, including insect-like sugar transporter genes and transposable elements. Evolutionary analyses suggest that positive selection is widespread among bracoviral genes. Conclusions The structure and organization of G. indiensis and G. flavicoxis bracovirus proviral segments as multiple loci containing one to many viral segments, flanked and separated by wasp gene-encoding DNA, is confirmed. Rapid evolution of bracovirus genes supports the hypothesis of bracovirus genes in an 'arms race' between bracovirus and caterpillar. Phylogenetic analyses of the bracoviral genes encoding sugar transporters provides the first robust evidence of a wasp origin for some polydnavirus genes. We hypothesize transposable elements, such as those described here, could facilitate transfer of genes between proviral segments and host DNA. PMID:19116010

  18. Comparative Genomic Indexing Reveals the Phylogenomics of Escherichia coli Pathogens

    PubMed Central

    Anjum, Muna F.; Lucchini, Sacha; Thompson, Arthur; Hinton, Jay C. D.; Woodward, Martin J.

    2003-01-01

    The Escherichia coli O26 serogroup includes important food-borne pathogens associated with human and animal diarrheal disease. Current typing methods have revealed great genetic heterogeneity within the O26 group; the data are often inconsistent and focus only on verotoxin (VT)-positive O26 isolates. To improve current understanding of diversity within this serogroup, the genomic relatedness of VT-positive and -negative O26 strains was assessed by comparative genomic indexing. Our results clearly demonstrate that irrespective of virulence characteristics and pathotype designation, the O26 strains show greater genomic similarity to each other than to any other strain included in this study. Our data suggest that enteropathogenic and VT-expressing E. coli O26 strains represent the same clonal lineage and that VT-expressing E. coli O26 strains have gained additional virulence characteristics. Using this approach, we established the core genes which are central to the E. coli species and identified regions of variation from the E. coli K-12 chromosomal backbone. PMID:12874348

  19. A New System for Comparative Functional Genomics of Saccharomyces Yeasts

    PubMed Central

    Caudy, Amy A.; Guan, Yuanfang; Jia, Yue; Hansen, Christina; DeSevo, Chris; Hayes, Alicia P.; Agee, Joy; Alvarez-Dominguez, Juan R.; Arellano, Hugo; Barrett, Daniel; Bauerle, Cynthia; Bisaria, Namita; Bradley, Patrick H.; Breunig, J. Scott; Bush, Erin; Cappel, David; Capra, Emily; Chen, Walter; Clore, John; Combs, Peter A.; Doucette, Christopher; Demuren, Olukunle; Fellowes, Peter; Freeman, Sam; Frenkel, Evgeni; Gadala-Maria, Daniel; Gawande, Richa; Glass, David; Grossberg, Samuel; Gupta, Anita; Hammonds-Odie, Latanya; Hoisos, Aaron; Hsi, Jenny; Hsu, Yu-Han Huang; Inukai, Sachi; Karczewski, Konrad J.; Ke, Xiaobo; Kojima, Mina; Leachman, Samuel; Lieber, Danny; Liebowitz, Anna; Liu, Julia; Liu, Yufei; Martin, Trevor; Mena, Jose; Mendoza, Rosa; Myhrvold, Cameron; Millian, Christian; Pfau, Sarah; Raj, Sandeep; Rich, Matt; Rokicki, Joe; Rounds, William; Salazar, Michael; Salesi, Matthew; Sharma, Rajani; Silverman, Sanford; Singer, Cara; Sinha, Sandhya; Staller, Max; Stern, Philip; Tang, Hanlin; Weeks, Sharon; Weidmann, Maxwell; Wolf, Ashley; Young, Carmen; Yuan, Jie; Crutchfield, Christopher; McClean, Megan; Murphy, Coleen T.; Llinás, Manuel; Botstein, David; Troyanskaya, Olga G.; Dunham, Maitreya J.

    2013-01-01

    Whole-genome sequencing, particularly in fungi, has progressed at a tremendous rate. More difficult, however, is experimental testing of the inferences about gene function that can be drawn from comparative sequence analysis alone. We present a genome-wide functional characterization of a sequenced but experimentally understudied budding yeast, Saccharomyces bayanus var. uvarum (henceforth referred to as S. bayanus), allowing us to map changes over the 20 million years that separate this organism from S. cerevisiae. We first created a suite of genetic tools to facilitate work in S. bayanus. Next, we measured the gene-expression response of S. bayanus to a diverse set of perturbations optimized using a computational approach to cover a diverse array of functionally relevant biological responses. The resulting data set reveals that gene-expression patterns are largely conserved, but significant changes may exist in regulatory networks such as carbohydrate utilization and meiosis. In addition to regulatory changes, our approach identified gene functions that have diverged. The functions of genes in core pathways are highly conserved, but we observed many changes in which genes are involved in osmotic stress, peroxisome biogenesis, and autophagy. A surprising number of genes specific to S. bayanus respond to oxidative stress, suggesting the organism may have evolved under different selection pressures than S. cerevisiae. This work expands the scope of genome-scale evolutionary studies from sequence-based analysis to rapid experimental characterization and could be adopted for functional mapping in any lineage of interest. Furthermore, our detailed characterization of S. bayanus provides a valuable resource for comparative functional genomics studies in yeast. PMID:23852385

  20. Comparative genomics reveals insights into avian genome evolution and adaptation

    PubMed Central

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  1. Comparative genomics reveals insights into avian genome evolution and adaptation.

    PubMed

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M; Lee, Chul; Storz, Jay F; Antunes, Agostinho; Greenwold, Matthew J; Meredith, Robert W; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S; Gatesy, John; Hoffmann, Federico G; Opazo, Juan C; Håstad, Olle; Sawyer, Roger H; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A; Green, Richard E; O'Brien, Stephen J; Griffin, Darren; Johnson, Warren E; Haussler, David; Ryder, Oliver A; Willerslev, Eske; Graves, Gary R; Alström, Per; Fjeldså, Jon; Mindell, David P; Edwards, Scott V; Braun, Edward L; Rahbek, Carsten; Burt, David W; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D; Gilbert, M Thomas P; Wang, Jun

    2014-12-12

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  2. Comparative genomics of the lactic acid bacteria

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Lactic acid-producing bacteria are associated with various plant and animal niches and play a key role in the production of fermented foods and beverages. We report nine genome sequences representing the phylogenetic and functional diversity of these bacteria. The small genomes of lactic acid bacter...

  3. Floral gene resources from basal angiosperms for comparative genomics research

    PubMed Central

    Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; dePamphilis, Claude W; Leebens-Mack, James H

    2005-01-01

    Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and

  4. Comparative Omics-Driven Genome Annotation Refinement: Application across Yersiniae

    SciTech Connect

    Rutledge, Alexandra C.; Jones, Marcus B.; Chauhan, Sadhana; Purvine, Samuel O.; Sanford, James; Monroe, Matthew E.; Brewer, Heather M.; Payne, Samuel H.; Ansong, Charles; Frank, Bryan C.; Smith, Richard D.; Peterson, Scott; Motin, Vladimir L.; Adkins, Joshua N.

    2012-03-27

    Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. To date, the perceived value of manual curation for genome annotations is not offset by the real cost and time associated with the process. In order to balance the large number of sequences generated, the annotation process is now performed almost exclusively in an automated fashion for most genome sequencing projects. One possible way to reduce errors inherent to automated computational annotations is to apply data from 'omics' measurements (i.e. transcriptional and proteomic) to the un-annotated genome with a proteogenomic-based approach. This approach does require additional experimental and bioinformatics methods to include omics technologies; however, the approach is readily automatable and can benefit from rapid developments occurring in those research domains as well. The annotation process can be improved by experimental validation of transcription and translation and aid in the discovery of annotation errors. Here the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species, as is becoming common in sequencing efforts. Transcriptomic and proteomic data derived from three highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 previously incorrect protein-coding sequences (e.g., observed frameshifts, extended start sites, and translated pseudogenes) within the three current Yersinia genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus

  5. Comparative genomic hybridization in clinical cytogenetics

    SciTech Connect

    Bryndorf, T.; Kirchhoff, M.; Rose, H.

    1995-11-01

    We report the results of applying comparative genomic hybridization (CGH) in a cytogenetic service laboratory for (1) determination of the origin of extra and missing chromosomal material in intricate cases of unbalanced aberrations and (2) detection of common prenatal numerical chromosome aberrations. A total of 11 fetal samples were analyzed. Seven cases of complex unbalanced aberrations that could not be identified reliably by conventional cytogenetics were successfully resolved by CGH analysis. CGH results were validated by using FISH with chromosome-specific probes. Four cases representing common prenatal numerical aberrations (trisomy 21, 18, and 13 and monosomy X) were also successfully diagnosed by CGH. We conclude that CGH is a powerful adjunct to traditional cytogenetic techniques that makes it possible to solve clinical cases of intricate unbalanced aberrations in a single hybridization. CGH may also be a useful adjunct to screen for euchromatic involvement in marker chromosomes. Further technical development may render CGH applicable for routine aberration screening. 16 refs., 4 figs., 2 tabs.

  6. Comparative analysis of genomic signal processing for microarray data clustering.

    PubMed

    Istepanian, Robert S H; Sungoor, Ala; Nebel, Jean-Christophe

    2011-12-01

    Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods. PMID:22157075

  7. Comparative genomic hybridization: Detection of segmental aneusomies

    SciTech Connect

    Cronin, J.E.; Magrane, G.G.; Gray, J.W.

    1994-09-01

    Comparative genomic hybridization (CGH) has been used successfully to detect whole chromosome and segmental aneusomies. However, its sensitivity for detection of segmental aneusomies is still not well known. We present here an analysis of CGH sensitivity with emphasis on detection of abnormalities commonly found during pre-and neo-natal diagnosis. CGH is performed by hybridizing green and red fluorescing test and normal DNA samples, respectively, to normal metaphase spreads and measuring green:red fluorescence ratios along all chromosomes. The ratios are normalized such that 2 copies of a normal chromosome region in the test sample gives a ratio of 1.0. Alterations in test vs. control gene copy number range from 1.5 [trisomy] to 0.5 [monosomy]. Clinical samples analyzed included Wolf Hirschhorn (4p-), Cri du Chat (5p-) and DiGeorge (22q-). In addition, 7 cell lines with chromosome 21 segmental aneusomies were analyzed. These included 3 with terminal duplications, 1 with a terminal deletion, 1 with an interstitial deletion and 2 with interstitial amplifications. The DiGeorge deletion was the only deletion not deleted by CGH. This is not surprising as standard G banding does not routinely detect this 1-2 megabase deletion. The 4p- and 5p- monosomies were detected and breakpoints correctly assigned prospectively. Proximal alterations involving 21q22.11 are unambiguously defined. Specifically, two interstitial aneusomies involving this region are detected. Studies involving late prophase chromosome normal spreads gave identical breakpoints. Thus, analysis of extended chromosomes did not improve the sensitivity of the technique. Taken together, these data suggest that CGH can detect segmental aneusomies greater than 8 megabases in extent. Smaller aneusomies can, at times, be detected. Work is now underway to modify the analysis software to increase sensitivity and to decrease the amount of material needed for analysis.

  8. Comparative and Functional Genomics in Identifying Aflatoxin Biosynthetic Genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Identification of genes involved in aflatoxin biosynthesis through Aspergillus flavus genomics has been actively pursued. A. flavus Expressed Sequence Tags (EST’s) and whole genome sequencing have been completed. Groups of genes that are potentially involved in aflatoxin production have been profi...

  9. Initial sequencing and comparative analysis of the mouse genome

    SciTech Connect

    Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

    2002-12-15

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

  10. Widespread expression of conserved small RNAs in small symbiont genomes

    PubMed Central

    Hansen, Allison K; Degnan, Patrick H

    2014-01-01

    Genome architecture of a microbe markedly changes when it transitions from a free-living lifestyle to an obligate symbiotic association within eukaryotic cells. These symbiont genomes experience numerous rearrangements and massive gene loss, which is expected to radically alter gene regulatory networks compared with those of free-living relatives. As such, it remains unclear whether and how these small symbiont genomes regulate gene expression. Here, using a label-free mass-spec quantification approach we found that differential protein regulation occurs in Buchnera, a model symbiont with a reduced genome, when it transitions between two distinct life stages. However, differential mRNA expression could not be detected between Buchnera life stages, despite the presence of a small number of putative transcriptional regulators. Instead a comparative analysis of small RNA expression profiles among five divergent Buchnera lineages, spanning a variety of Buchnera life stages, reveals 140 novel intergenic and antisense small RNAs and 517 untranslated regions that were significantly expressed, some of which have been conserved for ∼65 million years. In addition, the majority of these small RNAs exhibit both sequence covariation and thermodynamic stability, indicators of a potential structural RNA role. Together, these data suggest that gene regulation at the post-transcriptional level may be important in Buchnera. This is the first study to empirically identify Buchnera small RNAs, and we propose that these novel small RNAs may facilitate post-transcriptional regulation through translational inhibition/activation, and/or transcript stability. Ultimately, post-transcriptional regulation may shape metabolic complementation between Buchnera and its aphid host, thus impacting the animal's ecology and evolution. PMID:25012903

  11. Widespread expression of conserved small RNAs in small symbiont genomes.

    PubMed

    Hansen, Allison K; Degnan, Patrick H

    2014-12-01

    Genome architecture of a microbe markedly changes when it transitions from a free-living lifestyle to an obligate symbiotic association within eukaryotic cells. These symbiont genomes experience numerous rearrangements and massive gene loss, which is expected to radically alter gene regulatory networks compared with those of free-living relatives. As such, it remains unclear whether and how these small symbiont genomes regulate gene expression. Here, using a label-free mass-spec quantification approach we found that differential protein regulation occurs in Buchnera, a model symbiont with a reduced genome, when it transitions between two distinct life stages. However, differential mRNA expression could not be detected between Buchnera life stages, despite the presence of a small number of putative transcriptional regulators. Instead a comparative analysis of small RNA expression profiles among five divergent Buchnera lineages, spanning a variety of Buchnera life stages, reveals 140 novel intergenic and antisense small RNAs and 517 untranslated regions that were significantly expressed, some of which have been conserved for ∼65 million years. In addition, the majority of these small RNAs exhibit both sequence covariation and thermodynamic stability, indicators of a potential structural RNA role. Together, these data suggest that gene regulation at the post-transcriptional level may be important in Buchnera. This is the first study to empirically identify Buchnera small RNAs, and we propose that these novel small RNAs may facilitate post-transcriptional regulation through translational inhibition/activation, and/or transcript stability. Ultimately, post-transcriptional regulation may shape metabolic complementation between Buchnera and its aphid host, thus impacting the animal's ecology and evolution. PMID:25012903

  12. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources.

    PubMed

    Klima, Cassidy L; Cook, Shaun R; Zaheer, Rahat; Laing, Chad; Gannon, Vick P; Xu, Yong; Rasmussen, Jay; Potter, Andrew; Hendrick, Steve; Alexander, Trevor W; McAllister, Tim A

    2016-01-01

    Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1) and 6 (S6) isolated from pneumonic lesions and serotype 2 (S2) found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2-8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design intended to reduce the

  13. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources

    PubMed Central

    Klima, Cassidy L.; Cook, Shaun R.; Zaheer, Rahat; Laing, Chad; Gannon, Vick P.; Xu, Yong; Rasmussen, Jay; Potter, Andrew; Hendrick, Steve; Alexander, Trevor W.; McAllister, Tim A.

    2016-01-01

    Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1) and 6 (S6) isolated from pneumonic lesions and serotype 2 (S2) found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2–8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design intended to reduce the

  14. Comparative Genome Mapping of Sorghum and Maize

    PubMed Central

    Whitkus, R.; Doebley, J.; Lee, M.

    1992-01-01

    Linkage relationships were determined among 85 maize low copy number nuclear DNA probes and seven isozyme loci in an F(2) population derived from a cross of Sorghum bicolor ssp. bicolor X S. bicolor ssp. arundinaceum. Thirteen linkage groups were defined, three more than the 10 chromosomes of sorghum. Use of maize DNA probes to produce the sorghum linkage map allowed us to make several inferences concerning processes involved in the evolutionary divergence of the maize and sorghum genomes. The results show that many linkage groups are conserved between these two genomes and that the amount of recombination in these conserved linkage groups is roughly equivalent in maize and sorghum. Estimates of the proportions of duplicated loci suggest that a larger proportion of the loci are duplicated in the maize genome than in the sorghum genome. This result concurs with a prior estimate that the nuclear DNA content of maize is three to four times greater than that of sorghum. The pattern of conserved linkages between maize and sorghum is such that most sorghum linkage groups are composed of loci that map to two maize chromosomes. This pattern is consistent with the hypothesized ancient polyploid origin of maize and sorghum. There are nine cases in which locus order within shared linkage groups is inverted in sorghum relative to maize. These may have arisen from either inversions or intrachromosomal translocations. We found no evidence for large interchromosomal translocations. Overall, the data suggest that the primary processes involved in divergence of the maize and sorghum genomes were duplications (either by polyploidy or segmental duplication) and inversions or intrachromosomal translocations. PMID:1360933

  15. Analysis of the allohexaploid bread wheat genome (Triticum aestivum) using comparative whole genome shotgun sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The large 17 Gb allopolyploid genome of bread wheat is a major challenge for genome analysis because it is composed of three closely- related and independently maintained genomes, with genes dispersed as small “islands” separated by vast tracts of repetitive DNA. We used a novel comparative genomi...

  16. Comparative Genomic Analyses of Attenuated Strains of Mycoplasma gallisepticum▿ †

    PubMed Central

    Szczepanek, S. M.; Tulman, E. R.; Gorton, T. S.; Liao, X.; Lu, Z.; Zinski, J.; Aziz, F.; Frasca, S.; Kutish, G. F.; Geary, S. J.

    2010-01-01

    Mycoplasma gallisepticum is a significant respiratory and reproductive pathogen of domestic poultry. While the complete genomic sequence of the virulent, low-passage M. gallisepticum strain R (Rlow) has been reported, genomic determinants responsible for differences in virulence and host range remain to be completely identified. Here, we utilize genome sequencing and microarray-based comparative genomic data to identify these genomic determinants of virulence and to elucidate genomic variability among strains of M. gallisepticum. Analysis of the high-passage, attenuated derivative of Rlow, Rhigh, indicated that relatively few total genomic changes (64 loci) occurred, yet they are potentially responsible for the observed attenuation of this strain. In addition to previously characterized mutations in cytadherence-related proteins, changes included those in coding sequences of genes involved in sugar metabolism. Analyses of the genome of the M. gallisepticum vaccine strain F revealed numerous differences relative to strain R, including a highly divergent complement of vlhA surface lipoprotein genes, and at least 16 genes absent or significantly fragmented relative to strain R. Notably, an Rlow isogenic mutant in one of these genes (MGA_1107) caused significantly fewer severe tracheal lesions in the natural host compared to virulent M. gallisepticum Rlow. Comparative genomic hybridizations indicated few genetic loci commonly affected in F and vaccine strains ts-11 and 6/85, which would correlate with proteins affecting strain R virulence. Together, these data provide novel insights into inter- and intrastrain M. gallisepticum genomic variability and the genetic basis of M. gallisepticum virulence. PMID:20123709

  17. Comparative genomics of the lactic acid bacteria

    SciTech Connect

    Makarova, K.; Slesarev, A.; Wolf, Y.; Sorokin, A.; Mirkin, B.; Koonin, E.; Pavlov, A.; Pavlova, N.; Karamychev, V.; Polouchine, N.; Shakhova, V.; Grigoriev, I.; Lou, Y.; Rokhsar, D.; Lucas, S.; Huang, K.; Goodstein, D. M.; Hawkins, T.; Plengvidhya, V.; Welker, D.; Hughes, J.; Goh, Y.; Benson, A.; Baldwin, K.; Lee, J. -H.; Diaz-Muniz, I.; Dosti, B.; Smeianov, V; Wechter, W.; Barabote, R.; Lorca, G.; Altermann, E.; Barrangou, R.; Ganesan, B.; Xie, Y.; Rawsthorne, H.; Tamir, D.; Parker, C.; Breidt, F.; Broadbent, J.; Hutkins, R.; O'Sullivan, D.; Steele, J.; Unlu, G.; Saier, M.; Klaenhammer, T.; Richardson, P.; Kozyavkin, S.; Weimer, B.; Mills, D.

    2006-06-01

    Lactic acid-producing bacteria are associated with various plant and animal niches and play a key role in the production of fermented foods and beverages. We report nine genome sequences representing the phylogenetic and functional diversity of these bacteria. The small genomes of lactic acid bacteria encode a broad repertoire of transporters for efficient carbon and nitrogen acquisition from the nutritionally rich environments they inhabit and reflect a limited range of biosynthetic capabilities that indicate both prototrophic and auxotrophic strains. Phylogenetic analyses, comparison of gene content across the group, and reconstruction of ancestral gene sets indicate a combination of extensive gene loss and key gene acquisitions via horizontal gene transfer during the coevolution of lactic acid bacteria with their habitats.

  18. GenColors-based comparative genome databases for small eukaryotic genomes

    PubMed Central

    Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

    2013-01-01

    Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources. PMID:23193285

  19. Identification of Genomic Alterations in Pancreatic Cancer Using Array-Based Comparative Genomic Hybridization

    PubMed Central

    Liang, Jian-Wei; Shi, Zhi-Zhou; Shen, Tian-Yun; Che, Xu; Wang, Zheng; Shi, Su-Sheng; Xu, Xin; Cai, Yan; Zhao, Ping; Wang, Cheng-Feng; Zhou, Zhi-Xiang; Wang, Ming-Rong

    2014-01-01

    Background Genomic aberration is a common feature of human cancers and also is one of the basic mechanisms that lead to overexpression of oncogenes and underexpression of tumor suppressor genes. Our study aims to identify frequent genomic changes in pancreatic cancer. Materials and Methods We used array comparative genomic hybridization (array CGH) to identify recurrent genomic alterations and validated the protein expression of selected genes by immunohistochemistry. Results Sixteen gains and thirty-two losses occurred in more than 30% and 60% of the tumors, respectively. High-level amplifications at 7q21.3–q22.1 and 19q13.2 and homozygous deletions at 1p33–p32.3, 1p22.1, 1q22, 3q27.2, 6p22.3, 6p21.31, 12q13.2, 17p13.2, 17q21.31 and 22q13.1 were identified. Especially, amplification of AKT2 was detected in two carcinomas and homozygous deletion of CDKN2C in other two cases. In 15 independent validation samples, we found that AKT2 (19q13.2) and MCM7 (7q22.1) were amplified in 6 and 9 cases, and CAMTA2 (17p13.2) and PFN1 (17p13.2) were homozygously deleted in 3 and 1 cases. AKT2 and MCM7 were overexpressed, and CAMTA2 and PFN1 were underexpressed in pancreatic cancer tissues than in morphologically normal operative margin tissues. Both GISTIC and Genomic Workbench software identified 22q13.1 containing APOBEC3A and APOBEC3B as the only homozygous deletion region. And the expression levels of APOBEC3A and APOBEC3B were significantly lower in tumor tissues than in morphologically normal operative margin tissues. Further validation showed that overexpression of PSCA was significantly associated with lymph node metastasis, and overexpression of HMGA2 was significantly associated with invasive depth of pancreatic cancer. Conclusion These recurrent genomic changes may be useful for revealing the mechanism of pancreatic carcinogenesis and providing candidate biomarkers. PMID:25502777

  20. Comparative genetics and genomics of nematodes: genome structure, development, and lifestyle.

    PubMed

    Sommer, Ralf J; Streit, Adrian

    2011-01-01

    Nematodes are found in virtually all habitats on earth. Many of them are parasites of plants and animals, including humans. The free-living nematode, Caenorhabditis elegans, is one of the genetically best-studied model organisms and was the first metazoan whose genome was fully sequenced. In recent years, the draft genome sequences of another six nematodes representing four of the five major clades of nematodes were published. Compared to mammalian genomes, all these genomes are very small. Nevertheless, they contain almost the same number of genes as the human genome. Nematodes are therefore a very attractive system for comparative genetic and genomic studies, with C. elegans as an excellent baseline. Here, we review the efforts that were made to extend genetic analysis to nematodes other than C. elegans, and we compare the seven available nematode genomes. One of the most striking findings is the unexpectedly high incidence of gene acquisition through horizontal gene transfer (HGT). PMID:21721943

  1. Prediction of microbial phenotypes based on comparative genomics

    PubMed Central

    2015-01-01

    The accessibility of almost complete genome sequences of uncultivable microbial species from metagenomes necessitates computational methods predicting microbial phenotypes solely based on genomic data. Here we investigate how comparative genomics can be utilized for the prediction of microbial phenotypes. The PICA framework facilitates application and comparison of different machine learning techniques for phenotypic trait prediction. We have improved and extended PICA's support vector machine plug-in and suggest its applicability to large-scale genome databases and incomplete genome sequences. We have demonstrated the stability of the predictive power for phenotypic traits, not perturbed by the rapid growth of genome databases. A new software tool facilitates the in-depth analysis of phenotype models, which associate expected and unexpected protein functions with particular traits. Most of the traits can be reliably predicted in only 60-70% complete genomes. We have established a new phenotypic model that predicts intracellular microorganisms. Thereby we could demonstrate that also independently evolved phenotypic traits, characterized by genome reduction, can be reliably predicted based on comparative genomics. Our results suggest that the extended PICA framework can be used to automatically annotate phenotypes in near-complete microbial genome sequences, as generated in large numbers in current metagenomics studies. PMID:26451672

  2. Ten years of bacterial genome sequencing: comparative-genomics-based discoveries.

    PubMed

    Binnewies, Tim T; Motro, Yair; Hallin, Peter F; Lund, Ole; Dunn, David; La, Tom; Hampson, David J; Bellgard, Matthew; Wassenaar, Trudy M; Ussery, David W

    2006-07-01

    It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: "What have we learned from this vast amount of new genomic data?" Perhaps one of the most important lessons has been that genetic diversity, at the level of large-scale variation amongst even genomes of the same species, is far greater than was thought. The classical textbook view of evolution relying on the relatively slow accumulation of mutational events at the level of individual bases scattered throughout the genome has changed. One of the most obvious conclusions from examining the sequences from several hundred bacterial genomes is the enormous amount of diversity--even in different genomes from the same bacterial species. This diversity is generated by a variety of mechanisms, including mobile genetic elements and bacteriophages. An examination of the 20 Escherichia coli genomes sequenced so far dramatically illustrates this, with the genome size ranging from 4.6 to 5.5 Mbp; much of the variation appears to be of phage origin. This review also addresses mobile genetic elements, including pathogenicity islands and the structure of transposable elements. There are at least 20 different methods available to compare bacterial genomes. Metagenomics offers the chance to study genomic sequences found in ecosystems, including genomes of species that are difficult to culture. It has become clear that a genome sequence represents more than just a collection of gene sequences for an organism and that information concerning the environment and growth conditions for the organism are important for interpretation of the genomic data. The newly proposed Minimal Information about a Genome Sequence standard has been developed to obtain this

  3. Computational Methods for the Analysis of Array Comparative Genomic Hybridization

    PubMed Central

    Chari, Raj; Lockwood, William W.; Lam, Wan L.

    2006-01-01

    Array comparative genomic hybridization (array CGH) is a technique for assaying the copy number status of cancer genomes. The widespread use of this technology has lead to a rapid accumulation of high throughput data, which in turn has prompted the development of computational strategies for the analysis of array CGH data. Here we explain the principles behind array image processing, data visualization and genomic profile analysis, review currently available software packages, and raise considerations for future software development. PMID:17992253

  4. GenoSets: Visual Analytic Methods for Comparative Genomics

    PubMed Central

    Cain, Aurora A.; Kosara, Robert; Gibas, Cynthia J.

    2012-01-01

    Many important questions in biology are, fundamentally, comparative, and this extends to our analysis of a growing number of sequenced genomes. Existing genomic analysis tools are often organized around literal views of genomes as linear strings. Even when information is highly condensed, these views grow cumbersome as larger numbers of genomes are added. Data aggregation and summarization methods from the field of visual analytics can provide abstracted comparative views, suitable for sifting large multi-genome datasets to identify critical similarities and differences. We introduce a software system for visual analysis of comparative genomics data. The system automates the process of data integration, and provides the analysis platform to identify and explore features of interest within these large datasets. GenoSets borrows techniques from business intelligence and visual analytics to provide a rich interface of interactive visualizations supported by a multi-dimensional data warehouse. In GenoSets, visual analytic approaches are used to enable querying based on orthology, functional assignment, and taxonomic or user-defined groupings of genomes. GenoSets links this information together with coordinated, interactive visualizations for both detailed and high-level categorical analysis of summarized data. GenoSets has been designed to simplify the exploration of multiple genome datasets and to facilitate reasoning about genomic comparisons. Case examples are included showing the use of this system in the analysis of 12 Brucella genomes. GenoSets software and the case study dataset are freely available at http://genosets.uncc.edu. We demonstrate that the integration of genomic data using a coordinated multiple view approach can simplify the exploration of large comparative genomic data sets, and facilitate reasoning about comparisons and features of interest. PMID:23056299

  5. Comparative genomics of pectinacetylesterases: Insight on function and biology

    PubMed Central

    de Souza, Amancio José; Pauly, Markus

    2015-01-01

    Pectin acetylation influences the gelling ability of this important plant polysaccharide for the food industry. Plant apoplastic pectinacetylesterases (PAEs) play a key role in regulating the degree of pectin acetylation and modifying their expression thus represents one way to engineer plant polysaccharides for food applications. Identifying the major active enzymes within the PAE gene family will aid in our understanding of this biological phenomena as well as provide the tools for direct trait manipulation. Using comparative genomics we propose that there is a minimal set of 4 distinct PAEs in plants. Possible functional diversification of the PAE family in the grasses is also explored with the identification of 3 groups of PAE genes specific to grasses. PMID:26237162

  6. Comparative genomics of pectinacetylesterases: Insight on function and biology.

    PubMed

    de Souza, Amancio José; Pauly, Markus

    2015-01-01

    Pectin acetylation influences the gelling ability of this important plant polysaccharide for the food industry. Plant apoplastic pectinacetylesterases (PAEs) play a key role in regulating the degree of pectin acetylation and modifying their expression thus represents one way to engineer plant polysaccharides for food applications. Identifying the major active enzymes within the PAE gene family will aid in our understanding of this biological phenomena as well as provide the tools for direct trait manipulation. Using comparative genomics we propose that there is a minimal set of 4 distinct PAEs in plants. Possible functional diversification of the PAE family in the grasses is also explored with the identification of 3 groups of PAE genes specific to grasses. PMID:26237162

  7. Unclassified renal cell carcinoma: a clinicopathological, comparative genomic hybridization, and whole-genome exon sequencing study

    PubMed Central

    Hu, Zhen-Yan; Pang, Li-Juan; Qi, Yan; Kang, Xue-Ling; Hu, Jian-Ming; Wang, Lianghai; Liu, Kun-Peng; Ren, Yuan; Cui, Mei; Song, Li-Li; Li, Hong-An; Zou, Hong; Li, Feng

    2014-01-01

    Unclassified renal cell carcinoma (URCC) is a rare variant of RCC, accounting for only 3-5% of all cases. Studies on the molecular genetics of URCC are limited, and hence, we report on 2 cases of URCC analyzed using comparative genome hybridization (CGH) and the genome-wide human exon GeneChip technique to identify the genomic alterations of URCC. Both URCC patients (mean age, 72 years) presented at an advanced stage and died within 30 months post-surgery. Histologically, the URCCs were composed of undifferentiated, multinucleated, giant cells with eosinophilic cytoplasm. Immunostaining revealed that both URCC cases had strong p53 protein expression and partial expression of cluster of differentiation-10 and cytokeratin. The CGH profiles showed chromosomal imbalances in both URCC cases: gains were observed in chromosomes 1p11-12, 1q12-13, 2q20-23, 3q22-23, 8p12, and 16q11-15, whereas losses were detected on chromosomes 1q22-23, 3p12-22, 5p30-ter, 6p, 11q, 16q18-22, 17p12-14, and 20p. Compared with 18 normal renal tissues, 40 mutated genes were detected in the URCC tissues, including 32 missense and 8 silent mutations. Functional enrichment analysis revealed that the missense mutation genes were involved in 11 different biological processes and pathways, including cell cycle regulation, lipid localization and transport, neuropeptide signaling, organic ether metabolism, and ATP-binding cassette transporter signaling. Our findings indicate that URCC may be a highly aggressive cancer, and the genetic alterations identified herein may provide clues regarding the tumorigenesis of URCC and serve as a basis for the development of targeted therapies against URCC in the future. PMID:25120763

  8. The Effect of Human Genome Annotation Complexity on RNA-Seq Gene Expression Quantification

    PubMed Central

    Wu, Po-Yen; Phan, John H.; Wang, May D.

    2016-01-01

    Next-generation sequencing (NGS) has brought human genomic research to an unprecedented era. RNA-Seq is a branch of NGS that can be used to quantify gene expression and depends on accurate annotation of the human genome (i.e., the definition of genes and all of their variants or isoforms). Multiple annotations of the human genome exist with varying complexity. However, it is not clear how the choice of genome annotation influences RNA-Seq gene expression quantification. We assess the effect of different genome annotations in terms of (1) mapping quality, (2) quantification variation, (3) quantification accuracy (i.e., by comparing to qRT-PCR data), and (4) the concordance of detecting differentially expressed genes. External validation with qRT-PCR suggests that more complex genome annotations result in higher quantification variation.

  9. LKB1 preserves genome integrity by stimulating BRCA1 expression

    PubMed Central

    Gupta, Romi; Liu, Alex. Y.; Glazer, Peter M.; Wajapeyee, Narendra

    2015-01-01

    Serine/threonine kinase 11 (STK11, also known as LKB1) functions as a tumor suppressor in many human cancers. However, paradoxically loss of LKB1 in mouse embryonic fibroblast results in resistance to oncogene-induced transformation. Therefore, it is unclear why loss of LKB1 leads to increased predisposition to develop a wide variety of cancers. Here, we show that LKB1 protects cells from genotoxic stress. Cells lacking LKB1 display increased sensitivity to irradiation, accumulates more DNA double-strand breaks, display defective homology-directed DNA repair (HDR) and exhibit increased mutation rate, compared with that of LKB1-expressing cells. Conversely, the ectopic expression of LKB1 in cells lacking LKB1 protects them against genotoxic stress-induced DNA damage and prevents the accumulation of mutations. We find that LKB1 post-transcriptionally stimulates HDR gene BRCA1 expression by inhibiting the cytoplasmic localization of the RNA-binding protein, HU antigen R, in an AMP kinase-dependent manner and stabilizes BRCA1 mRNA. Cells lacking BRCA1 similar to the cell lacking LKB1 display increased genomic instability and ectopic expression of BRCA1 rescues LKB1 loss-induced sensitivity to genotoxic stress. Collectively, our results demonstrate that LKB1 is a crucial regulator of genome integrity and reveal a novel mechanism for LKB1-mediated tumor suppression with direct therapeutic implications for cancer prevention. PMID:25488815

  10. Comparative Genomics of an Emerging Amphibian Virus

    PubMed Central

    Epstein, Brendan; Storfer, Andrew

    2015-01-01

    Ranaviruses, a genus of the Iridoviridae, are large double-stranded DNA viruses that infect cold-blooded vertebrates worldwide. Ranaviruses have caused severe epizootics in commercial frog and fish populations, and are currently classified as notifiable pathogens in international trade. Previous work shows that a ranavirus that infects tiger salamanders throughout Western North America (Ambystoma tigrinum virus, or ATV) is in high prevalence among salamanders in the fishing bait trade. Bait ATV strains have elevated virulence and are transported long distances by humans, providing widespread opportunities for pathogen pollution. We sequenced the genomes of 15 strains of ATV collected from tiger salamanders across western North America and performed phylogenetic and population genomic analyses and tests for recombination. We find that ATV forms a monophyletic clade within the rest of the Ranaviruses and that it likely emerged within the last several thousand years, before human activities influenced its spread. We also identify several genes under strong positive selection, some of which appear to be involved in viral virulence and/or host immune evasion. In addition, we provide support for the pathogen pollution hypothesis with evidence of recombination among ATV strains, and potential bait-endemic strain recombination. PMID:26530419

  11. Comparative Genomics of an Emerging Amphibian Virus.

    PubMed

    Epstein, Brendan; Storfer, Andrew

    2016-01-01

    Ranaviruses, a genus of the Iridoviridae, are large double-stranded DNA viruses that infect cold-blooded vertebrates worldwide. Ranaviruses have caused severe epizootics in commercial frog and fish populations, and are currently classified as notifiable pathogens in international trade. Previous work shows that a ranavirus that infects tiger salamanders throughout Western North America (Ambystoma tigrinum virus, or ATV) is in high prevalence among salamanders in the fishing bait trade. Bait ATV strains have elevated virulence and are transported long distances by humans, providing widespread opportunities for pathogen pollution. We sequenced the genomes of 15 strains of ATV collected from tiger salamanders across western North America and performed phylogenetic and population genomic analyses and tests for recombination. We find that ATV forms a monophyletic clade within the rest of the Ranaviruses and that it likely emerged within the last several thousand years, before human activities influenced its spread. We also identify several genes under strong positive selection, some of which appear to be involved in viral virulence and/or host immune evasion. In addition, we provide support for the pathogen pollution hypothesis with evidence of recombination among ATV strains, and potential bait-endemic strain recombination. PMID:26530419

  12. Complete Genome Sequence and Comparative Genomics of a Novel Myxobacterium Myxococcus hansupus

    PubMed Central

    Sharma, Gaurav; Narwani, Tarun; Subramanian, Srikrishna

    2016-01-01

    Myxobacteria, a group of Gram-negative aerobes, belong to the class δ-proteobacteria and order Myxococcales. Unlike anaerobic δ-proteobacteria, they exhibit several unusual physiogenomic properties like gliding motility, desiccation-resistant myxospores and large genomes with high coding density. Here we report a 9.5 Mbp complete genome of Myxococcus hansupus that encodes 7,753 proteins. Phylogenomic and genome-genome distance based analysis suggest that Myxococcus hansupus is a novel member of the genus Myxococcus. Comparative genome analysis with other members of the genus Myxococcus was performed to explore their genome diversity. The variation in number of unique proteins observed across different species is suggestive of diversity at the genus level while the overrepresentation of several Pfam families indicates the extent and mode of genome expansion as compared to non-Myxococcales δ-proteobacteria. PMID:26900859

  13. Hemipteran genomics and psyllid gene expression

    Technology Transfer Automated Retrieval System (TEKTRAN)

    One of the best tools current available is the application of genomics to insect pest problems. Genomics provides rapid elucidation of the genetic basis of insect biology. Research efforts on psyllid genomics, while still in its infancy, is providing information which will aid strategies to suppress...

  14. Gramene 2016: comparative plant genomics and pathway resources

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the data...

  15. Cyberinfrastructure for (Comparative) Plant Genome Research Through PlantGDB

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Accurate and comprehensive gene structure annotation in emerging and assembled genomes is fundamental to comparative, functional, and translational genomics. We plan to build the cyberinfrastructure necessary for defining and accessing the plant gene space. Our Plant Genetic Data Base (PlantGDB) r...

  16. Functional and Comparative Genomics of Lignocellulose Degradation by Schizophyllum commune

    SciTech Connect

    Ohm, Robin A.; Lee, Hanbyul; Park, Hongjae; Brewer, Heather M.; Carver, Akiko; Copeland, Alex; Grimwood, Jane; Lindquist, Erika; Lipzen, Anna; Martin, Joel; Purvine, Samuel O.; Schackwitz, Wendy; Tegelaar, Martin; Tritt, Andrew; Baker, Scott; Choi, In-Geol; Lugones, Luis G.; Wosten, Han A. B.; Grigoriev, Igor V.

    2014-03-14

    The Basidiomycete fungus Schizophyllum commune is a wood-decaying fungus and is used as a model system to study lignocellulose degradation. Version 3.0 of the genome assembly filled 269 of 316 sequence gaps and added 680 kb of sequence. This new assembly was reannotated using RNAseq transcriptomics data, and this resulted in 3110 (24percent) more genes. Two additional S. commune strains with different wood-decaying properties were sequenced, from Tattone (France) and Loenen (The Netherlands). Sequence comparison shows remarkably high sequence diversity between the strains. The overall SNP rate of > 100 SNPs/kb is among the highest rates of within-species polymorphisms in Basidiomycetes. Some well-described proteins like hydrophobins and transcription factors have less than 70percent sequence identity among the strains. Some chromosomes are better conserved than others and in some cases large parts of chromosomes are missing from one or more strains. Gene expression on glucose, cellulose and wood was analyzed in two S. commune strains. Overall, gene expression correlated between the two strains, but there were some notable exceptions. Of particular interest are CAZymes (carbohydrate-active enzymes) that are regulated in different ways in the different strains. In both strains the transcription factor Fsp1 was strongly up-regulated during growth on cellulose and wood, when compared to glucose. Over-expression of Fsp1 using a constitutive promoter resulted in higher cellulose and xylose-degrading enzyme activity, which suggests that Fsp1 is involved in regulating CAZyme gene expression. Two CAZyme genes (of family GH61 and GH11) were shown to be strongly up-regulated during growth on cellulose, compared to glucose. Proteomics on the secreted proteins in the growth medium confirmed this. A promoter analysis revealed the shortest active promoters for these two genes, as well as putative transcription factor binding sites.

  17. Cognitive Endophenotypes Inform Genome-Wide Expression Profiling in Schizophrenia

    PubMed Central

    Zheutlin, Amanda B.; Viehman, Rachael W.; Fortgang, Rebecca; Borg, Jacqueline; Smith, Desmond J.; Suvisaari, Jaana; Therman, Sebastian; Hultman, Christina M.; Cannon, Tyrone D.

    2015-01-01

    OBJECTIVE We performed a whole-genome expression study to clarify the nature of the biological processes mediating between inherited genetic variations and cognitive dysfunction in schizophrenia. METHOD Gene expression was assayed from peripheral blood mononuclear cells using Illumina Human WG6 v3.0 chips in twins discordant for schizophrenia or bipolar disorder and control twins. After quality control, expression levels of 18,559 genes were screened for association with California Verbal Learning Test (CVLT) performance, and any memory-related probes were then evaluated for variation by diagnostic status in the discovery sample (N = 190), and in an independent replication sample (N = 73). Heritability of gene expression using the twin design was also assessed. RESULTS After Bonferroni correction (p < 2.69 × 10−6), CVLT performance was significantly related to expression levels for 76 genes, 43 of which were differentially expressed in schizophrenia patients, with comparable effect sizes in the same direction in the replication sample. For 41 of these 43 transcripts, expression levels were heritable. Nearly all identified genes contain common or de novo mutations associated with schizophrenia in prior studies. CONCLUSION Genes increasing risk for schizophrenia appear to do so in part via effects on signaling cascades influencing memory. The genes implicated in these processes are enriched for those related to RNA processing and DNA replication and include genes influencing G-protein coupled signal transduction, cytokine signaling, and oligodendrocyte function. PMID:26710095

  18. Regulation of methane genes and genome expression

    SciTech Connect

    John N. Reeve

    2009-09-09

    At the start of this project, it was known that methanogens were Archaeabacteria (now Archaea) and were therefore predicted to have gene expression and regulatory systems different from Bacteria, but few of the molecular biology details were established. The goals were then to establish the structures and organizations of genes in methanogens, and to develop the genetic technologies needed to investigate and dissect methanogen gene expression and regulation in vivo. By cloning and sequencing, we established the gene and operon structures of all of the “methane” genes that encode the enzymes that catalyze methane biosynthesis from carbon dioxide and hydrogen. This work identified unique sequences in the methane gene that we designated mcrA, that encodes the largest subunit of methyl-coenzyme M reductase, that could be used to identify methanogen DNA and establish methanogen phylogenetic relationships. McrA sequences are now the accepted standard and used extensively as hybridization probes to identify and quantify methanogens in environmental research. With the methane genes in hand, we used northern blot and then later whole-genome microarray hybridization analyses to establish how growth phase and substrate availability regulated methane gene expression in Methanobacterium thermautotrophicus ΔH (now Methanothermobacter thermautotrophicus). Isoenzymes or pairs of functionally equivalent enzymes catalyze several steps in the hydrogen-dependent reduction of carbon dioxide to methane. We established that hydrogen availability determine which of these pairs of methane genes is expressed and therefore which of the alternative enzymes is employed to catalyze methane biosynthesis under different environmental conditions. As were unable to establish a reliable genetic system for M. thermautotrophicus, we developed in vitro transcription as an alternative system to investigate methanogen gene expression and regulation. This led to the discovery that an archaeal protein

  19. Microbial NAD metabolism: lessons from comparative genomics.

    PubMed

    Gazzaniga, Francesca; Stebbins, Rebecca; Chang, Sheila Z; McPeek, Mark A; Brenner, Charles

    2009-09-01

    NAD is a coenzyme for redox reactions and a substrate of NAD-consuming enzymes, including ADP-ribose transferases, Sir2-related protein lysine deacetylases, and bacterial DNA ligases. Microorganisms that synthesize NAD from as few as one to as many as five of the six identified biosynthetic precursors have been identified. De novo NAD synthesis from aspartate or tryptophan is neither universal nor strictly aerobic. Salvage NAD synthesis from nicotinamide, nicotinic acid, nicotinamide riboside, and nicotinic acid riboside occurs via modules of different genes. Nicotinamide salvage genes nadV and pncA, found in distinct bacteria, appear to have spread throughout the tree of life via horizontal gene transfer. Biochemical, genetic, and genomic analyses have advanced to the point at which the precursors and pathways utilized by a microorganism can be predicted. Challenges remain in dissecting regulation of pathways. PMID:19721089

  20. Comparative genomics of autism and schizophrenia

    PubMed Central

    Crespi, Bernard; Stead, Philip; Elliot, Michael

    2010-01-01

    We used data from studies of copy-number variants (CNVs), single-gene associations, growth-signaling pathways, and intermediate phenotypes associated with brain growth to evaluate four alternative hypotheses for the genomic and developmental relationships between autism and schizophrenia: (i) autism subsumed in schizophrenia, (ii) independence, (iii) diametric, and (iv) partial overlap. Data from CNVs provides statistical support for the hypothesis that autism and schizophrenia are associated with reciprocal variants, such that at four loci, deletions predispose to one disorder, whereas duplications predispose to the other. Data from single-gene studies are inconsistent with a hypothesis based on independence, in that autism and schizophrenia share associated genes more often than expected by chance. However, differentiation between the partial overlap and diametric hypotheses using these data is precluded by limited overlap in the specific genetic markers analyzed in both autism and schizophrenia. Evidence from the effects of risk variants on growth-signaling pathways shows that autism-spectrum conditions tend to be associated with up-regulation of pathways due to loss of function mutations in negative regulators, whereas schizophrenia is associated with reduced pathway activation. Finally, data from studies of head and brain size phenotypes indicate that autism is commonly associated with developmentally-enhanced brain growth, whereas schizophrenia is characterized, on average, by reduced brain growth. These convergent lines of evidence appear most compatible with the hypothesis that autism and schizophrenia represent diametric conditions with regard to their genomic underpinnings, neurodevelopmental bases, and phenotypic manifestations as reflecting under-development versus dysregulated over-development of the human social brain. PMID:19955444

  1. Comparative Genomics and Extensive Recombinations in Phage Communities

    NASA Astrophysics Data System (ADS)

    Poisson, Guylaine; Belcaid, Mahdi; Bergeron, Anne

    Comparing the genomes of two closely related viruses often produces mosaics where nearly identical sequences alternate with sequences that are unique to each genome. When several closely related genomes are compared, the unique sequences are likely to be shared with third genomes, leading to virus mosaic communities. Here we present comparative analysis of sets of Staphylococcus aureus phages that share large identical sequences with up to three other genomes, and with different partners along their genomes. We introduce mosaic graphs to represent these complex recombination events, and use them to illustrate the breath and depth of sequence sharing: some genomes are almost completely made up of shared sequences, while genomes that share very large identical sequences can adopt alternate functional modules. Mosaic graphs also allow us to identify breakpoints that could eventually be used for the construction of recombination networks. These findings have several implications on phage metagenomics assembly, on the horizontal gene transfer paradigm, and more generally on the understanding of the composition and evolutionary dynamics of virus communities.

  2. Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

    PubMed

    Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

    2015-01-01

    Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/. PMID:25480115

  3. Plastic architecture of bacterial genome revealed by comparative genomics of Photorhabdus variants

    PubMed Central

    Gaudriault, Sophie; Pages, Sylvie; Lanois, Anne; Laroui, Christine; Teyssier, Corinne; Jumas-Bilak, Estelle; Givaudan, Alain

    2008-01-01

    Background The phenotypic consequences of large genomic architecture modifications within a clonal bacterial population are rarely evaluated because of the difficulties associated with using molecular approaches in a mixed population. Bacterial variants frequently arise among Photorhabdus luminescens, a nematode-symbiotic and insect-pathogenic bacterium. We therefore studied genome plasticity within Photorhabdus variants. Results We used a combination of macrorestriction and DNA microarray experiments to perform a comparative genomic study of different P. luminescens TT01 variants. Prolonged culturing of TT01 strain and a genomic variant, collected from the laboratory-maintained symbiotic nematode, generated bacterial lineages composed of primary and secondary phenotypic variants and colonial variants. The primary phenotypic variants exhibit several characteristics that are absent from the secondary forms. We identify substantial plasticity of the genome architecture of some variants, mediated mainly by deletions in the 'flexible' gene pool of the TT01 reference genome and also by genomic amplification. We show that the primary or secondary phenotypic variant status is independent from global genomic architecture and that the bacterial lineages are genomic lineages. We focused on two unusual genomic changes: a deletion at a new recombination hotspot composed of long approximate repeats; and a 275 kilobase single block duplication belonging to a new class of genomic duplications. Conclusion Our findings demonstrate that major genomic variations occur in Photorhabdus clonal populations. The phenotypic consequences of these genomic changes are cryptic. This study provides insight into the field of bacterial genome architecture and further elucidates the role played by clonal genomic variation in bacterial genome evolution. PMID:18647395

  4. DNA sequence copy number analysis by Comparative Genomic Hybridization (CGH)

    SciTech Connect

    Pinkel, D.; Kallioniemi, A.; Kallioniemi, O.; Waldman, F.; Sudar, D.; Gray, I. ); Rutovitz, D.; Piper, I. )

    1993-01-01

    Comparative Genomic Hybridization (CGH) uses the kinetics of in situ hybridization to compare the copy numbers of different DNA sequences within the same genome and the copy numbers of the same sequences among different genomes. In a typical application genomic DNA from a tumor and from normal cells are differentially labeled and simultaneously hybridized to normal metaphase chromosomes, and detected with different fluorochromes. Properly registered images of each fluorochrome are obtained using a microscope equipped with multi-band filters and a CCD camera. Digital image analysis permits measurement of intensity ratio profiles along each of the target chromosomes. Studies of cells with known aberrations indicate that the intensity ratio at each position is proportional to the ratio of the copy numbers of the sequences that bind there in the tumor and normal genomes. Analytical challenges posed by the need to efficiently obtain copy number karyotypes are discussed.

  5. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database.

    PubMed

    Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L

    2016-01-01

    The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. PMID:26578582

  6. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database

    PubMed Central

    Winsor, Geoffrey L.; Griffiths, Emma J.; Lo, Raymond; Dhillon, Bhavjinder K.; Shay, Julie A.; Brinkman, Fiona S. L.

    2016-01-01

    The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. PMID:26578582

  7. Evolutionary insights into scleractinian corals using comparative genomic hybridizations

    PubMed Central

    2012-01-01

    Background Coral reefs belong to the most ecologically and economically important ecosystems on our planet. Yet, they are under steady decline worldwide due to rising sea surface temperatures, disease, and pollution. Understanding the molecular impact of these stressors on different coral species is imperative in order to predict how coral populations will respond to this continued disturbance. The use of molecular tools such as microarrays has provided deep insight into the molecular stress response of corals. Here, we have performed comparative genomic hybridizations (CGH) with different coral species to an Acropora palmata microarray platform containing 13,546 cDNA clones in order to identify potentially rapidly evolving genes and to determine the suitability of existing microarray platforms for use in gene expression studies (via heterologous hybridization). Results Our results showed that the current microarray platform for A. palmata is able to provide biological relevant information for a wide variety of coral species covering both the complex clade as well the robust clade. Analysis of the fraction of highly diverged genes showed a significantly higher amount of genes without annotation corroborating previous findings that point towards a higher rate of divergence for taxonomically restricted genes. Among the genes with annotation, we found many mitochondrial genes to be highly diverged in M. faveolata when compared to A. palmata, while the majority of nuclear encoded genes maintained an average divergence rate. Conclusions The use of present microarray platforms for transcriptional analyses in different coral species will greatly enhance the understanding of the molecular basis of stress and health and highlight evolutionary differences between scleractinian coral species. On a genomic basis, we show that cDNA arrays can be used to identify patterns of divergence. Mitochondrion-encoded genes seem to have diverged faster than nuclear encoded genes in robust

  8. Mycobacterial species as case-study of comparative genome analysis.

    PubMed

    Zakham, F; Belayachi, L; Ussery, D; Akrim, M; Benjouad, A; El Aouad, R; Ennaji, M M

    2011-01-01

    The genus Mycobacterium represents more than 120 species including important pathogens of human and cause major public health problems and illnesses. Further, with more than 100 genome sequences from this genus, comparative genome analysis can provide new insights for better understanding the evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str. Pasteur 1173P2, M. leprae Br4923, M. marinum M, M. sp. KMS, M. sp. MCS, M. tuberculosis CDC1551, M. tuberculosis F11, M. tuberculosis H37Ra, M. tuberculosis H37Rv, M. tuberculosis KZN 1435 , M. ulcerans Agy99,and M. vanbaalenii PYR—1, For this purpose a comparison has been done based on their length of genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been defined for twelve Mycobacterial species. We have also introduced the genome atlas of the reference strain M. tuberculosis H37Rv which can give a good overview of this genome. And for examining the phylogenetic relationships among these bacteria, a phylogenic tree has been constructed from 16S rRNA gene for tuberculosis and non tuberculosis Mycobacteria to understand the evolutionary events of these species. PMID:21396338

  9. Evolutionary and comparative analyses of the soybean genome

    PubMed Central

    Cannon, Steven B.; Shoemaker, Randy C.

    2012-01-01

    The soybean genome assembly has been available since the end of 2008. Significant features of the genome include large, gene-poor, repeat-dense pericentromeric regions, spanning roughly 57% of the genome sequence; a relatively large genome size of ~1.15 billion bases; remnants of a genome duplication that occurred ~13 million years ago (Mya); and fainter remnants of older polyploidies that occurred ~58 Mya and >130 Mya. The genome sequence has been used to identify the genetic basis for numerous traits, including disease resistance, nutritional characteristics, and developmental features. The genome sequence has provided a scaffold for placement of many genomic feature elements, both from within soybean and from related species. These may be accessed at several websites, including http://www.phytozome.net, http://soybase.org, http://comparative-legumes.org, and http://www.legumebase.brc.miyazaki-u.ac.jp. The taxonomic position of soybean in the Phaseoleae tribe of the legumes means that there are approximately two dozen other beans and relatives that have undergone independent domestication, and which may have traits that will be useful for transfer to soybean. Methods of translating information between species in the Phaseoleae range from design of markers for marker assisted selection, to transformation with Agrobacterium or with other experimental transformation methods. PMID:23136483

  10. Comparative genomics of vesicomyid clam (Bivalvia: Mollusca) chemosynthetic symbionts

    PubMed Central

    Newton, Irene LG; Girguis, Peter R; Cavanaugh, Colleen M

    2008-01-01

    Background The Vesicomyidae (Bivalvia: Mollusca) are a family of clams that form symbioses with chemosynthetic gamma-proteobacteria. They exist in environments such as hydrothermal vents and cold seeps and have a reduced gut and feeding groove, indicating a large dependence on their endosymbionts for nutrition. Recently, two vesicomyid symbiont genomes were sequenced, illuminating the possible nutritional contributions of the symbiont to the host and making genome-wide evolutionary analyses possible. Results To examine the genomic evolution of the vesicomyid symbionts, a comparative genomics framework, including the existing genomic data combined with heterologous microarray hybridization results, was used to analyze conserved gene content in four vesicomyid symbiont genomes. These four symbionts were chosen to include a broad phylogenetic sampling of the vesicomyid symbionts and represent distinct chemosynthetic environments: cold seeps and hydrothermal vents. Conclusion The results of this comparative genomics analysis emphasize the importance of the symbionts' chemoautotrophic metabolism within their hosts. The fact that these symbionts appear to be metabolically capable autotrophs underscores the extent to which the host depends on them for nutrition and reveals the key to invertebrate colonization of these challenging environments. PMID:19055818

  11. Comparative Genomics of a Parthenogenesis-Inducing Wolbachia Symbiont.

    PubMed

    Lindsey, Amelia R I; Werren, John H; Richards, Stephen; Stouthamer, Richard

    2016-01-01

    Wolbachia is an intracellular symbiont of invertebrates responsible for inducing a wide variety of phenotypes in its host. These host-Wolbachia relationships span the continuum from reproductive parasitism to obligate mutualism, and provide a unique system to study genomic changes associated with the evolution of symbiosis. We present the genome sequence from a parthenogenesis-inducing Wolbachia strain (wTpre) infecting the minute parasitoid wasp Trichogramma pretiosum The wTpre genome is the most complete parthenogenesis-inducing Wolbachia genome available to date. We used comparative genomics across 16 Wolbachia strains, representing five supergroups, to identify a core Wolbachia genome of 496 sets of orthologous genes. Only 14 of these sets are unique to Wolbachia when compared to other bacteria from the Rickettsiales. We show that the B supergroup of Wolbachia, of which wTpre is a member, contains a significantly higher number of ankyrin repeat-containing genes than other supergroups. In the wTpre genome, there is evidence for truncation of the protein coding sequences in 20% of ORFs, mostly as a result of frameshift mutations. The wTpre strain represents a conversion from cytoplasmic incompatibility to a parthenogenesis-inducing lifestyle, and is required for reproduction in the Trichogramma host it infects. We hypothesize that the large number of coding frame truncations has accompanied the change in reproductive mode of the wTpre strain. PMID:27194801

  12. Comparative rates of evolution in endosymbiotic nuclear genomes

    PubMed Central

    Patron, Nicola J; Rogers, Matthew B; Keeling, Patrick J

    2006-01-01

    Background The nucleomorphs associated with secondary plastids of cryptomonads and chlorarachniophytes are the sole examples of organelles with eukaryotic nuclear genomes. Although not as widespread as their prokaryotic equivalents in mitochondria and plastids, nucleomorph genomes share similarities in terms of reduction and compaction. They also differ in several aspects, not least in that they encode proteins that target to the plastid, and so function in a different compartment from that in which they are encoded. Results Here, we test whether the phylogenetically distinct nucleomorph genomes of the cryptomonad, Guillardia theta, and the chlorarachniophyte, Bigelowiella natans, have experienced similar evolutionary pressures during their transformation to reduced organelles. We compared the evolutionary rates of genes from nuclear, nucleomorph, and plastid genomes, all of which encode proteins that function in the same cellular compartment, the plastid, and are thus subject to similar selection pressures. Furthermore, we investigated the divergence of nucleomorphs within cryptomonads by comparing G. theta and Rhodomonas salina. Conclusion Chlorarachniophyte nucleomorph genes have accumulated errors at a faster rate than other genomes within the same cell, regardless of the compartment where the gene product functions. In contrast, most nucleomorph genes in cryptomonads have evolved faster than genes in other genomes on average, but genes for plastid-targeted proteins are not overly divergent, and it appears that cryptomonad nucleomorphs are not presently evolving rapidly and have therefore stabilized. Overall, these analyses suggest that the forces at work in the two lineages are different, despite the similarities between the structures of their genomes. PMID:16772046

  13. Comparative Genomics of a Parthenogenesis-Inducing Wolbachia Symbiont

    PubMed Central

    Lindsey, Amelia R. I.; Werren, John H.; Richards, Stephen; Stouthamer, Richard

    2016-01-01

    Wolbachia is an intracellular symbiont of invertebrates responsible for inducing a wide variety of phenotypes in its host. These host-Wolbachia relationships span the continuum from reproductive parasitism to obligate mutualism, and provide a unique system to study genomic changes associated with the evolution of symbiosis. We present the genome sequence from a parthenogenesis-inducing Wolbachia strain (wTpre) infecting the minute parasitoid wasp Trichogramma pretiosum. The wTpre genome is the most complete parthenogenesis-inducing Wolbachia genome available to date. We used comparative genomics across 16 Wolbachia strains, representing five supergroups, to identify a core Wolbachia genome of 496 sets of orthologous genes. Only 14 of these sets are unique to Wolbachia when compared to other bacteria from the Rickettsiales. We show that the B supergroup of Wolbachia, of which wTpre is a member, contains a significantly higher number of ankyrin repeat-containing genes than other supergroups. In the wTpre genome, there is evidence for truncation of the protein coding sequences in 20% of ORFs, mostly as a result of frameshift mutations. The wTpre strain represents a conversion from cytoplasmic incompatibility to a parthenogenesis-inducing lifestyle, and is required for reproduction in the Trichogramma host it infects. We hypothesize that the large number of coding frame truncations has accompanied the change in reproductive mode of the wTpre strain. PMID:27194801

  14. Genome sequence and comparative genome analysis of Pseudomonas syringae pv. syringae type strain ATCC 19310.

    PubMed

    Park, Yong-Soon; Jeong, Haeyoung; Sim, Young Mi; Yi, Hwe-Su; Ryu, Choong-Min

    2014-04-01

    Pseudomonas syringae pv. syringae (Psy) is a major bacterial pathogen of many economically important plant species. Despite the severity of its impact, the genome sequence of the type strain has not been reported. Here, we present the draft genome sequence of Psy ATCC 19310. Comparative genomic analysis revealed that Psy ATCC 19310 is closely related to Psy B728a. However, only a few type III effectors, which are key virulence factors, are shared by the two strains, indicating the possibility of host-pathogen specificity and genome dynamics, even under the pathovar level. PMID:24444998

  15. SNUGB: a versatile genome browser supporting comparative and functional fungal genomics

    PubMed Central

    Jung, Kyongyong; Park, Jongsun; Choi, Jaeyoung; Park, Bongsoo; Kim, Seungill; Ahn, Kyohun; Choi, Jaehyuk; Choi, Doil; Kang, Seogchan; Lee, Yong-Hwan

    2008-01-01

    Background Since the full genome sequences of Saccharomyces cerevisiae were released in 1996, genome sequences of over 90 fungal species have become publicly available. The heterogeneous formats of genome sequences archived in different sequencing centers hampered the integration of the data for efficient and comprehensive comparative analyses. The Comparative Fungal Genomics Platform (CFGP) was developed to archive these data via a single standardized format that can support multifaceted and integrated analyses of the data. To facilitate efficient data visualization and utilization within and across species based on the architecture of CFGP and associated databases, a new genome browser was needed. Results The Seoul National University Genome Browser (SNUGB) integrates various types of genomic information derived from 98 fungal/oomycete (137 datasets) and 34 plant and animal (38 datasets) species, graphically presents germane features and properties of each genome, and supports comparison between genomes. The SNUGB provides three different forms of the data presentation interface, including diagram, table, and text, and six different display options to support visualization and utilization of the stored information. Information for individual species can be quickly accessed via a new tool named the taxonomy browser. In addition, SNUGB offers four useful data annotation/analysis functions, including 'BLAST annotation.' The modular design of SNUGB makes its adoption to support other comparative genomic platforms easy and facilitates continuous expansion. Conclusion The SNUGB serves as a powerful platform supporting comparative and functional genomics within the fungal kingdom and also across other kingdoms. All data and functions are available at the web site . PMID:19055845

  16. The MicrobesOnline Web site for comparative genomics

    SciTech Connect

    Alm, Eric J.; Huang, Katherine H.; Price, Morgan N.; Koche,Richard P.; Keller, Keith; Dubchak, Inna L.; Arkin, Adam P.

    2004-11-05

    At present, hundreds of microbial genomes have been sequenced, and hundreds more are currently in the pipeline. The Virtual Institute for Microbial Stress and Survival has developed a publicly available suite of Web-based comparative genomic tools (http://www.microbesonline.org) designed to facilitate multispecies comparison among prokaryotes. Highlights of the Microbes Online Web site include operon and regulon predictions, a multispecies genome browser, a multispecies Gene Ontology browser, a comparative KEGG metabolic pathway viewer, a Bioinformatics Workbench for in-depth sequence analysis, and Gene Carts that allow users to save genes of interest for further study while they browse. In addition, we provide an interface for genome annotation, which like all of the tools reported here, is freely available to the scientific community.

  17. Sputnik: a database platform for comparative plant genomics.

    PubMed

    Rudd, Stephen; Mewes, Hans-Werner; Mayer, Klaus F X

    2003-01-01

    Two million plant ESTs, from 20 different plant species, and totalling more than one 1000 Mbp of DNA sequence, represents a formidable transcriptomic resource. Sputnik uses the potential of this sequence resource to fill some of the information gap in the un-sequenced plant genomes and to serve as the foundation for in silicio comparative plant genomics. The complexity of the individual EST collections has been reduced using optimised EST clustering techniques. Annotation of cluster sequences is performed by exploiting and transferring information from the comprehensive knowledgebase already produced for the completed model plant genome (Arabidopsis thaliana) and by performing additional state of-the-art sequence analyses relevant to today's plant biologist. Functional predictions, comparative analyses and associative annotations for 500 000 plant EST derived peptides make Sputnik (http://mips.gsf.de/proj/sputnik/) a valid platform for contemporary plant genomics. PMID:12519965

  18. PGSB PlantsDB: updates to the database framework for comparative plant genome research.

    PubMed

    Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai C; Martis, Mihaela M; Seidel, Michael; Kugler, Karl G; Gundlach, Heidrun; Mayer, Klaus F X

    2016-01-01

    PGSB (Plant Genome and Systems Biology: formerly MIPS) PlantsDB (http://pgsb.helmholtz-muenchen.de/plant/index.jsp) is a database framework for the comparative analysis and visualization of plant genome data. The resource has been updated with new data sets and types as well as specialized tools and interfaces to address user demands for intuitive access to complex plant genome data. In its latest incarnation, we have re-worked both the layout and navigation structure and implemented new keyword search options and a new BLAST sequence search functionality. Actively involved in corresponding sequencing consortia, PlantsDB has dedicated special efforts to the integration and visualization of complex triticeae genome data, especially for barley, wheat and rye. We enhanced CrowsNest, a tool to visualize syntenic relationships between genomes, with data from the wheat sub-genome progenitor Aegilops tauschii and added functionality to the PGSB RNASeqExpressionBrowser. GenomeZipper results were integrated for the genomes of barley, rye, wheat and perennial ryegrass and interactive access is granted through PlantsDB interfaces. Data exchange and cross-linking between PlantsDB and other plant genome databases is stimulated by the transPLANT project (http://transplantdb.eu/). PMID:26527721

  19. PGSB PlantsDB: updates to the database framework for comparative plant genome research

    PubMed Central

    Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai C.; Martis, Mihaela M.; Seidel, Michael; Kugler, Karl G.; Gundlach, Heidrun; Mayer, Klaus F.X.

    2016-01-01

    PGSB (Plant Genome and Systems Biology: formerly MIPS) PlantsDB (http://pgsb.helmholtz-muenchen.de/plant/index.jsp) is a database framework for the comparative analysis and visualization of plant genome data. The resource has been updated with new data sets and types as well as specialized tools and interfaces to address user demands for intuitive access to complex plant genome data. In its latest incarnation, we have re-worked both the layout and navigation structure and implemented new keyword search options and a new BLAST sequence search functionality. Actively involved in corresponding sequencing consortia, PlantsDB has dedicated special efforts to the integration and visualization of complex triticeae genome data, especially for barley, wheat and rye. We enhanced CrowsNest, a tool to visualize syntenic relationships between genomes, with data from the wheat sub-genome progenitor Aegilops tauschii and added functionality to the PGSB RNASeqExpressionBrowser. GenomeZipper results were integrated for the genomes of barley, rye, wheat and perennial ryegrass and interactive access is granted through PlantsDB interfaces. Data exchange and cross-linking between PlantsDB and other plant genome databases is stimulated by the transPLANT project (http://transplantdb.eu/). PMID:26527721

  20. Comparative Genome Analysis of Basidiomycete Fungi

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Morin, Emmanuelle; Nagy, Laszlo; Manning, Gerard; Baker, Scott; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Hibbett, David; Martin, Francis; Grigoriev, Igor

    2012-03-19

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, symbionts, and plant and animal pathogens. To better understand the diversity of phenotypes in basidiomycetes, we performed a comparative analysis of 35 basidiomycete fungi spanning the diversity of the phylum. Phylogenetic patterns of lignocellulose degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay. Patterns of secondary metabolic enzymes give additional insight into the broad array of phenotypes found in the basidiomycetes. We suggest that the profile of an organism in lignocellulose-targeting genes can be used to predict its nutritional mode, and predict Dacryopinax sp. as a brown rot; Botryobasidium botryosum and Jaapia argillacea as white rots.

  1. Kiwifruit Information Resource (KIR): a comparative platform for kiwifruit genomics.

    PubMed

    Yue, Junyang; Liu, Jian; Ban, Rongjun; Tang, Wei; Deng, Lin; Fei, Zhangjun; Liu, Yongsheng

    2015-01-01

    The Kiwifruit Information Resource (KIR) is dedicated to maintain and integrate comprehensive datasets on genomics, functional genomics and transcriptomics of kiwifruit (Actinidiaceae). KIR serves as a central access point for existing/new genomic and genetic data. KIR also provides researchers with a variety of visualization and analysis tools. Current developments include the updated genome structure of Actinidia chinensis cv. Hongyang and its newest genome annotation, putative transcripts, gene expression, physical markers of genetic traits as well as relevant publications based on the latest genome assembly. Nine thousand five hundred and forty-seven new transcripts are detected and 21 132 old transcripts are changed. At the present release, the next-generation transcriptome sequencing data has been incorporated into gene models and splice variants. Protein-protein interactions are also identified based on experimentally determined orthologous interactions. Furthermore, the experimental results reported in peer-reviewed literature are manually extracted and integrated within a well-developed query page. In total, 122 identifications are currently associated, including commonly used gene names and symbols. All KIR datasets are helpful to facilitate a broad range of kiwifruit research topics and freely available to the research community. Database URL: http://bdg.hfut.edu.cn/kir/index.html. PMID:26656885

  2. Kiwifruit Information Resource (KIR): a comparative platform for kiwifruit genomics

    PubMed Central

    Yue, Junyang; Liu, Jian; Ban, Rongjun; Tang, Wei; Deng, Lin; Fei, Zhangjun; Liu, Yongsheng

    2015-01-01

    The Kiwifruit Information Resource (KIR) is dedicated to maintain and integrate comprehensive datasets on genomics, functional genomics and transcriptomics of kiwifruit (Actinidiaceae). KIR serves as a central access point for existing/new genomic and genetic data. KIR also provides researchers with a variety of visualization and analysis tools. Current developments include the updated genome structure of Actinidia chinensis cv. Hongyang and its newest genome annotation, putative transcripts, gene expression, physical markers of genetic traits as well as relevant publications based on the latest genome assembly. Nine thousand five hundred and forty-seven new transcripts are detected and 21 132 old transcripts are changed. At the present release, the next-generation transcriptome sequencing data has been incorporated into gene models and splice variants. Protein–protein interactions are also identified based on experimentally determined orthologous interactions. Furthermore, the experimental results reported in peer-reviewed literature are manually extracted and integrated within a well-developed query page. In total, 122 identifications are currently associated, including commonly used gene names and symbols. All KIR datasets are helpful to facilitate a broad range of kiwifruit research topics and freely available to the research community. Database URL: http://bdg.hfut.edu.cn/kir/index.html. PMID:26656885

  3. Complete genome sequencing and comparative genomic analysis of functionally diverse Lysinibacillus sphaericus III(3)7.

    PubMed

    Rey, Andrés; Silva-Quintero, Laura; Dussán, Jenny

    2016-09-01

    Lysinibacillus sphaericus III(3)7 is a native Colombian strain, the first one isolated from soil samples. This strain has shown high levels of pathogenic activity against Culex quinquefaciatus larvae in laboratory assays compared to other members of the same species. Using Pacific Biosciences sequencing technology we sequenced, annotated (de novo) and described the genome of strain III(3)7, achieving a complete genome sequence status. We then performed a comparative analysis between the newly sequenced genome and the ones previously reported for Colombian isolates L. sphaericus OT4b.31, CBAM5 and OT4b.25, with the inclusion of L. sphaericus C3-41 that has been used as a reference genome for most of previous genome sequencing projects. We concluded that L. sphaericus III(3)7 is highly similar with strain OT4b.25 and shares high levels of synteny with isolates CBAM5 and C3-41. PMID:27419068

  4. Comparative Whole-Genome Hybridization Reveals Genomic Islands in Brucella Species†

    PubMed Central

    Rajashekara, Gireesh; Glasner, Jeremy D.; Glover, David A.; Splitter, Gary A.

    2004-01-01

    Brucella species are responsible for brucellosis, a worldwide zoonotic disease causing abortion in domestic animals and Malta fever in humans. Based on host preference, the genus is divided into six species. Brucella abortus, B. melitensis, and B. suis are pathogenic to humans, whereas B. ovis and B. neotomae are nonpathogenic to humans and B. canis human infections are rare. Limited genome diversity exists among Brucella species. Comparison of Brucella species whole genomes is, therefore, likely to identify factors responsible for differences in host preference and virulence restriction. To facilitate such studies, we used the complete genome sequence of B. melitensis 16M, the species highly pathogenic to humans, to construct a genomic microarray. Hybridization of labeled genomic DNA from Brucella species to this microarray revealed a total of 217 open reading frames (ORFs) altered in five Brucella species analyzed. These ORFs are often found in clusters (islands) in the 16M genome. Examination of the genomic context of these islands suggests that many are horizontally acquired. Deletions of genetic content identified in Brucella species are conserved in multiple strains of the same species, and genomic islands missing in a given species are often restricted to that particular species. These findings suggest that, whereas the loss or gain of genetic material may be related to the host range and virulence restriction of certain Brucella species for humans, independent mechanisms involving gene inactivation or altered expression of virulence determinants may also contribute to these differences. PMID:15262941

  5. Comparative genome analysis of Solanum lycopersicum and Solanum tuberosum

    PubMed Central

    Lall, Rohit; Thomas, George; Singh, Satendra; Singh, Archana; Wadhwa, Gulshan

    2013-01-01

    Solanum lycopersicum and Solanum tuberosum are agriculturally important crop species as they are rich sources of starch, protein, antioxidants, lycopene, beta-carotene, vitamin C, and fiber. The genomes of S. lycopersicum and S. tuberosum are currently available. However the linear strings of nucleotides that together comprise a genome sequence are of limited significance by themselves. Computational and bioinformatics approaches can be used to exploit the genomes for fundamental research for improving their varieties. The comparative genome analysis, Pfam analysis of predicted reviewed paralogous proteins was performed. It was found that S. lycopersicum proteins belong to more families, domains and clans in comparison with S. tuberosum. It was also found that mostly intergenic regions are conserved in two genomes followed by exons, intron and UTR. This can be exploited to predict regions between genomes that are similar to each other and to study the evolutionary relationship between two genomes, leading towards the development of disease resistance, stress tolerance and improved varieties of tomato. PMID:24307771

  6. Metagenome Skimming of Insect Specimen Pools: Potential for Comparative Genomics.

    PubMed

    Linard, Benjamin; Crampton-Platt, Alex; Gillett, Conrad P D T; Timmermans, Martijn J T N; Vogler, Alfried P

    2015-06-01

    Metagenomic analyses are challenging in metazoans, but high-copy number and repeat regions can be assembled from low-coverage sequencing by "genome skimming," which is applied here as a new way of characterizing metagenomes obtained in an ecological or taxonomic context. Illumina shotgun sequencing on two pools of Coleoptera (beetles) of approximately 200 species each were assembled into tens of thousands of scaffolds. Repeated low-coverage sequencing recovered similar scaffold sets consistently, although approximately 70% of scaffolds could not be identified against existing genome databases. Identifiable scaffolds included mitochondrial DNA, conserved sequences with hits to expressed sequence tag and protein databases, and known repeat elements of high and low complexity, including numerous copies of rRNA and histone genes. Assemblies of histones captured a diversity of gene order and primary sequence in Coleoptera. Scaffolds with similarity to multiple sites in available coleopteran genome sequences for Dendroctonus and Tribolium revealed high specificity of scaffolds to either of these genomes, in particular for high-copy number repeats. Numerous "clusters" of scaffolds mapped to the same genomic site revealed intra- and/or intergenomic variation within a metagenome pool. In addition to effect of taxonomic composition of the metagenomes, the number of mapped scaffolds also revealed structural differences between the two reference genomes, although the significance of this striking finding remains unclear. Finally, apparently exogenous sequences were recovered, including potential food plants, fungal pathogens, and bacterial symbionts. The "metagenome skimming" approach is useful for capturing the genomic diversity of poorly studied, species-rich lineages and opens new prospects in environmental genomics. PMID:25979752

  7. Metagenome Skimming of Insect Specimen Pools: Potential for Comparative Genomics

    PubMed Central

    Linard, Benjamin; Crampton-Platt, Alex; Gillett, Conrad P.D.T.; Timmermans, Martijn J.T.N.; Vogler, Alfried P.

    2015-01-01

    Metagenomic analyses are challenging in metazoans, but high-copy number and repeat regions can be assembled from low-coverage sequencing by “genome skimming,” which is applied here as a new way of characterizing metagenomes obtained in an ecological or taxonomic context. Illumina shotgun sequencing on two pools of Coleoptera (beetles) of approximately 200 species each were assembled into tens of thousands of scaffolds. Repeated low-coverage sequencing recovered similar scaffold sets consistently, although approximately 70% of scaffolds could not be identified against existing genome databases. Identifiable scaffolds included mitochondrial DNA, conserved sequences with hits to expressed sequence tag and protein databases, and known repeat elements of high and low complexity, including numerous copies of rRNA and histone genes. Assemblies of histones captured a diversity of gene order and primary sequence in Coleoptera. Scaffolds with similarity to multiple sites in available coleopteran genome sequences for Dendroctonus and Tribolium revealed high specificity of scaffolds to either of these genomes, in particular for high-copy number repeats. Numerous “clusters” of scaffolds mapped to the same genomic site revealed intra- and/or intergenomic variation within a metagenome pool. In addition to effect of taxonomic composition of the metagenomes, the number of mapped scaffolds also revealed structural differences between the two reference genomes, although the significance of this striking finding remains unclear. Finally, apparently exogenous sequences were recovered, including potential food plants, fungal pathogens, and bacterial symbionts. The “metagenome skimming” approach is useful for capturing the genomic diversity of poorly studied, species-rich lineages and opens new prospects in environmental genomics. PMID:25979752

  8. Comparative proteogenomics: combining mass spectrometry and comparative genomics to analyze multiple genomes

    SciTech Connect

    Gupta, Nitin; Benhamida, Jamal; Bhargava, Vipul; Goodman, Daniel; Kain , Elisabeth; Kerman, Ian; Nguyen , Ngan; Ollikainen, Noah; Rodriguez, Jesse; Wang, J.; Lipton, Mary S.; Romine, Margaret F.; Bafna, Vineet; Smith, Richard D.; Pevzner, Pavel A.

    2008-07-30

    While bacterial genome annotations have significantly improved in recent years, techniques for bacterial proteome annotation (including post-translational chemical modifications, signal peptides, proteolytic events, etc.) are still in their infancy. At the same time, the number of sequenced bacterial genomes is rising sharply, far outpacing our ability to validate the predicted genes, let alone annotate bacterial proteomes. In this study, we use tandem mass spectrometry (MS/MS) to annotate the proteome of Shewanella oneidensis MR-1, an important microbe for bioremediation. In particular, we provide the first comprehensive map of post-translational modifications in a bacterial genome, including a large number of chemical modifications, signal peptide cleavages and cleavage of N-terminal methionine residues. We also detect multiple genes that were missed or assigned incorrect start positions by gene prediction programs and suggest corrections to improve the gene annotation. This study demonstrates that complementing every genome sequencing project by an MS/MS project would significantly improve both genome and proteome annotations for a reasonable cost.

  9. Comparative Genomic and Phylogenomic Analyses Reveal a Conserved Core Genome Shared by Estuarine and Oceanic Cyanopodoviruses

    PubMed Central

    Huang, Sijun; Zhang, Si; Jiao, Nianzhi; Chen, Feng

    2015-01-01

    Podoviruses are among the major viral groups that infect marine picocyanobacteria Prochlorococcus and Synechococcus. Here, we reported the genome sequences of five Synechococcus podoviruses isolated from the estuarine environment, and performed comparative genomic and phylogenomic analyses based on a total of 20 cyanopodovirus genomes. The genomes of all the known marine cyanopodoviruses are highly syntenic. A pan-genome of 349 clustered orthologous groups was determined, among which 15 were core genes. These core genes make up nearly half of each genome in length, reflecting the high level of genome conservation among this cyanophage type. The whole genome phylogenies based on concatenated core genes and gene content were highly consistent and confirmed the separation of two discrete marine cyanopodovirus clusters MPP-A and MPP-B. The genomes within cluster MPP-B grouped into subclusters mainly corresponding to Prochlorococcus or Synechococcus host types. Auxiliary metabolic genes tend to occur in a specific phylogenetic group of these cyanopodoviruses. All the MPP-B phages analyzed here encode the photosynthesis gene psbA, which are absent in all the MPP-A genomes thus far. Interestingly, all the MPP-B and two MPP-A Synechococcus podoviruses encode the thymidylate synthase gene thyX, while at the same genome locus all the MPP-B Prochlorococcus podoviruses encode the transaldolase gene talC. Both genes are hypothesized to have the potential to facilitate the biosynthesis of deoxynucleotide for phage replication. Inheritance of specific functional genes could be important to the evolution and ecological fitness of certain cyanophage genotypes. Our analyses demonstrate that cyanopodoviruses of estuarine and oceanic origins share a conserved core genome and suggest that accessory genes may be related to environmental adaptation. PMID:26569403

  10. Genome Evolution in the Eremothecium Clade of the Saccharomyces Complex Revealed by Comparative Genomics

    PubMed Central

    Wendland, Jürgen; Walther, Andrea

    2011-01-01

    We used comparative genomics to elucidate the genome evolution within the pre–whole-genome duplication genus Eremothecium. To this end, we sequenced and assembled the complete genome of Eremothecium cymbalariae, a filamentous ascomycete representing the Eremothecium type strain. Genome annotation indicated 4712 gene models and 143 tRNAs. We compared the E. cymbalariae genome with that of its relative, the riboflavin overproducer Ashbya (Eremothecium) gossypii, and the reconstructed yeast ancestor. Decisive changes in the Eremothecium lineage leading to the evolution of the A. gossypii genome include the reduction from eight to seven chromosomes, the downsizing of the genome by removal of 10% or 900 kb of DNA, mostly in intergenic regions, the loss of a TY3-Gypsy–type transposable element, the re-arrangement of mating-type loci, and a massive increase of its GC content. Key species-specific events are the loss of MNN1-family of mannosyltransferases required to add the terminal fourth and fifth α-1,3-linked mannose residue to O-linked glycans and genes of the Ehrlich pathway in E. cymbalariae and the loss of ZMM-family of meiosis-specific proteins and acquisition of riboflavin overproduction in A. gossypii. This reveals that within the Saccharomyces complex genome, evolution is not only based on genome duplication with subsequent gene deletions and chromosomal rearrangements but also on fungi associated with specific environments (e.g. involving fungal-insect interactions as in Eremothecium), which have encountered challenges that may be reflected both in genome streamlining and their biosynthetic potential. PMID:22384365

  11. Genome evolution in the eremothecium clade of the Saccharomyces complex revealed by comparative genomics.

    PubMed

    Wendland, Jürgen; Walther, Andrea

    2011-12-01

    We used comparative genomics to elucidate the genome evolution within the pre-whole-genome duplication genus Eremothecium. To this end, we sequenced and assembled the complete genome of Eremothecium cymbalariae, a filamentous ascomycete representing the Eremothecium type strain. Genome annotation indicated 4712 gene models and 143 tRNAs. We compared the E. cymbalariae genome with that of its relative, the riboflavin overproducer Ashbya (Eremothecium) gossypii, and the reconstructed yeast ancestor. Decisive changes in the Eremothecium lineage leading to the evolution of the A. gossypii genome include the reduction from eight to seven chromosomes, the downsizing of the genome by removal of 10% or 900 kb of DNA, mostly in intergenic regions, the loss of a TY3-Gypsy-type transposable element, the re-arrangement of mating-type loci, and a massive increase of its GC content. Key species-specific events are the loss of MNN1-family of mannosyltransferases required to add the terminal fourth and fifth α-1,3-linked mannose residue to O-linked glycans and genes of the Ehrlich pathway in E. cymbalariae and the loss of ZMM-family of meiosis-specific proteins and acquisition of riboflavin overproduction in A. gossypii. This reveals that within the Saccharomyces complex genome, evolution is not only based on genome duplication with subsequent gene deletions and chromosomal rearrangements but also on fungi associated with specific environments (e.g. involving fungal-insect interactions as in Eremothecium), which have encountered challenges that may be reflected both in genome streamlining and their biosynthetic potential. PMID:22384365

  12. Sockeye: a 3D environment for comparative genomics.

    PubMed

    Montgomery, Stephen B; Astakhova, Tamara; Bilenky, Mikhail; Birney, Ewan; Fu, Tony; Hassel, Maik; Melsopp, Craig; Rak, Marcin; Robertson, A Gordon; Sleumer, Monica; Siddiqui, Asim S; Jones, Steven J M

    2004-05-01

    Comparative genomics techniques are used in bioinformatics analyses to identify the structural and functional properties of DNA sequences. As the amount of available sequence data steadily increases, the ability to perform large-scale comparative analyses has become increasingly relevant. In addition, the growing complexity of genomic feature annotation means that new approaches to genomic visualization need to be explored. We have developed a Java-based application called Sockeye that uses three-dimensional (3D) graphics technology to facilitate the visualization of annotation and conservation across multiple sequences. This software uses the Ensembl database project to import sequence and annotation information from several eukaryotic species. A user can additionally import their own custom sequence and annotation data. Individual annotation objects are displayed in Sockeye by using custom 3D models. Ensembl-derived and imported sequences can be analyzed by using a suite of multiple and pair-wise alignment algorithms. The results of these comparative analyses are also displayed in the 3D environment of Sockeye. By using the Java3D API to visualize genomic data in a 3D environment, we are able to compactly display cross-sequence comparisons. This provides the user with a novel platform for visualizing and comparing genomic feature organization. PMID:15123592

  13. Phytozome: a Tool for Green Plant Comparative Genomics

    DOE Data Explorer

    Phytozome is a joint project of the Department of Energy's Joint Genome Institute and the Center for Integrative Genomics to facilitate comparative genomic studies amongst green plants. Clusters of orthologous and paralogous genes that represent the modern descendents of ancestral gene sets are constructed at key phylogenetic nodes. These clusters allow easy access to clade specific orthology/paralogy relationships as well as clade specific genes and gene expansions. As of release v4.0, Phytozome provides access to nine sequenced and annotated green plant genomes, eight of which have been clustered into gene families at six evolutionarily significant nodes. Where possible, each gene has been annotated with PFAM, KOG, KEGG, and PANTHER assignments, and publicly available annotations from RefSeq, UniProt, TAIR, JGI are hyper-linked and searchable. [Copied from the Overview at http://www.phytozome.net/Phytozome_info.php

  14. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    DOEpatents

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  15. Comparative genomics of drug resistance in Trypanosoma brucei rhodesiense.

    PubMed

    Graf, Fabrice E; Ludin, Philipp; Arquint, Christian; Schmidt, Remo S; Schaub, Nadia; Kunz Renggli, Christina; Munday, Jane C; Krezdorn, Jessica; Baker, Nicola; Horn, David; Balmer, Oliver; Caccone, Adalgisa; de Koning, Harry P; Mäser, Pascal

    2016-09-01

    Trypanosoma brucei rhodesiense is one of the causative agents of human sleeping sickness, a fatal disease that is transmitted by tsetse flies and restricted to Sub-Saharan Africa. Here we investigate two independent lines of T. b. rhodesiense that have been selected with the drugs melarsoprol and pentamidine over the course of 2 years, until they exhibited stable cross-resistance to an unprecedented degree. We apply comparative genomics and transcriptomics to identify the underlying mutations. Only few mutations have become fixed during selection. Three genes were affected by mutations in both lines: the aminopurine transporter AT1, the aquaporin AQP2, and the RNA-binding protein UBP1. The melarsoprol-selected line carried a large deletion including the adenosine transporter gene AT1, whereas the pentamidine-selected line carried a heterozygous point mutation in AT1, G430R, which rendered the transporter non-functional. Both resistant lines had lost AQP2, and both lines carried the same point mutation, R131L, in the RNA-binding motif of UBP1. The finding that concomitant deletion of the known resistance genes AT1 and AQP2 in T. b. brucei failed to phenocopy the high levels of resistance of the T. b. rhodesiense mutants indicated a possible role of UBP1 in melarsoprol-pentamidine cross-resistance. However, homozygous in situ expression of UBP1-Leu(131) in T. b. brucei did not affect the sensitivity to melarsoprol or pentamidine. PMID:26973180

  16. Comparative Genomics and stx Phage Characterization of LEE-Negative Shiga Toxin-Producing Escherichia coli

    PubMed Central

    Steyert, Susan R.; Sahl, Jason W.; Fraser, Claire M.; Teel, Louise D.; Scheutz, Flemming; Rasko, David A.

    2012-01-01

    Infection by Escherichia coli and Shigella species are among the leading causes of death due to diarrheal disease in the world. Shiga toxin-producing E. coli (STEC) that do not encode the locus of enterocyte effacement (LEE-negative STEC) often possess Shiga toxin gene variants and have been isolated from humans and a variety of animal sources. In this study, we compare the genomes of nine LEE-negative STEC harboring various stx alleles with four complete reference LEE-positive STEC isolates. Compared to a representative collection of prototype E. coli and Shigella isolates representing each of the pathotypes, the whole genome phylogeny demonstrated that these isolates are diverse. Whole genome comparative analysis of the 13 genomes revealed that in addition to the absence of the LEE pathogenicity island, phage-encoded genes including non-LEE encoded effectors, were absent from all nine LEE-negative STEC genomes. Several plasmid-encoded virulence factors reportedly identified in LEE-negative STEC isolates were identified in only a subset of the nine LEE-negative isolates further confirming the diversity of this group. In combination with whole genome analysis, we characterized the lambdoid phages harboring the various stx alleles and determined their genomic insertion sites. Although the integrase gene sequence corresponded with genomic location, it was not correlated with stx variant, further highlighting the mosaic nature of these phages. The transcription of these phages in different genomic backgrounds was examined. Expression of the Shiga toxin genes, stx1 and/or stx2, as well as the Q genes, were examined with quantitative reverse transcriptase polymerase chain reaction assays. A wide range of basal and induced toxin induction was observed. Overall, this is a first significant foray into the genome space of this unexplored group of emerging and divergent pathogens. PMID:23162798

  17. Comparative genomics boosts target prediction for bacterial small RNAs.

    PubMed

    Wright, Patrick R; Richter, Andreas S; Papenfort, Kai; Mann, Martin; Vogel, Jörg; Hess, Wolfgang R; Backofen, Rolf; Georg, Jens

    2013-09-10

    Small RNAs (sRNAs) constitute a large and heterogeneous class of bacterial gene expression regulators. Much like eukaryotic microRNAs, these sRNAs typically target multiple mRNAs through short seed pairing, thereby acting as global posttranscriptional regulators. In some bacteria, evidence for hundreds to possibly more than 1,000 different sRNAs has been obtained by transcriptome sequencing. However, the experimental identification of possible targets and, therefore, their confirmation as functional regulators of gene expression has remained laborious. Here, we present a strategy that integrates phylogenetic information to predict sRNA targets at the genomic scale and reconstructs regulatory networks upon functional enrichment and network analysis (CopraRNA, for Comparative Prediction Algorithm for sRNA Targets). Furthermore, CopraRNA precisely predicts the sRNA domains for target recognition and interaction. When applied to several model sRNAs, CopraRNA revealed additional targets and functions for the sRNAs CyaR, FnrS, RybB, RyhB, SgrS, and Spot42. Moreover, the mRNAs gdhA, lrp, marA, nagZ, ptsI, sdhA, and yobF-cspC were suggested as regulatory hubs targeted by up to seven different sRNAs. The verification of many previously undetected targets by CopraRNA, even for extensively investigated sRNAs, demonstrates its advantages and shows that CopraRNA-based analyses can compete with experimental target prediction approaches. A Web interface allows high-confidence target prediction and efficient classification of bacterial sRNAs. PMID:23980183

  18. The tiger genome and comparative analysis with lion and snow leopard genomes.

    PubMed

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-Uk; Luo, Shu-Jin; Johnson, Warren E; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A; Marker, Laurie; Harper, Cindy; Miller, Susan M; Jacobs, Wilhelm; Bertola, Laura D; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O'Brien, Stephen J; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world's most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats' hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species. PMID:24045858

  19. The tiger genome and comparative analysis with lion and snow leopard genomes

    PubMed Central

    Cho, Yun Sung; Hu, Li; Hou, Haolong; Lee, Hang; Xu, Jiaohui; Kwon, Soowhan; Oh, Sukhun; Kim, Hak-Min; Jho, Sungwoong; Kim, Sangsoo; Shin, Young-Ah; Kim, Byung Chul; Kim, Hyunmin; Kim, Chang-uk; Luo, Shu-Jin; Johnson, Warren E.; Koepfli, Klaus-Peter; Schmidt-Küntzel, Anne; Turner, Jason A.; Marker, Laurie; Harper, Cindy; Miller, Susan M.; Jacobs, Wilhelm; Bertola, Laura D.; Kim, Tae Hyung; Lee, Sunghoon; Zhou, Qian; Jung, Hyun-Ju; Xu, Xiao; Gadhvi, Priyvrat; Xu, Pengwei; Xiong, Yingqi; Luo, Yadan; Pan, Shengkai; Gou, Caiyun; Chu, Xiuhui; Zhang, Jilin; Liu, Sanyang; He, Jing; Chen, Ying; Yang, Linfeng; Yang, Yulan; He, Jiaju; Liu, Sha; Wang, Junyi; Kim, Chul Hong; Kwak, Hwanjong; Kim, Jong-Soo; Hwang, Seungwoo; Ko, Junsu; Kim, Chang-Bae; Kim, Sangtae; Bayarlkhagva, Damdin; Paek, Woon Kee; Kim, Seong-Jin; O’Brien, Stephen J.; Wang, Jun; Bhak, Jong

    2013-01-01

    Tigers and their close relatives (Panthera) are some of the world’s most endangered species. Here we report the de novo assembly of an Amur tiger whole-genome sequence as well as the genomic sequences of a white Bengal tiger, African lion, white African lion and snow leopard. Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats’ hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met39>Lys39), which is likely to be associated with adaptation to high altitude. We also detect a TYR260G>A mutation likely responsible for the white lion coat colour. Tiger and cat genomes show similar repeat composition and an appreciably conserved synteny. Genomic data from the five big cats provide an invaluable resource for resolving easily identifiable phenotypes evident in very close, but distinct, species. PMID:24045858

  20. An Integrative Method for Accurate Comparative Genome Mapping

    PubMed Central

    Swidan, Firas; Rocha, Eduardo P. C; Shmoish, Michael; Pinter, Ron Y

    2006-01-01

    We present MAGIC, an integrative and accurate method for comparative genome mapping. Our method consists of two phases: preprocessing for identifying “maximal similar segments,” and mapping for clustering and classifying these segments. MAGIC's main novelty lies in its biologically intuitive clustering approach, which aims towards both calculating reorder-free segments and identifying orthologous segments. In the process, MAGIC efficiently handles ambiguities resulting from duplications that occurred before the speciation of the considered organisms from their most recent common ancestor. We demonstrate both MAGIC's robustness and scalability: the former is asserted with respect to its initial input and with respect to its parameters' values. The latter is asserted by applying MAGIC to distantly related organisms and to large genomes. We compare MAGIC to other comparative mapping methods and provide detailed analysis of the differences between them. Our improvements allow a comprehensive study of the diversity of genetic repertoires resulting from large-scale mutations, such as indels and duplications, including explicitly transposable and phagic elements. The strength of our method is demonstrated by detailed statistics computed for each type of these large-scale mutations. MAGIC enabled us to conduct a comprehensive analysis of the different forces shaping prokaryotic genomes from different clades, and to quantify the importance of novel gene content introduced by horizontal gene transfer relative to gene duplication in bacterial genome evolution. We use these results to investigate the breakpoint distribution in several prokaryotic genomes. PMID:16933978

  1. Initial sequence and comparative analysis of the cat genome

    PubMed Central

    Pontius, Joan U.; Mullikin, James C.; Smith, Douglas R.; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A.; Agarwala, Richa; Narfström, Kristina; Murphy, William J.; Giger, Urs; Roca, Alfred L.; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E.; Bourque, Guillaume; Tesler, Glenn; O’Brien, Stephen J.

    2007-01-01

    The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing ∼65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence. PMID:17975172

  2. Brewing yeast genomes and genome-wide expression and proteome profiling during fermentation.

    PubMed

    Smart, Katherine A

    2007-11-01

    The genome structure, ancestry and instability of the brewing yeast strains have received considerable attention. The hybrid nature of brewing lager yeast strains provides adaptive potential but yields genome instability which can adversely affect fermentation performance. The requirement to differentiate between production strains and assess master cultures for genomic instability has led to significant adoption of specialized molecular tool kits by the industry. Furthermore, the development of genome-wide transcriptional and protein expression technologies has generated significant interest from brewers. The opportunity presented to explore, and the concurrent requirement to understand both, the constraints and potential of their strains to generate existing and new products during fermentation is discussed. PMID:17879324

  3. Sequencing and comparative analyses of the genomes of zoysiagrasses.

    PubMed

    Tanaka, Hidenori; Hirakawa, Hideki; Kosugi, Shunichi; Nakayama, Shinobu; Ono, Akiko; Watanabe, Akiko; Hashiguchi, Masatsugu; Gondo, Takahiro; Ishigaki, Genki; Muguerza, Melody; Shimizu, Katsuya; Sawamura, Noriko; Inoue, Takayasu; Shigeki, Yuichi; Ohno, Naoki; Tabata, Satoshi; Akashi, Ryo; Sato, Shusei

    2016-04-01

    Zoysiais a warm-season turfgrass, which comprises 11 allotetraploid species (2n= 4x= 40), each possessing different morphological and physiological traits. To characterize the genetic systems of Zoysia plants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes of Zoysia species using HiSeq and MiSeq platforms. As a reference sequence of Zoysia species, we generated a high-quality draft sequence of the genome of Z. japonica accession 'Nagirizaki' (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences of Z. matrella 'Wakaba' and Z. pacifica 'Zanpa' were also generated for comparative analyses. To investigate the genetic diversity among the Zoysia species, genome sequence reads of three additional accessions, Z. japonica'Kyoto', Z. japonica'Miyagi' and Z. matrella'Chiba Fair Green', were accumulated, and aligned against the reference genome of 'Nagirizaki' along with those from 'Wakaba' and 'Zanpa'. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the 'Zoysia Genome Database' at http://zoysia.kazusa.or.jp. PMID:26975196

  4. Sequencing and comparative analyses of the genomes of zoysiagrasses

    PubMed Central

    Tanaka, Hidenori; Hirakawa, Hideki; Kosugi, Shunichi; Nakayama, Shinobu; Ono, Akiko; Watanabe, Akiko; Hashiguchi, Masatsugu; Gondo, Takahiro; Ishigaki, Genki; Muguerza, Melody; Shimizu, Katsuya; Sawamura, Noriko; Inoue, Takayasu; Shigeki, Yuichi; Ohno, Naoki; Tabata, Satoshi; Akashi, Ryo; Sato, Shusei

    2016-01-01

    Zoysia is a warm-season turfgrass, which comprises 11 allotetraploid species (2n = 4x = 40), each possessing different morphological and physiological traits. To characterize the genetic systems of Zoysia plants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes of Zoysia species using HiSeq and MiSeq platforms. As a reference sequence of Zoysia species, we generated a high-quality draft sequence of the genome of Z. japonica accession ‘Nagirizaki’ (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences of Z. matrella ‘Wakaba’ and Z. pacifica ‘Zanpa’ were also generated for comparative analyses. To investigate the genetic diversity among the Zoysia species, genome sequence reads of three additional accessions, Z. japonica ‘Kyoto’, Z. japonica ‘Miyagi’ and Z. matrella ‘Chiba Fair Green’, were accumulated, and aligned against the reference genome of ‘Nagirizaki’ along with those from ‘Wakaba’ and ‘Zanpa’. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the ‘Zoysia Genome Database’ at http://zoysia.kazusa.or.jp. PMID:26975196

  5. What can whole genome expression data tell us about the ecology and evolution of personality?

    PubMed Central

    Bell, Alison M.; Aubin-Horth, Nadia

    2010-01-01

    Consistent individual differences in behaviour, aka personality, pose several evolutionary questions. For example, it is difficult to explain within-individual consistency in behaviour because behavioural plasticity is often advantageous. In addition, selection erodes heritable behavioural variation that is related to fitness, therefore we wish to know the mechanisms that can maintain between-individual variation in behaviour. In this paper, we argue that whole genome expression data can reveal new insights into the proximate mechanisms underlying personality, as well as its evolutionary consequences. After introducing the basics of whole genome expression analysis, we show how whole genome expression data can be used to understand whether behaviours in different contexts are affected by the same molecular mechanisms. We suggest strategies for using the power of genomics to understand what maintains behavioural variation, to study the evolution of behavioural correlations and to compare personality traits across diverse organisms. PMID:21078652

  6. Comparative genomics and functional annotation of bacterial transporters

    NASA Astrophysics Data System (ADS)

    Gelfand, Mikhail S.; Rodionov, Dmitry A.

    2008-03-01

    Transport proteins are difficult to study experimentally, and because of that their functional characterization trails that of enzymes. The comparative genomic analysis is a powerful approach to functional annotation of proteins, which makes it possible to utilize the genomic sequence data from thousands of organisms. The use of computational techniques allows one to identify candidate transporters, predict their structure and localization in the membrane, and perform detailed functional annotation, which includes substrate specificity and cellular role. We overview the main techniques of analysis of transporters' structure and function. We consider the most popular algorithms to identify transmembrane segments in protein sequences and to predict topology of multispanning proteins. We describe the main approaches of the comparative genomics, and how they may be applied to the analysis of transporters, and provide examples showing how combinations of these techniques is used for functional annotation of new transporter specificities in known families, characterization of new families, and prediction of novel transport mechanisms.

  7. Comparative analysis of rosaceous genomes and the reconstruction of a putative ancestral genome for the family

    PubMed Central

    2011-01-01

    Background Comparative genome mapping studies in Rosaceae have been conducted until now by aligning genetic maps within the same genus, or closely related genera and using a limited number of common markers. The growing body of genomics resources and sequence data for both Prunus and Fragaria permits detailed comparisons between these genera and the recently released Malus × domestica genome sequence. Results We generated a comparative analysis using 806 molecular markers that are anchored genetically to the Prunus and/or Fragaria reference maps, and physically to the Malus genome sequence. Markers in common for Malus and Prunus, and Malus and Fragaria, respectively were 784 and 148. The correspondence between marker positions was high and conserved syntenic blocks were identified among the three genera in the Rosaceae. We reconstructed a proposed ancestral genome for the Rosaceae. Conclusions A genome containing nine chromosomes is the most likely candidate for the ancestral Rosaceae progenitor. The number of chromosomal translocations observed between the three genera investigated was low. However, the number of inversions identified among Malus and Prunus was much higher than any reported genome comparisons in plants, suggesting that small inversions have played an important role in the evolution of these two genera or of the Rosaceae. PMID:21226921

  8. Genome-Based Comparative Analyses of Antarctic and Temperate Species of Paenibacillus

    PubMed Central

    Dsouza, Melissa; Taylor, Michael W.; Turner, Susan J.; Aislabie, Jackie

    2014-01-01

    Antarctic soils represent a unique environment characterised by extremes of temperature, salinity, elevated UV radiation, low nutrient and low water content. Despite the harshness of this environment, members of 15 bacterial phyla have been identified in soils of the Ross Sea Region (RSR). However, the survival mechanisms and ecological roles of these phyla are largely unknown. The aim of this study was to investigate whether strains of Paenibacillus darwinianus owe their resilience to substantial genomic changes. For this, genome-based comparative analyses were performed on three P. darwinianus strains, isolated from gamma-irradiated RSR soils, together with nine temperate, soil-dwelling Paenibacillus spp. The genome of each strain was sequenced to over 1,000-fold coverage, then assembled into contigs totalling approximately 3 Mbp per genome. Based on the occurrence of essential, single-copy genes, genome completeness was estimated at approximately 88%. Genome analysis revealed between 3,043–3,091 protein-coding sequences (CDSs), primarily associated with two-component systems, sigma factors, transporters, sporulation and genes induced by cold-shock, oxidative and osmotic stresses. These comparative analyses provide an insight into the metabolic potential of P. darwinianus, revealing potential adaptive mechanisms for survival in Antarctic soils. However, a large proportion of these mechanisms were also identified in temperate Paenibacillus spp., suggesting that these mechanisms are beneficial for growth and survival in a range of soil environments. These analyses have also revealed that the P. darwinianus genomes contain significantly fewer CDSs and have a lower paralogous content. Notwithstanding the incompleteness of the assemblies, the large differences in genome sizes, determined by the number of genes in paralogous clusters and the CDS content, are indicative of genome content scaling. Finally, these sequences are a resource for further investigations into

  9. CFGP: a web-based, comparative fungal genomics platform.

    PubMed

    Park, Jongsun; Park, Bongsoo; Jung, Kyongyong; Jang, Suwang; Yu, Kwangyul; Choi, Jaeyoung; Kong, Sunghyung; Park, Jaejin; Kim, Seryun; Kim, Hyojeong; Kim, Soonok; Kim, Jihyun F; Blair, Jaime E; Lee, Kwangwon; Kang, Seogchan; Lee, Yong-Hwan

    2008-01-01

    Since the completion of the Saccharomyces cerevisiae genome sequencing project in 1996, the genomes of over 80 fungal species have been sequenced or are currently being sequenced. Resulting data provide opportunities for studying and comparing fungal biology and evolution at the genome level. To support such studies, the Comparative Fungal Genomics Platform (CFGP; http://cfgp.snu.ac.kr), a web-based multifunctional informatics workbench, was developed. The CFGP comprises three layers, including the basal layer, middleware and the user interface. The data warehouse in the basal layer contains standardized genome sequences of 65 fungal species. The middleware processes queries via six analysis tools, including BLAST, ClustalW, InterProScan, SignalP 3.0, PSORT II and a newly developed tool named BLASTMatrix. The BLASTMatrix permits the identification and visualization of genes homologous to a query across multiple species. The Data-driven User Interface (DUI) of the CFGP was built on a new concept of pre-collecting data and post-executing analysis instead of the 'fill-in-the-form-and-press-SUBMIT' user interfaces utilized by most bioinformatics sites. A tool termed Favorite, which supports the management of encapsulated sequence data and provides a personalized data repository to users, is another novel feature in the DUI. PMID:17947331

  10. Using Comparative Genomics to Drive New Discoveries in Microbiology

    PubMed Central

    Haft, Daniel H.

    2015-01-01

    Bioinformatics looks to many microbiologists like a service industry. In this view, annotation starts with what is known from experiments in the lab, makes reasonable inferences of which genes match other genes in function, builds databases to make all that we know accessible, but creates nothing truly new. Experiments lead, then biocuration and computational biology follow. But the astounding success of genome sequencing is changing the annotation paradigm. Every genome sequenced is an intercepted coded message from the microbial world, and as all cryptographers know, it is easier to decode a thousand messages than a single message. Some biology is best discovered not by phenomenology, but by decoding genome content, forming hypotheses, and doing the first few rounds of validation computationally. Through such reasoning, a role and function may be assigned to a protein with no sequence similarity to any protein yet studied. Experimentation can follow after the discovery to cement and to extend the findings. Unfortunately, this approach remains so unfamiliar to most bench scientists that lab work and comparative genomics typically segregate to different teams working on unconnected projects. This review will discuss several themes in comparative genomics as a discovery method, including highly derived data, use of patterns of design to reason by analogy, and in silico testing of computationally generated hypotheses. PMID:25617609

  11. Comparative Analysis of Whole-Genome Gene Expression Changes in Cultured Human Embryonic Stem Cells in Response to Low, Clinical Diagnostic Relevant, and High Doses of Ionizing Radiation Exposure.

    PubMed

    Sokolov, Mykyta; Nguyen, Van; Neumann, Ronald

    2015-01-01

    The biological effects of low-dose ionizing radiation (LDIR) exposure in humans are not comprehensively understood, generating a high degree of controversy in published literature. The earliest stages of human development are known to be among the most sensitive to stress exposures, especially genotoxic stresses. However, the risks stemming from exposure to LDIR, particularly within the clinical diagnostic relevant dose range, have not been directly evaluated in human embryonic stem cells (hESCs). Here, we describe the dynamics of the whole genome transcriptional responses of different hESC lines to both LDIR and, as a reference, high-dose IR (HDIR). We found that even doses as low as 0.05 Gy could trigger statistically significant transient changes in a rather limited subset of genes in all hESCs lines examined. Gene expression signatures of hESCs exposed to IR appear to be highly dose-, time-, and cell line-dependent. We identified 50 genes constituting consensus gene expression signature as an early response to HDIR across all lines of hESC examined. We observed substantial differences in biological pathways affected by either LDIR or HDIR in hESCs, suggesting that the molecular mechanisms underpinning the responses of hESC may fundamentally differ depending on radiation doses. PMID:26133243

  12. Deregulated expression of DNA polymerase β is involved in the progression of genomic instability

    PubMed Central

    Luo, Qingying; Lai, Yanhao; Liu, Shukun; Wu, Mei; Liu, Yuan; Zhang, Zunzhen

    2013-01-01

    Deregulated expression of DNA polymerase beta (pol β) has been implicated in genomic instability that leads to tumorigenesis, yet the mechanisms underlying the pol β-mediated genetic instability remain elusive. In this study, we investigated the roles of deregulated expression of pol β in spontaneous and xenobiotic-induced genetic instability using mouse embryonic fibroblasts (MEFs) that express distinct pol β levels (wild-type, null and over-expression) as a model system. Three genetic instability endpoints, DNA strand breaks, chromosome breakage and gene mutation, were examined under various expression levels of pol β by comet assay, micronuclei test and hprt mutation assay. Our results demonstrate that neither pol β deficiency nor pol β over-expression is sufficient for accumulation of spontaneous DNA damage that promotes a hyper-proliferation phenotype. However, pol β null cells exhibit increased sensitivity to exogenous DNA damaging agents with increased genomic instability compared with pol β wild-type and over-expression cells. This finding suggests that a pol β deficiency may underlie genomic instability induced by exogenous DNA damaging agents. Interestingly, pol β over-expression cells exhibit less chromosomal or DNA damage, but display a higher hprt mutation frequency upon methyl methanesulfonate exposure compared with the other two cell types. Our results therefore indicate that an excessive amount of pol β may promote genomic instability, presumably through an error-prone repair response, although it enhances overall BER capacity for induced DNA damage. PMID:22576475

  13. Genome informatics and vaccine targets in Corynebacterium urealyticum using two whole genomes, comparative genomics, and reverse vaccinology

    PubMed Central

    2015-01-01

    Background Corynebacterium urealyticum is an opportunistic pathogen that normally lives on skin and mucous membranes in humans. This high Gram-positive bacteria can cause acute or encrusted cystitis, encrusted pyelitis, and pyelonephritis in immunocompromised patients. The bacteria is multi-drug resistant, and knowledge about the genes that contribute to its virulence is very limited. Two complete genome sequences were used in this comparative genomic study: C. urealyticum DSM 7109 and C. urealyticum DSM 7111. Results We used comparative genomics strategies to compare the two strains, DSM 7109 and DSM 7111, and to analyze their metabolic pathways, genome plasticity, and to predict putative antigenic targets. The genomes of these two strains together encode 2,115 non-redundant coding sequences, 1,823 of which are common to both genomes. We identified 188 strain-specific genes in DSM 7109 and 104 strain-specific genes in DSM 7111. The high number of strain-specific genes may be a result of horizontal gene transfer triggered by the large number of transposons in the genomes of these two strains. Screening for virulence factors revealed the presence of the spaDEF operon that encodes pili forming proteins. Therefore, spaDEF may play a pivotal role in facilitating the adhesion of the pathogen to the host tissue. Application of the reverse vaccinology method revealed 19 putative antigenic proteins that may be used in future studies as candidate drug or vaccine targets. Conclusions The genome features and the presence of virulence factors in genomic islands in the two strains of C. urealyticum provide insights in the lifestyle of this opportunistic pathogen and may be useful in developing future therapeutic strategies. PMID:26041051

  14. Comparative genomics of wild type yeast strains unveils important genome diversity

    PubMed Central

    Carreto, Laura; Eiriz, Maria F; Gomes, Ana C; Pereira, Patrícia M; Schuller, Dorit; Santos, Manuel AS

    2008-01-01

    Background Genome variability generates phenotypic heterogeneity and is of relevance for adaptation to environmental change, but the extent of such variability in natural populations is still poorly understood. For example, selected Saccharomyces cerevisiae strains are variable at the ploidy level, have gene amplifications, changes in chromosome copy number, and gross chromosomal rearrangements. This suggests that genome plasticity provides important genetic diversity upon which natural selection mechanisms can operate. Results In this study, we have used wild-type S. cerevisiae (yeast) strains to investigate genome variation in natural and artificial environments. We have used comparative genome hybridization on array (aCGH) to characterize the genome variability of 16 yeast strains, of laboratory and commercial origin, isolated from vineyards and wine cellars, and from opportunistic human infections. Interestingly, sub-telomeric instability was associated with the clinical phenotype, while Ty element insertion regions determined genomic differences of natural wine fermentation strains. Copy number depletion of ASP3 and YRF1 genes was found in all wild-type strains. Other gene families involved in transmembrane transport, sugar and alcohol metabolism or drug resistance had copy number changes, which also distinguished wine from clinical isolates. Conclusion We have isolated and genotyped more than 1000 yeast strains from natural environments and carried out an aCGH analysis of 16 strains representative of distinct genotype clusters. Important genomic variability was identified between these strains, in particular in sub-telomeric regions and in Ty-element insertion sites, suggesting that this type of genome variability is the main source of genetic diversity in natural populations of yeast. The data highlights the usefulness of yeast as a model system to unravel intraspecific natural genome diversity and to elucidate how natural selection shapes the yeast genome

  15. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity

    PubMed Central

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-01-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141

  16. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity.

    PubMed

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-08-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141

  17. African Relapsing Fever Borreliae Genomospecies Revealed by Comparative Genomics

    PubMed Central

    Elbir, Haitham; Abi-Rached, Laurent; Pontarotti, Pierre; Yoosuf, Niyaz; Drancourt, Michel

    2014-01-01

    Background: Relapsing fever borreliae are vector-borne bacteria responsible for febrile infection in humans in North America, Africa, Asia, and in the Iberian Peninsula in Europe. Relapsing fever borreliae are phylogenetically closely related, yet they differ in pathogenicity and vectors. Their long-term taxonomy, based on geography and vector grouping, needs to be re-apprised in a genomic context. We therefore embarked into genomic analyses of relapsing fever borreliae, focusing on species found in Africa. Results: Genome-wide phylogenetic analyses group Old World Borrelia crocidurae, Borrelia hispanica, B. duttonii, and B. recurrentis in one clade, and New World Borrelia turicatae and Borrelia hermsii in a second clade. Accordingly, average nucleotide identity is 99% among B. duttonii, B. recurrentis, and B. crocidurae and 96% between latter borreliae and B. hispanica while the similarity is 86% between Old World and New World borreliae. Comparative genomics indicates that the Old World relapsing fever B. duttonii, B. recurrentis, B. crocidurae, and B. hispanica have a 2,514-gene pan genome and a 933-gene core genome that includes 788 chromosomal and 145 plasmidic genes. Analyzing the role that natural selection has played in the evolution of Old World borreliae species revealed that 55 loci were under positive diversifying selection, including loci coding for membrane, flagellar, and chemotaxis proteins, three categories associated with adaption to specific niches. Conclusion: Genomic analyses led to a reappraisal of the taxonomy of relapsing fever borreliae in Africa. These analyses suggest that B. crocidurae, B. duttonii, and B. recurrentis are ecotypes of a unique genomospecies, while B. hispanica is a distinct species. PMID:25229054

  18. Comparative mapping of expressed sequence tags containing microsatellites in rainbow trout (Oncorhynchus mykiss)

    PubMed Central

    Rexroad, Caird E; Rodriguez, Maria F; Coulibaly, Issa; Gharbi, Karim; Danzmann, Roy G; DeKoning, Jenefer; Phillips, Ruth; Palti, Yniv

    2005-01-01

    Background Comparative genomics, through the integration of genetic maps from species of interest with whole genome sequences of other species, will facilitate the identification of genes affecting phenotypes of interest. The development of microsatellite markers from expressed sequence tags will serve to increase marker densities on current salmonid genetic maps and initiate in silico comparative maps with species whose genomes have been fully sequenced. Results Eighty-nine polymorphic microsatellite markers were generated for rainbow trout of which at least 74 amplify in other salmonids. Fifty-five have been associated with functional annotation and 30 were mapped on existing genetic maps. Homologous sequences were identified for 20 of the EST containing microsatellites to identify comparative assignments within the tetraodon, mouse, and/or human genomes. Conclusion The addition of microsatellite markers constructed from expressed sequence tag data will facilitate the development of high-density genetic maps for rainbow trout and comparative maps with other salmonids and better studied species. PMID:15836796

  19. Phylogeny and comparative genome analysis of a Basidiomycete fungi

    SciTech Connect

    Riley, Robert W.; Salamov, Asaf; Grigoriev, Igor; Hibbett, David

    2011-03-14

    Fungi of the phylum Basidiomycota, make up some 37percent of the described fungi, and are important from the perspectives of forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, plant pathogenic rusts and smuts, and some human pathogens. To better understand these important fungi, we have undertaken a comparative genomic analysis of the Basidiomycetes with available sequenced genomes. We report a phylogeny that sheds light on previously unclear evolutionary relationships among the Basidiomycetes. We also define a `core proteome? based on protein families conserved in all Basidiomycetes. We identify key expansions and contractions in protein families that may be responsible for the degradation of plant biomass such as cellulose, hemicellulose, and lignin. Finally, we speculate as to the genomic changes that drove such expansions and contractions.

  20. Utility of array comparative genomic hybridization in cytogenetic analysis.

    PubMed

    Singh, Rashmi R; Cheung, K-John J; Horsman, Douglas E

    2011-01-01

    Conventional comparative genomic hybridization (CGH), high-resolution oligonucleotide, and BAC array CGH have modernized the field of cytogenetics to enable access to unbalanced genomic aberrations such as whole or partial chromosomal gains and losses. The basic principle of array CGH involves hybridizing differentially labeled proband/test (e.g., tumor) and normal reference DNA on an array of oligonucleotide or BAC clones instead of normal metaphases as in conventional CGH. The sub-megabase resolution tiling BAC arrays are extremely useful for the analysis of acquired aberrations in cancer genomes. Array CGH can be extremely useful to identify the chromosomal makeup of marker and ring chromosomes, to define/delineate the precise location/bands involved in structural aberrations and the accurate localization of translocation breakpoints in both simple and complex karyotypes either alone or in combination with standard karyotype analysis. PMID:21431645

  1. Comparative genomics reveals mobile pathogenicity chromosomes in Fusarium

    PubMed Central

    Ma, Li-Jun; van der Does, H. Charlotte; Borkovich, Katherine A.; Coleman, Jeffrey J.; Daboussi, Marie-Josée; Di Pietro, Antonio; Dufresne, Marie; Freitag, Michael; Grabherr, Manfred; Henrissat, Bernard; Houterman, Petra M.; Kang, Seogchan; Shim, Won-Bo; Woloshuk, Charles; Xie, Xiaohui; Xu, Jin-Rong; Antoniw, John; Baker, Scott E.; Bluhm, Burton H.; Breakspear, Andrew; Brown, Daren W.; Butchko, Robert A. E.; Chapman, Sinead; Coulson, Richard; Coutinho, Pedro M.; Danchin, Etienne G. J.; Diener, Andrew; Gale, Liane R.; Gardiner, Donald M.; Goff, Stephen; Hammond-Kosack, Kim E.; Hilburn, Karen; Hua-Van, Aurélie; Jonkers, Wilfried; Kazan, Kemal; Kodira, Chinnappa D.; Koehrsen, Michael; Kumar, Lokesh; Lee, Yong-Hwan; Li, Liande; Manners, John M.; Miranda-Saavedra, Diego; Mukherjee, Mala; Park, Gyungsoon; Park, Jongsun; Park, Sook-Young; Proctor, Robert H.; Regev, Aviv; Ruiz-Roldan, M. Carmen; Sain, Divya; Sakthikumar, Sharadha; Sykes, Sean; Schwartz, David C.; Turgeon, B. Gillian; Wapinski, Ilan; Yoder, Olen; Young, Sarah; Zeng, Qiandong; Zhou, Shiguo; Galagan, James; Cuomo, Christina A.; Kistler, H. Corby; Rep, Martijn

    2011-01-01

    Fusarium species are among the most important phytopathogenic and toxigenic fungi. To understand the molecular underpinnings of pathogenicity in the genus Fusarium, we compared the genomes of three phenotypically diverse species: Fusarium graminearum, Fusarium verticillioides and Fusarium oxysporum f. sp. lycopersici. Our analysis revealed lineage-specific (LS) genomic regions in F. oxysporum that include four entire chromosomes and account for more than one-quarter of the genome. LS regions are rich in transposons and genes with distinct evolutionary profiles but related to pathogenicity, indicative of horizontal acquisition. Experimentally, we demonstrate the transfer of two LS chromosomes between strains of F. oxysporum, converting a non-pathogenic strain into a pathogen. Transfer of LS chromosomes between otherwise genetically isolated strains explains the polyphyletic origin of host specificity and the emergence of new pathogenic lineages in F. oxysporum. These findings put the evolution of fungal pathogenicity into a new perspective. PMID:20237561

  2. A web server for mining Comparative Genomic Hybridization (CGH) data

    NASA Astrophysics Data System (ADS)

    Liu, Jun; Ranka, Sanjay; Kahveci, Tamer

    2007-11-01

    Advances in cytogenetics and molecular biology has established that chromosomal alterations are critical in the pathogenesis of human cancer. Recurrent chromosomal alterations provide cytological and molecular markers for the diagnosis and prognosis of disease. They also facilitate the identification of genes that are important in carcinogenesis, which in the future may help in the development of targeted therapy. A large amount of publicly available cancer genetic data is now available and it is growing. There is a need for public domain tools that allow users to analyze their data and visualize the results. This chapter describes a web based software tool that will allow researchers to analyze and visualize Comparative Genomic Hybridization (CGH) datasets. It employs novel data mining methodologies for clustering and classification of CGH datasets as well as algorithms for identifying important markers (small set of genomic intervals with aberrations) that are potentially cancer signatures. The developed software will help in understanding the relationships between genomic aberrations and cancer types.

  3. Comparative analysis of methods for genome-wide nucleosome cartography.

    PubMed

    Quintales, Luis; Vázquez, Enrique; Antequera, Francisco

    2015-07-01

    Nucleosomes contribute to compacting the genome into the nucleus and regulate the physical access of regulatory proteins to DNA either directly or through the epigenetic modifications of the histone tails. Precise mapping of nucleosome positioning across the genome is, therefore, essential to understanding the genome regulation. In recent years, several experimental protocols have been developed for this purpose that include the enzymatic digestion, chemical cleavage or immunoprecipitation of chromatin followed by next-generation sequencing of the resulting DNA fragments. Here, we compare the performance and resolution of these methods from the initial biochemical steps through the alignment of the millions of short-sequence reads to a reference genome to the final computational analysis to generate genome-wide maps of nucleosome occupancy. Because of the lack of a unified protocol to process data sets obtained through the different approaches, we have developed a new computational tool (NUCwave), which facilitates their analysis, comparison and assessment and will enable researchers to choose the most suitable method for any particular purpose. NUCwave is freely available at http://nucleosome.usal.es/nucwave along with a step-by-step protocol for its use. PMID:25296770

  4. FusoBase: an online Fusobacterium comparative genomic analysis platform

    PubMed Central

    Ang, Mia Yang; Heydari, Hamed; Jakubovics, Nick S.; Mahmud, Mahafizul Imran; Dutta, Avirup; Wee, Wei Yee; Wong, Guat Jah; Mutha, Naresh V.R.; Tan, Shi Yang; Choo, Siew Woh

    2014-01-01

    Fusobacterium are anaerobic gram-negative bacteria that have been associated with a wide spectrum of human infections and diseases. As the biology of Fusobacterium is still not well understood, comparative genomic analysis on members of this species will provide further insights on their taxonomy, phylogeny, pathogenicity and other information that may contribute to better management of infections and diseases. To facilitate the ongoing genomic research on Fusobacterium, a specialized database with easy-to-use analysis tools is necessary. Here we present FusoBase, an online database providing access to genome-wide annotated sequences of Fusobacterium strains as well as bioinformatics tools, to support the expanding scientific community. Using our custom-developed Pairwise Genome Comparison tool, we demonstrate how differences between two user-defined genomes and how insertion of putative prophages can be identified. In addition, Pathogenomics Profiling Tool is capable of clustering predicted genes across Fusobacterium strains and visualizing the results in the form of a heat map with dendrogram. Database URL: http://fusobacterium.um.edu.my. PMID:25149689

  5. Comparative mitochondrial genomics within and among species of killifish

    PubMed Central

    Whitehead, Andrew

    2009-01-01

    Background This study was motivated by the observation of unusual mitochondrial haplotype distributions and associated physiological differences between populations of the killifish Fundulus heteroclitus distributed along the Atlantic coast of North America. A distinct "northern" haplotype is fixed in all populations north of New Jersey, and does not appear south of New Jersey except in extreme upper-estuary fresh water habitats, and northern individuals are known to be more tolerant of hyposmotic conditions than southern individuals. Complete mitochondrial genomes were sequenced from individuals from northern coastal, southern coastal, and fresh water populations (and from out-groups). Comparative genomics approaches were used to test multiple evolutionary hypotheses proposed to explain among-population genome variation including directional selection and hybridization. Results Structure and organization of the Fundulus mitochondrial genome is typical of animals, yet subtle differences in substitution patterns exist among populations. No signals of directional selection or hybridization were detected. Mitochondrial genes evolve at variable rates, but all genes exhibit very low dN/dS ratios across all lineages, and the southern population harbors more synonymous polymorphism than other populations. Conclusion Evolution of mitochondrial genomes within Fundulus is primarily governed by interaction between strong purifying selection and demographic influences, including larger historical population size in the south. Though directional selection and hybridization hypotheses were not supported, adaptive processes may indirectly contribute to partitioning of variation between populations. PMID:19144111

  6. Lactobacillus paracasei comparative genomics: towards species pan-genome definition and exploitation of diversity.

    PubMed

    Smokvina, Tamara; Wels, Michiel; Polka, Justyna; Chervaux, Christian; Brisse, Sylvain; Boekhorst, Jos; van Hylckama Vlieg, Johan E T; Siezen, Roland J

    2013-01-01

    Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its "pan-genome". We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800-3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon) are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25-53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis, in order to link

  7. Improvement of whole-genome annotation of cereals through comparative analyses

    PubMed Central

    Zhu, Wei; Buell, C. Robin

    2007-01-01

    Rice is an important model species for the Poaceae and other monocotyledonous plants. With the availability of a near-complete, finished, and annotated rice genome, we performed genome level comparisons between rice and all plant species in which large genomic or transcriptomic data sets are available to determine the utility of cross-species sequence for structural and functional annotation of the rice genome. Through comparative analyses with four plant genome sequence data sets and transcript assemblies from 185 plant species, we were able to confirm and improve the structural annotation of the rice genome. Support for 38,109 (89.3%) of the total 42,653 nontransposable element-related genes in the rice genome in the form of a rice expressed sequence tag, full-length cDNA, or plant homolog from our comparative analyses could be found. Although the majority of the putative homologs were obtained from Poaceae species, putative homologs were identified in dicotyledonous angiosperms, gymnosperms, and other plants such as algae, moss, and fern. A set of rice genes (7669) lacking a putative homolog was identified which may be lineage-specific genes that evolved after speciation and have a role in species diversity. Improvements to the current rice gene structural annotation could be identified from our comparative alignments and we were able to identify 487 genes which were mostly likely missed in the current rice genome annotation and another 500 genes for structural annotation review. We were able to demonstrate the utility of cross-species comparative alignments in the identification of noncoding sequences and in confirmation of gene nesting in rice. PMID:17284677

  8. MicroScope: a platform for microbial genome annotation and comparative genomics

    PubMed Central

    Vallenet, D.; Engelen, S.; Mornico, D.; Cruveiller, S.; Fleury, L.; Lajus, A.; Rouy, Z.; Roche, D.; Salvignol, G.; Scarpelli, C.; Médigue, C.

    2009-01-01

    The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope’s rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of

  9. Whole genome comparative studies between chicken and turkey and their implications for avian genome evolution

    PubMed Central

    Griffin, Darren K; Robertson, Lindsay B; Tempest, Helen G; Vignal, Alain; Fillon, Valérie; Crooijmans, Richard PMA; Groenen, Martien AM; Deryusheva, Svetlana; Gaginskaya, Elena; Carré, Wilfrid; Waddington, David; Talbot, Richard; Völker, Martin; Masabanda, Julio S; Burt, Dave W

    2008-01-01

    Background Comparative genomics is a powerful means of establishing inter-specific relationships between gene function/location and allows insight into genomic rearrangements, conservation and evolutionary phylogeny. The availability of the complete sequence of the chicken genome has initiated the development of detailed genomic information in other birds including turkey, an agriculturally important species where mapping has hitherto focused on linkage with limited physical information. No molecular study has yet examined conservation of avian microchromosomes, nor differences in copy number variants (CNVs) between birds. Results We present a detailed comparative cytogenetic map between chicken and turkey based on reciprocal chromosome painting and mapping of 338 chicken BACs to turkey metaphases. Two inter-chromosomal changes (both involving centromeres) and three pericentric inversions have been identified between chicken and turkey; and array CGH identified 16 inter-specific CNVs. Conclusion This is the first study to combine the modalities of zoo-FISH and array CGH between different avian species. The first insight into the conservation of microchromosomes, the first comparative cytogenetic map of any bird and the first appraisal of CNVs between birds is provided. Results suggest that avian genomes have remained relatively stable during evolution compared to mammalian equivalents. PMID:18410676

  10. Sequencing and Comparative Genome Analysis of Two Pathogenic Streptococcus gallolyticus Subspecies: Genome Plasticity, Adaptation and Virulence

    PubMed Central

    Teng, Yu-Ting; Wu, Hui-Lun; Liu, Yen-Ming; Wu, Keh-Ming; Chang, Chuan-Hsiung; Hsu, Ming-Ta

    2011-01-01

    Streptococcus gallolyticus infections in humans are often associated with bacteremia, infective endocarditis and colon cancers. The disease manifestations are different depending on the subspecies of S. gallolyticus causing the infection. Here, we present the complete genomes of S. gallolyticus ATCC 43143 (biotype I) and S. pasteurianus ATCC 43144 (biotype II.2). The genomic differences between the two biotypes were characterized with comparative genomic analyses. The chromosome of ATCC 43143 and ATCC 43144 are 2,36 and 2,10 Mb in length and encode 2246 and 1869 CDS respectively. The organization and genomic contents of both genomes were most similar to the recently published S. gallolyticus UCN34, where 2073 (92%) and 1607 (86%) of the ATCC 43143 and ATCC 43144 CDS were conserved in UCN34 respectively. There are around 600 CDS conserved in all Streptococcus genomes, indicating the Streptococcus genus has a small core-genome (constitute around 30% of total CDS) and substantial evolutionary plasticity. We identified eight and five regions of genome plasticity in ATCC 43143 and ATCC 43144 respectively. Within these regions, several proteins were recognized to contribute to the fitness and virulence of each of the two subspecies. We have also predicted putative cell-surface associated proteins that could play a role in adherence to host tissues, leading to persistent infections causing sub-acute and chronic diseases in humans. This study showed evidence that the S. gallolyticus still possesses genes making it suitable in a rumen environment, whereas the ability for S. pasteurianus to live in rumen is reduced. The genome heterogeneity and genetic diversity among the two biotypes, especially membrane and lipoproteins, most likely contribute to the differences in the pathogenesis of the two S. gallolyticus biotypes and the type of disease an infected patient eventually develops. PMID:21633709

  11. Trypanosomatid comparative genomics: Contributions to the study of parasite biology and different parasitic diseases

    PubMed Central

    Teixeira, Santuza M.; de Paiva, Rita Márcia Cardoso; Kangussu-Marcolino, Monica M.; DaRocha, Wanderson D.

    2012-01-01

    In 2005, draft sequences of the genomes of Trypanosoma brucei, Trypanosoma cruzi and Leishmania major, also known as the Tri-Tryp genomes, were published. These protozoan parasites are the causative agents of three distinct insect-borne diseases, namely sleeping sickness, Chagas disease and leishmaniasis, all with a worldwide distribution. Despite the large estimated evolutionary distance among them, a conserved core of ~6,200 trypanosomatid genes was found among the Tri-Tryp genomes. Extensive analysis of these genomic sequences has greatly increased our understanding of the biology of these parasites and their host-parasite interactions. In this article, we review the recent advances in the comparative genomics of these three species. This analysis also includes data on additional sequences derived from other trypanosmatid species, as well as recent data on gene expression and functional genomics. In addition to facilitating the identification of key parasite molecules that may provide a better understanding of these complex diseases, genome studies offer a rich source of new information that can be used to define potential new drug targets and vaccine candidates for controlling these parasitic infections. PMID:22481868

  12. Comparative genomics reveals diversity among xanthomonads infecting tomato and pepper

    PubMed Central

    2011-01-01

    Background Bacterial spot of tomato and pepper is caused by four Xanthomonas species and is a major plant disease in warm humid climates. The four species are distinct from each other based on physiological and molecular characteristics. The genome sequence of strain 85-10, a member of one of the species, Xanthomonas euvesicatoria (Xcv) has been previously reported. To determine the relationship of the four species at the genome level and to investigate the molecular basis of their virulence and differing host ranges, draft genomic sequences of members of the other three species were determined and compared to strain 85-10. Results We sequenced the genomes of X. vesicatoria (Xv) strain 1111 (ATCC 35937), X. perforans (Xp) strain 91-118 and X. gardneri (Xg) strain 101 (ATCC 19865). The genomes were compared with each other and with the previously sequenced Xcv strain 85-10. In addition, the molecular features were predicted that may be required for pathogenicity including the type III secretion apparatus, type III effectors, other secretion systems, quorum sensing systems, adhesins, extracellular polysaccharide, and lipopolysaccharide determinants. Several novel type III effectors from Xg strain 101 and Xv strain 1111 genomes were computationally identified and their translocation was validated using a reporter gene assay. A homolog to Ax21, the elicitor of XA21-mediated resistance in rice, and a functional Ax21 sulfation system were identified in Xcv. Genes encoding proteins with functions mediated by type II and type IV secretion systems have also been compared, including enzymes involved in cell wall deconstruction, as contributors to pathogenicity. Conclusions Comparative genomic analyses revealed considerable diversity among bacterial spot pathogens, providing new insights into differences and similarities that may explain the diverse nature of these strains. Genes specific to pepper pathogens, such as the O-antigen of the lipopolysaccharide cluster, and genes

  13. Complexity, Post-genomic Biology and Gene Expression Programs

    NASA Astrophysics Data System (ADS)

    Williams, Rohan B. H.; Luo, Oscar Junhong

    Gene expression represents the fundamental phenomenon by which information encoded in a genome is utilised for the overall biological objectives of the organism. Understanding this level of information transfer is therefore essential for dissecting the mechanistic basis of form and function of organisms. We survey recent developments in the methodology of the life sciences that is relevant for understanding the organisation and function of the genome and review our current understanding of the regulation of gene expression, and finally, outline some new approaches that may be useful in understanding the organisation of gene regulatory systems.

  14. Comparative genomics of transcriptional regulation of methionine metabolism in Proteobacteria.

    PubMed

    Leyn, Semen A; Suvorova, Inna A; Kholina, Tatiana D; Sherstneva, Sofia S; Novichkov, Pavel S; Gelfand, Mikhail S; Rodionov, Dmitry A

    2014-01-01

    Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ∼ 200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific and genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria. PMID:25411846

  15. Comparative genomics of transcriptional regulation of methionine metabolism in proteobacteria

    DOE PAGESBeta

    Leyn, Semen A.; Suvorova, Inna A.; Kholina, Tatiana D.; Sherstneva, Sofia S.; Novichkov, Pavel S.; Gelfand, Mikhail S.; Rodionov, Dmitry A.; Kuipers, Oscar P.

    2014-11-20

    Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ~200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific andmore » genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria.« less

  16. Comparative genomics of transcriptional regulation of methionine metabolism in proteobacteria

    SciTech Connect

    Leyn, Semen A.; Suvorova, Inna A.; Kholina, Tatiana D.; Sherstneva, Sofia S.; Novichkov, Pavel S.; Gelfand, Mikhail S.; Rodionov, Dmitry A.; Kuipers, Oscar P.

    2014-11-20

    Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ~200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific and genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria.

  17. Comparative Genomics of Transcriptional Regulation of Methionine Metabolism in Proteobacteria

    PubMed Central

    Leyn, Semen A.; Suvorova, Inna A.; Kholina, Tatiana D.; Sherstneva, Sofia S.; Novichkov, Pavel S.; Gelfand, Mikhail S.; Rodionov, Dmitry A.

    2014-01-01

    Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ∼200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific and genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria. PMID:25411846

  18. Comparative genomics and transcriptomics of trait-gene association

    PubMed Central

    2012-01-01

    Background The Order Rickettsiales includes important tick-borne pathogens, from Rickettsia rickettsii, which causes Rocky Mountain spotted fever, to Anaplasma marginale, the most prevalent vector-borne pathogen of cattle. Although most pathogens in this Order are transmitted by arthropod vectors, little is known about the microbial determinants of transmission. A. marginale provides unique tools for studying the determinants of transmission, with multiple strain sequences available that display distinct and reproducible transmission phenotypes. The closed core A. marginale genome suggests that any phenotypic differences are due to single nucleotide polymorphisms (SNPs). We combined DNA/RNA comparative genomic approaches using strains with different tick transmission phenotypes and identified genes that segregate with transmissibility. Results Comparison of seven strains with different transmission phenotypes generated a list of SNPs affecting 18 genes and nine promoters. Transcriptional analysis found two candidate genes downstream from promoter SNPs that were differentially transcribed. To corroborate the comparative genomics approach we used three RNA-seq platforms to analyze the transcriptomes from two A. marginale strains with different transmission phenotypes. RNA-seq analysis confirmed the comparative genomics data and found 10 additional genes whose transcription between strains with distinct transmission efficiencies was significantly different. Six regions of the genome that contained no annotation were found to be transcriptionally active, and two of these newly identified transcripts were differentially transcribed. Conclusions This approach identified 30 genes and two novel transcripts potentially involved in tick transmission. We describe the transcriptome of an obligate intracellular bacterium in depth, while employing massive parallel sequencing to dissect an important trait in bacterial pathogenesis. PMID:23181781

  19. Comparative genomics of Enterococcus faecalis from healthy Norwegian infants

    PubMed Central

    Solheim, Margrete; Aakra, Ågot; Snipen, Lars G; Brede, Dag A; Nes, Ingolf F

    2009-01-01

    Background Enterococcus faecalis, traditionally considered a harmless commensal of the intestinal tract, is now ranked among the leading causes of nosocomial infections. In an attempt to gain insight into the genetic make-up of commensal E. faecalis, we have studied genomic variation in a collection of community-derived E. faecalis isolated from the feces of Norwegian infants. Results The E. faecalis isolates were first sequence typed by multilocus sequence typing (MLST) and characterized with respect to antibiotic resistance and properties associated with virulence. A subset of the isolates was compared to the vancomycin resistant strain E. faecalis V583 (V583) by whole genome microarray comparison (comparative genomic hybridization (CGH)). Several of the putative enterococcal virulence factors were found to be highly prevalent among the commensal baby isolates. The genomic variation as observed by CGH was less between isolates displaying the same MLST sequence type than between isolates belonging to different evolutionary lineages. Conclusion The variations in gene content observed among the investigated commensal E. faecalis is comparable to the genetic variation previously reported among strains of various origins thought to be representative of the major E. faecalis lineages. Previous MLST analysis of E. faecalis have identified so-called high-risk enterococcal clonal complexes (HiRECC), defined as genetically distinct subpopulations, epidemiologically associated with enterococcal infections. The observed correlation between CGH and MLST presented here, may offer a method for the identification of lineage-specific genes, and may therefore add clues on how to distinguish pathogenic from commensal E. faecalis. In this work, information on the core genome of E. faecalis is also substantially extended. PMID:19393078

  20. Lactobacillus paracasei Comparative Genomics: Towards Species Pan-Genome Definition and Exploitation of Diversity

    PubMed Central

    Smokvina, Tamara; Wels, Michiel; Polka, Justyna; Chervaux, Christian; Brisse, Sylvain; Boekhorst, Jos; Vlieg, Johan E. T. van Hylckama; Siezen, Roland J.

    2013-01-01

    Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its “pan-genome”. We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800–3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon) are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25–53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis, in order to

  1. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir S.; Joachimiak, Marcin P.; Price, Morgan N.; Bates, John T.; Baumohl, Jason K.; Chivian, Dylan; Friedland, Greg D.; Huang, Katherine H.; Keller, Keith; Novichkov, Pavel S.; Dubchak, Inna L.; Alm, Eric J.; Arkin, Adam P.

    2009-09-17

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  2. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir; Joachimiak, Marcin; Price, Morgan; Bates, John; Baumohl, Jason; Chivian, Dylan; Friedland, Greg; Huang, Kathleen; Keller, Keith; Novichkov, Pavel; Dubchak, Inna; Alm, Eric; Arkin, Adam

    2011-07-14

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  3. MicrobesOnline: an integrated portal for comparative and functional genomics.

    PubMed

    Dehal, Paramvir S; Joachimiak, Marcin P; Price, Morgan N; Bates, John T; Baumohl, Jason K; Chivian, Dylan; Friedland, Greg D; Huang, Katherine H; Keller, Keith; Novichkov, Pavel S; Dubchak, Inna L; Alm, Eric J; Arkin, Adam P

    2010-01-01

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html. PMID:19906701

  4. Comparative Genomics of Multidrug Resistance in Acinetobacter baumannii

    PubMed Central

    2006-01-01

    Acinetobacter baumannii is a species of nonfermentative gram-negative bacteria commonly found in water and soil. This organism was susceptible to most antibiotics in the 1970s. It has now become a major cause of hospital-acquired infections worldwide due to its remarkable propensity to rapidly acquire resistance determinants to a wide range of antibacterial agents. Here we use a comparative genomic approach to identify the complete repertoire of resistance genes exhibited by the multidrug-resistant A. baumannii strain AYE, which is epidemic in France, as well as to investigate the mechanisms of their acquisition by comparison with the fully susceptible A. baumannii strain SDF, which is associated with human body lice. The assembly of the whole shotgun genome sequences of the strains AYE and SDF gave an estimated size of 3.9 and 3.2 Mb, respectively. A. baumannii strain AYE exhibits an 86-kb genomic region termed a resistance island—the largest identified to date—in which 45 resistance genes are clustered. At the homologous location, the SDF strain exhibits a 20 kb-genomic island flanked by transposases but devoid of resistance markers. Such a switching genomic structure might be a hotspot that could explain the rapid acquisition of resistance markers under antimicrobial pressure. Sequence similarity and phylogenetic analyses confirm that most of the resistance genes found in the A. baumannii strain AYE have been recently acquired from bacteria of the genera Pseudomonas, Salmonella, or Escherichia. This study also resulted in the discovery of 19 new putative resistance genes. Whole-genome sequencing appears to be a fast and efficient approach to the exhaustive identification of resistance genes in epidemic infectious agents of clinical significance. PMID:16415984

  5. High variability of genomic instability and gene expression profiling in different HeLa clones

    PubMed Central

    Frattini, Annalisa; Fabbri, Marco; Valli, Roberto; De Paoli, Elena; Montalbano, Giuseppe; Gribaldo, Laura; Pasquali, Francesco; Maserati, Emanuela

    2015-01-01

    The HeLa cell line is one of the most popular cell lines in biomedical research, despite its well-known chromosomal instability. We compared the genomic and transcriptomic profiles of 4 different HeLa batches and showed that the gain and loss of genomic material varies widely between batches, drastically affecting basal gene expression. Moreover, different pathways were activated in response to a hypoxic stimulus. Our study emphasizes the large genomic and transcriptomic variability among different batches, to the point that the same experiment performed with different batches can lead to distinct conclusions and irreproducible results. The HeLa cell line is thought to be a unique cell line but it is clear that substantial differences between the primary tumour and the human genome exist and that an indeterminate number of HeLa cell lines may exist, each with a unique genomic profile. PMID:26483214

  6. High variability of genomic instability and gene expression profiling in different HeLa clones.

    PubMed

    Frattini, Annalisa; Fabbri, Marco; Valli, Roberto; De Paoli, Elena; Montalbano, Giuseppe; Gribaldo, Laura; Pasquali, Francesco; Maserati, Emanuela

    2015-01-01

    The HeLa cell line is one of the most popular cell lines in biomedical research, despite its well-known chromosomal instability. We compared the genomic and transcriptomic profiles of 4 different HeLa batches and showed that the gain and loss of genomic material varies widely between batches, drastically affecting basal gene expression. Moreover, different pathways were activated in response to a hypoxic stimulus. Our study emphasizes the large genomic and transcriptomic variability among different batches, to the point that the same experiment performed with different batches can lead to distinct conclusions and irreproducible results. The HeLa cell line is thought to be a unique cell line but it is clear that substantial differences between the primary tumour and the human genome exist and that an indeterminate number of HeLa cell lines may exist, each with a unique genomic profile. PMID:26483214

  7. A Web-Based Comparative Genomics Tutorial for Investigating Microbial Genomes

    PubMed Central

    STRONG, MICHAEL; CASCIO, DUILIO; EISENBERG, DAVID

    2004-01-01

    As the number of completely sequenced microbial genomes continues to rise at an impressive rate, it is important to prepare students with the skills necessary to investigate microorganisms at the genomic level. As a part of the core curriculum for first-year graduate students in the biological sciences, we have implemented a web-based tutorial to introduce students to the fields of comparative and functional genomics. The tutorial focuses on recent computational methods for identifying functionally linked genes and proteins on a genome-wide scale and was used to introduce students to the Rosetta Stone, Phylogenetic Profile, conserved Gene Neighbor, and Operon computational methods. Students learned to use a number of publicly available web servers and databases to identify functionally linked genes in the Escherichia coli genome, with emphasis on genome organization and operon structure. The overall effectiveness of the tutorial was assessed based on student evaluations and homework assignments. The tutorial is available to other educators at http://www.doe-mbi.ucla.edu/~strong/m253.php. PMID:23653555

  8. Genomic copy number alterations in 33 malignant peritoneal mesothelioma analyzed by comparative genomic hybridization array.

    PubMed

    Chirac, Pierre; Maillet, Denis; Leprêtre, Frédéric; Isaac, Sylvie; Glehen, Olivier; Figeac, Martin; Villeneuve, Laurent; Péron, Julien; Gibson, Fernando; Galateau-Sallé, Françoise; Gilly, François-Noël; Brevet, Marie

    2016-09-01

    Malignant peritoneal mesotheliomas (MPM) are rare, accounting for approximately 8% of cases of mesothelioma in France. We performed comparative genomic hybridization (CGH) on frozen MPM samples using the Agilent Human Genome CGH 180 K array. Samples were taken from a total of 33 French patients, comprising 20 men and 13 women with a mean (range) age of 58.4 (17-76) years. Asbestos exposure was reported in 8 patients (24.2%). Median (range) overall survival (OS) was 39 (0-119) months. CGH analysis demonstrated the presence of chromosomal instability in patients with MPM, with a genomic pattern that was similar to that described for pleural mesothelioma, including the loss of chromosomal regions 3p21, 9p21, and 22q12. In addition, novel genomic copy number alterations were identified, including the 15q26.2 region and the 8p11.22 region. Median OS was associated with a low peritoneal cancer index (P=.011), epithelioid subtype (P=.038), and a low number of genomic aberrations (P=.015), all of which constitute good prognostic factors for MPM. Our results provide new insights into the genetic and genomic background of MPM. Although pleural and peritoneal mesotheliomas have different risk factors, different therapeutics, and different prognosis; these data provide support to combine pleural and peritoneal mesothelioma in same clinical assays. PMID:27184482

  9. The Whole Genome Assembly and Comparative Genomic Research of Thellungiella parvula (Extremophile Crucifer) Mitochondrion.

    PubMed

    Wang, Xuelin; Bi, Changwei; Xu, Yiqing; Wei, Suyun; Dai, Xiaogang; Yin, Tongming; Ye, Ning

    2016-01-01

    The complete nucleotide sequences of the mitochondrial (mt) genome of an extremophile species Thellungiella parvula (T. parvula) have been determined with the lengths of 255,773 bp. T. parvula mt genome is a circular sequence and contains 32 protein-coding genes, 19 tRNA genes, and three ribosomal RNA genes with a 11.5% coding sequence. The base composition of 27.5% A, 27.5% T, 22.7% C, and 22.3% G in descending order shows a slight bias of 55% AT. Fifty-three repeats were identified in the mitochondrial genome of T. parvula, including 24 direct repeats, 28 tandem repeats (TRs), and one palindromic repeat. Furthermore, a total of 199 perfect microsatellites have been mined with a high A/T content (83.1%) through simple sequence repeat (SSR) analysis and they were distributed unevenly within this mitochondrial genome. We also analyzed other plant mitochondrial genomes' evolution in general, providing clues for the understanding of the evolution of organelles genomes in plants. Comparing with other Brassicaceae species, T. parvula is related to Arabidopsis thaliana whose characters of low temperature resistance have been well documented. This study will provide important genetic tools for other Brassicaceae species research and improve yields of economically important plants. PMID:27148547

  10. The Whole Genome Assembly and Comparative Genomic Research of Thellungiella parvula (Extremophile Crucifer) Mitochondrion

    PubMed Central

    Wang, Xuelin; Bi, Changwei; Xu, Yiqing; Wei, Suyun; Dai, Xiaogang; Yin, Tongming; Ye, Ning

    2016-01-01

    The complete nucleotide sequences of the mitochondrial (mt) genome of an extremophile species Thellungiella parvula (T. parvula) have been determined with the lengths of 255,773 bp. T. parvula mt genome is a circular sequence and contains 32 protein-coding genes, 19 tRNA genes, and three ribosomal RNA genes with a 11.5% coding sequence. The base composition of 27.5% A, 27.5% T, 22.7% C, and 22.3% G in descending order shows a slight bias of 55% AT. Fifty-three repeats were identified in the mitochondrial genome of T. parvula, including 24 direct repeats, 28 tandem repeats (TRs), and one palindromic repeat. Furthermore, a total of 199 perfect microsatellites have been mined with a high A/T content (83.1%) through simple sequence repeat (SSR) analysis and they were distributed unevenly within this mitochondrial genome. We also analyzed other plant mitochondrial genomes' evolution in general, providing clues for the understanding of the evolution of organelles genomes in plants. Comparing with other Brassicaceae species, T. parvula is related to Arabidopsis thaliana whose characters of low temperature resistance have been well documented. This study will provide important genetic tools for other Brassicaceae species research and improve yields of economically important plants. PMID:27148547

  11. Reading Genomes and Controlling Gene Expression

    NASA Astrophysics Data System (ADS)

    Libchaber, Albert

    2000-03-01

    Molecular recognition of DNA sequences is achieved by DNA hybridization of complementary sequences. We present various scenarios for optimization, leading to microarrays and global measurement. Gene expression can be controlled using gene constructs immobilized on a template with micron scale temperature heaters. We will discuss and present results on protein microarrays.

  12. Methods for Genome-Wide Analysis of Gene Expression Changes in Polyploids

    PubMed Central

    Wang, Jianlin; Lee, Jinsuk J.; Tian, Lu; Lee, Hyeon-Se; Chen, Meng; Rao, Sheetal; Wei, Edward N.; Doerge, R. W.; Comai, Luca; Jeffrey Chen, Z.

    2007-01-01

    Polyploidy is an evolutionary innovation, providing extra sets of genetic material for phenotypic variation and adaptation. It is predicted that changes of gene expression by genetic and epigenetic mechanisms are responsible for novel variation in nascent and established polyploids (Liu and Wendel, 2002; Osborn et al., 2003; Pikaard, 2001). Studying gene expression changes in allopolyploids is more complicated than in autopolyploids, because allopolyploids contain more than two sets of genomes originating from divergent, but related, species. Here we describe two methods that are applicable to the genome-wide analysis of gene expression differences resulting from genome duplication in autopolyploids or interactions between homoeologous genomes in allopolyploids. First, we describe an amplified fragment length polymorphism (AFLP)–complementary DNA (cDNA) display method that allows the discrimination of homoeologous loci based on restriction polymorphisms between the progenitors. Second, we describe microarray analyses that can be used to compare gene expression differences between the allopolyploids and respective progenitors using appropriate experimental design and statistical analysis. We demonstrate the utility of these two complementary methods and discuss the pros and cons of using the methods to analyze gene expression changes in autopolyploids and allopolyploids. Furthermore, we describe these methods in general terms to be of wider applicability for comparative gene expression in a variety of evolutionary, genetic, biological, and physiological contexts. PMID:15865985

  13. Evidence for the expression of abundant microRNAs in the locust genome

    PubMed Central

    Wang, Yanli; Jiang, Feng; Wang, Huimin; Song, Tianqi; Wei, Yuanyuan; Yang, Meiling; Zhang, Jianzhen; Kang, Le

    2015-01-01

    Substantial accumulation of neutral sequences accounts for genome size expansion in animal genomes. Numerous novel microRNAs (miRNAs), which evolve in a birth and death manner, are considered evolutionary neutral sequences. The migratory locust is an ideal model to determine whether large genomes contain abundant neutral miRNAs because of its large genome size. A total of 833 miRNAs were discovered, and several miRNAs were randomly chosen for validation by Northern blot and RIP-qPCR. Three additional verification methods, namely, processing-dependent methods of miRNA biogenesis using RNAi, evolutionary comparison with closely related species, and evidence supported by tissue-specific expression, were applied to provide compelling results that support the authenticity of locust miRNAs. We observed that abundant local duplication events of miRNAs, which were unique in locusts compared with those in other insects with small genome sizes, may be responsible for the substantial acquisition of miRNAs in locusts. Together, multiple evidence showed that the locust genome experienced a burst of miRNA acquisition, suggesting that genome size expansion may have considerable influences of miRNA innovation. These results provide new insight into the genomic dynamics of miRNA repertoires under genome size evolution. PMID:26329925

  14. Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes

    PubMed Central

    Sun, Yan-Bo; Xiong, Zi-Jun; Xiang, Xue-Yan; Liu, Shi-Ping; Zhou, Wei-Wei; Tu, Xiao-Long; Zhong, Li; Wang, Lu; Wu, Dong-Dong; Zhang, Bao-Lin; Zhu, Chun-Ling; Yang, Min-Min; Chen, Hong-Man; Li, Fang; Zhou, Long; Feng, Shao-Hong; Huang, Chao; Zhang, Guo-Jie; Irwin, David; Hillis, David M.; Murphy, Robert W.; Yang, Huan-Ming; Che, Jing; Wang, Jun; Zhang, Ya-Ping

    2015-01-01

    The development of efficient sequencing techniques has resulted in large numbers of genomes being available for evolutionary studies. However, only one genome is available for all amphibians, that of Xenopus tropicalis, which is distantly related from the majority of frogs. More than 96% of frogs belong to the Neobatrachia, and no genome exists for this group. This dearth of amphibian genomes greatly restricts genomic studies of amphibians and, more generally, our understanding of tetrapod genome evolution. To fill this gap, we provide the de novo genome of a Tibetan Plateau frog, Nanorana parkeri, and compare it to that of X. tropicalis and other vertebrates. This genome encodes more than 20,000 protein-coding genes, a number similar to that of Xenopus. Although the genome size of Nanorana is considerably larger than that of Xenopus (2.3 vs. 1.5 Gb), most of the difference is due to the respective number of transposable elements in the two genomes. The two frogs exhibit considerable conserved whole-genome synteny despite having diverged approximately 266 Ma, indicating a slow rate of DNA structural evolution in anurans. Multigenome synteny blocks further show that amphibians have fewer interchromosomal rearrangements than mammals but have a comparable rate of intrachromosomal rearrangements. Our analysis also identifies 11 Mb of anuran-specific highly conserved elements that will be useful for comparative genomic analyses of frogs. The Nanorana genome offers an improved understanding of evolution of tetrapod genomes and also provides a genomic reference for other evolutionary studies. PMID:25733869

  15. Whole-genome sequence of the Tibetan frog Nanorana parkeri and the comparative evolution of tetrapod genomes.

    PubMed

    Sun, Yan-Bo; Xiong, Zi-Jun; Xiang, Xue-Yan; Liu, Shi-Ping; Zhou, Wei-Wei; Tu, Xiao-Long; Zhong, Li; Wang, Lu; Wu, Dong-Dong; Zhang, Bao-Lin; Zhu, Chun-Ling; Yang, Min-Min; Chen, Hong-Man; Li, Fang; Zhou, Long; Feng, Shao-Hong; Huang, Chao; Zhang, Guo-Jie; Irwin, David; Hillis, David M; Murphy, Robert W; Yang, Huan-Ming; Che, Jing; Wang, Jun; Zhang, Ya-Ping

    2015-03-17

    The development of efficient sequencing techniques has resulted in large numbers of genomes being available for evolutionary studies. However, only one genome is available for all amphibians, that of Xenopus tropicalis, which is distantly related from the majority of frogs. More than 96% of frogs belong to the Neobatrachia, and no genome exists for this group. This dearth of amphibian genomes greatly restricts genomic studies of amphibians and, more generally, our understanding of tetrapod genome evolution. To fill this gap, we provide the de novo genome of a Tibetan Plateau frog, Nanorana parkeri, and compare it to that of X. tropicalis and other vertebrates. This genome encodes more than 20,000 protein-coding genes, a number similar to that of Xenopus. Although the genome size of Nanorana is considerably larger than that of Xenopus (2.3 vs. 1.5 Gb), most of the difference is due to the respective number of transposable elements in the two genomes. The two frogs exhibit considerable conserved whole-genome synteny despite having diverged approximately 266 Ma, indicating a slow rate of DNA structural evolution in anurans. Multigenome synteny blocks further show that amphibians have fewer interchromosomal rearrangements than mammals but have a comparable rate of intrachromosomal rearrangements. Our analysis also identifies 11 Mb of anuran-specific highly conserved elements that will be useful for comparative genomic analyses of frogs. The Nanorana genome offers an improved understanding of evolution of tetrapod genomes and also provides a genomic reference for other evolutionary studies. PMID:25733869

  16. Unlocking Holocentric Chromosomes: New Perspectives from Comparative and Functional Genomics?

    PubMed Central

    Mandrioli, Mauro; Manicardi, Gian Carlo

    2012-01-01

    The presence of chromosomes with diffuse centromeres (holocentric chromosomes) has been reported in several taxa since more than fifty years, but a full understanding of their origin is still lacking. Comparative and functional genomics are nowadays furnishing new data to better understand holocentric chromosome evolution thus opening new perspectives to analyse karyotype rearrangements in species with holocentric chromosomes in particular evidencing unusual common features, such as the uniform GC content and gene distribution along chromosomes. PMID:23372420

  17. Mosaic supernumerary ring chromosome 19 identified by comparative genomic hybridisation.

    PubMed Central

    Ghaffari, S R; Boyd, E; Connor, J M; Jones, A M; Tolmie, J L

    1998-01-01

    We report the use of comparative genomic hybridisation (CGH) to define the origin of a supernumerary ring chromosome which conventional cytogenetic banding and fluorescence in situ hybridisation (FISH) methods had failed to identify. Targeted FISH using whole chromosome 19 library arm and site specific probes then confirmed the CGH results. This study shows the feasibility of using CGH for the identification of supernumerary marker chromosomes, even in fewer than 50% of cells, where no clinical or cytogenetic clues are present. Images PMID:9783708

  18. Comparative genome analysis and genome-guided physiological analysis of Roseobacter litoralis

    PubMed Central

    2011-01-01

    Background Roseobacter litoralis OCh149, the type species of the genus, and Roseobacter denitrificans OCh114 were the first described organisms of the Roseobacter clade, an ecologically important group of marine bacteria. Both species were isolated from seaweed and are able to perform aerobic anoxygenic photosynthesis. Results The genome of R. litoralis OCh149 contains one circular chromosome of 4,505,211 bp and three plasmids of 93,578 bp (pRLO149_94), 83,129 bp (pRLO149_83) and 63,532 bp (pRLO149_63). Of the 4537 genes predicted for R. litoralis, 1122 (24.7%) are not present in the genome of R. denitrificans. Many of the unique genes of R. litoralis are located in genomic islands and on plasmids. On pRLO149_83 several potential heavy metal resistance genes are encoded which are not present in the genome of R. denitrificans. The comparison of the heavy metal tolerance of the two organisms showed an increased zinc tolerance of R. litoralis. In contrast to R. denitrificans, the photosynthesis genes of R. litoralis are plasmid encoded. The activity of the photosynthetic apparatus was confirmed by respiration rate measurements, indicating a growth-phase dependent response to light. Comparative genomics with other members of the Roseobacter clade revealed several genomic regions that were only conserved in the two Roseobacter species. One of those regions encodes a variety of genes that might play a role in host association of the organisms. The catabolism of different carbon and nitrogen sources was predicted from the genome and combined with experimental data. In several cases, e.g. the degradation of some algal osmolytes and sugars, the genome-derived predictions of the metabolic pathways in R. litoralis differed from the phenotype. Conclusions The genomic differences between the two Roseobacter species are mainly due to lateral gene transfer and genomic rearrangements. Plasmid pRLO149_83 contains predominantly recently acquired genetic material whereas pRLO149

  19. Genome Sequence and Comparative Genome Analysis of Lactobacillus casei: Insights into Their Niche-Associated Evolution

    PubMed Central

    Cai, Hui; Thompson, Rebecca; Budinich, Mateo F.; Broadbent, Jeff R.

    2009-01-01

    Lactobacillus casei is remarkably adaptable to diverse habitats and widely used in the food industry. To reveal the genomic features that contribute to its broad ecological adaptability and examine the evolution of the species, the genome sequence of L. casei ATCC 334 is analyzed and compared with other sequenced lactobacilli. This analysis reveals that ATCC 334 contains a high number of coding sequences involved in carbohydrate utilization and transcriptional regulation, reflecting its requirement for dealing with diverse environmental conditions. A comparison of the genome sequences of ATCC 334 to L. casei BL23 reveals 12 and 19 genomic islands, respectively. For a broader assessment of the genetic variability within L. casei, gene content of 21 L. casei strains isolated from various habitats (cheeses, n = 7; plant materials, n = 8; and human sources, n = 6) was examined by comparative genome hybridization with an ATCC 334-based microarray. This analysis resulted in identification of 25 hypervariable regions. One of these regions contains an overrepresentation of genes involved in carbohydrate utilization and transcriptional regulation and was thus proposed as a lifestyle adaptation island. Differences in L. casei genome inventory reveal both gene gain and gene decay. Gene gain, via acquisition of genomic islands, likely confers a fitness benefit in specific habitats. Gene decay, that is, loss of unnecessary ancestral traits, is observed in the cheese isolates and likely results in enhanced fitness in the dairy niche. This study gives the first picture of the stable versus variable regions in L. casei and provides valuable insights into evolution, lifestyle adaptation, and metabolic diversity of L. casei. PMID:20333194

  20. Genome sequence and comparative genome analysis of Lactobacillus casei: insights into their niche-associated evolution.

    PubMed

    Cai, Hui; Thompson, Rebecca; Budinich, Mateo F; Broadbent, Jeff R; Steele, James L

    2009-01-01

    Lactobacillus casei is remarkably adaptable to diverse habitats and widely used in the food industry. To reveal the genomic features that contribute to its broad ecological adaptability and examine the evolution of the species, the genome sequence of L. casei ATCC 334 is analyzed and compared with other sequenced lactobacilli. This analysis reveals that ATCC 334 contains a high number of coding sequences involved in carbohydrate utilization and transcriptional regulation, reflecting its requirement for dealing with diverse environmental conditions. A comparison of the genome sequences of ATCC 334 to L. casei BL23 reveals 12 and 19 genomic islands, respectively. For a broader assessment of the genetic variability within L. casei, gene content of 21 L. casei strains isolated from various habitats (cheeses, n = 7; plant materials, n = 8; and human sources, n = 6) was examined by comparative genome hybridization with an ATCC 334-based microarray. This analysis resulted in identification of 25 hypervariable regions. One of these regions contains an overrepresentation of genes involved in carbohydrate utilization and transcriptional regulation and was thus proposed as a lifestyle adaptation island. Differences in L. casei genome inventory reveal both gene gain and gene decay. Gene gain, via acquisition of genomic islands, likely confers a fitness benefit in specific habitats. Gene decay, that is, loss of unnecessary ancestral traits, is observed in the cheese isolates and likely results in enhanced fitness in the dairy niche. This study gives the first picture of the stable versus variable regions in L. casei and provides valuable insights into evolution, lifestyle adaptation, and metabolic diversity of L. casei. PMID:20333194

  1. Genome-wide comparative analysis of the Brassica rapa gene space reveals genome shrinkage and differential loss of duplicated genes after whole genome triplication

    PubMed Central

    2009-01-01

    Background Brassica rapa is one of the most economically important vegetable crops worldwide. Owing to its agronomic importance and phylogenetic position, B. rapa provides a crucial reference to understand polyploidy-related crop genome evolution. The high degree of sequence identity and remarkably conserved genome structure between Arabidopsis and Brassica genomes enables comparative tiling sequencing using Arabidopsis sequences as references to select the counterpart regions in B. rapa, which is a strong challenge of structural and comparative crop genomics. Results We assembled 65.8 megabase-pairs of non-redundant euchromatic sequence of B. rapa and compared this sequence to the Arabidopsis genome to investigate chromosomal relationships, macrosynteny blocks, and microsynteny within blocks. The triplicated B. rapa genome contains only approximately twice the number of genes as in Arabidopsis because of genome shrinkage. Genome comparisons suggest that B. rapa has a distinct organization of ancestral genome blocks as a result of recent whole genome triplication followed by a unique diploidization process. A lack of the most recent whole genome duplication (3R) event in the B. rapa genome, atypical of other Brassica genomes, may account for the emergence of B. rapa from the Brassica progenitor around 8 million years ago. Conclusions This work demonstrates the potential of using comparative tiling sequencing for genome analysis of crop species. Based on a comparative analysis of the B. rapa sequences and the Arabidopsis genome, it appears that polyploidy and chromosomal diploidization are ongoing processes that collectively stabilize the B. rapa genome and facilitate its evolution. PMID:19821981

  2. Databases of homologous gene families for comparative genomics

    PubMed Central

    Penel, Simon; Arigon, Anne-Muriel; Dufayard, Jean-François; Sertier, Anne-Sophie; Daubin, Vincent; Duret, Laurent; Gouy, Manolo; Perrière, Guy

    2009-01-01

    Background Comparative genomics is a central step in many sequence analysis studies, from gene annotation and the identification of new functional regions in genomes, to the study of evolutionary processes at the molecular level (speciation, single gene or whole genome duplications, etc.) and phylogenetics. In that context, databases providing users high quality homologous families and sequence alignments as well as phylogenetic trees based on state of the art algorithms are becoming indispensable. Methods We developed an automated procedure allowing massive all-against-all similarity searches, gene clustering, multiple alignments computation, and phylogenetic trees construction and reconciliation. The application of this procedure to a very large set of sequences is possible through parallel computing on a large computer cluster. Results Three databases were developed using this procedure: HOVERGEN, HOGENOM and HOMOLENS. These databases share the same architecture but differ in their content. HOVERGEN contains sequences from vertebrates, HOGENOM is mainly devoted to completely sequenced microbial organisms, and HOMOLENS is devoted to metazoan genomes from Ensembl. Access to the databases is provided through Web query forms, a general retrieval system and a client-server graphical interface. The later can be used to perform tree-pattern based searches allowing, among other uses, to retrieve sets of orthologous genes. The three databases, as well as the software required to build and query them, can be used or downloaded from the PBIL (Pôle Bioinformatique Lyonnais) site at . PMID:19534752

  3. Sequencing and comparative analysis of the gorilla MHC genomic sequence.

    PubMed

    Wilming, Laurens G; Hart, Elizabeth A; Coggill, Penny C; Horton, Roger; Gilbert, James G R; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L

    2013-01-01

    Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC. PMID:23589541

  4. Comparative evolutionary genomics of the STAT family of transcription factors

    PubMed Central

    Wang, Yaming; Levy, David E.

    2012-01-01

    The STAT signaling pathway is one of the seven common pathways that govern cell fate decisions during animal development. Comparative genomics revealed multiple incidences of stat gene duplications throughout metazoan evolutionary history. While pseudogenization is a frequent fate of duplicated genes, many of these STAT duplications evolved into novel genes through rapid sequence diversification and neofunctionalization. Additionally, the core of STAT gene regulatory networks, comprising stat1 through 4, stat5 and stat6, arose early in vertebrate evolution, probably through the two whole genome duplication events that occurred after the split of Cephalochordates but before the rise of Chondrichthyes. While another complete genome duplication event took place during the evolution of bony fish after their separation from the tetrapods about 450 million years ago (Mya), modern fish have only one set of these core stats, suggesting the rapid loss of most duplicated stat genes. The two stat5 genes in mammals likely arose from a duplication event in early Eutherian evolution, a period from about 310 Mya at the avian-mammal divergence to the separation of marsupials from other mammals about 130 Mya. These analyses indicate that whole genome duplications and gene duplications by unequal chromosomal crossing over were likely the major mechanisms underlying the evolution of STATs. PMID:24058748

  5. Comparative genomic analysis of seven Mycoplasma hyosynoviae strains

    PubMed Central

    Bumgardner, Eric A; Kittichotirat, Weerayuth; Bumgarner, Roger E; Lawrence, Paulraj K

    2015-01-01

    Infection with Mycoplasma hyosynoviae can result in debilitating arthritis in pigs, particularly those aged 10 weeks or older. Strategies for controlling this pathogen are becoming increasingly important due to the rise in the number of cases of arthritis that have been attributed to infection in recent years. In order to begin to develop interventions to prevent arthritis caused by M. hyosynoviae, more information regarding the specific proteins and potential virulence factors that its genome encodes was needed. However, the genome of this emerging swine pathogen had not been sequenced previously. In this report, we present a comparative analysis of the genomes of seven strains of M. hyosynoviae isolated from different locations in North America during the years 2010 to 2013. We identified several putative virulence factors that may contribute to the ability of this pathogen to adhere to host cells. Additionally, we discovered several prophage genes present within the genomes of three strains that show significant similarity to MAV1, a phage isolated from the related species, M. arthritidis. We also identified CRISPR-Cas and type III restriction and modification systems present in two strains that may contribute to their ability to defend against phage infection. PMID:25693846

  6. Exploring the zoonotic potential of Mycobacterium avium subspecies paratuberculosis through comparative genomics.

    PubMed

    Wynne, James W; Bull, Tim J; Seemann, Torsten; Bulach, Dieter M; Wagner, Josef; Kirkwood, Carl D; Michalski, Wojtek P

    2011-01-01

    A comparative genomics approach was utilised to compare the genomes of Mycobacterium avium subspecies paratuberculosis (MAP) isolated from early onset paediatric Crohn's disease (CD) patients as well as Johne's diseased animals. Draft genome sequences were produced for MAP isolates derived from four CD patients, one ulcerative colitis (UC) patient, and two non-inflammatory bowel disease (IBD) control individuals using Illumina sequencing, complemented by comparative genome hybridisation (CGH). MAP isolates derived from two bovine and one ovine host were also subjected to whole genome sequencing and CGH. All seven human derived MAP isolates were highly genetically similar and clustered together with one bovine type isolate following phylogenetic analysis. Three other sequenced isolates (including the reference bovine derived isolate K10) were genetically distinct. The human isolates contained two large tandem duplications, the organisations of which were confirmed by PCR. Designated vGI-17 and vGI-18 these duplications spanned 63 and 109 open reading frames, respectively. PCR screening of over 30 additional MAP isolates (3 human derived, 27 animal derived and one environmental isolate) confirmed that vGI-17 and vGI-18 are common across many isolates. Quantitative real-time PCR of vGI-17 demonstrated that the proportion of cells containing the vGI-17 duplication varied between 0.01 to 15% amongst isolates with human isolates containing a higher proportion of vGI-17 compared to most animal isolates. These findings suggest these duplications are transient genomic rearrangements. We hypothesise that the over-representation of vGI-17 in human derived MAP strains may enhance their ability to infect or persist within a human host by increasing genome redundancy and conferring crude regulation of protein expression across biologically important regions. PMID:21799786

  7. Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

    SciTech Connect

    Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; Hauser, Loren John; Wanchai, Visanu; Land, Miriam L.; Timm, Collin M.; Lu, Tse-Yuan S.; Schadt, Christopher Warren; Doktycz, Mitchel John; Pelletier, Dale A; Ussery, David W

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The species P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but this

  8. Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

    DOE PAGESBeta

    Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; Hauser, Loren John; Wanchai, Visanu; Land, Miriam L.; Timm, Collin M.; Lu, Tse-Yuan S.; Schadt, Christopher Warren; Doktycz, Mitchel John; et al

    2016-01-01

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The speciesmore » P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but

  9. Marine organism cell biology and regulatory sequence discoveryin comparative functional genomics.

    PubMed

    Barnes, David W; Mattingly, Carolyn J; Parton, Angela; Dowell, Lori M; Bayne, Christopher J; Forrest, John N

    2004-10-01

    The use of bioinformatics to integrate phenotypic and genomic data from mammalian models is well established as a means of understanding human biology and disease. Beyond direct biomedical applications of these approaches in predicting structure-function relationships between coding sequences and protein activities, comparative studies also promote understanding of molecular evolution and the relationship between genomic sequence and morphological and physiological specialization. Recently recognized is the potential of comparative studies to identify functionally significant regulatory regions and to generate experimentally testable hypotheses that contribute to understanding mechanisms that regulate gene expression, including transcriptional activity, alternative splicing and transcript stability. Functional tests of hypotheses generated by computational approaches require experimentally tractable in vitro systems, including cell cultures. Comparative sequence analysis strategies that use genomic sequences from a variety of evolutionarily diverse organisms are critical for identifying conserved regulatory motifs in the 5'-upstream, 3'-downstream and introns of genes. Genomic sequences and gene orthologues in the first aquatic vertebrate and protovertebrate organisms to be fully sequenced (Fugu rubripes, Ciona intestinalis, Tetraodon nigroviridis, Danio rerio) as well as in the elasmobranchs, spiny dogfish shark (Squalus acanthias) and little skate (Raja erinacea), and marine invertebrate models such as the sea urchin (Strongylocentrotus purpuratus) are valuable in the prediction of putative genomic regulatory regions. Cell cultures have been derived for these and other model species. Data and tools resulting from these kinds of studies will contribute to understanding transcriptional regulation of biomedically important genes and provide new avenues for medical therapeutics and disease prevention. PMID:19003267

  10. Comparative genomic reconstruction of transcriptional networks controlling central metabolism in the Shewanella genus

    PubMed Central

    2011-01-01

    Background Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in bacteria is one of the critical tasks of modern genomics. The Shewanella genus is comprised of metabolically versatile gamma-proteobacteria, whose lifestyles and natural environments are substantially different from Escherichia coli and other model bacterial species. The comparative genomics approaches and computational identification of regulatory sites are useful for the in silico reconstruction of transcriptional regulatory networks in bacteria. Results To explore conservation and variations in the Shewanella transcriptional networks we analyzed the repertoire of transcription factors and performed genomics-based reconstruction and comparative analysis of regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors and their DNA binding sites, 8 riboswitches and 6 translational attenuators. Forty five regulons were newly inferred from the genome context analysis, whereas others were propagated from previously characterized regulons in the Enterobacteria and Pseudomonas spp.. Multiple variations in regulatory strategies between the Shewanella spp. and E. coli include regulon contraction and expansion (as in the case of PdhR, HexR, FadR), numerous cases of recruiting non-orthologous regulators to control equivalent pathways (e.g. PsrA for fatty acid degradation) and, conversely, orthologous regulators to control distinct pathways (e.g. TyrR, ArgR, Crp). Conclusions We tentatively defined the first reference collection of ~100 transcriptional regulons in 16 Shewanella genomes. The resulting regulatory network contains ~600 regulated genes per genome that are mostly involved in metabolism of carbohydrates, amino acids, fatty acids, vitamins, metals, and stress responses. Several reconstructed regulons including NagR for N-acetylglucosamine catabolism were experimentally validated in S. oneidensis MR-1. Analysis of

  11. Comparative genomic analysis reveals a distant liver enhancer upstream of the COUP-TFII gene

    SciTech Connect

    Baroukh, Nadine; Ahituv, Nadav; Chang, Jessie; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Pennacchio, Len A.

    2004-08-20

    COUP-TFII is a central nuclear hormone receptor that tightly regulates the expression of numerous target lipid metabolism genes in vertebrates. However, it remains unclear how COUP-TFII itself is transcriptionally controlled since studies with its promoter and upstream region fail to recapitulate the genes liver expression. In an attempt to identify liver enhancers in the vicinity of COUP-TFII, we employed a comparative genomic approach. Initial comparisons between humans and mice of the 3,470kb gene poor region surrounding COUP-TFII revealed 2,023 conserved non-coding elements. To prioritize a subset of these elements for functional studies, we performed further genomic comparisons with the orthologous pufferfish (Fugu rubripes) locus and uncovered two anciently conserved non-coding sequences (CNS) upstream of COUP-TFII (CNS-62kb and CNS-66kb). Testing these two elements using reporter constructs in liver (HepG2) cells revealed that CNS-66kb, but not CNS-62kb, yielded robust in vitro enhancer activity. In addition, an in vivo reporter assay using naked DNA transfer with CNS-66kb linked to luciferase displayed strong reproducible liver expression in adult mice, further supporting its role as a liver enhancer. Together, these studies further support the utility of comparative genomics to uncover gene regulatory sequences based on evolutionary conservation and provide the substrates to better understand the regulation and expression of COUP-TFII.

  12. Complete chloroplast genome sequences of Solanum bulbocastanum, Solanum lycopersicum and comparative analyses with other Solanaceae genomes.

    PubMed

    Daniell, Henry; Lee, Seung-Bum; Grevich, Justin; Saski, Christopher; Quesada-Vargas, Tania; Guda, Chittibabu; Tomkins, Jeffrey; Jansen, Robert K

    2006-05-01

    Despite the agricultural importance of both potato and tomato, very little is known about their chloroplast genomes. Analysis of the complete sequences of tomato, potato, tobacco, and Atropa chloroplast genomes reveals significant insertions and deletions within certain coding regions or regulatory sequences (e.g., deletion of repeated sequences within 16S rRNA, ycf2 or ribosomal binding sites in ycf2). RNA, photosynthesis, and atp synthase genes are the least divergent and the most divergent genes are clpP, cemA, ccsA, and matK. Repeat analyses identified 33-45 direct and inverted repeats >or=30 bp with a sequence identity of at least 90%; all but five of the repeats shared by all four Solanaceae genomes are located in the same genes or intergenic regions, suggesting a functional role. A comprehensive genome-wide analysis of all coding sequences and intergenic spacer regions was done for the first time in chloroplast genomes. Only four spacer regions are fully conserved (100% sequence identity) among all genomes; deletions or insertions within some intergenic spacer regions result in less than 25% sequence identity, underscoring the importance of choosing appropriate intergenic spacers for plastid transformation and providing valuable new information for phylogenetic utility of the chloroplast intergenic spacer regions. Comparison of coding sequences with expressed sequence tags showed considerable amount of variation, resulting in amino acid changes; none of the C-to-U conversions observed in potato and tomato were conserved in tobacco and Atropa. It is possible that there has been a loss of conserved editing sites in potato and tomato. PMID:16575560

  13. Comparative genomics of parasitic silkworm microsporidia reveal an association between genome expansion and host adaptation

    PubMed Central

    2013-01-01

    Background Microsporidian Nosema bombycis has received much attention because the pébrine disease of domesticated silkworms results in great economic losses in the silkworm industry. So far, no effective treatment could be found for pébrine. Compared to other known Nosema parasites, N. bombycis can unusually parasitize a broad range of hosts. To gain some insights into the underlying genetic mechanism of pathological ability and host range expansion in this parasite, a comparative genomic approach is conducted. The genome of two Nosema parasites, N. bombycis and N. antheraeae (an obligatory parasite to undomesticated silkworms Antheraea pernyi), were sequenced and compared with their distantly related species, N. ceranae (an obligatory parasite to honey bees). Results Our comparative genomics analysis show that the N. bombycis genome has greatly expanded due to the following three molecular mechanisms: 1) the proliferation of host-derived transposable elements, 2) the acquisition of many horizontally transferred genes from bacteria, and 3) the production of abundnant gene duplications. To our knowledge, duplicated genes derived not only from small-scale events (e.g., tandem duplications) but also from large-scale events (e.g., segmental duplications) have never been seen so abundant in any reported microsporidia genomes. Our relative dating analysis further indicated that these duplication events have arisen recently over very short evolutionary time. Furthermore, several duplicated genes involving in the cytotoxic metabolic pathway were found to undergo positive selection, suggestive of the role of duplicated genes on the adaptive evolution of pathogenic ability. Conclusions Genome expansion is rarely considered as the evolutionary outcome acting on those highly reduced and compact parasitic microsporidian genomes. This study, for the first time, demonstrates that the parasitic genomes can expand, instead of shrink, through several common molecular mechanisms

  14. Comparative Analysis of Codon Usage Bias Patterns in Microsporidian Genomes

    PubMed Central

    Xiang, Heng; Zhang, Ruizhi; Butler, Robert R.; Liu, Tie; Zhang, Li; Pombert, Jean-François; Zhou, Zeyang

    2015-01-01

    The sub-3 Mbp genomes from microsporidian species of the Encephalitozoon genus are the smallest known among eukaryotes and paragons of genomic reduction and compaction in parasites. However, their diminutive stature is not characteristic of all Microsporidia, whose genome sizes vary by an order of magnitude. This large variability suggests that different evolutionary forces are applied on the group as a whole. In this study, we have compared the codon usage bias (CUB) between eight taxonomically distinct microsporidian genomes: Encephalitozoon intestinalis, Encephalitozoon cuniculi, Spraguea lophii, Trachipleistophora hominis, Enterocytozoon bieneusi, Nematocida parisii, Nosema bombycis and Nosema ceranae. While the CUB was found to be weak in all eight Microsporidia, nearly all (98%) of the optimal codons in S. lophii, T. hominis, E. bieneusi, N. parisii, N. bombycis and N. ceranae are fond of A/U in third position whereas most (64.6%) optimal codons in the Encephalitozoon species E. intestinalis and E. cuniculi are biased towards G/C. Although nucleotide composition biases are likely the main factor driving the CUB in Microsporidia according to correlation analyses, directed mutational pressure also likely affects the CUB as suggested by ENc-plots, correspondence and neutrality analyses. Overall, the Encephalitozoon genomes were found to be markedly different from the other microsporidians and, despite being the first sequenced representatives of this lineage, are uncharacteristic of the group as a whole. The disparities observed cannot be attributed solely to differences in host specificity and we hypothesize that other forces are at play in the lineage leading to Encephalitozoon species. PMID:26057384

  15. Comparative Analysis of Codon Usage Bias Patterns in Microsporidian Genomes.

    PubMed

    Xiang, Heng; Zhang, Ruizhi; Butler, Robert R; Liu, Tie; Zhang, Li; Pombert, Jean-François; Zhou, Zeyang

    2015-01-01

    The sub-3 Mbp genomes from microsporidian species of the Encephalitozoon genus are the smallest known among eukaryotes and paragons of genomic reduction and compaction in parasites. However, their diminutive stature is not characteristic of all Microsporidia, whose genome sizes vary by an order of magnitude. This large variability suggests that different evolutionary forces are applied on the group as a whole. In this study, we have compared the codon usage bias (CUB) between eight taxonomically distinct microsporidian genomes: Encephalitozoon intestinalis, Encephalitozoon cuniculi, Spraguea lophii, Trachipleistophora hominis, Enterocytozoon bieneusi, Nematocida parisii, Nosema bombycis and Nosema ceranae. While the CUB was found to be weak in all eight Microsporidia, nearly all (98%) of the optimal codons in S. lophii, T. hominis, E. bieneusi, N. parisii, N. bombycis and N. ceranae are fond of A/U in third position whereas most (64.6%) optimal codons in the Encephalitozoon species E. intestinalis and E. cuniculi are biased towards G/C. Although nucleotide composition biases are likely the main factor driving the CUB in Microsporidia according to correlation analyses, directed mutational pressure also likely affects the CUB as suggested by ENc-plots, correspondence and neutrality analyses. Overall, the Encephalitozoon genomes were found to be markedly different from the other microsporidians and, despite being the first sequenced representatives of this lineage, are uncharacteristic of the group as a whole. The disparities observed cannot be attributed solely to differences in host specificity and we hypothesize that other forces are at play in the lineage leading to Encephalitozoon species. PMID:26057384

  16. A Comparative Analysis of Mitochondrial Genomes in Eustigmatophyte Algae

    PubMed Central

    Ševčíková, Tereza; Klimeš, Vladimír; Zbránková, Veronika; Strnad, Hynek; Hroudová, Miluše; Vlček, Čestmír; Eliáš, Marek

    2016-01-01

    Eustigmatophyceae (Ochrophyta, Stramenopiles) is a small algal group with species of the genus Nannochloropsis being its best studied representatives. Nuclear and organellar genomes have been recently sequenced for several Nannochloropsis spp., but phylogenetically wider genomic studies are missing for eustigmatophytes. We sequenced mitochondrial genomes (mitogenomes) of three species representing most major eustigmatophyte lineages, Monodopsis sp. MarTras21, Vischeria sp. CAUP Q 202 and Trachydiscus minutus, and carried out their comparative analysis in the context of available data from Nannochloropsis and other stramenopiles, revealing a number of noticeable findings. First, mitogenomes of most eustigmatophytes are highly collinear and similar in the gene content, but extensive rearrangements and loss of three otherwise ubiquitous genes happened in the Vischeria lineage; this correlates with an accelerated evolution of mitochondrial gene sequences in this lineage. Second, eustigmatophytes appear to be the only ochrophyte group with the Atp1 protein encoded by the mitogenome. Third, eustigmatophyte mitogenomes uniquely share a truncated nad11 gene encoding only the C-terminal part of the Nad11 protein, while the N-terminal part is encoded by a separate gene in the nuclear genome. Fourth, UGA as a termination codon and the cognate release factor mRF2 were lost from mitochondria independently by the Nannochloropsis and T. minutus lineages. Finally, the rps3 gene in the mitogenome of Vischeria sp. is interrupted by the UAG codon, but the genome includes a gene for an unusual tRNA with an extended anticodon loop that we speculate may serve as a suppressor tRNA to properly decode the rps3 gene. PMID:26872774

  17. A Comparative Analysis of Mitochondrial Genomes in Eustigmatophyte Algae.

    PubMed

    Ševčíková, Tereza; Klimeš, Vladimír; Zbránková, Veronika; Strnad, Hynek; Hroudová, Miluše; Vlček, Čestmír; Eliáš, Marek

    2016-03-01

    Eustigmatophyceae (Ochrophyta, Stramenopiles) is a small algal group with species of the genus Nannochloropsis being its best studied representatives. Nuclear and organellar genomes have been recently sequenced for several Nannochloropsis spp., but phylogenetically wider genomic studies are missing for eustigmatophytes. We sequenced mitochondrial genomes (mitogenomes) of three species representing most major eustigmatophyte lineages, Monodopsis sp. MarTras21, Vischeria sp. CAUP Q 202 and Trachydiscus minutus, and carried out their comparative analysis in the context of available data from Nannochloropsis and other stramenopiles, revealing a number of noticeable findings. First, mitogenomes of most eustigmatophytes are highly collinear and similar in the gene content, but extensive rearrangements and loss of three otherwise ubiquitous genes happened in the Vischeria lineage; this correlates with an accelerated evolution of mitochondrial gene sequences in this lineage. Second, eustigmatophytes appear to be the only ochrophyte group with the Atp1 protein encoded by the mitogenome. Third, eustigmatophyte mitogenomes uniquely share a truncated nad11 gene encoding only the C-terminal part of the Nad11 protein, while the N-terminal part is encoded by a separate gene in the nuclear genome. Fourth, UGA as a termination codon and the cognate release factor mRF2 were lost from mitochondria independently by the Nannochloropsis and T. minutus lineages. Finally, the rps3 gene in the mitogenome of Vischeria sp. is interrupted by the UAG codon, but the genome includes a gene for an unusual tRNA with an extended anticodon loop that we speculate may serve as a suppressor tRNA to properly decode the rps3 gene. PMID:26872774

  18. Comparative genomics of Cylindrospermopsis raciborskii strains with differential toxicities

    PubMed Central

    2014-01-01

    Background Cylindrospermopsis raciborskii is an invasive filamentous freshwater cyanobacterium, some strains of which produce toxins. Sporadic toxicity may be the result of gene deletion events, the horizontal transfer of toxin biosynthesis gene clusters, or other genomic variables, yet the evolutionary drivers for cyanotoxin production remain a mystery. Through examining the genomes of toxic and non-toxic strains of C. raciborskii, we hoped to gain a better understanding of the degree of similarity between these strains of common geographical origin, and what the primary differences between these strains might be. Additionally, we hoped to ascertain why some cyanobacteria possess the cylindrospermopsin biosynthesis (cyr) gene cluster and produce toxin, while others do not. It has been hypothesised that toxicity or lack thereof might confer a selective advantage to cyanobacteria under certain environmental conditions. Results In order to examine the fundamental differences between toxic and non-toxic C. raciborskii strains, we sequenced the genomes of two closely related isolates, CS-506 (CYN+) and CS-509 (CYN-) sourced from different lakes in tropical Queensland, Australia. These genomes were then compared to a third (reference) genome from C. raciborskii CS-505 (CYN+). Genome sizes were similar across all three strains and their G + C contents were almost identical. At least 2,767 genes were shared among all three strains, including the taxonomically important rpoc1, ssuRNA, lsuRNA, cpcA, cpcB, nifB and nifH, which exhibited 99.8-100% nucleotide identity. Strains CS-506 and CS-509 contained at least 176 and 101 strain-specific (or non-homologous) genes, respectively, most of which were associated with DNA repair and modification, nutrient uptake and transport, or adaptive measures such as osmoregulation. However, the only significant genetic difference observed between the two strains was the presence or absence of the cylindrospermopsin biosynthesis gene

  19. fPoxDB: fungal peroxidase database for comparative genomics

    PubMed Central

    2014-01-01

    -based prediction and diverse analysis toolkits with easy-to-follow web interface offer a useful workbench to study comparative and evolutionary genomics of peroxidases in fungi. PMID:24885079

  20. Comparative Analysis of Genomics and Proteomics in Bacillus thuringiensis 4.0718

    PubMed Central

    Rang, Jie; He, Hao; Wang, Ting; Ding, Xuezhi; Zuo, Mingxing; Quan, Meifang; Sun, Yunjun; Yu, Ziquan; Hu, Shengbiao; Xia, Liqiu

    2015-01-01

    Bacillus thuringiensis is a widely used biopesticide that produced various insecticidal active substances during its life cycle. Separation and purification of numerous insecticide active substances have been difficult because of the relatively short half-life of such substances. On the other hand, substances can be synthetized at different times during development, so samples at different stages have to be studied, further complicating the analysis. A dual genomic and proteomic approach would enhance our ability to identify such substances, and particularily using mass spectrometry-based proteomic methods. The comparative analysis for genomic and proteomic data have showed that not all of the products deduced from the annotated genome could be identified among the proteomic data. For instance, genome annotation results showed that 39 coding sequences in the whole genome were related to insect pathogenicity, including five cry genes. However, Cry2Ab, Cry1Ia, Cytotoxin K, Bacteriocin, Exoenzyme C3 and Alveolysin could not be detected in the proteomic data obtained. The sporulation-related proteins were also compared analysis, results showed that the great majority sporulation-related proteins can be detected by mass spectrometry. This analysis revealed Spo0A~P, SigF, SigE(+), SigK(+) and SigG(+), all known to play an important role in the process of spore formation regulatory network, also were displayed in the proteomic data. Through the comparison of the two data sets, it was possible to infer that some genes were silenced or were expressed at very low levels. For instance, found that cry2Ab seems to lack a functional promoter while cry1Ia may not be expressed due to the presence of transposons. With this comparative study a relatively complete database can be constructed and used to transform hereditary material, thereby prompting the high expression of toxic proteins. A theoretical basis is provided for constructing highly virulent engineered bacteria and for

  1. Evolutionary and comparative analyses of the soybean genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The soybean genome assembly has been available since the end of 2008. Significant features of the genome include large, gene-poor, repeat-dense pericentromeric regions, spanning roughly 57% of the genome sequence; a relatively large genome size of ~1.15 billion bases; remnants of a genome duplicatio...

  2. Enabling comparative modeling of closely related genomes: Example genus Brucella

    DOE PAGESBeta

    Faria, José P.; Edirisinghe, Janaka N.; Davis, James J.; Disz, Terrence; Hausmann, Anna; Henry, Christopher S.; Olson, Robert; Overbeek, Ross A.; Pusch, Gordon D.; Shukla, Maulik; et al

    2014-03-08

    For many scientific applications, it is highly desirable to be able to compare metabolic models of closely related genomes. In this study, we attempt to raise awareness to the fact that taking annotated genomes from public repositories and using them for metabolic model reconstructions is far from being trivial due to annotation inconsistencies. We are proposing a protocol for comparative analysis of metabolic models on closely related genomes, using fifteen strains of genus Brucella, which contains pathogens of both humans and livestock. This study lead to the identification and subsequent correction of inconsistent annotations in the SEED database, as wellmore » as the identification of 31 biochemical reactions that are common to Brucella, which are not originally identified by automated metabolic reconstructions. We are currently implementing this protocol for improving automated annotations within the SEED database and these improvements have been propagated into PATRIC, Model-SEED, KBase and RAST. This method is an enabling step for the future creation of consistent annotation systems and high-quality model reconstructions that will support in predicting accurate phenotypes such as pathogenicity, media requirements or type of respiration.« less

  3. A comparative encyclopedia of DNA elements in the mouse genome.

    PubMed

    Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D; Shen, Yin; Pervouchine, Dmitri D; Djebali, Sarah; Thurman, Robert E; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K; Williams, Brian A; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M A; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D; Bansal, Mukul S; Kellis, Manolis; Keller, Cheryl A; Morrissey, Christapher S; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S; Cayting, Philip; Kawli, Trupti; Boyle, Alan P; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S; Cline, Melissa S; Erickson, Drew T; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A; Rosenbloom, Kate R; Lacerda de Sousa, Beatriz; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W James; Ramalho Santos, Miguel; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J; Wilken, Matthew S; Reh, Thomas A; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P; Neph, Shane; Humbert, Richard; Hansen, R Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E; Orkin, Stuart H; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J; Blobel, Gerd A; Cao, Xiaoyi; Zhong, Sheng; Wang, Ting; Good, Peter J; Lowdon, Rebecca F; Adams, Leslie B; Zhou, Xiao-Qiao; Pazin, Michael J; Feingold, Elise A; Wold, Barbara; Taylor, James; Mortazavi, Ali; Weissman, Sherman M; Stamatoyannopoulos, John A; Snyder, Michael P; Guigo, Roderic; Gingeras, Thomas R; Gilbert, David M; Hardison, Ross C; Beer, Michael A; Ren, Bing

    2014-11-20

    The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases. PMID:25409824

  4. WormBase: methods for data mining and comparative genomics.

    PubMed

    Harris, Todd W; Stein, Lincoln D

    2006-01-01

    WormBase is a comprehensive repository for information on Caenorhabditis elegans and related nematodes. Although the primary web-based interface of WormBase (http:// www.wormbase.org/) is familiar to most C. elegans researchers, WormBase also offers powerful data-mining features for addressing questions of comparative genomics, genome structure, and evolution. In this chapter, we focus on data mining at WormBase through the use of flexible web interfaces, custom queries, and scripts. The intended audience includes users wishing to query the database beyond the confines of the web interface or fetch data en masse. No knowledge of programming is necessary or assumed, although users with intermediate skills in the Perl scripting language will be able to utilize additional data-mining approaches. PMID:16988424

  5. Ancient signals: comparative genomics of green plant CDPKs

    PubMed Central

    Hamel, Louis-Philippe; Sheen, Jen; Séguin, Armand

    2014-01-01

    Calcium-dependent protein kinases (CDPKs) are multifunctional proteins combining calcium-binding and signaling capabilities within a single gene product. This unique versatility enables multiple plant biological processes to be controlled, including developmental programs and stress responses. The genome of flowering plants typically encodes around 30 CDPK homologs that cluster in four conserved clades. In this Review, we take advantage of the recent availability of genome sequences from green algae and early land plants to examine how well the previously described CDPK family from angiosperms compares to the broader evolutionary states associated with early diverging green plant lineages. Our analysis suggests that the current architecture of the CDPK family was shaped during the colonization of the land by plants, whereas CDPKs from ancestor green algae have continued to evolve independently. PMID:24342084

  6. A Comparative Encyclopedia of DNA Elements in the Mouse Genome

    PubMed Central

    Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D.; Shen, Yin; Pervouchine, Dmitri D.; Djebali, Sarah; Thurman, Bob; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K.; Williams, Brian A.; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M. A.; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T.; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D.; Bansal, Mukul S.; Keller, Cheryl A.; Morrissey, Christapher S.; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S.; Cayting, Philip; Kawli, Trupti; Boyle, Alan P.; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S.; Cline, Melissa S.; Erickson, Drew T.; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A.; Rosenbloom, Kate R.; de Sousa, Beatriz Lacerda; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W. James; Santos, Miguel Ramalho; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J.; Wilken, Matthew S.; Reh, Thomas A.; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P.; Neph, Shane; Humbert, Richard; Hansen, R. Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E.; Orkin, Stuart H.; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J.; Blobel, Gerd A.; Good, Peter J.; Lowdon, Rebecca F.; Adams, Leslie B.; Zhou, Xiao-Qiao; Pazin, Michael J.; Feingold, Elise A.; Wold, Barbara; Taylor, James; Kellis, Manolis; Mortazavi, Ali; Weissman, Sherman M.; Stamatoyannopoulos, John; Snyder, Michael P.; Guigo, Roderic; Gingeras, Thomas R.; Gilbert, David M.; Hardison, Ross C.; Beer, Michael A.; Ren, Bing

    2014-01-01

    Summary As the premier model organism in biomedical research, the laboratory mouse shares the majority of protein-coding genes with humans, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications, and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of other sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases. PMID:25409824

  7. Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus.

    PubMed

    Li, Fagen; Zhou, Changpin; Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming

    2015-01-01

    Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10-56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa. PMID:26695430

  8. Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus

    PubMed Central

    Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming

    2015-01-01

    Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10–56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa. PMID:26695430

  9. Industrial Acetogenic Biocatalysts: A Comparative Metabolic and Genomic Analysis

    PubMed Central

    Bengelsdorf, Frank R.; Poehlein, Anja; Linder, Sonja; Erz, Catarina; Hummel, Tim; Hoffmeister, Sabrina; Daniel, Rolf; Dürre, Peter

    2016-01-01

    Synthesis gas (syngas) fermentation by anaerobic acetogenic bacteria employing the Wood–Ljungdahl pathway is a bioprocess for production of biofuels and biocommodities. The major fermentation products of the most relevant biocatalytic strains (Clostridium ljungdahlii, C. autoethanogenum, C. ragsdalei, and C. coskatii) are acetic acid and ethanol. A comparative metabolic and genomic analysis using the mentioned biocatalysts might offer targets for metabolic engineering and thus improve the production of compounds apart from ethanol. Autotrophic growth and product formation of the four wild type (WT) strains were compared in uncontrolled batch experiments. The genomes of C. ragsdalei and C. coskatii were sequenced and the genome sequences of all four biocatalytic strains analyzed in comparative manner. Growth and product spectra (acetate, ethanol, 2,3-butanediol) of C. autoethanogenum, C. ljungdahlii, and C. ragsdalei were rather similar. In contrast, C. coskatii produced significantly less ethanol and its genome sequence lacks two genes encoding aldehyde:ferredoxin oxidoreductases (AOR). Comparative genome sequence analysis of the four WT strains revealed high average nucleotide identity (ANI) of C. ljungdahlii and C. autoethanogenum (99.3%) and C. coskatii (98.3%). In contrast, C. ljungdahlii WT and C. ragsdalei WT showed an ANI-based similarity of only 95.8%. Additionally, recombinant C. ljungdahlii strains were constructed that harbor an artificial acetone synthesis operon (ASO) consisting of the following genes: adc, ctfA, ctfB, and thlA (encoding acetoacetate decarboxylase, acetoacetyl-CoA:acetate/butyrate:CoA-transferase subunits A and B, and thiolase) under the control of thlA promoter (PthlA) from C. acetobutylicum or native pta-ack promoter (Ppta-ack) from C. ljungdahlii. Respective recombinant strains produced 2-propanol rather than acetone, due to the presence of a NADPH-dependent primary-secondary alcohol dehydrogenase that converts acetone to 2

  10. Industrial Acetogenic Biocatalysts: A Comparative Metabolic and Genomic Analysis.

    PubMed

    Bengelsdorf, Frank R; Poehlein, Anja; Linder, Sonja; Erz, Catarina; Hummel, Tim; Hoffmeister, Sabrina; Daniel, Rolf; Dürre, Peter

    2016-01-01

    Synthesis gas (syngas) fermentation by anaerobic acetogenic bacteria employing the Wood-Ljungdahl pathway is a bioprocess for production of biofuels and biocommodities. The major fermentation products of the most relevant biocatalytic strains (Clostridium ljungdahlii, C. autoethanogenum, C. ragsdalei, and C. coskatii) are acetic acid and ethanol. A comparative metabolic and genomic analysis using the mentioned biocatalysts might offer targets for metabolic engineering and thus improve the production of compounds apart from ethanol. Autotrophic growth and product formation of the four wild type (WT) strains were compared in uncontrolled batch experiments. The genomes of C. ragsdalei and C. coskatii were sequenced and the genome sequences of all four biocatalytic strains analyzed in comparative manner. Growth and product spectra (acetate, ethanol, 2,3-butanediol) of C. autoethanogenum, C. ljungdahlii, and C. ragsdalei were rather similar. In contrast, C. coskatii produced significantly less ethanol and its genome sequence lacks two genes encoding aldehyde:ferredoxin oxidoreductases (AOR). Comparative genome sequence analysis of the four WT strains revealed high average nucleotide identity (ANI) of C. ljungdahlii and C. autoethanogenum (99.3%) and C. coskatii (98.3%). In contrast, C. ljungdahlii WT and C. ragsdalei WT showed an ANI-based similarity of only 95.8%. Additionally, recombinant C. ljungdahlii strains were constructed that harbor an artificial acetone synthesis operon (ASO) consisting of the following genes: adc, ctfA, ctfB, and thlA (encoding acetoacetate decarboxylase, acetoacetyl-CoA:acetate/butyrate:CoA-transferase subunits A and B, and thiolase) under the control of thlA promoter (P thlA ) from C. acetobutylicum or native pta-ack promoter (P pta-ack ) from C. ljungdahlii. Respective recombinant strains produced 2-propanol rather than acetone, due to the presence of a NADPH-dependent primary-secondary alcohol dehydrogenase that converts acetone to 2

  11. Comparative genome analysis of Bacillus cereus group genomes withBacillus subtilis

    SciTech Connect

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D'Souza, Mark; Larsen, Niels; Pusch,Gordon; Liolios, Konstantinos; Grechkin, Yuri; Lapidus, Alla; Goltsman,Eugene; Chu, Lien; Fonstein, Michael; Ehrlich, S. Dusko; Overbeek, Ross; Kyrpides, Nikos; Ivanova, Natalia

    2005-09-14

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-layer proteins suggesting differences in their phenotype were identified. The B. cereus group has signal transduction systems including a tyrosine kinase related to two-component system histidine kinases from B. subtilis. A model for regulation of the stress responsive sigma factor sigmaB in the B. cereus group different from the well studied regulation in B. subtilis has been proposed. Despite a high degree of chromosomal synteny among these genomes, significant differences in cell wall and spore coat proteins that contribute to the survival and adaptation in specific hosts has been identified.

  12. Comparative Genomics of the Ubiquitous, Hydrocarbon-degrading Genus Marinobacter

    NASA Astrophysics Data System (ADS)

    Singer, E.; Webb, E.; Edwards, K. J.

    2012-12-01

    The genus Marinobacter is amongst the most ubiquitous in the global oceans and strains have been isolated from a wide variety of marine environments, including offshore oil-well heads, coastal thermal springs, Antarctic sea water, saline soils and associations with diatoms and dinoflagellates. Many strains have been recognized to be important hydrocarbon degraders in various marine habitats presenting sometimes extreme pH or salinity conditions. Analysis of the genome of M. aquaeolei revealed enormous adaptation versatility with an assortment of strategies for carbon and energy acquisition, sensation, and defense. In an effort to elucidate the ecological and biogeochemical significance of the Marinobacters, seven Marinobacter strains from diverse environments were included in a comparative genomics study. Genomes were screened for metabolic and adaptation potential to elucidate the strategies responsible for the omnipresence of the Marinobacter genus and their remedial action potential in hydrocarbon-polluted waters. The core genome predominantly encodes for key genes involved in hydrocarbon degradation, biofilm-relevant processes, including utilization of external DNA, halotolerance, as well as defense mechanisms against heavy metals, antibiotics, and toxins. All Marinobacter strains were observed to degrade a wide spectrum of hydrocarbon species, including aliphatic, polycyclic aromatic as well as acyclic isoprenoid compounds. Various genes predicted to facilitate hydrocarbon degradation, e.g. alkane 1-monooxygenase, appear to have originated from lateral gene transfer as they are located on gene clusters of 10-20% lower GC-content compared to genome averages and are flanked by transposases. Top ortholog hits are found in other hydrocarbon degrading organisms, e.g. Alcanivorax borkumensis. Strategies for hydrocarbon uptake encoded by various Marinobacter strains include cell surface hydrophobicity adaptation via capsular polysaccharide biosynthesis and attachment

  13. Comparative genomics of Serratia spp.: two paths towards endosymbiotic life.

    PubMed

    Manzano-Marín, Alejandro; Lamelas, Araceli; Moya, Andrés; Latorre, Amparo

    2012-01-01

    Symbiosis is a widespread phenomenon in nature, in which insects show a great number of these associations. Buchnera aphidicola, the obligate endosymbiont of aphids, coexists in some species with another intracellular bacterium, Serratia symbiotica. Of particular interest is the case of the cedar aphid Cinara cedri, where B. aphidicola BCc and S. symbiotica SCc need each other to fulfil their symbiotic role with the insect. Moreover, various features seem to indicate that S. symbiotica SCc is closer to an obligate endosymbiont than to other facultative S. symbiotica, such as the one described for the aphid Acirthosyphon pisum (S. symbiotica SAp). This work is based on the comparative genomics of five strains of Serratia, three free-living and two endosymbiotic ones (one facultative and one obligate) which should allow us to dissect the genome reduction taking place in the adaptive process to an intracellular life-style. Using a pan-genome approach, we have identified shared and strain-specific genes from both endosymbiotic strains and gained insight into the different genetic reduction both S. symbiotica have undergone. We have identified both retained and reduced functional categories in S. symbiotica compared to the Free-Living Serratia (FLS) that seem to be related with its endosymbiotic role in their specific host-symbiont systems. By means of a phylogenomic reconstruction we have solved the position of both endosymbionts with confidence, established the probable insect-pathogen origin of the symbiotic clade as well as the high amino-acid substitution rate in S. symbiotica SCc. Finally, we were able to quantify the minimal number of rearrangements suffered in the endosymbiotic lineages and reconstruct a minimal rearrangement phylogeny. All these findings provide important evidence for the existence of at least two distinctive S. symbiotica lineages that are characterized by different rearrangements, gene content, genome size and branch lengths. PMID:23077583

  14. Comparative 3D Genome Structure Analysis of the Fission and the Budding Yeast

    PubMed Central

    Gong, Ke; Tjong, Harianto; Zhou, Xianghong Jasmine; Alber, Frank

    2015-01-01

    We studied the 3D structural organization of the fission yeast genome, which emerges from the tethering of heterochromatic regions in otherwise randomly configured chromosomes represented as flexible polymer chains in an nuclear environment. This model is sufficient to explain in a statistical manner many experimentally determined distinctive features of the fission yeast genome, including chromatin interaction patterns from Hi-C experiments and the co-locations of functionally related and co-expressed genes, such as genes expressed by Pol-III. Our findings demonstrate that some previously described structure-function correlations can be explained as a consequence of random chromatin collisions driven by a few geometric constraints (mainly due to centromere-SPB and telomere-NE tethering) combined with the specific gene locations in the chromosome sequence. We also performed a comparative analysis between the fission and budding yeast genome structures, for which we previously detected a similar organizing principle. However, due to the different chromosome sizes and numbers, substantial differences are observed in the 3D structural genome organization between the two species, most notably in the nuclear locations of orthologous genes, and the extent of nuclear territories for genes and chromosomes. However, despite those differences, remarkably, functional similarities are maintained, which is evident when comparing spatial clustering of functionally related genes in both yeasts. Functionally related genes show a similar spatial clustering behavior in both yeasts, even though their nuclear locations are largely different between the yeast species. PMID:25799503

  15. Genome distribution of differential homoeologue contributions to leaf gene expression in bread wheat.

    PubMed

    Harper, Andrea L; Trick, Martin; He, Zhesi; Clissold, Leah; Fellgett, Alison; Griffiths, Simon; Bancroft, Ian

    2016-05-01

    Using a combination of de novo transcriptome assembly, a newly developed 9495-marker transcriptome SNP genetic linkage map and comparative genomics approaches, we developed an ordered set of nonredundant transcripts for each of the subgenomes of hexaploid wheat: A (47 160 unigenes), B (59 663 unigenes) and D (40 588 unigenes). We used these as reference sequences against which to map Illumina mRNA-Seq reads derived from young leaf tissue. Transcript abundance was quantified for each unigene. Using a three-way reciprocal BLAST approach, 15 527 triplet sets of homoeologues (one from each genome) were identified. Differential expression (P < 0.05) was identified for 5248 unigenes, with 2906 represented at greater abundance than their two homoeologues and 2342 represented at lower abundance than their two homoeologues. Analysis of gene ontology terms revealed no biases between homoeologues. There was no evidence of genomewide dominance effects, rather the more highly transcribed individual genes were distributed throughout all three genomes. Transcriptome display tile plot, a visualization approach based on CMYK colour space, was developed and used to assess the genome for regions of skewed homoeologue transcript abundance. Extensive striation was revealed, indicative of many small regions of genome dominance (transcripts of homoeologues from one genome more abundant than the others) and many larger regions of genome repression (transcripts of homoeologues from one genome less abundant than the others). PMID:26442792

  16. The infectious BAC genomic DNA expression library: a high capacity vector system for functional genomics.

    PubMed

    Lufino, Michele M P; Edser, Pauline A H; Quail, Michael A; Rice, Stephen; Adams, David J; Wade-Martins, Richard

    2016-01-01

    Gene dosage plays a critical role in a range of cellular phenotypes, yet most cellular expression systems use heterologous cDNA-based vectors which express proteins well above physiological levels. In contrast, genomic DNA expression vectors generate physiologically-relevant levels of gene expression by carrying the whole genomic DNA locus of a gene including its regulatory elements. Here we describe the first genomic DNA expression library generated using the high-capacity herpes simplex virus-1 amplicon technology to deliver bacterial artificial chromosomes (BACs) into cells by viral transduction. The infectious BAC (iBAC) library contains 184,320 clones with an average insert size of 134.5 kb. We show in a Chinese hamster ovary (CHO) disease model cell line and mouse embryonic stem (ES) cells that this library can be used for genetic rescue studies in a range of contexts including the physiological restoration of Ldlr deficiency, and viral receptor expression. The iBAC library represents an important new genetic analysis tool openly available to the research community. PMID:27353647

  17. The infectious BAC genomic DNA expression library: a high capacity vector system for functional genomics

    PubMed Central

    Lufino, Michele M. P.; Edser, Pauline A. H.; Quail, Michael A.; Rice, Stephen; Adams, David J.; Wade-Martins, Richard

    2016-01-01

    Gene dosage plays a critical role in a range of cellular phenotypes, yet most cellular expression systems use heterologous cDNA-based vectors which express proteins well above physiological levels. In contrast, genomic DNA expression vectors generate physiologically-relevant levels of gene expression by carrying the whole genomic DNA locus of a gene including its regulatory elements. Here we describe the first genomic DNA expression library generated using the high-capacity herpes simplex virus-1 amplicon technology to deliver bacterial artificial chromosomes (BACs) into cells by viral transduction. The infectious BAC (iBAC) library contains 184,320 clones with an average insert size of 134.5 kb. We show in a Chinese hamster ovary (CHO) disease model cell line and mouse embryonic stem (ES) cells that this library can be used for genetic rescue studies in a range of contexts including the physiological restoration of Ldlr deficiency, and viral receptor expression. The iBAC library represents an important new genetic analysis tool openly available to the research community. PMID:27353647

  18. Comparative analysis of essential genes in prokaryotic genomic islands

    PubMed Central

    Zhang, Xi; Peng, Chong; Zhang, Ge; Gao, Feng

    2015-01-01

    Essential genes are thought to encode proteins that carry out the basic functions to sustain a cellular life, and genomic islands (GIs) usually contain clusters of horizontally transferred genes. It has been assumed that essential genes are not likely to be located in GIs, but systematical analysis of essential genes in GIs has not been explored before. Here, we have analyzed the essential genes in 28 prokaryotes by statistical method and reached a conclusion that essential genes in GIs are significantly fewer than those outside GIs. The function of 362 essential genes found in GIs has been explored further by BLAST against the Virulence Factor Database (VFDB) and the phage/prophage sequence database of PHAge Search Tool (PHAST). Consequently, 64 and 60 eligible essential genes are found to share the sequence similarity with the virulence factors and phage/prophages-related genes, respectively. Meanwhile, we find several toxin-related proteins and repressors encoded by these essential genes in GIs. The comparative analysis of essential genes in genomic islands will not only shed new light on the development of the prediction algorithm of essential genes, but also give a clue to detect the functionality of essential genes in genomic islands. PMID:26223387

  19. Comparative genomics of the mimicry switch in Papilio dardanus

    PubMed Central

    Timmermans, Martijn J. T. N.; Baxter, Simon W.; Clark, Rebecca; Heckel, David G.; Vogel, Heiko; Collins, Steve; Papanicolaou, Alexie; Fukova, Iva; Joron, Mathieu; Thompson, Martin J.; Jiggins, Chris D.; ffrench-Constant, Richard H.; Vogler, Alfried P.

    2014-01-01

    The African Mocker Swallowtail, Papilio dardanus, is a textbook example in evolutionary genetics. Classical breeding experiments have shown that wing pattern variation in this polymorphic Batesian mimic is determined by the polyallelic H locus that controls a set of distinct mimetic phenotypes. Using bacterial artificial chromosome (BAC) sequencing, recombination analyses and comparative genomics, we show that H co-segregates with an interval of less than 500 kb that is collinear with two other Lepidoptera genomes and contains 24 genes, including the transcription factor genes engrailed (en) and invected (inv). H is located in a region of conserved gene order, which argues against any role for genomic translocations in the evolution of a hypothesized multi-gene mimicry locus. Natural populations of P. dardanus show significant associations of specific morphs with single nucleotide polymorphisms (SNPs), centred on en. In addition, SNP variation in the H region reveals evidence of non-neutral molecular evolution in the en gene alone. We find evidence for a duplication potentially driving physical constraints on recombination in the lamborni morph. Absence of perfect linkage disequilibrium between different genes in the other morphs suggests that H is limited to nucleotide positions in the regulatory and coding regions of en. Our results therefore support the hypothesis that a single gene underlies wing pattern variation in P. dardanus. PMID:24920480

  20. Comparative genomic analysis of ten Streptococcus pneumoniae temperate bacteriophages.

    PubMed

    Romero, Patricia; Croucher, Nicholas J; Hiller, N Luisa; Hu, Fen Z; Ehrlich, Garth D; Bentley, Stephen D; García, Ernesto; Mitchell, Tim J

    2009-08-01

    Streptococcus pneumoniae is an important human pathogen that often carries temperate bacteriophages. As part of a program to characterize the genetic makeup of prophages associated with clinical strains and to assess the potential roles that they play in the biology and pathogenesis in their host, we performed comparative genomic analysis of 10 temperate pneumococcal phages. All of the genomes are organized into five major gene clusters: lysogeny, replication, packaging, morphogenesis, and lysis clusters. All of the phage particles observed showed a Siphoviridae morphology. The only genes that are well conserved in all the genomes studied are those involved in the integration and the lysis of the host in addition to two genes, of unknown function, within the replication module. We observed that a high percentage of the open reading frames contained no similarities to any sequences catalogued in public databases; however, genes that were homologous to known phage virulence genes, including the pblB gene of Streptococcus mitis and the vapE gene of Dichelobacter nodosus, were also identified. Interestingly, bioinformatic tools showed the presence of a toxin-antitoxin system in the phage phiSpn_6, and this represents the first time that an addition system in a pneumophage has been identified. Collectively, the temperate pneumophages contain a diverse set of genes with various levels of similarity among them. PMID:19502408

  1. Comparative genomics reveals mobile pathogenicity chromosomes in Fusarium

    SciTech Connect

    Ma, Li Jun; van der Does, H. C.; Borkovich, Katherine A.; Coleman, Jeffrey J.; Daboussi, Marie-Jose; Di Pietro, Antonio; Dufresne, Marie; Freitag, Michael; Grabherr, Manfred; Henrissat, Bernard; Houterman, Petra M.; Kang, Seogchan; Shim, Won-Bo; Wolochuk, Charles; Xie, Xiaohui; Xu, Jin Rong; Antoniw, John; Baker, Scott E.; Bluhm, Burton H.; Breakspear, Andrew; Brown, Daren W.; Butchko, Robert A.; Chapman, Sinead; Coulson, Richard; Coutinho, Pedro M.; Danchin, Etienne G.; Diener, Andrew; Gale, Liane R.; Gardiner, Donald; Goff, Steven; Hammond-Kossack, Kim; Hilburn, Karen; Hua-Van, Aurelie; Jonkers, Wilfried; Kazan, Kemal; Kodira, Chinnappa D.; Koehrsen, Michael; Kumar, Lokesh; Lee, Yong Hwan; Li, Liande; Manners, John M.; Miranda-Saavedra, Diego; Mukherjee, Mala; Park, Gyungsoon; Park, Jongsun; Park, Sook Young; Proctor, Robert H.; Regev, Aviv; Ruiz-Roldan, M. C.; Sain, Divya; Sakthikumar, Sharadha; Sykes, Sean; Schwartz, David C.; Turgeon, Barbara G.; Wapinski, Ilan; Yoder, Olen; Young, Sarah; Zeng, Qiandong; Zhou, Shiguo; Galagan, James; Cuomo, Christina A.; Kistler, H. Corby; Rep, Martijn

    2010-03-18

    Fusarium species are among the most important phytopathogenic and toxigenic fungi, having significant impact on crop production and animal health. Distinctively, members of the F. oxysporum species complex exhibit wide host range but discontinuously distributed host specificity, reflecting remarkable genetic adaptability. To understand the molecular underpinnings of diverse phenotypic traits and their evolution in Fusarium, we compared the genomes of three economically important and phylogenetically related, yet phenotypically diverse plant-pathogenic species, F. graminearum, F. verticillioides and F. oxysporum f. sp. lycopersici. Our analysis revealed greatly expanded lineage-specific (LS) genomic regions in F. oxysporum that include four entire chromosomes, accounting for more than one-quarter of the genome. LS regions are rich in transposons and genes with distinct evolutionary profiles but related to pathogenicity. Experimentally, we demonstrate for the first time the transfer of two LS chromosomes between strains of F. oxysporum, resulting in the conversion of a non-pathogenic strain into a pathogen. Transfer of LS chromosomes between otherwise genetically isolated strains explains the polyphyletic origin of host specificity and the emergence of new pathogenic lineages in the F. oxysporum species complex, putting the evolution of fungal pathogenicity into a new perspective.

  2. Genome-wide discovery of functional transcription factor binding sites by comparative genomics: The case of Stat3

    PubMed Central

    Vallania, Francesco; Schiavone, Davide; Dewilde, Sarah; Pupo, Emanuela; Garbay, Serge; Calogero, Raffaele; Pontoglio, Marco; Provero, Paolo; Poli, Valeria

    2009-01-01

    The identification of direct targets of transcription factors is a key problem in the study of gene regulatory networks. However, the use of high throughput experimental methods, such as ChIP-chip and ChIP-sequencing, is limited by their high cost and strong dependence on cellular type and context. We developed a computational method for the genome-wide identification of functional transcription factor binding sites based on positional weight matrices, comparative genomics, and gene expression profiling. The method was applied to Stat3, a transcription factor playing crucial roles in inflammation, immunity and oncogenesis, and able to induce distinct subsets of target genes in different cell types or conditions. A newly generated positional weight matrix enabled us to assign affinity scores of high specificity, as measured by EMSA competition assays. Phylogenetic conservation with 7 vertebrate species was used to select the binding sites most likely to be functional. Validation was carried out on predicted sites within genes identified as differentially expressed in the presence or absence of Stat3 by microarray analysis. Twelve of the fourteen sites tested were bound by Stat3 in vivo, as assessed by Chromatin Immunoprecipitation, allowing us to identify 9 Stat3 transcriptional targets. Given its high validation rate, and the availability of large transcription factor-dependent gene expression datasets obtained under diverse experimental conditions, our approach appears to be a valid alternative to high-throughput experimental assays for the discovery of novel direct targets of transcription factors. PMID:19282476

  3. Comparative Genomics of Flatworms (Platyhelminthes) Reveals Shared Genomic Features of Ecto- and Endoparastic Neodermata

    PubMed Central

    Hahn, Christoph; Fromm, Bastian; Bachmann, Lutz

    2014-01-01

    The ectoparasitic Monogenea comprise a major part of the obligate parasitic flatworm diversity. Although genomic adaptations to parasitism have been studied in the endoparasitic tapeworms (Cestoda) and flukes (Trematoda), no representative of the Monogenea has been investigated yet. We present the high-quality draft genome of Gyrodactylus salaris, an economically important monogenean ectoparasite of wild Atlantic salmon (Salmo salar). A total of 15,488 gene models were identified, of which 7,102 were functionally annotated. The controversial phylogenetic relationships within the obligate parasitic Neodermata were resolved in a phylogenomic analysis using 1,719 gene models (alignment length of >500,000 amino acids) for a set of 16 metazoan taxa. The Monogenea were found basal to the Cestoda and Trematoda, which implies ectoparasitism being plesiomorphic within the Neodermata and strongly supports a common origin of complex life cycles. Comparative analysis of seven parasitic flatworm genomes identified shared genomic features for the ecto- and endoparasitic lineages, such as a substantial reduction of the core bilaterian gene complement, including the homeodomain-containing genes, and a loss of the piwi and vasa genes, which are considered essential for animal development. Furthermore, the shared loss of functional fatty acid biosynthesis pathways and the absence of peroxisomes, the latter organelles presumed ubiquitous in eukaryotes except for parasitic protozoans, were inferred. The draft genome of G. salaris opens for future in-depth analyses of pathogenicity and host specificity of poorly characterized G. salaris strains, and will enhance studies addressing the genomics of host–parasite interactions and speciation in the highly diverse monogenean flatworms. PMID:24732282

  4. The Complete Genome Sequence and Comparative Genome Analysis of the High Pathogenicity Yersinia enterocolitica Strain 8081

    PubMed Central

    Thomson, Nicholas R; Howard, Sarah; Wren, Brendan W; Holden, Matthew T. G; Crossman, Lisa; Challis, Gregory L; Churcher, Carol; Mungall, Karen; Brooks, Karen; Chillingworth, Tracey; Feltwell, Theresa; Abdellah, Zahra; Hauser, Heidi; Jagels, Kay; Maddison, Mark; Moule, Sharon; Sanders, Mandy; Whitehead, Sally; Quail, Michael A; Dougan, Gordon; Parkhill, Julian; Prentice, Michael B

    2006-01-01

    The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B) and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common themes in the

  5. The complete genome sequence and comparative genome analysis of the high pathogenicity Yersinia enterocolitica strain 8081.

    PubMed

    Thomson, Nicholas R; Howard, Sarah; Wren, Brendan W; Holden, Matthew T G; Crossman, Lisa; Challis, Gregory L; Churcher, Carol; Mungall, Karen; Brooks, Karen; Chillingworth, Tracey; Feltwell, Theresa; Abdellah, Zahra; Hauser, Heidi; Jagels, Kay; Maddison, Mark; Moule, Sharon; Sanders, Mandy; Whitehead, Sally; Quail, Michael A; Dougan, Gordon; Parkhill, Julian; Prentice, Michael B

    2006-12-15

    The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B) and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common themes in the

  6. Comparative genomic reconstruction of transcriptional networks controlling central metabolism in the Shewanella genus

    SciTech Connect

    Rodionov, Dmitry A.; Novichkov, Pavel; Stavrovskaya, Elena D.; Rodionova, Irina A.; Li, Xiaoqing; Kazanov, Marat D.; Ravcheev, Dmitry A.; Gerasimova, Anna V.; Kazakov, Alexey E.; Kovaleva, Galina Y.; Permina, Elizabeth A.; Laikova, Olga N.; Overbeek, Ross; Romine, Margaret F.; Fredrickson, Jim K.; Arkin, Adam P.; Dubchak, Inna; Osterman, Andrei L.; Gelfand, Mikhail S.

    2011-06-15

    Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in bacteria is one of the critical tasks of modern genomics. Despite the growing number of genome-scale gene expression studies, our abilities to convert the results of these studies into accurate regulatory annotations and to project them from model to other organisms are extremely limited. The comparative genomics approaches and computational identification of regulatory sites are useful for the in silico reconstruction of transcriptional regulatory networks in bacteria. The Shewanella genus is comprised of metabolically versatile gamma-proteobacteria, whose lifestyles and natural environments are substantially different from Escherichia coli and other model bacterial species. To explore conservation and variations in the Shewanella transcriptional networks we analyzed the repertoire of transcription factors and performed genomics-based reconstruction and comparative analysis of regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors and their DNA binding sites, 8 riboswitches and 6 translational attenuators. Forty five regulons were newly inferred from the genome context analysis, whereas others were propagated from previously characterized regulons in the Enterobacteria and Pseudomonas spp.. However, even orthologous regulators with conserved DNA-binding motifs may control substantially different gene sets, revealing striking differences in regulatory strategies between the Shewanella spp. and E. coli. Multiple examples of regulatory network rewiring include regulon contraction and expansion (as in the case of PdhR, HexR, FadR), and numerous cases of recruiting non-orthologous regulators to control equivalent pathways (e.g. NagR for N-acetylglucosamine catabolism and PsrA for fatty acid degradation) and, conversely, orthologous regulators to control distinct pathways (e.g. TyrR, ArgR, Crp).

  7. Reconstructing the Evolution of Brachypodium Genomes Using Comparative Chromosome Painting

    PubMed Central

    Betekhtin, Alexander; Jenkins, Glyn; Hasterok, Robert

    2014-01-01

    Brachypodium distachyon is a model for the temperate cereals and grasses and has a biology, genomics infrastructure and cytogenetic platform fit for purpose. It is a member of a genus with fewer than 20 species, which have different genome sizes, basic chromosome numbers and ploidy levels. The phylogeny and interspecific relationships of this group have not to date been resolved by sequence comparisons and karyotypical studies. The aims of this study are not only to reconstruct the evolution of Brachypodium karyotypes to resolve the phylogeny, but also to highlight the mechanisms that shape the evolution of grass genomes. This was achieved through the use of comparative chromosome painting (CCP) which hybridises fluorescent, chromosome-specific probes derived from B. distachyon to homoeologous meiotic chromosomes of its close relatives. The study included five diploids (B. distachyon 2n = 10, B. sylvaticum 2n = 18, B. pinnatum 2n = 16; 2n = 18, B. arbuscula 2n = 18 and B. stacei 2n = 20) three allotetraploids (B. pinnatum 2n = 28, B. phoenicoides 2n = 28 and B. hybridum 2n = 30), and two species of unknown ploidy (B. retusum 2n = 38 and B. mexicanum 2n = 40). On the basis of the patterns of hybridisation and incorporating published data, we propose two alternative, but similar, models of karyotype evolution in the genus Brachypodium. According to the first model, the extant genome of B. distachyon derives from B. mexicanum or B. stacei by several rounds of descending dysploidy, and the other diploids evolve from B. distachyon via ascending dysploidy. The allotetraploids arise by interspecific hybridisation and chromosome doubling between B. distachyon and other diploids. The second model differs from the first insofar as it incorporates an intermediate 2n = 18 species between the B. mexicanum or B. stacei progenitors and the dysploidic B. distachyon. PMID:25493646

  8. Reconstructing the Evolution of Brachypodium Genomes Using Comparative Chromosome Painting.

    PubMed

    Betekhtin, Alexander; Jenkins, Glyn; Hasterok, Robert

    2014-01-01

    Brachypodium distachyon is a model for the temperate cereals and grasses and has a biology, genomics infrastructure and cytogenetic platform fit for purpose. It is a member of a genus with fewer than 20 species, which have different genome sizes, basic chromosome numbers and ploidy levels. The phylogeny and interspecific relationships of this group have not to date been resolved by sequence comparisons and karyotypical studies. The aims of this study are not only to reconstruct the evolution of Brachypodium karyotypes to resolve the phylogeny, but also to highlight the mechanisms that shape the evolution of grass genomes. This was achieved through the use of comparative chromosome painting (CCP) which hybridises fluorescent, chromosome-specific probes derived from B. distachyon to homoeologous meiotic chromosomes of its close relatives. The study included five diploids (B. distachyon 2n = 10, B. sylvaticum 2n = 18, B. pinnatum 2n = 16; 2n = 18, B. arbuscula 2n = 18 and B. stacei 2n = 20) three allotetraploids (B. pinnatum 2n = 28, B. phoenicoides 2n = 28 and B. hybridum 2n = 30), and two species of unknown ploidy (B. retusum 2n = 38 and B. mexicanum 2n = 40). On the basis of the patterns of hybridisation and incorporating published data, we propose two alternative, but similar, models of karyotype evolution in the genus Brachypodium. According to the first model, the extant genome of B. distachyon derives from B. mexicanum or B. stacei by several rounds of descending dysploidy, and the other diploids evolve from B. distachyon via ascending dysploidy. The allotetraploids arise by interspecific hybridisation and chromosome doubling between B. distachyon and other diploids. The second model differs from the first insofar as it incorporates an intermediate 2n = 18 species between the B. mexicanum or B. stacei progenitors and the dysploidic B. distachyon. PMID:25493646

  9. Genomic Sequencing of Orientia tsutsugamushi Strain Karp, an Assembly Comparable to the Genome Size of the Strain Ikeda

    PubMed Central

    Liao, Hsiao-Mei; Chao, Chien-Chung; Lei, Haiyan; Li, Bingjie; Tsai, Shien; Hung, Guo-Chiuan

    2016-01-01

    Orientia tsutsugamushi, an intracellular bacterium, belongs to the family Rickettsiaceae. This study presents the draft genome sequence of strain Karp, with 2.0 Mb as the size of the completed genome. This nearly finished draft genome sequence was annotated with the RAST server and the contents compared to those of the other strains. PMID:27540052

  10. Complementary DNA sequencing: Expressed sequence tags and human genome project

    SciTech Connect

    Adams, M.D.; Kelley, J.M.; Gocayne, J.D.; Dubnick, M.; Wu, A.; Olde, B.; Moreno, R.F.; Kerlavage, A.R.; McCombie, W.R.; Venter, J.C. ); Polymeropoulos, M.H.; Hong Xiao; Merril, C.R. )

    1991-06-21

    Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs). ESTs have applications in the discovery of new human genes, mapping of the human genome, and identification of coding regions in genomic sequences. Of the sequences generated, 337 represent new genes, including 48 with significant similarity to genes from other organisms, such as a yeast RNA polymerase II subunit; Drosophila kinesin, Notch, and Enhancer of split; and a murine tyrosine kinase receptor. Forty-six ESTs were mapped to chromosomes after amplification by the polymerase chain reaction. This fast approach to cDNA characterization will facilitate the tagging of most human genes in a few years at a fraction of the cost of complete genomic sequencing, provide new genetic markers, and serve as a resource in diverse biological research fields.

  11. Comparative genomics of Blattabacterium cuenoti: the frozen legacy of an ancient endosymbiont genome.

    PubMed

    Patiño-Navarrete, Rafael; Moya, Andrés; Latorre, Amparo; Peretó, Juli

    2013-01-01

    Many insect species have established long-term symbiotic relationships with intracellular bacteria. Symbiosis with bacteria has provided insects with novel ecological capabilities, which have allowed them colonize previously unexplored niches. Despite its importance to the understanding of the emergence of biological complexity, the evolution of symbiotic relationships remains hitherto a mystery in evolutionary biology. In this study, we contribute to the investigation of the evolutionary leaps enabled by mutualistic symbioses by sequencing the genome of Blattabacterium cuenoti, primary endosymbiont of the omnivorous cockroach Blatta orientalis, and one of the most ancient symbiotic associations. We perform comparative analyses between the Blattabacterium cuenoti genome and that of previously sequenced endosymbionts, namely those from the omnivorous hosts the Blattella germanica (Blattelidae) and Periplaneta americana (Blattidae), and the endosymbionts harbored by two wood-feeding hosts, the subsocial cockroach Cryptocercus punctulatus (Cryptocercidae) and the termite Mastotermes darwiniensis (Termitidae). Our study shows a remarkable evolutionary stasis of this symbiotic system throughout the evolutionary history of cockroaches and the deepest branching termite M. darwiniensis, in terms of not only chromosome architecture but also gene content, as revealed by the striking conservation of the Blattabacterium core genome. Importantly, the architecture of central metabolic network inferred from the endosymbiont genomes was established very early in Blattabacterium evolutionary history and could be an outcome of the essential role played by this endosymbiont in the host's nitrogen economy. PMID:23355305

  12. The SOL Genomics Network. A Comparative Resource for Solanaceae Biology and Beyond1

    PubMed Central

    Mueller, Lukas A.; Solow, Teri H.; Taylor, Nicolas; Skwarecki, Beth; Buels, Robert; Binns, John; Lin, Chenwei; Wright, Mark H.; Ahrens, Robert; Wang, Ying; Herbst, Evan V.; Keyder, Emil R.; Menda, Naama; Zamir, Dani; Tanksley, Steven D.

    2005-01-01

    The SOL Genomics Network (SGN; http://sgn.cornell.edu) is a rapidly evolving comparative resource for the plants of the Solanaceae family, which includes important crop and model plants such as potato (Solanum tuberosum), eggplant (Solanum melongena), pepper (Capsicum annuum), and tomato (Solanum lycopersicum). The aim of SGN is to relate these species to one another using a comparative genomics approach and to tie them to the other dicots through the fully sequenced genome of Arabidopsis (Arabidopsis thaliana). SGN currently houses map and marker data for Solanaceae species, a large expressed sequence tag collection with computationally derived unigene sets, an extensive database of phenotypic information for a mutagenized tomato population, and associated tools such as real-time quantitative trait loci. Recently, the International Solanaceae Project (SOL) was formed as an umbrella organization for Solanaceae research in over 30 countries to address important questions in plant biology. The first cornerstone of the SOL project is the sequencing of the entire euchromatic portion of the tomato genome. SGN is collaborating with other bioinformatics centers in building the bioinformatics infrastructure for the tomato sequencing project and implementing the bioinformatics strategy of the larger SOL project. The overarching goal of SGN is to make information available in an intuitive comparative format, thereby facilitating a systems approach to investigations into the basis of adaptation and phenotypic diversity in the Solanaceae family, other species in the Asterid clade such as coffee (Coffea arabica), Rubiaciae, and beyond. PMID:16010005

  13. Genetical and comparative genomics of Brassica under altered Ca supply identifies Arabidopsis Ca-transporter orthologs.

    PubMed

    Graham, Neil S; Hammond, John P; Lysenko, Artem; Mayes, Sean; O Lochlainn, Seosamh; Blasco, Bego; Bowen, Helen C; Rawlings, Chris J; Rios, Juan J; Welham, Susan; Carion, Pierre W C; Dupuy, Lionel X; King, Graham J; White, Philip J; Broadley, Martin R

    2014-07-01

    Although Ca transport in plants is highly complex, the overexpression of vacuolar Ca(2+) transporters in crops is a promising new technology to improve dietary Ca supplies through biofortification. Here, we sought to identify novel targets for increasing plant Ca accumulation using genetical and comparative genomics. Expression quantitative trait locus (eQTL) mapping to 1895 cis- and 8015 trans-loci were identified in shoots of an inbred mapping population of Brassica rapa (IMB211 × R500); 23 cis- and 948 trans-eQTLs responded specifically to altered Ca supply. eQTLs were screened for functional significance using a large database of shoot Ca concentration phenotypes of Arabidopsis thaliana. From 31 Arabidopsis gene identifiers tagged to robust shoot Ca concentration phenotypes, 21 mapped to 27 B. rapa eQTLs, including orthologs of the Ca(2+) transporters At-CAX1 and At-ACA8. Two of three independent missense mutants of BraA.cax1a, isolated previously by targeting induced local lesions in genomes, have allele-specific shoot Ca concentration phenotypes compared with their segregating wild types. BraA.CAX1a is a promising target for altering the Ca composition of Brassica, consistent with prior knowledge from Arabidopsis. We conclude that multiple-environment eQTL analysis of complex crop genomes combined with comparative genomics is a powerful technique for novel gene identification/prioritization. PMID:25082855

  14. High resolution comparative genomic hybridisation in clinical cytogenetics

    PubMed Central

    Kirchhoff, M.; Rose, H.; Lundsteen, C.

    2001-01-01

    High resolution comparative genomic hybridisation (HR-CGH) is a diagnostic tool in our clinical cytogenetics laboratory. The present survey reports the results of 253 clinical cases in which 47 abnormalities were detected. Among 144 dysmorphic and mentally retarded subjects with a normal conventional karyotype, 15 (10%) had small deletions or duplications, of which 11 were interstitial. In addition, a case of mosaic trisomy 9 was detected. Among 25 dysmorphic and mentally retarded subjects carrying apparently balanced de novo translocations, four had deletions at translocation breakpoints and two had deletions elsewhere in the genome. Seventeen of 19 complex rearrangements were clarified by HR-CGH. A small supernumerary marker chromosome occurring with low frequency and the breakpoint of a mosaic r(18) case could not be clarified. Three of 19 other abnormalities could not be confirmed by HR-CGH. One was a Williams syndrome deletion and two were DiGeorge syndrome deletions, which were apparently below the resolution of HR-CGH. However, we were able to confirm Angelman and Prader-Willi syndrome deletions, which are about 3-5 Mb. We conclude that HR-CGH should be used for the evaluation of (1) dysmorphic and mentally retarded subjects where normal karyotyping has failed to show abnormalities, (2) dysmorphic and mentally retarded subjects carrying apparently balanced de novo translocations, (3) apparently balanced de novo translocations detected prenatally, and (4) for clarification of complex structural rearrangements.


Keywords: comparative genomic hybridisation; chromosome analysis; chromosome aberrations; dysmorphism PMID:11694545

  15. Canine urothelial carcinoma: genomically aberrant and comparatively relevant

    PubMed Central

    Shapiro, S. G.; Raghunath, S.; Williams, C.; Motsinger-Reif, A. A.; Cullen, J. M.; Liu, T.; Albertson, D.; Ruvolo, M.; Lucas, A. Bergstrom; Jin, J.; Knapp, D. W.; Schiffman, J. D.

    2015-01-01

    Urothelial carcinoma (UC), also referred to as transitional cell carcinoma (TCC), is the most common bladder malignancy in both human and canine populations. In human UC, numerous studies have demonstrated the prevalence of chromosomal imbalances. Although the histopathology of the disease is similar in both species, studies evaluating the genomic profile of canine UC are lacking, limiting the discovery of key comparative molecular markers associated with driving UC pathogenesis. In the present study, we evaluated 31 primary canine UC biopsies by oligonucleotide array comparative genomic hybridization (oaCGH). Results highlighted the presence of three highly recurrent numerical aberrations: gain of dog chromosome (CFA) 13 and 36 and loss of CFA 19. Regional gains of CFA 13 and 36 were present in 97% and 84% of cases, respectively, and losses on CFA 19 were present in 77% of cases. Fluorescence in situ hybridization (FISH), using targeted bacterial artificial chromosome (BAC) clones and custom Agilent SureFISH probes, was performed to detect and quantify these regions in paraffin-embedded biopsy sections and urine-derived urothelial cells. The data indicate that these three aberrations are potentially diagnostic of UC. Comparison of our canine oaCGH data with that of 285 human cases identified a series of shared copy number aberrations. Using an informatics approach to interrogate the frequency of copy number aberrations across both species, we identified those that had the highest joint probability of association with UC. The most significant joint region contained the gene PABPC1, which should be considered further for its role in UC progression. In addition, cross-species filtering of genome-wide copy number data highlighted several genes as high-profile candidates for further analysis, including CDKN2A, S100A8/9, and LRP1B. We propose that these common aberrations are indicative of an evolutionarily conserved mechanism of pathogenesis and harbor genes key to

  16. Comparative genomic characterization of citrus-associated Xylella fastidiosa strains

    PubMed Central

    da Silva, Vivian S; Shida, Cláudio S; Rodrigues, Fabiana B; Ribeiro, Diógenes CD; de Souza, Alessandra A; Coletta-Filho, Helvécio D; Machado, Marcos A; Nunes, Luiz R; de Oliveira, Regina Costa

    2007-01-01

    Background The xylem-inhabiting bacterium Xylella fastidiosa (Xf) is the causal agent of Pierce's disease (PD) in vineyards and citrus variegated chlorosis (CVC) in orange trees. Both of these economically-devastating diseases are caused by distinct strains of this complex group of microorganisms, which has motivated researchers to conduct extensive genomic sequencing projects with Xf strains. This sequence information, along with other molecular tools, have been used to estimate the evolutionary history of the group and provide clues to understand the capacity of Xf to infect different hosts, causing a variety of symptoms. Nonetheless, although significant amounts of information have been generated from Xf strains, a large proportion of these efforts has concentrated on the study of North American strains, limiting our understanding about the genomic composition of South American strains – which is particularly important for CVC-associated strains. Results This paper describes the first genome-wide comparison among South American Xf strains, involving 6 distinct citrus-associated bacteria. Comparative analyses performed through a microarray-based approach allowed identification and characterization of large mobile genetic elements that seem to be exclusive to South American strains. Moreover, a large-scale sequencing effort, based on Suppressive Subtraction Hybridization (SSH), identified 290 new ORFs, distributed in 135 Groups of Orthologous Elements, throughout the genomes of these bacteria. Conclusion Results from microarray-based comparisons provide further evidence concerning activity of horizontally transferred elements, reinforcing their importance as major mediators in the evolution of Xf. Moreover, the microarray-based genomic profiles showed similarity between Xf strains 9a5c and Fb7, which is unexpected, given the geographical and chronological differences associated with the isolation of these microorganisms. The newly identified ORFs, obtained by

  17. Comparative Genomics of Interreplichore Translocations in Bacteria: A Measure of Chromosome Topology?

    PubMed Central

    Khedkar, Supriya; Seshasayee, Aswin Sai Narain

    2016-01-01

    Genomes evolve not only in base sequence but also in terms of their architecture, defined by gene organization and chromosome topology. Whereas genome sequence data inform us about the changes in base sequences for a large variety of organisms, the study of chromosome topology is restricted to a few model organisms studied using microscopy and chromosome conformation capture techniques. Here, we exploit whole genome sequence data to study the link between gene organization and chromosome topology in bacteria. Using comparative genomics across ∼250 pairs of closely related bacteria we show that: (a) many organisms show a high degree of interreplichore translocations throughout the chromosome and not limited to the inversion-prone terminus (ter) or the origin of replication (oriC); (b) translocation maps may reflect chromosome topologies; and (c) symmetric interreplichore translocations do not disrupt the distance of a gene from oriC or affect gene expression states or strand biases in gene densities. In summary, we suggest that translocation maps might be a first line in defining a gross chromosome topology given a pair of closely related genome sequences. PMID:27172194

  18. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

    PubMed Central

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2005-01-01

    We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085

  19. Comparative Genomics Provides Insight into the Diversity of the Attaching and Effacing Escherichia coli Virulence Plasmids

    PubMed Central

    Hazen, Tracy H.; Kaper, James B.; Nataro, James P.

    2015-01-01

    Attaching and effacing Escherichia coli (AEEC) strains are a genomically diverse group of diarrheagenic E. coli strains that are characterized by the presence of the locus of enterocyte effacement (LEE) genomic island, which encodes a type III secretion system that is essential to virulence. AEEC strains can be further classified as either enterohemorrhagic E. coli (EHEC), typical enteropathogenic E. coli (EPEC), or atypical EPEC, depending on the presence or absence of the Shiga toxin genes or bundle-forming pilus (BFP) genes. Recent AEEC genomic studies have focused on the diversity of the core genome, and less is known regarding the genetic diversity and relatedness of AEEC plasmids. Comparative genomic analyses in this study demonstrated genetic similarity among AEEC plasmid genes involved in plasmid replication conjugative transfer and maintenance, while the remainder of the plasmids had sequence variability. Investigation of the EPEC adherence factor (EAF) plasmids, which carry the BFP genes, demonstrated significant plasmid diversity even among isolates within the same phylogenomic lineage, suggesting that these EAF-like plasmids have undergone genetic modifications or have been lost and acquired multiple times. Global transcriptional analyses of the EPEC prototype isolate E2348/69 and two EAF plasmid mutants of this isolate demonstrated that the plasmid genes influence the expression of a number of chromosomal genes in addition to the LEE. This suggests that the genetic diversity of the EAF plasmids could contribute to differences in the global virulence regulons of EPEC isolates. PMID:26238712

  20. Use of methylation filtration and C(0)t fractionation for analysis of genome composition and comparative genomics in bread wheat.

    PubMed

    Bandopadhyay, Rajib; Rustgi, Sachin; Chaudhuri, Rajat Kanti; Khurana, Paramjit; Khurana, Jitendra Paul; Tyagi, Akhilesh Kumar; Balyan, Harindra Singh; Houben, Andreas; Gupta, Pushpendra Kumar

    2011-07-20

    We investigated the compositional and structural differences in sequences derived from different fractions of wheat genomic DNA obtained using methylation filtration and C(0)t fractionation. Comparative analysis of these sequences revealed large compositional and structural variations in terms of GC content, different structural elements including repeat sequences (e.g., transposable elements and simple sequence repeats), protein coding genes, and non-coding RNA genes. A correlation between methylation status [determined on the basis of selective inclusion/exclusion in methylation-filtered (MF) library] of different repeat elements and expression level was observed. The expression levels were determined by comparing MF sequences with expressed sequence tags (ESTs) available in the public domain. Only a limited overlap among MF, high C(0)t (HC), and ESTs was observed, suggesting that these sequences may largely either represent the low-copy non-transcribed sequences or include genes with low expression levels. Thus, these results indicated a need to study MF and HC sequences along with ESTs to fully appreciate complexity of wheat gene space. PMID:21777856

  1. Mitochondrial and nuclear genomic responses to loss of LRPPRC expression.

    PubMed

    Gohil, Vishal M; Nilsson, Roland; Belcher-Timme, Casey A; Luo, Biao; Root, David E; Mootha, Vamsi K

    2010-04-30

    Rapid advances in genotyping and sequencing technology have dramatically accelerated the discovery of genes underlying human disease. Elucidating the function of such genes and understanding their role in pathogenesis, however, remain challenging. Here, we introduce a genomic strategy to characterize such genes functionally, and we apply it to LRPPRC, a poorly studied gene that is mutated in Leigh syndrome, French-Canadian type (LSFC). We utilize RNA interference to engineer an allelic series of cellular models in which LRPPRC has been stably silenced to different levels of knockdown efficiency. We then combine genome-wide expression profiling with gene set enrichment analysis to identify cellular responses that correlate with the loss of LRPPRC. Using this strategy, we discovered a specific role for LRPPRC in the expression of all mitochondrial DNA-encoded mRNAs, but not the rRNAs, providing mechanistic insights into the enzymatic defects observed in the disease. Our analysis shows that nuclear genes encoding mitochondrial proteins are not collectively affected by the loss of LRPPRC. We do observe altered expression of genes related to hexose metabolism, prostaglandin synthesis, and glycosphingolipid biology that may either play an adaptive role in cell survival or contribute to pathogenesis. The combination of genetic perturbation, genomic profiling, and pathway analysis represents a generic strategy for understanding disease pathogenesis. PMID:20220140

  2. Comparative genomics of mitochondria in chlorarachniophyte algae: endosymbiotic gene transfer and organellar genome dynamics

    PubMed Central

    Tanifuji, Goro; Archibald, John M.; Hashimoto, Tetsuo

    2016-01-01

    Chlorarachniophyte algae possess four DNA-containing compartments per cell, the nucleus, mitochondrion, plastid and nucleomorph, the latter being a relic nucleus derived from a secondary endosymbiont. While the evolutionary dynamics of plastid and nucleomorph genomes have been investigated, a comparative investigation of mitochondrial genomes (mtDNAs) has not been carried out. We have sequenced the complete mtDNA of Lotharella oceanica and compared it to that of another chlorarachniophyte, Bigelowiella natans. The linear mtDNA of L. oceanica is 36.7 kbp in size and contains 35 protein genes, three rRNAs and 24 tRNAs. The codons GUG and UUG appear to be capable of acting as initiation codons in the chlorarachniophyte mtDNAs, in addition to AUG. Rpl16, rps4 and atp8 genes are missing in L.oceanica mtDNA, despite being present in B. natans mtDNA. We searched for, and found, mitochondrial rpl16 and rps4 genes with spliceosomal introns in the L. oceanica nuclear genome, indicating that mitochondrion-to-host-nucleus gene transfer occurred after the divergence of these two genera. Despite being of similar size and coding capacity, the level of synteny between L. oceanica and B. natans mtDNA is low, suggesting frequent rearrangements. Overall, our results suggest that chlorarachniophyte mtDNAs are more evolutionarily dynamic than their plastid counterparts. PMID:26888293

  3. Comparative genomics of mitochondria in chlorarachniophyte algae: endosymbiotic gene transfer and organellar genome dynamics

    NASA Astrophysics Data System (ADS)

    Tanifuji, Goro; Archibald, John M.; Hashimoto, Tetsuo

    2016-02-01

    Chlorarachniophyte algae possess four DNA-containing compartments per cell, the nucleus, mitochondrion, plastid and nucleomorph, the latter being a relic nucleus derived from a secondary endosymbiont. While the evolutionary dynamics of plastid and nucleomorph genomes have been investigated, a comparative investigation of mitochondrial genomes (mtDNAs) has not been carried out. We have sequenced the complete mtDNA of Lotharella oceanica and compared it to that of another chlorarachniophyte, Bigelowiella natans. The linear mtDNA of L. oceanica is 36.7 kbp in size and contains 35 protein genes, three rRNAs and 24 tRNAs. The codons GUG and UUG appear to be capable of acting as initiation codons in the chlorarachniophyte mtDNAs, in addition to AUG. Rpl16, rps4 and atp8 genes are missing in L.oceanica mtDNA, despite being present in B. natans mtDNA. We searched for, and found, mitochondrial rpl16 and rps4 genes with spliceosomal introns in the L. oceanica nuclear genome, indicating that mitochondrion-to-host-nucleus gene transfer occurred after the divergence of these two genera. Despite being of similar size and coding capacity, the level of synteny between L. oceanica and B. natans mtDNA is low, suggesting frequent rearrangements. Overall, our results suggest that chlorarachniophyte mtDNAs are more evolutionarily dynamic than their plastid counterparts.

  4. Comparative expression pathway analysis of human and canine mammary tumors

    PubMed Central

    Uva, Paolo; Aurisicchio, Luigi; Watters, James; Loboda, Andrey; Kulkarni, Amit; Castle, John; Palombo, Fabio; Viti, Valentina; Mesiti, Giuseppe; Zappulli, Valentina; Marconato, Laura; Abramo, Francesca; Ciliberto, Gennaro; Lahm, Armin; La Monica, Nicola; de Rinaldis, Emanuele

    2009-01-01

    Background Spontaneous tumors in dog have been demonstrated to share many features with their human counterparts, including relevant molecular targets, histological appearance, genetics, biological behavior and response to conventional treatments. Mammary tumors in dog therefore provide an attractive alternative to more classical mouse models, such as transgenics or xenografts, where the tumour is artificially induced. To assess the extent to which dog tumors represent clinically significant human phenotypes, we performed the first genome-wide comparative analysis of transcriptional changes occurring in mammary tumors of the two species, with particular focus on the molecular pathways involved. Results We analyzed human and dog gene expression data derived from both tumor and normal mammary samples. By analyzing the expression levels of about ten thousand dog/human orthologous genes we observed a significant overlap of genes deregulated in the mammary tumor samples, as compared to their normal counterparts. Pathway analysis of gene expression data revealed a great degree of similarity in the perturbation of many cancer-related pathways, including the 'PI3K/AKT', 'KRAS', 'PTEN', 'WNT-beta catenin' and 'MAPK cascade'. Moreover, we show that the transcriptional relationships between different gene signatures observed in human breast cancer are largely maintained in the canine model, suggesting a close interspecies similarity in the network of cancer signalling circuitries. Conclusion Our data confirm and further strengthen the value of the canine mammary cancer model and open up new perspectives for the evaluation of novel cancer therapeutics and the development of prognostic and diagnostic biomarkers to be used in clinical studies. PMID:19327144

  5. Genome Sequencing and Comparative Genomics of the Broad Host-Range Pathogen Rhizoctonia solani AG8

    PubMed Central

    Hane, James K.; Anderson, Jonathan P.; Williams, Angela H.; Sperschneider, Jana; Singh, Karam B.

    2014-01-01

    Rhizoctonia solani is a soil-borne basidiomycete fungus with a necrotrophic lifestyle which is classified into fourteen reproductively incompatible anastomosis groups (AGs). One of these, AG8, is a devastating pathogen causing bare patch of cereals, brassicas and legumes. R. solani is a multinucleate heterokaryon containing significant heterozygosity within a single cell. This complexity posed significant challenges for the assembly of its genome. We present a high quality genome assembly of R. solani AG8 and a manually curated set of 13,964 genes supported by RNA-seq. The AG8 genome assembly used novel methods to produce a haploid representation of its heterokaryotic state. The whole-genomes of AG8, the rice pathogen AG1-IA and the potato pathogen AG3 were observed to be syntenic and co-linear. Genes and functions putatively relevant to pathogenicity were highlighted by comparing AG8 to known pathogenicity genes, orthology databases spanning 197 phytopathogenic taxa and AG1-IA. We also observed SNP-level “hypermutation” of CpG dinucleotides to TpG between AG8 nuclei, with similarities to repeat-induced point mutation (RIP). Interestingly, gene-coding regions were widely affected along with repetitive DNA, which has not been previously observed for RIP in mononuclear fungi of the Pezizomycotina. The rate of heterozygous SNP mutations within this single isolate of AG8 was observed to be higher than SNP mutation rates observed across populations of most fungal species compared. Comparative analyses were combined to predict biological processes relevant to AG8 and 308 proteins with effector-like characteristics, forming a valuable resource for further study of this pathosystem. Predicted effector-like proteins had elevated levels of non-synonymous point mutations relative to synonymous mutations (dN/dS), suggesting that they may be under diversifying selection pressures. In addition, the distant relationship to sequenced necrotrophs of the Ascomycota suggests the

  6. Genome sequencing and comparative genomics of the broad host-range pathogen Rhizoctonia solani AG8.

    PubMed

    Hane, James K; Anderson, Jonathan P; Williams, Angela H; Sperschneider, Jana; Singh, Karam B

    2014-05-01

    Rhizoctonia solani is a soil-borne basidiomycete fungus with a necrotrophic lifestyle which is classified into fourteen reproductively incompatible anastomosis groups (AGs). One of these, AG8, is a devastating pathogen causing bare patch of cereals, brassicas and legumes. R. solani is a multinucleate heterokaryon containing significant heterozygosity within a single cell. This complexity posed significant challenges for the assembly of its genome. We present a high quality genome assembly of R. solani AG8 and a manually curated set of 13,964 genes supported by RNA-seq. The AG8 genome assembly used novel methods to produce a haploid representation of its heterokaryotic state. The whole-genomes of AG8, the rice pathogen AG1-IA and the potato pathogen AG3 were observed to be syntenic and co-linear. Genes and functions putatively relevant to pathogenicity were highlighted by comparing AG8 to known pathogenicity genes, orthology databases spanning 197 phytopathogenic taxa and AG1-IA. We also observed SNP-level "hypermutation" of CpG dinucleotides to TpG between AG8 nuclei, with similarities to repeat-induced point mutation (RIP). Interestingly, gene-coding regions were widely affected along with repetitive DNA, which has not been previously observed for RIP in mononuclear fungi of the Pezizomycotina. The rate of heterozygous SNP mutations within this single isolate of AG8 was observed to be higher than SNP mutation rates observed across populations of most fungal species compared. Comparative analyses were combined to predict biological processes relevant to AG8 and 308 proteins with effector-like characteristics, forming a valuable resource for further study of this pathosystem. Predicted effector-like proteins had elevated levels of non-synonymous point mutations relative to synonymous mutations (dN/dS), suggesting that they may be under diversifying selection pressures. In addition, the distant relationship to sequenced necrotrophs of the Ascomycota suggests the R

  7. ReplicationDomain: a visualization tool and comparative database for genome-wide replication timing data

    PubMed Central

    Weddington, Nodin; Stuy, Alexander; Hiratani, Ichiro; Ryba, Tyrone; Yokochi, Tomoki; Gilbert, David M

    2008-01-01

    Background Eukaryotic DNA replication is regulated at the level of large chromosomal domains (0.5–5 megabases in mammals) within which replicons are activated relatively synchronously. These domains replicate in a specific temporal order during S-phase and our genome-wide analyses of replication timing have demonstrated that this temporal order of domain replication is a stable property of specific cell types. Results We have developed ReplicationDomain as a web-based database for analysis of genome-wide replication timing maps (replication profiles) from various cell lines and species. This database also provides comparative information of transcriptional expression and is configured to display any genome-wide property (for instance, ChIP-Chip or ChIP-Seq data) via an interactive web interface. Our published microarray data sets are publicly available. Users may graphically display these data sets for a selected genomic region and download the data displayed as text files, or alternatively, download complete genome-wide data sets. Furthermore, we have implemented a user registration system that allows registered users to upload their own data sets. Upon uploading, registered users may choose to: (1) view their data sets privately without sharing; (2) share with other registered users; or (3) make their published or "in press" data sets publicly available, which can fulfill journal and funding agencies' requirements for data sharing. Conclusion ReplicationDomain is a novel and powerful tool to facilitate the comparative visualization of replication timing in various cell types as well as other genome-wide chromatin features and is considerably faster and more convenient than existing browsers when viewing multi-megabase segments of chromosomes. Furthermore, the data upload function with the option of private viewing or sharing of data sets between registered users should be a valuable resource for the scientific community. PMID:19077204

  8. Comparative genomics of xylose-fermenting fungi for enhanced biofuel production

    SciTech Connect

    Wohlbach, Dana J.; Kuo, Alan; Sato, Trey K.; Potts, Katlyn M.; Salamov, Asaf A.; LaButti, Kurt M.; Sun, Hui; Clum, Alicia; Pangilinan, Jasmyn L.; Lindquist, Erika A.; Lucas, Susan; Lapidus, Alla; Jin, Mingjie; Gunawan, Christa; Balan, Venkatesh; Dale, Bruce E.; Jeffries, Thomas W.; Zinkel, Robert; Barry, Kerrie W.; Grigoriev, Igor V.; Gasch, Audrey P.

    2011-02-24

    Cellulosic biomass is an abundant and underused substrate for biofuel production. The inability of many microbes to metabolize the pentose sugars abundant within hemicellulose creates specific challenges for microbial biofuel production from cellulosic material. Although engineered strains of Saccharomyces cerevisiae can use the pentose xylose, the fermentative capacity pales in comparison with glucose, limiting the economic feasibility of industrial fermentations. To better understand xylose utilization for subsequent microbial engineering, we sequenced the genomes of two xylose-fermenting, beetle-associated fungi, Spathaspora passalidarum and Candida tenuis. To identify genes involved in xylose metabolism, we applied a comparative genomic approach across 14 Ascomycete genomes, mapping phenotypes and genotypes onto the fungal phylogeny, and measured genomic expression across five Hemiascomycete species with different xylose-consumption phenotypes. This approach implicated many genes and processes involved in xylose assimilation. Several of these genes significantly improved xylose utilization when engineered into S. cerevisiae, demonstrating the power of comparative methods in rapidly identifying genes for biomass conversion while reflecting on fungal ecology.

  9. Integrated Genomic and Gene Expression Profiling Identifies Two Major Genomic Circuits in Urothelial Carcinoma

    PubMed Central

    Lindgren, David; Sjödahl, Gottfrid; Lauss, Martin; Staaf, Johan; Chebil, Gunilla; Lövgren, Kristina; Gudjonsson, Sigurdur; Liedberg, Fredrik; Patschan, Oliver; Månsson, Wiking; Fernö, Mårten; Höglund, Mattias

    2012-01-01

    Similar to other malignancies, urothelial carcinoma (UC) is characterized by specific recurrent chromosomal aberrations and gene mutations. However, the interconnection between specific genomic alterations, and how patterns of chromosomal alterations adhere to different molecular subgroups of UC, is less clear. We applied tiling resolution array CGH to 146 cases of UC and identified a number of regions harboring recurrent focal genomic amplifications and deletions. Several potential oncogenes were included in the amplified regions, including known oncogenes like E2F3, CCND1, and CCNE1, as well as new candidate genes, such as SETDB1 (1q21), and BCL2L1 (20q11). We next combined genome profiling with global gene expression, gene mutation, and protein expression data and identified two major genomic circuits operating in urothelial carcinoma. The first circuit was characterized by FGFR3 alterations, overexpression of CCND1, and 9q and CDKN2A deletions. The second circuit was defined by E3F3 amplifications and RB1 deletions, as well as gains of 5p, deletions at PTEN and 2q36, 16q, 20q, and elevated CDKN2A levels. TP53/MDM2 alterations were common for advanced tumors within the two circuits. Our data also suggest a possible RAS/RAF circuit. The tumors with worst prognosis showed a gene expression profile that indicated a keratinized phenotype. Taken together, our integrative approach revealed at least two separate networks of genomic alterations linked to the molecular diversity seen in UC, and that these circuits may reflect distinct pathways of tumor development. PMID:22685613

  10. Genome-wide Comparative Analysis of Annexin Superfamily in Plants

    PubMed Central

    Jami, Sravan Kumar; Clark, Greg B.; Ayele, Belay T.; Ashe, Paula; Kirti, Pulugurtha Bharadwaja

    2012-01-01

    Most annexins are calcium-dependent, phospholipid-binding proteins with suggested functions in response to environmental stresses and signaling during plant growth and development. They have previously been identified and characterized in Arabidopsis and rice, and constitute a multigene family in plants. In this study, we performed a comparative analysis of annexin gene families in the sequenced genomes of Viridiplantae ranging from unicellular green algae to multicellular plants, and identified 149 genes. Phylogenetic studies of these deduced annexins classified them into nine different arbitrary groups. The occurrence and distribution of bona fide type II calcium binding sites within the four annexin domains were found to be different in each of these groups. Analysis of chromosomal distribution of annexin genes in rice, Arabidopsis and poplar revealed their localization on various chromosomes with some members also found on duplicated chromosomal segments leading to gene family expansion. Analysis of gene structure suggests sequential or differential loss of introns during the evolution of land plant annexin genes. Intron positions and phases are well conserved in annexin genes from representative genomes ranging from Physcomitrella to higher plants. The occurrence of alternative motifs such as K/R/HGD was found to be overlapping or at the mutated regions of the type II calcium binding sites indicating potential functional divergence in certain plant annexins. This study provides a basis for further functional analysis and characterization of annexin multigene families in the plant lineage. PMID:23133603

  11. Abnormal expression of DNA methyltransferases and genomic imprinting in cloned goat fibroblasts.

    PubMed

    Wan, Yongjie; Deng, Mingtian; Zhang, Guomin; Ren, Caifang; Zhang, Hao; Zhang, Yanli; Wang, Lizhong; Wang, Feng

    2016-01-01

    Somatic cell nuclear transfer (SCNT) is a useful way to produce cloned animals. However, SCNT animals exhibit DNA methylation and genomic imprinting abnormalities. These abnormalities may be due to the faulty epigenetic reprogramming of donor cells. To investigate the consequence of SCNT on the genomic imprinting and global methylation in the donor cells, growth patterns and apoptosis of cloned goat fibroblast cells (CGFCs) at passage 7 were determined. Growth patterns in CGFCs were similar to the controls; however, the growth rate in log phase was lower and apoptosis in CGFCs were significantly higher (P < 0.01). In addition, quantitative expression analysis of three DNA methyltransferases (Dnmt) and two imprinted genes (H19, IGF2R) was conducted in CGFCs: Dnmt1 and Dnmt3b expression was significantly reduced (P < 0.01), and H19 expression was decreased sixfold (P < 0.01); however, the expression of Dnmt3a was unaltered and IGF2R expression was significantly increased (P < 0.05). Finally, we used bisulfite sequencing PCR to compare the DNA methylation patterns in differentially methylated regions (DMRs) of H19 and IGF2R. The DMRs of H19 (P < 0.01) and IGF2R (P < 0.01) were both highly methylated in CGFCs. These results indicate that the global genome might be hypomethylated. Moreover, there is an aberrant expression of imprinted genes and DMR methylation in CGFCs. PMID:26314395

  12. Characterization of Three Mycobacterium spp. with Potential Use in Bioremediation by Genome Sequencing and Comparative Genomics

    PubMed Central

    Das, Sarbashis; Pettersson, B.M. Fredrik; Behra, Phani Rama Krishna; Ramesh, Malavika; Dasgupta, Santanu; Bhattacharya, Alok; Kirsebom, Leif A.

    2015-01-01

    We provide the genome sequences of the type strains of the polychlorophenol-degrading Mycobacterium chlorophenolicum (DSM43826), the degrader of chlorinated aliphatics Mycobacterium chubuense (DSM44219) and Mycobacterium obuense (DSM44075) that has been tested for use in cancer immunotherapy. The genome sizes of M. chlorophenolicum, M. chubuense, and M. obuense are 6.93, 5.95, and 5.58 Mb with GC-contents of 68.4%, 69.2%, and 67.9%, respectively. Comparative genomic analysis revealed that 3,254 genes are common and we predicted approximately 250 genes acquired through horizontal gene transfer from different sources including proteobacteria. The data also showed that the biodegrading Mycobacterium spp. NBB4, also referred to as M. chubuense NBB4, is distantly related to the M. chubuense type strain and should be considered as a separate species, we suggest it to be named Mycobacterium ethylenense NBB4. Among different categories we identified genes with potential roles in: biodegradation of aromatic compounds and copper homeostasis. These are the first nonpathogenic Mycobacterium spp. found harboring genes involved in copper homeostasis. These findings would therefore provide insight into the role of this group of Mycobacterium spp. in bioremediation as well as the evolution of copper homeostasis within the Mycobacterium genus. PMID:26079817

  13. Establishing a framework for comparative analysis of genome sequences

    SciTech Connect

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  14. High-resolution genomic profiles of breast cancer cell lines assessed by tiling BAC array comparative genomic hybridization.

    PubMed

    Jönsson, Göran; Staaf, Johan; Olsson, Eleonor; Heidenblad, Markus; Vallon-Christersson, Johan; Osoegawa, Kazutoyo; de Jong, Pieter; Oredsson, Stina; Ringnér, Markus; Höglund, Mattias; Borg, Ake

    2007-06-01

    A BAC-array platform for comparative genomic hybridization was constructed from a library of 32,433 clones providing complete genome coverage, and evaluated by screening for DNA copy number changes in 10 breast cancer cell lines (BT474, MCF7, HCC1937, SK-BR-3, L56Br-C1, ZR-75-1, JIMT1, MDA-MB-231, MDA-MB-361, and HCC2218) and one cell line derived from fibrocystic disease of the breast (MCF10A). These were also characterized by gene expression analysis and found to represent all five recently described breast cancer subtypes using the "intrinsic gene set" and centroid correlation. Three cell lines, HCC1937 and L56BrC1 derived from BRCA1 mutation carriers and MDA-MB-231, were of basal-like subtype and characterized by a high frequency of low-level gains and losses of typical pattern, including limited deletions on 5q. Four estrogen receptor positive cell lines were of luminal A subtype and characterized by a different pattern of aberrations and high-level amplifications, including ERBB2 and other 17q amplicons in BT474 and MDA-MB-361. SK-BR-3 cells, characterized by a complex genome including ERBB2 amplification, massive high-level amplifications on 8q and a homozygous deletion of CDH1 at 16q22, had an expression signature closest to luminal B subtype. The effects of gene amplifications were verified by gene expression analysis to distinguish targeted genes from silent amplicon passengers. JIMT1, derived from an ERBB2 amplified trastuzumab resistant tumor, was of the ERBB2 subtype. Homozygous deletions included other known targets such as PTEN (HCC1937) and CDKN2A (MDA-MB-231, MCF10A), but also new candidate suppressor genes such as FUSSEL18 (HCC1937) and WDR11 (L56Br-C1) as well as regions without known genes. The tiling BAC-arrays constitute a powerful tool for high-resolution genomic profiling suitable for cancer research and clinical diagnostics. PMID:17334996

  15. Design of a Seven-Genome Escherichia coli Microarray for Comparative Genomic Profiling▿

    PubMed Central

    Willenbrock, Hanni; Petersen, Anne; Sekse, Camilla; Kiil, Kristoffer; Wasteson, Yngvild; Ussery, David W.

    2006-01-01

    We describe the design and evaluate the use of a high-density oligonucleotide microarray covering seven sequenced Escherichia coli genomes in addition to several sequenced E. coli plasmids, bacteriophages, pathogenicity islands, and virulence genes. Its utility is demonstrated for comparative genomic profiling of two unsequenced strains, O175:H16 D1 and O157:H7 3538 (Δstx2::cat) as well as two well-known control strains, K-12 W3110 and O157:H7 EDL933. By using fluorescently labeled genomic DNA to query the microarrays and subsequently analyze common virulence genes and phage elements and perform whole-genome comparisons, we observed that O175:H16 D1 is a K-12-like strain and confirmed that its φ3538 (Δstx2::cat) phage element originated from the E. coli 3538 (Δstx2::cat) strain, with which it shares a substantial proportion of phage elements. Moreover, a number of genes involved in DNA transfer and recombination was identified in both new strains, providing a likely explanation for their capability to transfer φ3538 (Δstx2::cat) between them. Analyses of control samples demonstrated that results using our custom-designed microarray were representative of the true biology, e.g., by confirming the presence of all known chromosomal phage elements as well as 98.8 and 97.7% of queried chromosomal genes for the two control strains. Finally, we demonstrate that use of spatial information, in terms of the physical chromosomal locations of probes, improves the analysis. PMID:16963574

  16. Comparative genomics of metabolic capacities of regulons controlled by cis-regulatory RNA motifs in bacteria

    PubMed Central

    2013-01-01

    Background In silico comparative genomics approaches have been efficiently used for functional prediction and reconstruction of metabolic and regulatory networks. Riboswitches are metabolite-sensing structures often found in bacterial mRNA leaders controlling gene expression on transcriptional or translational levels. An increasing number of riboswitches and other cis-regulatory RNAs have been recently classified into numerous RNA families in the Rfam database. High conservation of these RNA motifs provides a unique advantage for their genomic identification and comparative analysis. Results A comparative genomics approach implemented in the RegPredict tool was used for reconstruction and functional annotation of regulons controlled by RNAs from 43 Rfam families in diverse taxonomic groups of Bacteria. The inferred regulons include ~5200 cis-regulatory RNAs and more than 12000 target genes in 255 microbial genomes. All predicted RNA-regulated genes were classified into specific and overall functional categories. Analysis of taxonomic distribution of these categories allowed us to establish major functional preferences for each analyzed cis-regulatory RNA motif family. Overall, most RNA motif regulons showed predictable functional content in accordance with their experimentally established effector ligands. Our results suggest that some RNA motifs (including thiamin pyrophosphate and cobalamin riboswitches that control the cofactor metabolism) are widespread and likely originated from the last common ancestor of all bacteria. However, many more analyzed RNA motifs are restricted to a narrow taxonomic group of bacteria and likely represent more recent evolutionary innovations. Conclusions The reconstructed regulatory networks for major known RNA motifs substantially expand the existing knowledge of transcriptional regulation in bacteria. The inferred regulons can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and

  17. Development of peanut EST (expressed sequence tag)-based genomic resources and tools

    Technology Transfer Automated Retrieval System (TEKTRAN)

    U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

  18. Expression genomics and drug development: towards predictive pharmacology.

    PubMed

    Liu, Edison T

    2005-02-01

    Expression genomics can be defined as the study of the dynamic transciptome and its regulatory elements. Technologies are available that can assess transcripts on a genome-wide scale over time and across many samples. This comprehensive and dynamic database is being used to decipher signalling pathways and to identify new biomarkers and targets. Biomarkers emerging from these studies have prognostic potential and can be used to predict therapeutic outcome. The multiplex nature of this approach not only telescopes the time to discovery, but also allows for detection of complex interactions. Taken together, these capabilities, if carefully used, can speed drug development, enhance the identification of potent drug combinations and identify patient populations that will benefit from these new drugs. PMID:15814022

  19. Genome organization and gene expression of saguaro cactus carmovirus.

    PubMed

    Weng, Z; Xiong, Z

    1997-03-01

    The complete sequence of the single-stranded, (+)-sense RNA genome of saguaro cactus carmovirus (SCV) has been determined. The 3879 nucleotide genome contains five open reading frames (ORFs). The 5'-proximal ORF encodes a 26 kDa protein (p26) and terminates with an amber codon which is readthrough into an in-frame p57 ORF to generate an 86 kDa fusion protein (p86). Two small, centrally located ORFs encode a 6 kDa protein (p6) and a 9 kDa protein (p9), respectively. The 3'-proximal ORF encodes a 37 kDa (p37) capsid protein (CP). Analysis of the nucleotide and predicted amino acid sequences supports the classification of SCV in the genus Carmovirus in the family Tombusviridae. All predicted SCV proteins are expressed in an in vitro translation system. SCV p26 and the readthrough fusion protein p86 are synthesized from the genomic RNA while p6, p9 and p37 CP ORFs at the 3' half of the genome are expressed from two subgenomic (sg) RNAs. The 5' termini of both sg RNAs have been mapped. The large 1614 nucleotide sg RNA contains the p6 and p9 ORFs as the first and the second ORFs respectively from its 5' end. It directs the synthesis of abundant p6 but a small amount of p9. While a synthetic transcript with the p9 ORF at the 5' end is a more efficient messenger for p9, no corresponding sg RNA has been identified in vivo. The smaller 1396 nucleotide sg RNA contains only the p37 ORF and directs the synthesis of SCV CP. PMID:9049400

  20. Automated comparative auditing of NCIT genomic roles using NCBI.

    PubMed

    Cohen, Barry; Oren, Marc; Min, Hua; Perl, Yehoshua; Halper, Michael

    2008-12-01

    Biomedical research has identified many human genes and various knowledge about them. The National Cancer Institute Thesaurus (NCIT) represents such knowledge as concepts and roles (relationships). Due to the rapid advances in this field, it is to be expected that the NCIT's Gene hierarchy will contain role errors. A comparative methodology to audit the Gene hierarchy with the use of the National Center for Biotechnology Information's (NCBI's) Entrez Gene database is presented. The two knowledge sources are accessed via a pair of Web crawlers to ensure up-to-date data. Our algorithms then compare the knowledge gathered from each, identify discrepancies that represent probable errors, and suggest corrective actions. The primary focus is on two kinds of gene-roles: (1) the chromosomal locations of genes, and (2) the biological processes in which genes play a role. Regarding chromosomal locations, the discrepancies revealed are striking and systematic, suggesting a structurally common origin. In regard to the biological processes, difficulties arise because genes frequently play roles in multiple processes, and processes may have many designations (such as synonymous terms). Our algorithms make use of the roles defined in the NCIT Biological Process hierarchy to uncover many probable gene-role errors in the NCIT. These results show that automated comparative auditing is a promising technique that can identify a large number of probable errors and corrections for them in a terminological genomic knowledge repository, thus facilitating its overall maintenance. PMID:18486558

  1. Genetic aberrations detected by comparative genomic hybridisation in vulvar cancers.

    PubMed

    Allen, D G; Hutchins, A-M; Hammet, F; White, D J; Scurry, J P; Tabrizi, S N; Garland, S M; Armes, J E

    2002-03-18

    Squamous cell carcinoma of the vulva is a disease of significant clinical importance, which arises in the presence or absence of human papillomavirus. We used comparative genomic hybridisation to document non-random chromosomal gains and losses within human papillomavirus positive and negative vulvar cancers. Gain of 3q was significantly more common in human papillomavirus-positive cancers compared to human papillomavirus-negative cancers. The smallest area of gain was 3q22-25, a chromosome region which is frequently gained in other human papillomavirus-related cancers. Chromosome 8q was more commonly gained in human papillomavirus-negative compared to human papillomavirus-positive cancers. 8q21 was the smallest region of gain, which has been identified in other, non-human papillomavirus-related cancers. Chromosome arms 3p and 11q were lost in both categories of vulvar cancer. This study has demonstrated chromosome locations important in the development of vulvar squamous cell carcinoma. Additionally, taken together with previous studies of human papillomavirus-positive cancers of other anogenital sites, the data indicate that one or more oncogenes important in the development and progression of human papillomavirus-induced carcinomas are located on 3q. The different genetic changes seen in human papillomavirus-positive and negative vulvar squamous cell carcinomas support the clinicopathological data indicating that these are different cancer types. PMID:11953825

  2. Automated Comparative Auditing of NCIT Genomic Roles Using NCBI

    PubMed Central

    Cohen, Barry; Oren, Marc; Min, Hua; Perl, Yehoshua; Halper, Michael

    2008-01-01

    Biomedical research has identified many human genes and various knowledge about them. The National Cancer Institute Thesaurus (NCIT) represents such knowledge as concepts and roles (relationships). Due to the rapid advances in this field, it is to be expected that the NCIT’s Gene hierarchy will contain role errors. A comparative methodology to audit the Gene hierarchy with the use of the National Center for Biotechnology Information’s (NCBI’s) Entrez Gene database is presented. The two knowledge sources are accessed via a pair of Web crawlers to ensure up-to-date data. Our algorithms then compare the knowledge gathered from each, identify discrepancies that represent probable errors, and suggest corrective actions. The primary focus is on two kinds of gene-roles: (1) the chromosomal locations of genes, and (2) the biological processes in which genes plays a role. Regarding chromosomal locations, the discrepancies revealed are striking and systematic, suggesting a structurally common origin. In regard to the biological processes, difficulties arise because genes frequently play roles in multiple processes, and processes may have many designations (such as synonymous terms). Our algorithms make use of the roles defined in the NCIT Biological Process hierarchy to uncover many probable gene-role errors in the NCIT. These results show that automated comparative auditing is a promising technique that can identify a large number of probable errors and corrections for them in a terminological genomic knowledge repository, thus facilitating its overall maintenance. PMID:18486558

  3. Comparative genomic analysis of hyperthermophilic archaeal fuselloviridae viruses

    SciTech Connect

    B. Wiedenheft; K. Stedman; F. Roberto; D. Willits; A. K. Gleske; L. Zoeller; J. Snyder; T. Douglas; M. Young

    2004-02-01

    The complete genome sequences of two Sulfolobus spindle-shaped viruses (SSVs) from acidic hot springs in Kamchatka (Russia) and Yellowstone National Park (United States) have been determined. These nonlytic temperate viruses were isolated from hyperthermophilic Sulfolobus hosts, and both viruses share the spindleshaped morphology characteristic of the Fuselloviridae family. These two genomes, in combination with the previously determined SSV1 genome from Japan and the SSV2 genome from Iceland, have allowed us to carry out a phylogenetic comparison of these geographically distributed hyperthermal viruses. Each virus contains a circular double-stranded DNA genome of _15 kbp with approximately 34 open reading frames (ORFs). These Fusellovirus ORFs show little or no similarity to genes in the public databases. In contrast, 18 ORFs are common to all four isolates and may represent the minimal gene set defining this viral group. In general, ORFs on one half of the genome are colinear and highly conserved, while ORFs on the other half are not. One shared ORF among all four genomes is an integrase of the tyrosine recombinase family. All four viral genomes integrate into their host tRNA genes. The specific tRNA gene used for integration varies, and one genome integrates into multiple loci. Several unique ORFs are found in the genome of each isolate.

  4. Comparative genomic analysis of hyperthermophilic archaeal Fuselloviridae viruses.

    PubMed

    Wiedenheft, Blake; Stedman, Kenneth; Roberto, Francisco; Willits, Deborah; Gleske, Anne-Kathrin; Zoeller, Luisa; Snyder, Jamie; Douglas, Trevor; Young, Mark

    2004-02-01

    The complete genome sequences of two Sulfolobus spindle-shaped viruses (SSVs) from acidic hot springs in Kamchatka (Russia) and Yellowstone National Park (United States) have been determined. These nonlytic temperate viruses were isolated from hyperthermophilic Sulfolobus hosts, and both viruses share the spindle-shaped morphology characteristic of the Fuselloviridae family. These two genomes, in combination with the previously determined SSV1 genome from Japan and the SSV2 genome from Iceland, have allowed us to carry out a phylogenetic comparison of these geographically distributed hyperthermal viruses. Each virus contains a circular double-stranded DNA genome of approximately 15 kbp with approximately 34 open reading frames (ORFs). These Fusellovirus ORFs show little or no similarity to genes in the public databases. In contrast, 18 ORFs are common to all four isolates and may represent the minimal gene set defining this viral group. In general, ORFs on one half of the genome are colinear and highly conserved, while ORFs on the other half are not. One shared ORF among all four genomes is an integrase of the tyrosine recombinase family. All four viral genomes integrate into their host tRNA genes. The specific tRNA gene used for integration varies, and one genome integrates into multiple loci. Several unique ORFs are found in the genome of each isolate. PMID:14747560

  5. Detection and mapping of amplified DNA sequences in breast cancer by comparative genomic hybridization

    SciTech Connect

    Kallioniemi, A.; Tanner, M.; Kallioniemi, O.P.; Piper, J.; Stokke, T.; Pinkel, D.; Gray, J.W.; Waldman, F.M.; Chen, L.; Smith, H.S.

    1994-03-15

    Comparative genomic hybridization was applied to 5 breast cancer cell lines and 33 primary tumors to discover and map regions of the genome with increased DNA-sequence copy-number. Two-thirds of primary tumors and almost all cell lines showed increased DNA-sequence copy-number affecting a total of 26 chromosomal subregions. Most of these loci were distinct from those of currently known amplified genes in breast cancer, with sequences originating from 17q22-q24 and 20q13 showing the highest frequency of amplification. The results indicate that these chromosomal regions may contain previously unknown genes whose increased expression contributes to breast cancer progression. Chromosomal regions with increased copy-number often spanned tens of Mb, suggesting involvement of more than one gene in each region.

  6. Phylogenetic Analysis and Comparative Genomics of Purine Riboswitch Distribution in Prokaryotes

    PubMed Central

    Singh, Payal; Sengupta, Supratim

    2012-01-01

    Riboswitches are regulatory RNA that control gene expression by undergoing conformational changes on ligand binding. Using phylogenetic analysis and comparative genomics we have been able to identify the class of genes/operons regulated by the purine riboswitch and obtain a high-resolution map of purine riboswitch distribution across all bacterial groups. In the process, we are able to explain the absence of purine riboswitches upstream to specific genes in certain genomes. We also identify the point of origin of various purine riboswitches and argue that not all purine riboswitches are of primordial origin, and that some purine riboswitches must have originated after the divergence of certain Firmicute orders in the course of evolution. Our study also reveals the role of horizontal transfer events in accounting for the presence of purine riboswitches in some gammaproteobacterial species. Our work provides significant insights into the origin, distribution and regulatory role of purine riboswitches in prokaryotes. PMID:23170063

  7. Comparative Genomic Analyses of the Human NPHP1 Locus Reveal Complex Genomic Architecture and Its Regional Evolution in Primates

    PubMed Central

    Yuan, Bo; Liu, Pengfei; Gupta, Aditya; Beck, Christine R.; Tejomurtula, Anusha; Campbell, Ian M.; Gambin, Tomasz; Simmons, Alexandra D.; Withers, Marjorie A.; Harris, R. Alan; Rogers, Jeffrey; Schwartz, David C.; Lupski, James R.

    2015-01-01

    Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100) is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs) are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases—about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR) between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV) haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual’s susceptibility to acquiring disease-associated alleles. PMID:26641089

  8. Array Comparative Genomic Hybridizations: Assessing the ability to recapture evolutionary relationships using an in silico approach

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Comparative Genomic Hybridization (CGH) with DNA microarrays has many biological applications including surveys of copy number changes in tumorigenesis, species detection and identification, and functional genomics studies among related organisms. Array CGH has also been used to infer phylogenetic r...

  9. A Glimpse of Streptococcal Toxic Shock Syndrome from Comparative Genomics of S. suis 2 Chinese Isolates

    PubMed Central

    Wang, Jing; Zheng, Feng; Pan, Xiuzhen; Liu, Di; Li, Ming; Song, Yajun; Zhu, Xinxing; Sun, Haibo; Feng, Tao; Guo, Zhaobiao; Ju, Aiping; Ge, Junchao; Dong, Yaqing; Sun, Wen; Jiang, Yongqiang; Wang, Jun; Yan, Jinghua; Yang, Huanming; Wang, Xiaoning; Gao, George F.; Yang, Ruifu; Wang, Jian; Yu, Jun

    2007-01-01

    Background Streptococcus suis serotype 2 (SS2) is an important zoonotic pathogen, causing more than 200 cases of severe human infection worldwide, with the hallmarks of meningitis, septicemia, arthritis, etc. Very recently, SS2 has been recognized as an etiological agent for streptococcal toxic shock syndrome (STSS), which was originally associated with Streptococcus pyogenes (GAS) in Streptococci. However, the molecular mechanisms underlying STSS are poorly understood. Methods and Findings To elucidate the genetic determinants of STSS caused by SS2, whole genome sequencing of 3 different Chinese SS2 strains was undertaken. Comparative genomics accompanied by several lines of experiments, including experimental animal infection, PCR assay, and expression analysis, were utilized to further dissect a candidate pathogenicity island (PAI). Here we show, for the first time, a novel molecular insight into Chinese isolates of highly invasive SS2, which caused two large-scale human STSS outbreaks in China. A candidate PAI of ∼89 kb in length, which is designated 89K and specific for Chinese SS2 virulent isolates, was investigated at the genomic level. It shares the universal properties of PAIs such as distinct GC content, consistent with its pivotal role in STSS and high virulence. Conclusions To our knowledge, this is the first PAI candidate from S. suis worldwide. Our finding thus sheds light on STSS triggered by SS2 at the genomic level, facilitates further understanding of its pathogenesis and points to directions of development on some effective strategies to combat highly pathogenic SS2 infections. PMID:17375201

  10. Evaluation of Apis mellifera syriaca Levant region honeybee conservation using comparative genome hybridization.

    PubMed

    Haddad, Nizar Jamal; Batainh, Ahmed; Saini, Deepti; Migdadi, Osama; Aiyaz, Mohamed; Manchiganti, Rushiraj; Krishnamurthy, Venkatesh; Al-Shagour, Banan; Brake, Mohammad; Bourgeois, Lelania; De Guzman, Lilia; Rinderer, Thomas; Hamouri, Zayed Mahoud

    2016-06-01

    Apis mellifera syriaca is the native honeybee subspecies of Jordan and much of the Levant region. It expresses behavioral adaptations to a regional climate with very high temperatures, nectar dearth in summer, attacks of the Oriental wasp and is resistant to Varroa mites. The A. m. syriaca control reference sample (CRS) in this study was originally collected and stored since 2001 from "Wadi Ben Hammad", a remote valley in the southern region of Jordan. Morphometric and mitochondrial DNA markers of these honeybees had shown highest similarity to reference A. m. syriaca samples collected in 1952 by Brother Adam of samples collected from the Middle East. Samples 1-5 were collected from the National Center for Agricultural Research and Extension breeding apiary which was established for the conservation of A. m. syriaca. Our objective was to determine the success of an A. m. syriaca honey bee conservation program using genomic information from an array-based comparative genomic hybridization platform to evaluate genetic similarities to a historic reference collection (CRS). Our results had shown insignificant genomic differences between the current population in the conservation program and the CRS indicated that program is successfully conserving A. m. syriaca. Functional genomic variations were identified which are useful for conservation monitoring and may be useful for breeding programs designed to improve locally adapted strains of A. m. syriaca. PMID:27010806

  11. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    SciTech Connect

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  12. The non-human primate reference transcriptome resource (NHPRTR) for comparative functional genomics

    PubMed Central

    Pipes, Lenore; Li, Sheng; Bozinoski, Marjan; Palermo, Robert; Peng, Xinxia; Blood, Phillip; Kelly, Sara; Weiss, Jeffrey M.; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Zumbo, Paul; Chen, Ronghua; Schroth, Gary P.; Mason, Christopher E.; Katze, Michael G.

    2013-01-01

    RNA-based next-generation sequencing (RNA-Seq) provides a tremendous amount of new information regarding gene and transcript structure, expression and regulation. This is particularly true for non-coding RNAs where whole transcriptome analyses have revealed that the much of the genome is transcribed and that many non-coding transcripts have widespread functionality. However, uniform resources for raw, cleaned and processed RNA-Seq data are sparse for most organisms and this is especially true for non-human primates (NHPs). Here, we describe a large-scale RNA-Seq data and analysis infrastructure, the NHP reference transcriptome resource (http://nhprtr.org); it presently hosts data from12 species of primates, to be expanded to 15 species/subspecies spanning great apes, old world monkeys, new world monkeys and prosimians. Data are collected for each species using pools of RNA from comparable tissues. We provide data access in advance of its deposition at NCBI, as well as browsable tracks of alignments against the human genome using the UCSC genome browser. This resource will continue to host additional RNA-Seq data, alignments and assemblies as they are generated over the coming years and provide a key resource for the annotation of NHP genomes as well as informing primate studies on evolution, reproduction, infection, immunity and pharmacology. PMID:23203872

  13. Expression profiling and comparative sequence derived insights into lipid metabolism

    SciTech Connect

    Callow, Matthew J.; Rubin, Edward M.

    2001-12-19

    Expression profiling and genomic DNA sequence comparisons are increasingly being applied to the identification and analysis of the genes involved in lipid metabolism. Not only has genome-wide expression profiling aided in the identification of novel genes involved in important processes in lipid metabolism such as sterol efflux, but the utilization of information from these studies has added to our understanding of the regulation of pathways participating in the process. Coupled with these gene expression studies, cross species comparison, searching for sequences conserved through evolution, has proven to be a powerful tool to identify important non-coding regulatory sequences as well as the discovery of novel genes relevant to lipid biology. An example of the value of this approach was the recent chance discovery of a new apolipoprotein gene (apo AV) that has dramatic effects upon triglyceride metabolism in mice and humans.

  14. Comparative in-silico genome analysis of Leishmania (Leishmania) donovani: A step towards its species specificity

    PubMed Central

    S., Satheesh Kumar; R.K., Gokulasuriyan; Ghosh, Monidipa

    2014-01-01

    Comparative genome analysis of recently sequenced Leishmania (L.) donovani was unexplored so far. The present study deals with the complete scanning of L. (L.) donovani genome revealing its interspecies variations. 60 distinctly present genes in L. (L.) donovani were identified when the whole genome was compared with Leishmania (L.) infantum. Similarly 72, 159, and 265 species specific genes were identified in L. (L.) donovani when compared to Leishmania (L.) major, Leishmania (L.) mexicana and Leishmania (Viannia) braziliensis respectively. The cross comparison of L. (L.) donovani in parallel with the other sequenced species of leishmanial led to the identification of 55 genes which are highly specific and expressed exclusively in L. (L.) donovani. We found mainly the discrepancies of surface proteins such as amastins, proteases, and peptidases. Also 415 repeat containing proteins in L. (L.) donovani and their differential distribution in other leishmanial species were identified which might have a potential role during pathogenesis. The genes identified can be evaluated as drug targets for anti-leishmanial treatment, exploring the scope for extensive future investigations. PMID:25606461

  15. The HLA genomic loci map: expression, interaction, diversity and disease.

    PubMed

    Shiina, Takashi; Hosomichi, Kazuyoshi; Inoko, Hidetoshi; Kulski, Jerzy K

    2009-01-01

    The human leukocyte antigen (HLA) super-locus is a genomic region in the chromosomal position 6p21 that encodes the six classical transplantation HLA genes and at least 132 protein coding genes that have important roles in the regulation of the immune system as well as some other fundamental molecular and cellular processes. This small segment of the human genome has been associated with more than 100 different diseases, including common diseases, such as diabetes, rheumatoid arthritis, psoriasis, asthma and various other autoimmune disorders. The first complete and continuous HLA 3.6 Mb genomic sequence was reported in 1999 with the annotation of 224 gene loci, including coding and non-coding genes that were reviewed extensively in 2004. In this review, we present (1) an updated list of all the HLA gene symbols, gene names, expression status, Online Mendelian Inheritance in Man (OMIM) numbers, including new genes, and latest changes to gene names and symbols, (2) a regional analysis of the extended class I, class I, class III, class II and extended class II subregions, (3) a summary of the interspersed repeats (retrotransposons and transposons), (4) examples of the sequence diversity between different HLA haplotypes, (5) intra- and extra-HLA gene interactions and (6) some of the HLA gene expression profiles and HLA genes associated with autoimmune and infectious diseases. Overall, the degrees and types of HLA super-locus coordinated gene expression profiles and gene variations have yet to be fully elucidated, integrated and defined for the processes involved with normal cellular and tissue physiology, inflammatory and immune responses, and autoimmune and infectious diseases. PMID:19158813

  16. Comparison of genomic abnormalities between BRCAX and sporadic breast cancers studied by comparative genomic hybridization.

    PubMed

    Gronwald, Jacek; Jauch, Anna; Cybulski, Cezary; Schoell, Brigitte; Böhm-Steuer, Barbara; Lener, Marcin; Grabowska, Ewa; Górski, Bohdan; Jakubowska, Anna; Domagała, Wenancjusz; Chosia, Maria; Scott, Rodney J; Lubiński, Jan

    2005-03-20

    Very little is known about the chromosomal regions harbouring genes involved in initiation and progression of BRCAX-associated breast cancers. We applied comparative genomic hybridization (CGH) to identify the most frequent genomic imbalances in 18 BRCAX hereditary breast cancers and compared them to chromosomal aberrations detected in a group of 27 sporadic breast cancers. The aberrations observed most frequently in BRCAX tumours were gains of 8q (83%), 19q (67%), 19p (61%), 20q (61%), 1q (56%), 17q (56%) and losses of 8p (56%), 11q (44%) and 13q (33%). The sporadic cases most frequently showed gains of 1q (67%), 8q (48%), 17q (37%), 16p (33%), 19q (33%) and losses of 11q (26%), 8p (22%) and 16q (19%). Losses of 8p and gains 8q, 19 as well as gains of 20q (with respect to ductal tumours only) were detected significantly more often in BRCAX than in sporadic breast cancers. Analysis of 8p-losses and 8q-gains showed that these aberrations are early events in the tumorigenesis of BRCAX tumors. The findings of this report indicate similarities between BRCAX and BRCA2 tumours, possibly suggesting a common pathway of disease. These findings need confirmation by more extensive studies because only a limited number of cases were analysed and there are relatively few reports published. PMID:15540206

  17. Expanding the repertoire of secretory peptides controlling root development with comparative genome analysis and functional assays

    PubMed Central

    Ghorbani, Sarieh; Lin, Yao-Cheng; Parizot, Boris; Fernandez, Ana; Njo, Maria Fransiska; Van de Peer, Yves; Beeckman, Tom; Hilson, Pierre

    2015-01-01

    Plant genomes encode numerous small secretory peptides (SSPs) whose functions have yet to be explored. Based on structural features that characterize SSP families known to take part in postembryonic development, this comparative genome analysis resulted in the identification of genes coding for oligopeptides potentially involved in cell-to-cell communication. Because genome annotation based on short sequence homology is difficult, the criteria for the de novo identification and aggregation of conserved SSP sequences were first benchmarked across five reference plant species. The resulting gene families were then extended to 32 genome sequences, including major crops. The global phylogenetic pattern common to the functionally characterized SSP families suggests that their apparition and expansion coincide with that of the land plants. The SSP families can be searched online for members, sequences and consensus (http://bioinformatics.psb.ugent.be/webtools/PlantSSP/). Looking for putative regulators of root development, Arabidopsis thaliana SSP genes were further selected through transcriptome meta-analysis based on their expression at specific stages and in specific cell types in the course of the lateral root formation. As an additional indication that formerly uncharacterized SSPs may control development, this study showed that root growth and branching were altered by the application of synthetic peptides matching conserved SSP motifs, sometimes in very specific ways. The strategy used in the study, combining comparative genomics, transcriptome meta-analysis and peptide functional assays in planta, pinpoints factors potentially involved in non-cell-autonomous regulatory mechanisms. A similar approach can be implemented in different species for the study of a wide range of developmental programmes. PMID:26195730

  18. Comparative paleogenomics of crucifers: ancestral genomic blocks revisited.

    PubMed

    Lysak, Martin A; Mandáková, Terezie; Schranz, M Eric

    2016-04-01

    A decade ago the concept of the Ancestral Crucifer Karyotype (ACK) and the definition of 24 conserved genomic blocks was presented. Subsequently, 35 cytogenetic reconstructions and/or draft genome sequences of crucifer species (members of the Brassicaceae family) have been analyzed in the context of this system; placing crucifers at the forefront of plant phylogenomics. In this review, we highlight how the ACK and genomic blocks have facilitated and guided genomic analysis of crucifers in the last 10 years and provide an update of this robust model. PMID:26945766

  19. CrusView: a Java-based visualization platform for comparative genomics analyses in Brassicaceae species.

    PubMed

    Chen, Hao; Wang, Xiangfeng

    2013-09-01

    In plants and animals, chromosomal breakage and fusion events based on conserved syntenic genomic blocks lead to conserved patterns of karyotype evolution among species of the same family. However, karyotype information has not been well utilized in genomic comparison studies. We present CrusView, a Java-based bioinformatic application utilizing Standard Widget Toolkit/Swing graphics libraries and a SQLite database for performing visualized analyses of comparative genomics data in Brassicaceae (crucifer) plants. Compared with similar software and databases, one of the unique features of CrusView is its integration of karyotype information when comparing two genomes. This feature allows users to perform karyotype-based genome assembly and karyotype-assisted genome synteny analyses with preset karyotype patterns of the Brassicaceae genomes. Additionally, CrusView is a local program, which gives its users high flexibility when analyzing unpublished genomes and allows users to upload self-defined genomic information so that they can visually study the associations between genome structural variations and genetic elements, including chromosomal rearrangements, genomic macrosynteny, gene families, high-frequency recombination sites, and tandem and segmental duplications between related species. This tool will greatly facilitate karyotype, chromosome, and genome evolution studies using visualized comparative genomics approaches in Brassicaceae species. CrusView is freely available at http://www.cmbb.arizona.edu/CrusView/. PMID:23898041

  20. Profiling of chicken adipose tissue gene expression by genome array

    PubMed Central

    Wang, Hong-Bao; Li, Hui; Wang, Qi-Gui; Zhang, Xin-Yu; Wang, Shou-Zhi; Wang, Yu-Xiang; Wang, Xiu-Ping

    2007-01-01

    Background Excessive accumulation of lipids in the adipose tissue is a major problem in the present-day broiler industry. However, few studies have analyzed the expression of adipose tissue genes that are involved in pathways and mechanisms leading to adiposity in chickens. Gene expression profiling of chicken adipose tissue could provide key information about the ontogenesis of fatness and clarify the molecular mechanisms underlying obesity. In this study, Chicken Genome Arrays were used to construct an adipose tissue gene expression profile of 7-week-old broilers, and to screen adipose tissue genes that are differentially expressed in lean and fat lines divergently selected over eight generations for high and low abdominal fat weight. Results The gene expression profiles detected 13,234–16,858 probe sets in chicken adipose tissue at 7 weeks, and genes involved in lipid metabolism and immunity such as fatty acid binding protein (FABP), thyroid hormone-responsive protein (Spot14), lipoprotein lipase(LPL), insulin-like growth factor binding protein 7(IGFBP7) and major histocompatibility complex (MHC), were highly expressed. In contrast, some genes related to lipogenesis, such as leptin receptor, sterol regulatory element binding proteins1 (SREBP1), apolipoprotein B(ApoB) and insulin-like growth factor 2(IGF2), were not detected. Moreover, 230 genes that were differentially expressed between the two lines were screened out; these were mainly involved in lipid metabolism, signal transduction, energy metabolism, tumorigenesis and immunity. Subsequently, real-time RT-PCR was performed to validate fifteen differentially expressed genes screened out by the microarray approach and high consistency was observed between the two methods. Conclusion Our results establish the groundwork for further studies of the basic genetic control of growth and development of chicken adipose tissue, and will be beneficial in clarifying the molecular mechanism of obesity in chickens. PMID

  1. Complete nucleotide sequence of the Cryptomeria japonica D. Don. chloroplast genome and comparative chloroplast genomics: diversified genomic structure of coniferous species

    PubMed Central

    Hirao, Tomonori; Watanabe, Atsushi; Kurita, Manabu; Kondo, Teiji; Takata, Katsuhiko

    2008-01-01

    Background The recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms. Results The C. japonica cp genome is 131,810 bp in length, with 112 single copy genes and two duplicated (trnI-CAU, trnQ-UUG) genes that give a total of 116 genes. Compared to other land plant cp genomes, the C. japonica cp has lost one of the relevant large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperms, such as Cycas and Gingko, and additionally has completely lost its trnR-CCG, partially lost its trnT-GGU, and shows diversification of accD. The genomic structure of the C. japonica cp genome also differs significantly from those of other plant species. For example, we estimate that a minimum of 15 inversions would be required to transform the gene organization of the Pinus thunbergii cp genome into that of C. japonica. In the C. japonica cp genome, direct repeat and inverted repeat sequences are observed at the inversion and translocation endpoints, and these sequences may be associated with the genomic rearrangements. Conclusion The observed differences in genomic structure between C. japonica and other land plants, including

  2. Genome-scale long noncoding RNA expression pattern in squamous cell lung cancer

    PubMed Central

    Wang, Ying; Qian, Chen-Yue; Li, Xiang-Ping; Zhang, Yu; He, Hui; Wang, Jing; Chen, Juan; Cui, Jia-Jia; Liu, Rong; Zhou, Hui; Xiao, Lin; Xu, Xiao-Jing; Zheng, Yi; Fu, Yi-Lan; Chen, Zi-Yu; Chen, Xiang; Zhang, Wei; Ye, Cheng-Cheng; Zhou, Hong-Hao; Yin, Ji-Ye; Liu, Zhao-Qian

    2015-01-01

    In this study, we aimed to explore the long noncoding RNA expression pattern in squamous cell lung cancer (SQCC) on a genome-wide scale. Total RNAs were extracted from 16 lung SQCC patients’ normal and matched lung cancer tissues by Trizol reagent. The expression level of genome-wide scale lncRNA and mRNA was determined by microarray. qRT-PCR was used to validate the lncRNA expression level in 47 patients. Data analyses were performed using R and Bioconductor. A total of 2,748 up and 852 down regulated probes were identified to be significantly and differentially expressed in tumor tissues. The annotation result of their co-expressed mRNAs showed that the most significantly related category of GO analysis was development and differentiation, while the most significantly related pathway was cell cycle. Subgroup analysis identified that 46 and 18 probes were specifically differentially expressed in smoking and moderately differentiated tumors, respectively. Our study indicated that clusters of lncRNAs were significantly and differentially expressed in SQCC compared with normal tissues in the same subject. They may exert a significant role in lung cancer development and could be potential targets for future treatment of SQCC. PMID:26159226

  3. Genomic Characterization of Prenatally Detected Chromosomal Structural Abnormalities Using Oligonucleotide Array Comparative Genomic Hybridization

    PubMed Central

    Li, Peining; Pomianowski, Pawel; DiMaio, Miriam S.; Florio, Joanne R.; Rossi, Michael R.; Xiang, Bixia; Xu, Fang; Yang, Hui; Geng, Qian; Xie, Jiansheng; Mahoney, Maurice J.

    2013-01-01

    Detection of chromosomal structural abnormalities using conventional cytogenetic methods poses a challenge for prenatal genetic counseling due to unpredictable clinical outcomes and risk of recurrence. Of the 1,726 prenatal cases in a 3-year period, we performed oligonucleotide array comparative genomic hybridization (aCGH) analysis on 11 cases detected with various structural chromosomal abnormalities. In nine cases, genomic aberrations and gene contents involving a 3p distal deletion, a marker chromosome from chromosome 4, a derivative chromosome 5 from a 5p/7q translocation, a de novo distal 6q deletion, a recombinant chromosome 8 comprised of an 8p duplication and an 8q deletion, an extra derivative chromosome 9 from an 8p/9q translocation, mosaicism for chromosome 12q with added material of initially unknown origin, an unbalanced 13q/15q rearrangement, and a distal 18q duplication and deletion were delineated. An absence of pathogenic copy number changes was noted in one case with a de novo 11q/14q translocation and in another with a familial insertion of 21q into a 19q. Genomic characterization of the structural abnormalities aided in the prediction of clinical outcomes. These results demonstrated the value of aCGH analysis in prenatal cases with subtle or complex chromosomal rearrangements. Furthermore, a retrospective analysis of clinical indications of our prenatal cases showed that approximately 20% of them had abnormal ultrasound findings and should be considered as high risk pregnancies for a combined chromosome and aCGH analysis. PMID:21671377

  4. Genome Based Phylogeny and Comparative Genomic Analysis of Intra-Mammary Pathogenic Escherichia coli

    PubMed Central

    Richards, Vincent P.; Lefébure, Tristan; Pavinski Bitar, Paulina D.; Dogan, Belgin; Simpson, Kenneth W.; Schukken, Ynte H.; Stanhope, Michael J.

    2015-01-01

    Escherichia coli is an important cause of bovine mastitis and can cause both severe inflammation with a short-term transient infection, as well as less severe, but more chronic inflammation and infection persistence. E. coli is a highly diverse organism that has been classified into a number of different pathotypes or pathovars, and mammary pathogenic E. coli (MPEC) has been proposed as a new such pathotype. The purpose of this study was to use genome sequence data derived from both transient and persistent MPEC isolates (two isolates of each phenotype) to construct a genome-based phylogeny that places MPEC in its phylogenetic context with other E. coli pathovars. A subsidiary goal was to conduct comparative genomic analyses of these MPEC isolates with other E. coli pathovars to provide a preliminary perspective on loci that might be correlated with the MPEC phenotype. Both concatenated and consensus tree phylogenies did not support MPEC monophyly or the monophyly of either transient or persistent phenotypes. Three of the MPEC isolates (ECA-727, ECC-Z, and ECA-O157) originated from within the predominately commensal clade of E. coli, referred to as phylogroup A. The fourth MPEC isolate, of the persistent phenotype (ECC-1470), was sister group to an isolate of ETEC, falling within the E. coli B1 clade. This suggests that the MPEC phenotype has arisen on numerous independent occasions and that this has often, although not invariably, occurred from commensal ancestry. Examination of the genes present in the MPEC strains relative to the commensal strains identified a consistent presence of the type VI secretion system (T6SS) in the MPEC strains, with only occasional representation in commensal strains, suggesting that T6SS may be associated with MPEC pathogenesis and/or as an inter-bacterial competitive attribute and therefore could represent a useful target to explore for the development of MPEC specific inhibitors. PMID:25807497

  5. Genome Stability of Lyme Disease Spirochetes: Comparative Genomics of Borrelia burgdorferi Plasmids

    PubMed Central

    Casjens, Sherwood R.; Mongodin, Emmanuel F.; Qiu, Wei-Gang; Luft, Benjamin J.; Schutzer, Steven E.; Gilcrease, Eddie B.; Huang, Wai Mun; Vujadinovic, Marija; Aron, John K.; Vargas, Levy C.; Freeman, Sam; Radune, Diana; Weidman, Janice F.; Dimitrov, George I.; Khouri, Hoda M.; Sosa, Julia E.; Halpin, Rebecca A.; Dunn, John J.; Fraser, Claire M.

    2012-01-01

    Lyme disease is the most common tick-borne human illness in North America. In order to understand the molecular pathogenesis, natural diversity, population structure and epizootic spread of the North American Lyme agent, Borrelia burgdorferi sensu stricto, a much better understanding of the natural diversity of its genome will be required. Towards this end we present a comparative analysis of the nucleotide sequences of the numerous plasmids of B. burgdorferi isolates B31, N40, JD1 and 297. These strains were chosen because they include the three most commonly studied laboratory strains, and because they represent different major genetic lineages and so are informative regarding the genetic diversity and evolution of this organism. A unique feature of Borrelia genomes is that they carry a large number of linear and circular plasmids, and this work shows that strains N40, JD1, 297 and B31 carry related but non-identical sets of 16, 20, 19 and 21 plasmids, respectively, that comprise 33–40% of their genomes. We deduce that there are at least 28 plasmid compatibility types among the four strains. The B. burgdorferi ∼900 Kbp linear chromosomes are evolutionarily exceptionally stable, except for a short ≤20 Kbp plasmid-like section at the right end. A few of the plasmids, including the linear lp54 and circular cp26, are also very stable. We show here that the other plasmids, especially the linear ones, are considerably more variable. Nearly all of the linear plasmids have undergone one or more substantial inter-plasmid rearrangements since their last common ancestor. In spite of these rearrangements and differences in plasmid contents, the overall gene complement of the different isolates has remained relatively constant. PMID:22432010

  6. Genome Stability of Lyme Disease Spirochetes: Comparative Genomics of Borrelia burgdorferi Plasmids

    SciTech Connect

    Casjens S. R.; Dunn J.; Mongodin, E. F.; Qiu, W.-G.; Luft, B. J.; Schutzer, S. E.; Gilcrease, E. B.; Huang, W. M.; Vujadinovic, M.; Aron, J. K.; Vargas, L. C.; Freeman, S.; Radune, D.; Weidman, J. F.; Dimitrov, G. I.; Khouri, H. M.; Sosa, J. E.; Halpin, R. A.; Fraser, C. M.

    2012-03-14

    Lyme disease is the most common tick-borne human illness in North America. In order to understand the molecular pathogenesis, natural diversity, population structure and epizootic spread of the North American Lyme agent, Borrelia burgdorferi sensu stricto, a much better understanding of the natural diversity of its genome will be required. Towards this end we present a comparative analysis of the nucleotide sequences of the numerous plasmids of B. burgdorferi isolates B31, N40, JD1 and 297. These strains were chosen because they include the three most commonly studied laboratory strains, and because they represent different major genetic lineages and so are informative regarding the genetic diversity and evolution of this organism. A unique feature of Borrelia genomes is that they carry a large number of linear and circular plasmids, and this work shows that strains N40, JD1, 297 and B31 carry related but non-identical sets of 16, 20, 19 and 21 plasmids, respectively, that comprise 33-40% of their genomes. We deduce that there are at least 28 plasmid compatibility types among the four strains. The B. burgdorferi {approx}900 Kbp linear chromosomes are evolutionarily exceptionally stable, except for a short {le}20 Kbp plasmid-like section at the right end. A few of the plasmids, including the linear lp54 and circular cp26, are also very stable. We show here that the other plasmids, especially the linear ones, are considerably more variable. Nearly all of the linear plasmids have undergone one or more substantial inter-plasmid rearrangements since their last common ancestor. In spite of these rearrangements and differences in plasmid contents, the overall gene complement of the different isolates has remained relatively constant.

  7. PlantGDB: A Resource for Comparative Plant Genomics

    Technology Transfer Automated Retrieval System (TEKTRAN)

    PlantGDB (http://www.plantgdb.org/) is a genomics database encompassing sequence data for green plants (Viridiplantae). PlantGDB provides annotated transcript assemblies for >100 plant species, with transcripts mapped to their cognate genomic context where available, integrated with a variety of seq...

  8. Comparative Genomic Analysis of Meningitis- and Bacteremia-Causing Pneumococci Identifies a Common Core Genome.

    PubMed

    Kulohoma, Benard W; Cornick, Jennifer E; Chaguza, Chrispin; Yalcin, Feyruz; Harris, Simon R; Gray, Katherine J; Kiran, Anmol M; Molyneux, Elizabeth; French, Neil; Parkhill, Julian; Faragher, Brian E; Everett, Dean B; Bentley, Stephen D; Heyderman, Robert S

    2015-10-01

    Streptococcus pneumoniae is a nasopharyngeal commensal that occasionally invades normally sterile sites to cause bloodstream infection and meningitis. Although the pneumococcal population structure and evolutionary genetics are well defined, it is not clear whether pneumococci that cause meningitis are genetically distinct from those that do not. Here, we used whole-genome sequencing of 140 isolates of S. pneumoniae recovered from bloodstream infection (n = 70) and meningitis (n = 70) to compare their genetic contents. By fitting a double-exponential decaying-function model, we show that these isolates share a core of 1,427 genes (95% confidence interval [CI], 1,425 to 1,435 genes) and that there is no difference in the core genome or accessory gene content from these disease manifestations. Gene presence/absence alone therefore does not explain the virulence behavior of pneumococci that reach the meninges. Our analysis, however, supports the requirement of a range of previously described virulence factors and vaccine candidates for both meningitis- and bacteremia-causing pneumococci. This high-resolution view suggests that, despite considerable competency for genetic exchange, all pneumococci are under considerable pressure to retain key components advantageous for colonization and transmission and that these components are essential for access to and survival in sterile sites. PMID:26259813

  9. Rapid Identification of Potential Drugs for Diabetic Nephropathy Using Whole-Genome Expression Profiles of Glomeruli

    PubMed Central

    Shi, Jingsong; Jiang, Song; Qiu, Dandan; Le, Weibo; Wang, Xiao; Lu, Yinhui; Liu, Zhihong

    2016-01-01

    Objective. To investigate potential drugs for diabetic nephropathy (DN) using whole-genome expression profiles and the Connectivity Map (CMAP). Methodology. Eighteen Chinese Han DN patients and six normal controls were included in this study. Whole-genome expression profiles of microdissected glomeruli were measured using the Affymetrix human U133 plus 2.0 chip. Differentially expressed genes (DEGs) between late stage and early stage DN samples and the CMAP database were used to identify potential drugs for DN using bioinformatics methods. Results. (1) A total of 1065 DEGs (FDR < 0.05 and fold change > 1.5) were found in late stage DN patients compared with early stage DN patients. (2) Piperlongumine, 15d-PGJ2 (15-delta prostaglandin J2), vorinostat, and trichostatin A were predicted to be the most promising potential drugs for DN, acting as NF-κB inhibitors, histone deacetylase inhibitors (HDACIs), PI3K pathway inhibitors, or PPARγ agonists, respectively. Conclusion. Using whole-genome expression profiles and the CMAP database, we rapidly predicted potential DN drugs, and therapeutic potential was confirmed by previously published studies. Animal experiments and clinical trials are needed to confirm both the safety and efficacy of these drugs in the treatment of DN. PMID:27069916

  10. Sequence and comparative genomic analysis of actin-related proteins.

    PubMed

    Muller, Jean; Oma, Yukako; Vallar, Laurent; Friederich, Evelyne; Poch, Olivier; Winsor, Barbara

    2005-12-01

    Actin-related proteins (ARPs) are key players in cytoskeleton activities and nuclear functions. Two complexes, ARP2/3 and ARP1/11, also known as dynactin, are implicated in actin dynamics and in microtubule-based trafficking, respectively. ARP4 to ARP9 are components of many chromatin-modulating complexes. Conventional actins and ARPs codefine a large family of homologous proteins, the actin superfamily, with a tertiary structure known as the actin fold. Because ARPs and actin share high sequence conservation, clear family definition requires distinct features to easily and systematically identify each subfamily. In this study we performed an in depth sequence and comparative genomic analysis of ARP subfamilies. A high-quality multiple alignment of approximately 700 complete protein sequences homologous to actin, including 148 ARP sequences, allowed us to extend the ARP classification to new organisms. Sequence alignments revealed conserved residues, motifs, and inserted sequence signatures to define each ARP subfamily. These discriminative characteristics allowed us to develop ARPAnno (http://bips.u-strasbg.fr/ARPAnno), a new web server dedicated to the annotation of ARP sequences. Analyses of sequence conservation among actins and ARPs highlight part of the actin fold and suggest interactions between ARPs and actin-binding proteins. Finally, analysis of ARP distribution across eukaryotic phyla emphasizes the central importance of nuclear ARPs, particularly the multifunctional ARP4. PMID:16195354

  11. openSputnik--a database to ESTablish comparative plant genomics using unsaturated sequence collections.

    PubMed

    Rudd, Stephen

    2005-01-01

    The public expressed sequence tag collections are continually being enriched with high-quality sequences that represent an ever-expanding range of taxonomically diverse plant species. While these sequence collections provide biased insight into the populations of expressed genes available within individual species and their associated tissues, the information is conceivably of wider relevance in a comparative context. When we consider the available expressed sequence tag (EST) collections of summer 2004, most of the major plant taxonomic clades are at least superficially represented. Investigation of the five million available plant ESTs provides a wealth of information that has applications in modelling the routes of plant genome evolution and the identification of lineage-specific genes and gene families. Over four million ESTs from over 50 distinct plant species have been collated within an EST analysis pipeline called openSputnik. The ESTs were resolved down into approximately one million unigene sequences. These have been annotated using orthology-based annotation transfer from reference plant genomes and using a variety of contemporary bioinformatics methods to assign peptide, structural and functional attributes. The openSputnik database is available at http://sputnik.btk.fi. PMID:15608275

  12. Comparative Genomic and Functional Analysis of Lactobacillus casei and Lactobacillus rhamnosus Strains Marketed as Probiotics

    PubMed Central

    Douillard, François P.; Ribbera, Angela; Järvinen, Hanna M.; Kant, Ravi; Pietilä, Taija E.; Randazzo, Cinzia; Paulin, Lars; Laine, Pia K.; Caggia, Cinzia; von Ossowski, Ingemar; Reunanen, Justus; Satokari, Reetta; Salminen, Seppo; Palva, Airi

    2013-01-01

    Four Lactobacillus strains were isolated from marketed probiotic products, including L. rhamnosus strains from Vifit (Friesland Campina) and Idoform (Ferrosan) and L. casei strains from Actimel (Danone) and Yakult (Yakult Honsa Co.). Their genomes and phenotypes were characterized and compared in detail with L. casei strain BL23 and L. rhamnosus strain GG. Phenotypic analysis of the new isolates indicated differences in carbohydrate utilization between L. casei and L. rhamnosus strains, which could be linked to their genotypes. The two isolated L. rhamnosus strains had genomes that were virtually identical to that of L. rhamnosus GG, testifying to their genomic stability and integrity in food products. The L. casei strains showed much greater genomic heterogeneity. Remarkably, all strains contained an intact spaCBA pilus gene cluster. However, only the L. rhamnosus strains produced mucus-binding SpaCBA pili under the conditions tested. Transcription initiation mapping demonstrated that the insertion of an iso-IS30 element upstream of the pilus gene cluster in L. rhamnosus strains but absent in L. casei strains had constituted a functional promoter driving pilus gene expression. All L. rhamnosus strains triggered an NF-κB response via Toll-like receptor 2 (TLR2) in a reporter cell line, whereas the L. casei strains did not or did so to a much lesser extent. This study demonstrates that the two L. rhamnosus strains isolated from probiotic products are virtually identical to L. rhamnosus GG and further highlights the differences between these and L. casei strains widely marketed as probiotics, in terms of genome content, mucus-binding and metabolic capacities, and host signaling capabilities. PMID:23315726

  13. Comparative genomic and functional analysis of Lactobacillus casei and Lactobacillus rhamnosus strains marketed as probiotics.

    PubMed

    Douillard, François P; Ribbera, Angela; Järvinen, Hanna M; Kant, Ravi; Pietilä, Taija E; Randazzo, Cinzia; Paulin, Lars; Laine, Pia K; Caggia, Cinzia; von Ossowski, Ingemar; Reunanen, Justus; Satokari, Reetta; Salminen, Seppo; Palva, Airi; de Vos, Willem M

    2013-03-01

    Four Lactobacillus strains were isolated from marketed probiotic products, including L. rhamnosus strains from Vifit (Friesland Campina) and Idoform (Ferrosan) and L. casei strains from Actimel (Danone) and Yakult (Yakult Honsa Co.). Their genomes and phenotypes were characterized and compared in detail with L. casei strain BL23 and L. rhamnosus strain GG. Phenotypic analysis of the new isolates indicated differences in carbohydrate utilization between L. casei and L. rhamnosus strains, which could be linked to their genotypes. The two isolated L. rhamnosus strains had genomes that were virtually identical to that of L. rhamnosus GG, testifying to their genomic stability and integrity in food products. The L. casei strains showed much greater genomic heterogeneity. Remarkably, all strains contained an intact spaCBA pilus gene cluster. However, only the L. rhamnosus strains produced mucus-binding SpaCBA pili under the conditions tested. Transcription initiation mapping demonstrated that the insertion of an iso-IS30 element upstream of the pilus gene cluster in L. rhamnosus strains but absent in L. casei strains had constituted a functional promoter driving pilus gene expression. All L. rhamnosus strains triggered an NF-κB response via Toll-like receptor 2 (TLR2) in a reporter cell line, whereas the L. casei strains did not or did so to a much lesser extent. This study demonstrates that the two L. rhamnosus strains isolated from probiotic products are virtually identical to L. rhamnosus GG and further highlights the differences between these and L. casei strains widely marketed as probiotics, in terms of genome content, mucus-binding and metabolic capacities, and host signaling capabilities. PMID:23315726

  14. Comparative ruminant genomics highlights segmental duplication and mobile element insertion diversity

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We have expanded upon a previously reported comparative genomics approach using a read-depth (JaRMs) and a hybrid read-pair, split-read (RAPTR-SV) copy number variation (CNV) detection method that uses read alignments to the cattle reference genome in order to identify species-specific genomic rearr...

  15. Genomic-associated Markers and comparative Genome Maps of Xanthomonas oryzae pv. oryzae and X. oryzae pv. oryzicola.

    PubMed

    Feng, Wenjie; Wang, Yi; Huang, Lisha; Feng, Chuanshun; Chu, Zhaohui; Ding, Xinhua; Yang, Long

    2015-09-01

    Xanthomonas oryzae pv. oryzae (Xoo) and X. oryzae pv. oryzicola (Xoc) cause two major seed quarantine diseases in rice, bacterial blight and bacterial leaf streak, respectively. Xoo and Xoc share high similarity in genomic sequence, which results in hard differentiation of the two pathogens. Genomic-associated Markers and comparative Genome Maps database (GMGM) is an integrated database providing comprehensive information including compared genome maps and full genomic-coverage molecular makers of Xoo and Xoc. This database was established based on bioinformatic analysis of complete sequenced genomes of several X. oryzae pathovars of which the similarity of the genomes was up to 91.39 %. The program was designed with a series of specific PCR primers, including 286 pairs of Xoo dominant markers, 288 pairs of Xoc dominant markers, and 288 pairs of Xoo and Xoc co-dominant markers, which were predicted to distinguish two pathovars. Test on a total of 40 donor pathogen strains using randomly selected 120 pairs of primers demonstrated that over 52.5 % of the primers were efficacious. The GMGM web portal ( http://biodb.sdau.edu.cn/gmgm/ ) will be a powerful tool that can present highly specific diagnostic markers, and it also provides information about comparative genome maps of the two pathogens for future evolution study. PMID:26093644

  16. Genomic modulators of gene expression in human neutrophils.

    PubMed

    Naranbhai, Vivek; Fairfax, Benjamin P; Makino, Seiko; Humburg, Peter; Wong, Daniel; Ng, Esther; Hill, Adrian V S; Knight, Julian C

    2015-01-01

    Neutrophils form the most abundant leukocyte subset and are central to many disease processes. Technical challenges in transcriptomic profiling have prohibited genomic approaches to date. Here we map expression quantitative trait loci (eQTL) in peripheral blood CD16+ neutrophils from 101 healthy European adults. We identify cis-eQTL for 3281 neutrophil-expressed genes including many implicated in neutrophil function, with 450 of these not previously observed in myeloid or lymphoid cells. Paired comparison with monocyte eQTL demonstrates nuanced conditioning of genetic regulation of gene expression by cellular context, which relates to cell-type-specific DNA methylation and histone modifications. Neutrophil eQTL are markedly enriched for trait-associated variants particularly autoimmune, allergy and infectious disease. We further demonstrate how eQTL in PADI4 and NOD2 delineate risk variant function in rheumatoid arthritis, leprosy and Crohn's disease. Taken together, these data help advance understanding of the genetics of gene expression, neutrophil biology and immune-related diseases. PMID:26151758

  17. Genomic modulators of gene expression in human neutrophils

    PubMed Central

    Naranbhai, Vivek; Fairfax, Benjamin P.; Makino, Seiko; Humburg, Peter; Wong, Daniel; Ng, Esther; Hill, Adrian V. S.; Knight, Julian C.

    2015-01-01

    Neutrophils form the most abundant leukocyte subset and are central to many disease processes. Technical challenges in transcriptomic profiling have prohibited genomic approaches to date. Here we map expression quantitative trait loci (eQTL) in peripheral blood CD16+ neutrophils from 101 healthy European adults. We identify cis-eQTL for 3281 neutrophil-expressed genes including many implicated in neutrophil function, with 450 of these not previously observed in myeloid or lymphoid cells. Paired comparison with monocyte eQTL demonstrates nuanced conditioning of genetic regulation of gene expression by cellular context, which relates to cell-type-specific DNA methylation and histone modifications. Neutrophil eQTL are markedly enriched for trait-associated variants particularly autoimmune, allergy and infectious disease. We further demonstrate how eQTL in PADI4 and NOD2 delineate risk variant function in rheumatoid arthritis, leprosy and Crohn's disease. Taken together, these data help advance understanding of the genetics of gene expression, neutrophil biology and immune-related diseases. PMID:26151758

  18. Genomic subtypes of breast cancer identified by array-comparative genomic hybridization display distinct molecular and clinical characteristics

    PubMed Central

    2010-01-01

    Introduction Breast cancer is a profoundly heterogeneous disease with respect to biologic and clinical behavior. Gene-expression profiling has been used to dissect this complexity and to stratify tumors into intrinsic gene-expression subtypes, associated with distinct biology, patient outcome, and genomic alterations. Additionally, breast tumors occurring in individuals with germline BRCA1 or BRCA2 mutations typically fall into distinct subtypes. Methods We applied global DNA copy number and gene-expression profiling in 359 breast tumors. All tumors were classified according to intrinsic gene-expression subtypes and included cases from genetically predisposed women. The Genomic Identification of Significant Targets in Cancer (GISTIC) algorithm was used to identify significant DNA copy-number aberrations and genomic subgroups of breast cancer. Results We identified 31 genomic regions that were highly amplified in > 1% of the 359 breast tumors. Several amplicons were found to co-occur, the 8p12 and 11q13.3 regions being the most frequent combination besides amplicons on the same chromosomal arm. Unsupervised hierarchical clustering with 133 significant GISTIC regions revealed six genomic subtypes, termed 17q12, basal-complex, luminal-simple, luminal-complex, amplifier, and mixed subtypes. Four of them had striking similarity to intrinsic gene-expression subtypes and showed associations to conventional tumor biomarkers and clinical outcome. However, luminal A-classified tumors were distributed in two main genomic subtypes, luminal-simple and luminal-complex, the former group having a better prognosis, whereas the latter group included also luminal B and the majority of BRCA2-mutated tumors. The basal-complex subtype displayed extensive genomic homogeneity and harbored the majority of BRCA1-mutated tumors. The 17q12 subtype comprised mostly HER2-amplified and HER2-enriched subtype tumors and had the worst prognosis. The amplifier and mixed subtypes contained tumors

  19. Comparative genomics of eukaryotic small nucleolar RNAs reveals deep evolutionary ancestry amidst ongoing intragenomic mobility

    PubMed Central

    2012-01-01

    Background Small nucleolar (sno)RNAs are required for posttranscriptional processing and modification of ribosomal, spliceosomal and messenger RNAs. Their presence in both eukaryotes and archaea indicates that snoRNAs are evolutionarily ancient. The location of some snoRNAs within the introns of ribosomal protein genes has been suggested to belie an RNA world origin, with the exons of the earliest protein-coding genes having evolved around snoRNAs after the advent of templated protein synthesis. Alternatively, this intronic location may reflect more recent selection for coexpression of snoRNAs and ribosomal components, ensuring rRNA modification by snoRNAs during ribosome synthesis. To gain insight into the evolutionary origins of this genetic organization, we examined the antiquity of snoRNA families and the stability of their genomic location across 44 eukaryote genomes. Results We report that dozens of snoRNA families are traceable to the Last Eukaryotic Common Ancestor (LECA), but find only weak similarities between the oldest eukaryotic snoRNAs and archaeal snoRNA-like genes. Moreover, many of these LECA snoRNAs are located within the introns of host genes independently traceable to the LECA. Comparative genomic analyses reveal the intronic location of LECA snoRNAs is not ancestral however, suggesting the pattern we observe is the result of ongoing intragenomic mobility. Analysis of human transcriptome data indicates that the primary requirement for hosting intronic snoRNAs is a broad expression profile. Consistent with ongoing mobility across broadly-expressed genes, we report a case of recent migration of a non-LECA snoRNA from the intron of a ubiquitously expressed non-LECA host gene into the introns of two LECA genes during the evolution of primates. Conclusions Our analyses show that snoRNAs were a well-established family of RNAs at the time when eukaryotes began to diversify. While many are intronic, this association is not evolutionarily stable across

  20. Genome-wide analysis reveals gene expression and metabolic network dynamics during embryo development in Arabidopsis.

    PubMed

    Xiang, Daoquan; Venglat, Prakash; Tibiche, Chabane; Yang, Hui; Risseeuw, Eddy; Cao, Yongguo; Babic, Vivijan; Cloutier, Mathieu; Keller, Wilf; Wang, Edwin; Selvaraj, Gopalan; Datla, Raju

    2011-05-01

    Embryogenesis is central to the life cycle of most plant species. Despite its importance, because of the difficulty associated with embryo isolation, global gene expression programs involved in plant embryogenesis, especially the early events following fertilization, are largely unknown. To address this gap, we have developed methods to isolate whole live Arabidopsis (Arabidopsis thaliana) embryos as young as zygote and performed genome-wide profiling of gene expression. These studies revealed insights into patterns of gene expression relating to: maternal and paternal contributions to zygote development, chromosomal level clustering of temporal expression in embryogenesis, and embryo-specific functions. Functional analysis of some of the modulated transcription factor encoding genes from our data sets confirmed that they are critical for embryogenesis. Furthermore, we constructed stage-specific metabolic networks mapped with differentially regulated genes by combining the microarray data with the available Kyoto Encyclopedia of Genes and Genomes metabolic data sets. Comparative analysis of these networks revealed the network-associated structural and topological features, pathway interactions, and gene expression with reference to the metabolic activities during embryogenesis. Together, these studies have generated comprehensive gene expression data sets for embryo development in Arabidopsis and may serve as an important foundational resource for other seed plants. PMID:21402797

  1. Comparative genomics reveals functional transcriptional control sequences in the Prop1 gene

    PubMed Central

    Ward, Robert D.; Davis, Shannon W.; Cho, MinChul; Esposito, Constance; Lyons, Robert H.; Cheng, Jan-Fang; Rubin, Edward M.; Rhodes, Simon J.; Raetzman, Lori T.; Smith, Timothy P. L.

    2007-01-01

    Mutations in PROP1 are a common genetic cause of multiple pituitary hormone deficiency (MPHD). We used a comparative genomics approach to predict the transcriptional regulatory domains of Prop1 and tested them in cell culture and mice. A BAC transgene containing Prop1 completely rescues the Prop1 mutant phenotype, demonstrating that the regulatory elements necessary for proper PROP1 transcription are contained within the BAC. We generated DNA sequences from the PROP1 genes in lemur, pig, and five different primate species. Comparison of these with available human and mouse PROP1 sequences identified three putative regulatory sequences that are highly conserved. These are located in the PROP1 promoter proximal region, within the first intron of PROP1, and downstream of PROP1. Each of the conserved elements elicited orientation-specific enhancer activity in the context of the Drosophila alcohol dehydrogenase minimal promoter in both heterologous and pituitary-derived cells lines. The intronic element is sufficient to confer dorsal expansion of the pituitary expression domain of a transgene, suggesting that this element is important for the normal spatial expression of endogenous Prop1 during pituitary development. This study illustrates the usefulness of a comparative genomics approach in the identification of regulatory elements that may be the site of mutations responsible for some cases of MPHD. PMID:17557180

  2. A complete Sec61 complex in Nosema bombycis and its comparative genomics analyses.

    PubMed

    Wu, Zhengli; Li, Yanhong; Pan, Guoqing; Li, Chunfeng; Hu, Junhua; Liu, Handeng; Zhou, Zeyang; Xiang, Zhonghuai

    2007-01-01

    We characterized a complete Sec61 complex in Nosema bombycis, which has been shown to consist of Sec61alpha, Sec61beta, and Sec61gamma genes. Comparing the genomic regions that harbor the respective subunit genes between N. bombycis, Encephalitozoon cuniculi, and Antonospora locustae, we found that microsporidian genomes have high degree of synteny in short genomic fragment, and that this gene synteny in general might exist throughout microsporidian genomes. PMID:17669164

  3. Comparative genomic survey of microbial arylamine N-acetyltransferases

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Introduction: Microorganisms are constantly exposed to exogenous chemical influences. Our previous genomic surveys have identified putative NAT genes across a phylogenetic spectrum of prokaryotic and eukaryotic microorganisms. We are currently pursuing two lines of investigation: The first looks int...

  4. Genome-scale protein expression and structural biology of Plasmodium falciparum and related Apicomplexan organisms.

    PubMed

    Vedadi, Masoud; Lew, Jocelyne; Artz, Jennifer; Amani, Mehrnaz; Zhao, Yong; Dong, Aiping; Wasney, Gregory A; Gao, Mian; Hills, Tanya; Brokx, Stephen; Qiu, Wei; Sharma, Sujata; Diassiti, Angelina; Alam, Zahoor; Melone, Michelle; Mulichak, Anne; Wernimont, Amy; Bray, James; Loppnau, Peter; Plotnikova, Olga; Newberry, Kate; Sundararajan, Emayavaram; Houston, Simon; Walker, John; Tempel, Wolfram; Bochkarev, Alexey; Kozieradzki, Ivona; Edwards, Aled; Arrowsmith, Cheryl; Roos, David; Kain, Kevin; Hui, Raymond

    2007-01-01

    Parasites from the protozoan phylum Apicomplexa are responsible for diseases, such as malaria, toxoplasmosis and cryptosporidiosis, all of which have significantly higher rates of mortality and morbidity in economically underdeveloped regions of the world. Advances in vaccine development and drug discovery are urgently needed to control these diseases and can be facilitated by production of purified recombinant proteins from Apicomplexan genomes and determination of their 3D structures. To date, both heterologous expression and crystallization of Apicomplexan proteins have seen only limited success. In an effort to explore the effectiveness of producing and crystallizing proteins on a genome-scale using a standardized methodology, over 400 distinct Plasmodium falciparum target genes were chosen representing different cellular classes, along with select orthologues from four other Plasmodium species as well as Cryptosporidium parvum and Toxoplasma gondii. From a total of 1008 genes from the seven genomes, 304 (30.2%) produced purified soluble proteins and 97 (9.6%) crystallized, culminating in 36 crystal structures. These results demonstrate that, contrary to previous findings, a standardized platform using Escherichia coli can be effective for genome-scale production and crystallography of Apicomplexan proteins. Predictably, orthologous proteins from different Apicomplexan genomes behaved differently in expression, purification and crystallization, although the overall success rates of Plasmodium orthologues do not differ significantly. Their differences were effectively exploited to elevate the overall productivity to levels comparable to the most successful ongoing structural genomics projects: 229 of the 468 target genes produced purified soluble protein from one or more organisms, with 80 and 32 of the purified targets, respectively, leading to crystals and ultimately structures from one or more orthologues. PMID:17125854

  5. Genomicus update 2015: KaryoView and MatrixView provide a genome-wide perspective to multispecies comparative genomics

    PubMed Central

    Louis, Alexandra; Nguyen, Nga Thi Thuy; Muffato, Matthieu; Roest Crollius, Hugues

    2015-01-01

    The Genomicus web server (http://www.genomicus.biologie.ens.fr/genomicus) is a visualization tool allowing comparative genomics in four different phyla (Vertebrate, Fungi, Metazoan and Plants). It provides access to genomic information from extant species, as well as ancestral gene content and gene order for vertebrates and flowering plants. Here we present the new features available for vertebrate genome with a focus on new graphical tools. The interface to enter the database has been improved, two pairwise genome comparison tools are now available (KaryoView and MatrixView) and the multiple genome comparison tools (PhyloView and AlignView) propose three new kinds of representation and a more intuitive menu. These new developments have been implemented for Genomicus portal dedicated to vertebrates. This allows the analysis of 68 extant animal genomes, as well as 58 ancestral reconstructed genomes. The Genomicus server also provides access to ancestral gene orders, to facilitate evolutionary and comparative genomics studies, as well as computationally predicted regulatory interactions, thanks to the representation of conserved non-coding elements with their putative gene targets. PMID:25378326

  6. Comparative genomics of pathogenic lineages of Vibrio nigripulchritudo identifies virulence-associated traits

    PubMed Central

    Goudenège, David; Labreuche, Yannick; Krin, Evelyne; Ansquer, Dominique; Mangenot, Sophie; Calteau, Alexandra; Médigue, Claudine; Mazel, Didier; Polz, Martin F; Le Roux, Frédérique

    2013-01-01

    Vibrio nigripulchritudo is an emerging pathogen of farmed shrimp in New Caledonia and other regions in the Indo-Pacific. The molecular determinants of V. nigripulchritudo pathogenicity are unknown; however, molecular epidemiological studies have suggested that pathogenicity is linked to particular lineages. Here, we performed high-throughput sequencing-based comparative genome analysis of 16 V. nigripulchritudo strains to explore the genomic diversity and evolutionary history of pathogen-containing lineages and to identify pathogen-specific genetic elements. Our phylogenetic analysis revealed three pathogen-containing V. nigripulchritudo clades, including two clades previously identified from New Caledonia and one novel clade comprising putatively pathogenic isolates from septicemic shrimp in Madagascar. The similar genetic distance between the three clades indicates that they have diverged from an ancestral population roughly at the same time and recombination analysis indicates that these genomes have, in the past, shared a common gene pool and exchanged genes. As each contemporary lineage is comprised of nearly identical strains, comparative genomics allowed differentiation of genetic elements specific to shrimp pathogenesis of varying severity. Notably, only a large plasmid present in all highly pathogenic (HP) strains encodes a toxin. Although less/non-pathogenic strains contain related plasmids, these are differentiated by a putative toxin locus. Expression of this gene by a non-pathogenic V. nigripulchritudo strain resulted in production of toxic culture supernatant, normally an exclusive feature of HP strains. Thus, this protein, here termed ‘nigritoxin', is implicated to an extent that remains to be precisely determined in the toxicity of V. nigripulchritudo. PMID:23739050

  7. Comparative Genome Analysis of the Pathogenic Spirochetes Borrelia burgdorferi and Treponema pallidum

    PubMed Central

    Subramanian, G.; Koonin, Eugene V.; Aravind, L.

    2000-01-01

    A comparative analysis of the predicted protein sequences encoded in the complete genomes of Borrelia burgdorferi and Treponema pallidum provides a number of insights into evolutionary trends and adaptive strategies of the two spirochetes. A measure of orthologous relationships between gene sets, termed the orthology coefficient (OC), was developed. The overall OC value for the gene sets of the two spirochetes is about 0.43, which means that less than one-half of the genes show readily detectable orthologous relationships. This emphasizes significant divergence between the two spirochetes, apparently driven by different biological niches. Different functional categories of proteins as well as different protein families show a broad distribution of OC values, from near 1 (a perfect, one-to-one correspondence) to near 0. The proteins involved in core biological functions, such as genome replication and expression, typically show high OC values. In contrast, marked variability is seen among proteins that are involved in specific processes, such as nutrient transport, metabolism, gene-specific transcription regulation, signal transduction, and host response. Differences in the gene complements encoded in the two spirochete genomes suggest active adaptive evolution for their distinct niches. Comparative analysis of the spirochete genomes produced evidence of gene exchanges with other bacteria, archaea, and eukaryotic hosts that seem to have occurred at different points in the evolution of the spirochetes. Examples are presented of the use of sequence profile analysis to predict proteins that are likely to play a role in pathogenesis, including secreted proteins that contain specific protein-protein interaction domains, such as von Willebrand A, YWTD, TPR, and PR1, some of which hitherto have been reported only in eukaryotes. We tentatively reconstruct the likely evolutionary process that has led to the divergence of the two spirochete lineages; this reconstruction seems

  8. De novo Transcriptome Assemblies of Rana (Lithobates) catesbeiana and Xenopus laevis Tadpole Livers for Comparative Genomics without Reference Genomes

    PubMed Central

    Birol, Inanc; Behsaz, Bahar; Hammond, S. Austin; Kucuk, Erdi; Veldhoen, Nik; Helbing, Caren C.

    2015-01-01

    In this work we studied the liver transcriptomes of two frog species, the American bullfrog (Rana (Lithobates) catesbeiana) and the African clawed frog (Xenopus laevis). We used high throughput RNA sequencing (RNA-seq) data to assemble and annotate these transcriptomes, and compared how their baseline expression profiles change when tadpoles of the two species are exposed to thyroid hormone. We generated more than 1.5 billion RNA-seq reads in total for the two species under two conditions as treatment/control pairs. We de novo assembled these reads using Trans-ABySS to reconstruct reference transcriptomes, obtaining over 350,000 and 130,000 putative transcripts for R. catesbeiana and X. laevis, respectively. Using available genomics resources for X. laevis, we annotated over 97% of our X. laevis transcriptome contigs, demonstrating the utility and efficacy of our methodology. Leveraging this validated analysis pipeline, we also annotated the assembled R. catesbeiana transcriptome. We used the expression profiles of the annotated genes of the two species to examine the similarities and differences between the tadpole liver transcriptomes. We also compared the gene ontology terms of expressed genes to measure how the animals react to a challenge by thyroid hormone. Our study reports three main conclusions. First, de novo assembly of RNA-seq data is a powerful method for annotating and establishing transcriptomes of non-model organisms. Second, the liver transcriptomes of the two frog species, R. catesbeiana and X. laevis, show many common features, and the distribution of their gene ontology profiles are statistically indistinguishable. Third, although they broadly respond the same way to the presence of thyroid hormone in their environment, their receptor/signal transduction pathways display marked differences. PMID:26121473

  9. Comparative genomics of mammalian hibernators using gene networks.

    PubMed

    Villanueva-Cañas, José Luis; Faherty, Sheena L; Yoder, Anne D; Albà, M Mar

    2014-09-01

    In recent years, the study of the molecular processes involved in mammalian hibernation has shifted from investigating a few carefully selected candidate genes to large-scale analysis of differential gene expression. The availability of high-throughput data provides an unprecedented opportunity to ask whether phylogenetically distant species show similar mechanisms of genetic control, and how these relate to particular genes and pathways involved in the hibernation phenotype. In order to address these questions, we compare 11 datasets of differentially expressed (DE) genes from two ground squirrel species, one bat species, and the American black bear, as well as a list of genes extracted from the literature that previously have been correlated with the drastic physiological changes associated with hibernation. We identify several genes that are DE in different species, indicating either ancestral adaptations or evolutionary convergence. When we use a network approach to expand the original datasets of DE genes to large gene networks using available interactome data, a higher agreement between datasets is achieved. This indicates that the same key pathways are important for activating and maintaining the hibernation phenotype. Functional-term-enrichment analysis identifies several important metabolic and mitochondrial processes that are critical for hibernation, such as fatty acid beta-oxidation and mitochondrial transport. We do not detect any enrichment of positive selection signatures in the coding sequences of genes from the networks of hibernation-associated genes, supporting the hypothesis that the genetic processes shaping the hibernation phenotype are driven primarily by changes in gene regulation. PMID:24881044

  10. Identification of DNA Methyltransferase Genes in Human Pathogenic Bacteria by Comparative Genomics.

    PubMed

    Brambila-Tapia, Aniel Jessica Leticia; Poot-Hernández, Augusto Cesar; Perez-Rueda, Ernesto; Rodríguez-Vázquez, Katya

    2016-06-01

    DNA methylation plays an important role in gene expression and virulence in some pathogenic bacteria. In this report, we describe DNA methyltransferases (MTases) present in human pathogenic bacteria and compared them with related species, which are not pathogenic or less pathogenic, based in comparative genomics. We performed a search in the KEGG database of the KEGG database orthology groups associated with adenine and cytosine DNA MTase activities (EC: 2.1.1.37, EC: 2.1.1.113 and EC: 2.1.1.72) in 37 human pathogenic species and 18 non/less pathogenic relatives and performed comparisons of the number of these MTases sequences according to their genome size, the DNA MTase type and with their non-less pathogenic relatives. We observed that Helicobacter pylori and Neisseria spp. presented the highest number of MTases while ten different species did not present a predicted DNA MTase. We also detected a significant increase of adenine MTases over cytosine MTases (2.19 vs. 1.06, respectively, p < 0.001). Adenine MTases were the only MTases associated with restriction modification systems and DNA MTases associated with type I restriction modification systems were more numerous than those associated with type III restriction modification systems (0.84 vs. 0.17, p < 0.001); additionally, there was no correlation with the genome size and the total number of DNA MTases, indicating that the number of DNA MTases is related to the particular evolution and lifestyle of specific species, regulating the expression of virulence genes in some pathogenic bacteria. PMID:27570304

  11. Evolution of tryptophan biosynthetic pathway in microbial genomes: a comparative genetic study.

    PubMed

    Priya, V K; Sarkar, Susmita; Sinha, Somdatta

    2014-03-01

    Biosynthetic pathway evolution needs to consider the evolution of a group of genes that code for enzymes catalysing the multiple chemical reaction steps leading to the final end product. Tryptophan biosynthetic pathway has five chemical reaction steps that are highly conserved in diverse microbial genomes, though the genes of the pathway enzymes show considerable variations in arrangements, operon structure (gene fusion and splitting) and regulation. We use a combined bioinformatic and statistical analyses approach to address the question if the pathway genes from different microbial genomes, belonging to a wide range of groups, show similar evolutionary relationships within and between them. Our analyses involved detailed study of gene organization (fusion/splitting events), base composition, relative synonymous codon usage pattern of the genes, gene expressivity, amino acid usage, etc. to assess inter- and intra-genic variations, between and within the pathway genes, in diverse group of microorganisms. We describe these genetic and genomic variations in the tryptophan pathway genes in different microorganisms to show the similarities across organisms, and compare the same genes across different organisms to find the possible variability arising possibly due to horizontal gene transfers. Such studies form the basis for moving from single gene evolution to pathway evolutionary studies that are important steps towards understanding the systems biology of intracellular pathways. PMID:24592292

  12. The plant ontology as a tool for comparative plant anatomy and genomic analyses

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Plant science is now a major player in the fields of genomics, gene expression analysis, phenomics and metabolomics. Recent advances in sequencing technologies have led to a windfall of data, with new species being added rapidly to the list of species whose genomes have been decoded. The Plant Ontol...

  13. IMG 4 version of the integrated microbial genomes comparative analysis system.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu). PMID:24165883

  14. IMG 4 version of the integrated microbial genomes comparative analysis system

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  15. Genome wide transcriptome profiling reveals differential gene expression in secondary metabolite pathway of Cymbopogon winterianus.

    PubMed

    Devi, Kamalakshi; Mishra, Surajit K; Sahu, Jagajjit; Panda, Debashis; Modi, Mahendra K; Sen, Priyabrata

    2016-01-01

    Advances in transcriptome sequencing provide fast, cost-effective and reliable approach to generate large expression datasets especially suitable for non-model species to identify putative genes, key pathway and regulatory mechanism. Citronella (Cymbopogon winterianus) is an aromatic medicinal grass used for anti-tumoral, antibacterial, anti-fungal, antiviral, detoxifying and natural insect repellent properties. Despite of having number of utilities, the genes involved in terpenes biosynthetic pathway is not yet clearly elucidated. The present study is a pioneering attempt to generate an exhaustive molecular information of secondary metabolite pathway and to increase genomic resources in Citronella. Using high-throughput RNA-Seq technology, root and leaf transcriptome was analysed at an unprecedented depth (11.7 Gb). Targeted searches identified majority of the genes associated with metabolic pathway and other natural product pathway viz. antibiotics synthesis along with many novel genes. Terpenoid biosynthesis genes comparative expression results were validated for 15 unigenes by RT-PCR and qRT-PCR. Thus the coverage of these transcriptome is comprehensive enough to discover all known genes of major metabolic pathways. This transcriptome dataset can serve as important public information for gene expression, genomics and function genomics studies in Citronella and shall act as a benchmark for future improvement of the crop. PMID:26877149

  16. Genome wide transcriptome profiling reveals differential gene expression in secondary metabolite pathway of Cymbopogon winterianus

    PubMed Central

    Devi, Kamalakshi; Mishra, Surajit K.; Sahu, Jagajjit; Panda, Debashis; Modi, Mahendra K.; Sen, Priyabrata

    2016-01-01

    Advances in transcriptome sequencing provide fast, cost-effective and reliable approach to generate large expression datasets especially suitable for non-model species to identify putative genes, key pathway and regulatory mechanism. Citronella (Cymbopogon winterianus) is an aromatic medicinal grass used for anti-tumoral, antibacterial, anti-fungal, antiviral, detoxifying and natural insect repellent properties. Despite of having number of utilities, the genes involved in terpenes biosynthetic pathway is not yet clearly elucidated. The present study is a pioneering attempt to generate an exhaustive molecular information of secondary metabolite pathway and to increase genomic resources in Citronella. Using high-throughput RNA-Seq technology, root and leaf transcriptome was analysed at an unprecedented depth (11.7 Gb). Targeted searches identified majority of the genes associated with metabolic pathway and other natural product pathway viz. antibiotics synthesis along with many novel genes. Terpenoid biosynthesis genes comparative expression results were validated for 15 unigenes by RT-PCR and qRT-PCR. Thus the coverage of these transcriptome is comprehensive enough to discover all known genes of major metabolic pathways. This transcriptome dataset can serve as important public information for gene expression, genomics and function genomics studies in Citronella and shall act as a benchmark for future improvement of the crop. PMID:26877149

  17. Complete genome sequences and comparative genome analysis of Lactobacillus plantarum strain 5-2 isolated from fermented soybean.

    PubMed

    Liu, Chen-Jian; Wang, Rui; Gong, Fu-Ming; Liu, Xiao-Feng; Zheng, Hua-Jun; Luo, Yi-Yong; Li, Xiao-Ran

    2015-12-01

    Lactobacillus plantarum is an important probiotic and is mostly isolated from fermented foods. We sequenced the genome of L. plantarum strain 5-2, which was derived from fermented soybean isolated from Yunnan province, China. The strain was determined to contain 3114 genes. Fourteen complete insertion sequence (IS) elements were found in 5-2 chromosome. There were 24 DNA replication proteins and 76 DNA repair proteins in the 5-2 genome. Consistent with the classification of L. plantarum as a facultative heterofermentative lactobacillus, the 5-2 genome encodes key enzymes required for the EMP (Embden-Meyerhof-Parnas) and phosphoketolase (PK) pathways. Several components of the secretion machinery are found in the 5-2 genome, which was compared with L. plantarum ST-III, JDM1 and WCFS1. Most of the specific proteins in the four genomes appeared to be related to their prophage elements. PMID:26212213

  18. Deciphering the conserved genetic loci implicated in plant disease control through comparative genomics of Bacillus amyloliquefaciens subsp. plantarum.

    PubMed

    Hossain, Mohammad J; Ran, Chao; Liu, Ke; Ryu, Choong-Min; Rasmussen-Ivey, Cody R; Williams, Malachi A; Hassan, Mohammad K; Choi, Soo-Keun; Jeong, Haeyoung; Newman, Molli; Kloepper, Joseph W; Liles, Mark R

    2015-01-01

    To understand the growth-promoting and disease-inhibiting activities of plant growth-promoting rhizobacteria (PGPR) strains, the genomes of 12 Bacillus subtilis group strains with PGPR activity were sequenced and analyzed. These B. subtilis strains exhibited high genomic diversity, whereas the genomes of B. amyloliquefaciens strains (a member of the B. subtilis group) are highly conserved. A pairwise BLASTp matrix revealed that gene family similarity among Bacillus genomes ranges from 32 to 90%, with 2839 genes within the core genome of B. amyloliquefaciens subsp. plantarum. Comparative genomic analyses of B. amyloliquefaciens strains identified genes that are linked with biological control and colonization of roots and/or leaves, including 73 genes uniquely associated with subsp. plantarum strains that have predicted functions related to signaling, transportation, secondary metabolite production, and carbon source utilization. Although B. amyloliquefaciens subsp. plantarum strains contain gene clusters that encode many different secondary metabolites, only polyketide biosynthetic clusters that encode difficidin and macrolactin are conserved within this subspecies. To evaluate their role in plant pathogen biocontrol, genes involved in secondary metabolite biosynthesis were deleted in a B. amyloliquefaciens subsp. plantarum strain, revealing that difficidin expression is critical in reducing the severity of disease, caused by Xanthomonas axonopodis pv. vesicatoria in tomato plants. This study defines genomic features of PGPR strains and links them with biocontrol activity and with host colonization. PMID:26347755

  19. Deciphering the conserved genetic loci implicated in plant disease control through comparative genomics of Bacillus amyloliquefaciens subsp. plantarum

    PubMed Central

    Hossain, Mohammad J.; Ran, Chao; Liu, Ke; Ryu, Choong-Min; Rasmussen-Ivey, Cody R.; Williams, Malachi A.; Hassan, Mohammad K.; Choi, Soo-Keun; Jeong, Haeyoung; Newman, Molli; Kloepper, Joseph W.; Liles, Mark R.

    2015-01-01

    To understand the growth-promoting and disease-inhibiting activities of plant growth-promoting rhizobacteria (PGPR) strains, the genomes of 12 Bacillus subtilis group strains with PGPR activity were sequenced and analyzed. These B. subtilis strains exhibited high genomic diversity, whereas the genomes of B. amyloliquefaciens strains (a member of the B. subtilis group) are highly conserved. A pairwise BLASTp matrix revealed that gene family similarity among Bacillus genomes ranges from 32 to 90%, with 2839 genes within the core genome of B. amyloliquefaciens subsp. plantarum. Comparative genomic analyses of B. amyloliquefaciens strains identified genes that are linked with biological control and colonization of roots and/or leaves, including 73 genes uniquely associated with subsp. plantarum strains that have predicted functions related to signaling, transportation, secondary metabolite production, and carbon source utilization. Although B. amyloliquefaciens subsp. plantarum strains contain gene clusters that encode many different secondary metabolites, only polyketide biosynthetic clusters that encode difficidin and macrolactin are conserved within this subspecies. To evaluate their role in plant pathogen biocontrol, genes involved in secondary metabolite biosynthesis were deleted in a B. amyloliquefaciens subsp. plantarum strain, revealing that difficidin expression is critical in reducing the severity of disease, caused by Xanthomonas axonopodis pv. vesicatoria in tomato plants. This study defines genomic features of PGPR strains and links them with biocontrol activity and with host colonization. PMID:26347755

  20. Comparative genomics and evolution of the tailed-bacteriophages.

    PubMed

    Casjens, Sherwood R

    2005-08-01

    The number of completely sequenced tailed-bacteriophage genomes that have been published increased to more than 125 last year. The comparison of these genomes has brought their highly mosaic nature into much sharper focus. Furthermore, reports of the complete sequences of about 150 bacterial genomes have shown that the many prophage and parts thereof that reside in these bacterial genomes must comprise a significant fraction of Earth's phage gene pool. These phage and prophage genomes are fertile ground for attempts to deduce the nature of viral evolutionary processes, and such analyses have made it clear that these phage have enjoyed a significant level of horizontal exchange of genetic information throughout their long histories. The strength of these evolutionary deductions rests largely on the extensive knowledge that has accumulated during intensive study into the molecular nature of the life cycles of a few 'model system' phages over the past half century. Recent molecular studies of phages other than these model system phages have made it clear that much remains to be learnt about the variety of lifestyle strategies utilized by the tailed-phage. PMID:16019256

  1. Genomic and Expression Analyses of Cold-Adapted Microorganisms.

    SciTech Connect

    Bakermans, Corien; Bergholz, Peter W.; Rodrigues, Debora F.; Vishnivetskaya, T.; Ayala-del-Río, Hector L.; Tiedje, James M.

    2011-01-01

    Contents 7.1 Introduction 7.2 Ecological evidence of bacterial adaptation to cold 7.2.1 Characteristics of cold environments and implications for microbial ecology 7.2.2 Ecological adaptation in Exiguobacterium spp. and Psychrobacter spp. 7.3 Gene Expression Responses to the Cold 7.3.1 Fundamentals of Gene Expression Responses to Cold 7.3.2 Acclimation for Life in Cold Habitats 7.3.2.1 Translation and Chaperone Proteins: Safeguarding the functional units of cellular physiology 7.3.2.2 Carbon and Energy metabolism: resource efficiency over long generation times 7.3.2.3 Amino Acid Biosynthesis: Species-specific responses to species-specific deficiencies 7.3.2.4 Compatible solutes: a concomitant response in cryoenvironments 7.3.2.5 Membrane fluidity: A major role in the overall metabolic rate at temperature 7.3.2.6 The cell wall at low temperature: A poorly understood growth rate determinant 7.3.2.7 Transporters: The balance between local nutrient uptake and depletion 7.3.2.8 Genome plasticity. The potential role of transposases and repeated sequences. 7.4 Protein adaptations to cold 7.4.1 The low temperature challenge 7.4.2 The stability activity relationship 7.4.3 Structural features of cold adapted enzymes. 7.4.4 Hydrophobic interactions 7.4.5 Electrostatic interactions 7.4.5.1 Arginine 7.4.5.2 Acidic residues 7.4.6 Structural elements 7.4.6.1 -helices and -sheets 7.4.6.2 Proline and Glycine 7.4.6.3 Disordered regions 7.5 Comparison of cold- and warm-adapted Exiguobacterium strains 7.5.1 Phylogeny reflects adaptations to environmental conditions 7.5.2 Genomic comparison of two strains 7.6 Summary and future directions

  2. Genome-wide gene expression perturbation induced by loss of C2 chromosome in allotetraploid Brassica napus L.

    PubMed Central

    Zhu, Bin; Shao, Yujiao; Pan, Qi; Ge, Xianhong; Li, Zaiyun

    2015-01-01

    Aneuploidy with loss of entire chromosomes from normal complement disrupts the balanced genome and is tolerable only by polyploidy plants. In this study, the monosomic and nullisomic plants losing one or two copies of C2 chromosome from allotetraploid Brassica napus L. (2n = 38, AACC) were produced and compared for their phenotype and transcriptome. The monosomics gave a plant phenotype very similar to the original donor, but the nullisomics had much smaller stature and also shorter growth period. By the comparative analyses on the global transcript profiles with the euploid donor, genome-wide alterations in gene expression were revealed in two aneuploids, and their majority of differentially expressed genes (DEGs) resulted from the trans-acting effects of the zero and one copy of C2 chromosome. The higher number of up-regulated genes than down-regulated genes on other chromosomes suggested that the genome responded to the C2 loss via enhancing the expression of certain genes. Particularly, more DEGs were detected in the monosomics than nullisomics, contrasting with their phenotypes. The gene expression of the other chromosomes was differently affected, and several dysregulated domains in which up- or downregulated genes obviously clustered were identifiable. But the mean gene expression (MGE) for homoeologous chromosome A2 reduced with the C2 loss. Some genes and their expressions on C2 were correlated with the phenotype deviations in the aneuploids. These results provided new insights into the transcriptomic perturbation of the allopolyploid genome elicited by the loss of individual chromosome. PMID:26442076

  3. INVESTIGATIONS INTO MOLECULAR PATHWAYS IN THE POST GENOME ERA: CROSS SPECIES COMPARATIVE GENOMICS APPROACH

    EPA Science Inventory


    Genome sequencing efforts in the past decade were aimed at generating draft sequences of many prokaryotic and eukaryotic model organisms. Successful completion of unicellular eukaryotes, worm, fly and human genome have opened up the new field of molecular biology and function...

  4. Mitochondrial genome sequences and comparative genomics of Phytophthora ramorum and P. sojae

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete sequences of the mitochondrial genomes of the oomycetes Phytophthora ramorum and P. sojae were determined during the course of their complete nuclear genome sequencing (Tyler et al. 2006). Both are circular, with sizes of 39,314 bp for P. ramorum and 42,977 bp for P. sojae. Each contain...

  5. Comparative Genomics Identifies Epidermal Proteins Associated with the Evolution of the Turtle Shell.

    PubMed

    Holthaus, Karin Brigit; Strasser, Bettina; Sipos, Wolfgang; Schmidt, Heiko A; Mlitz, Veronika; Sukseree, Supawadee; Weissenbacher, Anton; Tschachler, Erwin; Alibardi, Lorenzo; Eckhart, Leopold

    2016-03-01

    The evolution of reptiles, birds, and mammals was associated with the origin of unique integumentary structures. Studies on lizards, chicken, and humans have suggested that the evolution of major structural proteins of the outermost, cornified layers of the epidermis was driven by the diversification of a gene cluster called Epidermal Differentiation Complex (EDC). Turtles have evolved unique defense mechanisms that depend on mechanically resilient modifications of the epidermis. To investigate whether the evolution of the integument in these reptiles was associated with specific adaptations of the sequences and expression patterns of EDC-related genes, we utilized newly available genome sequences to determine the epidermal differentiation gene complement of turtles. The EDC of the western painted turtle (Chrysemys picta bellii) comprises more than 100 genes, including at least 48 genes that encode proteins referred to as beta-keratins or corneous beta-proteins. Several EDC proteins have evolved cysteine/proline contents beyond 50% of total amino acid residues. Comparative genomics suggests that distinct subfamilies of EDC genes have been expanded and partly translocated to loci outside of the EDC in turtles. Gene expression analysis in the European pond turtle (Emys orbicularis) showed that EDC genes are differentially expressed in the skin of the various body sites and that a subset of beta-keratin genes within the EDC as well as those located outside of the EDC are expressed predominantly in the shell. Our findings give strong support to the hypothesis that the evolutionary innovation of the turtle shell involved specific molecular adaptations of epidermal differentiation. PMID:26601937

  6. Comparative Genomics Identifies Epidermal Proteins Associated with the Evolution of the Turtle Shell

    PubMed Central

    Holthaus, Karin Brigit; Strasser, Bettina; Sipos, Wolfgang; Schmidt, Heiko A.; Mlitz, Veronika; Sukseree, Supawadee; Weissenbacher, Anton; Tschachler, Erwin; Alibardi, Lorenzo; Eckhart, Leopold

    2016-01-01

    The evolution of reptiles, birds, and mammals was associated with the origin of unique integumentary structures. Studies on lizards, chicken, and humans have suggested that the evolution of major structural proteins of the outermost, cornified layers of the epidermis was driven by the diversification of a gene cluster called Epidermal Differentiation Complex (EDC). Turtles have evolved unique defense mechanisms that depend on mechanically resilient modifications of the epidermis. To investigate whether the evolution of the integument in these reptiles was associated with specific adaptations of the sequences and expression patterns of EDC-related genes, we utilized newly available genome sequences to determine the epidermal differentiation gene complement of turtles. The EDC of the western painted turtle (Chrysemys picta bellii) comprises more than 100 genes, including at least 48 genes that encode proteins referred to as beta-keratins or corneous beta-proteins. Several EDC proteins have evolved cysteine/proline contents beyond 50% of total amino acid residues. Comparative genomics suggests that distinct subfamilies of EDC genes have been expanded and partly translocated to loci outside of the EDC in turtles. Gene expression analysis in the European pond turtle (Emys orbicularis) showed that EDC genes are differentially expressed in the skin of the various body sites and that a subset of beta-keratin genes within the EDC as well as those located outside of the EDC are expressed predominantly in the shell. Our findings give strong support to the hypothesis that the evolutionary innovation of the turtle shell involved specific molecular adaptations of epidermal differentiation. PMID:26601937

  7. Whole genome expression profiling in chewing-tobacco-associated oral cancers: a pilot study.

    PubMed

    Chakrabarti, Sanjukta; Multani, Shaleen; Dabholkar, Jyoti; Saranath, Dhananjaya

    2015-03-01

    The current study was undertaken with a view to identify differential biomarkers in chewing-tobacco-associated oral cancer tissues in patients of Indian ethnicity. The gene expression profile was analyzed in oral cancer tissues as compared to clinically normal oral buccal mucosa. We examined 30 oral cancer tissues and 27 normal oral tissues with 16 paired samples from contralateral site of the patient and 14 unpaired samples from different oral cancer patients, for whole genome expression using high-throughput IlluminaSentrix Human Ref-8 v2 Expression BeadChip array. The cDNA microarray analysis identified 425 differentially expressed genes with >1.5-fold expression in the oral cancer tissues as compared to normal tissues in the oral cancer patients. Overexpression of 255 genes and downregulation of 170 genes (p < 0.01) were observed. Further, a minimum twofold overexpression was observed in 32 genes and downregulation in 12 genes, in 30-83% of oral cancer patients. Biological pathway analysis using Kyoto Encyclopedia of Genes and Genome Pathway database revealed that the differentially regulated genes were associated with critical biological functions. The biological functions and representative deregulated genes include cell proliferation (AIM2, FAP, TNFSF13B, TMPRSS11A); signal transduction (FOLR2, MME, HTR3B); invasion and metastasis (SPP1, TNFAIP6, EPHB6); differentiation (CLEC4A, ELF5); angiogenesis (CXCL1); apoptosis (GLIPR1, WISP1, DAPL1); and immune responses (CD300A, IFIT2, TREM2); and metabolism (NNMT; ALDH3A1). Besides, several of the genes have been differentially expressed in human cancers including oral cancer. Our data indicated differentially expressed genes in oral cancer tissues and may identify prognostic and therapeutic biomarkers in oral cancers, postvalidation in larger numbers and varied population samples. PMID:25663065

  8. Correction: Comparative analysis of fungal genomes reveals different plant cell wall degrading capacity in fungi

    PubMed Central

    2014-01-01

    Abstract The version of this article published in BMC Genomics 2013, 14: 274, contains 9 unpublished genomes (Botryobasidium botryosum, Gymnopus luxurians, Hypholoma sublateritium, Jaapia argillacea, Hebeloma cylindrosporum, Conidiobolus coronatus, Laccaria amethystina, Paxillus involutus, and P. rubicundulus) downloaded from JGI website. In this correction, we removed these genomes after discussion with editors and data producers whom we should have contacted before downloading these genomes. Removing these data did not alter the principle results and conclusions of our original work. The relevant Figures 1, 2, 3, 4 and 6; and Table 1 have been revised. Additional files 1, 3, 4, and 5 were also revised. We would like to apologize for any confusion or inconvenience this may have caused. Background Fungi produce a variety of carbohydrate activity enzymes (CAZymes) for the degradation of plant polysaccharide materials to facilitate infection and/or gain nutrition. Identifying and comparing CAZymes from fungi with different nutritional modes or infection mechanisms may provide information for better understanding of their life styles and infection models. To date, over hundreds of fungal genomes are publicly available. However, a systematic comparative analysis of fungal CAZymes across the entire fungal kingdom has not been reported. Results In this study, we systemically identified glycoside hydrolases (GHs), polysaccharide lyases (PLs), carbohydrate esterases (CEs), and glycosyltransferases (GTs) as well as carbohydrate-binding modules (CBMs) in the predicted proteomes of 94 representative fungi from Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota. Comparative analysis of these CAZymes that play major roles in plant polysaccharide degradation revealed that fungi exhibit tremendous diversity in the number and variety of CAZymes. Among them, some families of GHs and CEs are the most prevalent CAZymes that are distributed in all of the fungi analyzed

  9. Comparative genomics reveals conserved positioning of essential genomic clusters in highly rearranged Thermococcales chromosomes

    PubMed Central

    Cossu, Matteo; Da Cunha, Violette; Toffano-Nioche, Claire; Forterre, Patrick; Oberto, Jacques

    2015-01-01

    The genomes of the 21 completely sequenced Thermococcales display a characteristic high level of rearrangements. As a result, the prediction of their origin and termination of replication on the sole basis of chromosomal DNA composition or skew is inoperative. Using a different approach based on biologically relevant sequences, we were able to determine oriC position in all 21 genomes. The position of dif, the site where chromosome dimers are resolved before DNA segregation could be predicted in 19 genomes. Computation of the core genome uncovered a number of essential gene clusters with a remarkably stable chromosomal position across species, in sharp contrast with the scrambled nature of their genomes. The active chromosomal reorganization of numerous genes acquired by horizontal transfer, mainly from mobile elements, could explain this phenomenon. PMID:26166067

  10. Comparative genomics of the Flavobacterium psychrophilum Frp locus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Flavobacterium psychrophilum is a Gram-negative, bacterium responsible for cold water disease and rainbow trout fry syndrome. These diseases are a source of significant economic loss in cool water aquaculture. Recently, the genome sequences of two strains, JIP02/86 and CSF259-93, have been determine...

  11. Comparative population genomics of maize domestication and improvement

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Domestication and modern breeding represent exemplary case studies of evolution in action. Maize is an outcrossing species with a complex genome, and an understanding of maize evolution is thus relevant for both plant and animal systems. This study is the largest plant resequencing effort to date, ...

  12. Comparative Analysis of Alu Repeats in Primate Genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: Alu repeats are SINEs (Short intersperse repetitive elements) which enjoy a successful application in genome evolution, population biology, phylogenetics and forensics. Human Alu consensus sequences were widely used as surrogates in nonhuman primate studies with an assumption that all p...

  13. Comparative Analysis of Genome Diversity in Bullmastiff Dogs

    PubMed Central

    Mortlock, Sally-Anne; Khatkar, Mehar S.; Williamson, Peter

    2016-01-01

    Management and preservation of genomic diversity in dog breeds is a major objective for maintaining health. The present study was undertaken to characterise genomic diversity in Bullmastiff dogs using both genealogical and molecular analysis. Genealogical analysis of diversity was conducted using a database consisting of 16,378 Bullmastiff pedigrees from year 1980 to 2013. Additionally, a total of 188 Bullmastiff dogs were genotyped using the 170,000 SNP Illumina CanineHD Beadchip. Genealogical parameters revealed a mean inbreeding coefficient of 0.047; 142 total founders (f); an effective number of founders (fe) of 79; an effective number of ancestors (fa) of 62; and an effective population size of the reference population of 41. Genetic diversity and the degree of genome-wide homogeneity within the breed were also investigated using molecular data. Multiple-locus heterozygosity (MLH) was equal to 0.206; runs of homozygosity (ROH) as proportion of the genome, averaged 16.44%; effective population size was 29.1, with an average inbreeding coefficient of 0.035, all estimated using SNP Data. Fine-scale population structure was analysed using NETVIEW, a population analysis pipeline. Visualisation of the high definition network captured relationships among individuals within and between subpopulations. Effects of unequal founder use, and ancestral inbreeding and selection, were evident. While current levels of Bullmastiff heterozygosity, inbreeding and homozygosity are not unusual, a relatively small effective population size indicates that a breeding strategy to reduce the inbreeding rate may be beneficial. PMID:26824579

  14. Genomic Comparative Study of Bovine Mastitis Escherichia coli

    PubMed Central

    Kempf, Florent; Slugocki, Cindy; Blum, Shlomo E.; Leitner, Gabriel; Germon, Pierre

    2016-01-01

    Escherichia coli, one of the main causative agents of bovine mastitis, is responsible for significant losses on dairy farms. In order to better understand the pathogenicity of E. coli mastitis, an accurate characterization of E. coli strains isolated from mastitis cases is required. By using phylogenetic analyses and whole genome comparison of 5 currently available mastitis E. coli genome sequences, we searched for genotypic traits specific for mastitis isolates. Our data confirm that there is a bias in the distribution of mastitis isolates in the different phylogenetic groups of the E. coli species, with the majority of strains belonging to phylogenetic groups A and B1. An interesting feature is that clustering of strains based on their accessory genome is very similar to that obtained using the core genome. This finding illustrates the fact that phenotypic properties of strains from different phylogroups are likely to be different. As a consequence, it is possible that different strategies could be used by mastitis isolates of different phylogroups to trigger mastitis. Our results indicate that mastitis E. coli isolates analyzed in this study carry very few of the virulence genes described in other pathogenic E. coli strains. A more detailed analysis of the presence/absence of genes involved in LPS synthesis, iron acquisition and type 6 secretion systems did not uncover specific properties of mastitis isolates. Altogether, these results indicate that mastitis E. coli isolates are rather characterized by a lack of bona fide currently described virulence genes. PMID:26809117

  15. Cloud Computing for Comparative Genomics with Windows Azure Platform

    PubMed Central

    Kim, Insik; Jung, Jae-Yoon; DeLuca, Todd F.; Nelson, Tristan H.; Wall, Dennis P.

    2012-01-01

    Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services. PMID:23032609

  16. Comparative genomics of mutualistic viruses of Glyptapanteles parasitic wasps

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Polydnaviruses, a family of double-stranded DNA viruses with segmented genomes, have evolved as obligate endosymbionts of endoparasitoid wasps, and are some of the few viruses known to share mutualistic relationships with eukaryotic hosts. Virus particles are replication deficient and are produced o...

  17. Comparative Analysis of Genome Diversity in Bullmastiff Dogs.

    PubMed

    Mortlock, Sally-Anne; Khatkar, Mehar S; Williamson, Peter

    2016-01-01

    Management and preservation of genomic diversity in dog breeds is a major objective for maintaining health. The present study was undertaken to characterise genomic diversity in Bullmastiff dogs using both genealogical and molecular analysis. Genealogical analysis of diversity was conducted using a database consisting of 16,378 Bullmastiff pedigrees from year 1980 to 2013. Additionally, a total of 188 Bullmastiff dogs were genotyped using the 170,000 SNP Illumina CanineHD Beadchip. Genealogical parameters revealed a mean inbreeding coefficient of 0.047; 142 total founders (f); an effective number of founders (fe) of 79; an effective number of ancestors (fa) of 62; and an effective population size of the reference population of 41. Genetic diversity and the degree of genome-wide homogeneity within the breed were also investigated using molecular data. Multiple-locus heterozygosity (MLH) was equal to 0.206; runs of homozygosity (ROH) as proportion of the genome, averaged 16.44%; effective population size was 29.1, with an average inbreeding coefficient of 0.035, all estimated using SNP Data. Fine-scale population structure was analysed using NETVIEW, a population analysis pipeline. Visualisation of the high definition network captured relationships among individuals within and between subpopulations. Effects of unequal founder use, and ancestral inbreeding and selection, were evident. While current levels of Bullmastiff heterozygosity, inbreeding and homozygosity are not unusual, a relatively small effective population size indicates that a breeding strategy to reduce the inbreeding rate may be beneficial. PMID:26824579

  18. Evaluating Theobroma grandiflorum for comparative genomic studies with Theobroma cacao

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The seeds of Theobroma cacao (cacao) are the source of cocoa, the raw material for the multi-billion dollar chocolate industry. Cacao’s two most important traits are its unique seed storage triglyceride (cocoa butter) and the flavor of its fermented beans (chocolate). The genome of T. cacao is bei...

  19. Fertility and genomics: comparison of gene expression in contrasting reproductive tissues of female cattle.

    PubMed

    McGettigan, P A; Browne, J A; Carrington, S D; Crowe, M A; Fair, T; Forde, N; Loftus, B J; Lohan, A; Lonergan, P; Pluta, K; Mamo, S; Murphy, A; Roche, J; Walsh, S W; Creevey, C J; Earley, B; Keady, S; Kenny, D A; Matthews, D; McCabe, M; Morris, D; O'Loughlin, A; Waters, S; Diskin, M G; Evans, A C O

    2016-01-01

    To compare gene expression among bovine tissues, large bovine RNA-seq datasets were used, comprising 280 samples from 10 different bovine tissues (uterine endometrium, granulosa cells, theca cells, cervix, embryos, leucocytes, liver, hypothalamus, pituitary, muscle) and generating 260 Gbases of data. Twin approaches were used: an information-theoretic analysis of the existing annotated transcriptome to identify the most tissue-specific genes and a de-novo transcriptome annotation to evaluate general features of the transcription landscape. Expression was detected for 97% of the Ensembl transcriptome with at least one read in one sample and between 28% and 66% at a level of 10 tags per million (TPM) or greater in individual tissues. Over 95% of genes exhibited some level of tissue-specific gene expression. This was mostly due to different levels of expression in different tissues rather than exclusive expression in a single tissue. Less than 1% of annotated genes exhibited a highly restricted tissue-specific expression profile and approximately 2% exhibited classic housekeeping profiles. In conclusion, it is the combined effects of the variable expression of large numbers of genes (73%-93% of the genome) and the specific expression of a small number of genes (<1% of the transcriptome) that contribute to determining the outcome of the function of individual tissues. PMID:27062871

  20. Modeling Genome-Scale mRNA Expression Datasets: From Matrix Algebra to Genetic Networks

    NASA Astrophysics Data System (ADS)

    Alter, Orly

    2003-03-01

    DNA microarray genome-wide expression data promise to enhance fundamental understanding of life on the molecular level, and may prove useful in medical diagnosis, treatment and drug design. Analysis of these new data requires mathematical tools that use large quantities of data and reduce the complexity of the data to make them comprehensible. These tools should provide predictive models, i.e., mathematical frameworks for the description of the data, in which the mathematical variables and operations may be assigned biological meaning. Such models will facilitate the unraveling of the cellular machineries that generate, sense and react to the expression signal. I will start with a description of the use of singular value decomposition (SVD) to construct the first model for genome-wide expression data. SVD is a unique data-driven linear transformation of the expression data from the genes × arrays space to the reduced ``eigengenes'' × ``eigenarrays'' space, where the eigengenes (eigenarrays) are unique orthonormal superpositions of the genes (arrays). Normalizing the data, by detecting and filtering out additive and multiplicative experimental artifacts and irrelevant biological processes, enables meaningful comparison of the expression of different genes across different arrays in different experiments. Sorting the data, according to a chosen subset of eigengenes (and eigenarrays), rather than by overall expression, gives a global picture of gene expression in which individual genes and arrays appear to be classified into groups of similar regulation and function, or similar cellular state and biological phenotype, respectively. In some experiments, the significant eigengenes and eigenarrays can be associated with genome-wide effects of regulators, or with measured samples in which these regulators are overactive or underactive, respectively. I will then describe the use of generalized singular value decomposition (GSVD) to construct the first comparative model

  1. Comparative genomics of Ceriporiopsis subvermispora and Phanerochaete chrysosporium provide insight into selective ligninolysis

    SciTech Connect

    Fernandez-Fueyo, Elena; Ruiz-Duenas, Francisco J.; Ferreira, Patrica; Floudas, Dimitrios; HIbbett, David S.; Canessa, Paulo; Larrondo, Luis F.; James, Tim Y.; Seelenfreund, Daniela; Lobos, Sergio; Polanco, Ruben; Tello, Mario; Honda, Yoichi; Watanabe, Takahito; Watanabe, Takashi; Ryu, Jae San; Kubicek, Christian P.; Schmoll, Monika; Gaskell, Jill; Hammel, Kenneth E.; John, Franz J.; Vanden Wymelenberg, Amber; Sabat, Grzegorz; Splinter BonDurant, Sandra; Syed, Khajamohiddin; Yadav, Jagjit S.; Doddapaneni, Harshavardhan; Subramanian, Venkataramanan; Lavin, Jose L.; Oguiza, Jose A.; Perez, Gumer; Pisabarro, Antonio G.; Ramirez, Lucia; Santoyo, Francisco; Master, Emma; Coutinho, Pedro M.; Henrissat, Bernard; Lombard, Vincent; Magnuson, Jon Karl; Kues, Ursula; Hori, Chiaki; Igarashi, Kiyohiko; Samejima, Masahiro; Held, Benjamin W.; Barry, Kerrie W.; LaButti, Kurt M.; Lapidus, Alla; Lindquist, Erika A.; Lucas, Susan M.; Riley, Robert; Salamov, Asaf A.; Hoffmeister, Dirk; Schwenk, Daniel; Hadar, Yitzhak; Yarden, Oded; de Vries, Ronald P.; Wiebenga, Ad; Stenlid, Jan; Eastwood, Daniel; Grigoriev, Igor V.; Berka, Randy M.; Blanchette, Robert A.; Kersten, Phil; Martinez, Angel T.; Vicuna, Rafael; Cullen, Dan

    2011-12-06

    Efficient lignin depolymerization is unique to the wood decay basidiomycetes, collectively referred to as white rot fungi. Phanerochaete chrysosporium simultaneously degrades lignin and cellulose, whereas the closely related species, Ceriporiopsis subvermispora, also depolymerizes lignin but may do so with relatively little cellulose degradation. To investigate the basis for selective ligninolysis, we conducted comparative genome analysis of C. subvermispora and P. chrysosporium. Genes encoding manganese peroxidase numbered 13 and five in C. subvermispora and P. chrysosporium, respectively. In addition, the C. subvermispora genome contains at least seven genes predicted to encode laccases, whereas the P. chrysosporium genome contains none. We also observed expansion of the number of C. subvermispora desaturase-encoding genes putatively involved in lipid metabolism. Microarray-based transcriptome analysis showed substantial up-regulation of several desaturase and MnP genes in wood-containing medium. MS identified MnP proteins in C. subvermispora culture filtrates, but none in P. chrysosporium cultures. These results support the importance of MnP and a lignin degradation mechanism whereby cleavage of the dominant nonphenolic structures is mediated by lipid peroxidation products. Two C. subvermispora genes were predicted to encode peroxidases structurally similar to P. chrysosporium lignin peroxidase and, following heterologous expression in Escherichia coli, the enzymes were shown to oxidize high redox potential substrates, but not Mn2. Apart from oxidative lignin degradation, we also examined cellulolytic and hemicellulolytic systems in both fungi. In summary, the C. subvermispora genetic inventory and expression patterns exhibit increased oxidoreductase potential and diminished cellulolytic capability relative to P. chrysosporium.

  2. Comparative genomics of clinical and environmental Vibrio mimicus.

    PubMed

    Hasan, Nur A; Grim, Christopher J; Haley, Bradd J; Chun, Jongsik; Alam, Munirul; Taviani, Elisa; Hoq, Mozammel; Munk, A Christine; Saunders, Elizabeth; Brettin, Thomas S; Bruce, David C; Challacombe, Jean F; Detter, J Chris; Han, Cliff S; Xie, Gary; Nair, G Balakrish; Huq, Anwar; Colwell, Rita R

    2010-12-01

    Whether Vibrio mimicus is a variant of Vibrio cholerae or a separate species has been the subject of taxonomic controversy. A genomic analysis was undertaken to resolve the issue. The genomes of V. mimicus MB451, a clinical isolate, and VM223, an environmental isolate, comprise ca. 4,347,971 and 4,313,453 bp and encode 3,802 and 3,290 ORFs, respectively. As in other vibrios, chromosome I (C-I) predominantly contains genes necessary for growth and viability, whereas chromosome II (C-II) bears genes for adaptation to environmental change. C-I harbors many virulence genes, including some not previously reported in V. mimicus, such as mannose-sensitive hemagglutinin (MSHA), and enterotoxigenic hemolysin (HlyA); C-II encodes a variant of Vibrio pathogenicity island 2 (VPI-2), and Vibrio seventh pandemic island II (VSP-II) cluster of genes. Extensive genomic rearrangement in C-II indicates it is a hot spot for evolution and genesis of speciation for the genus Vibrio. The number of virulence regions discovered in this study (VSP-II, MSHA, HlyA, type IV pilin, PilE, and integron integrase, IntI4) with no notable difference in potential virulence genes between clinical and environmental strains suggests these genes also may play a role in the environment and that pathogenic strains may arise in the environment. Significant genome synteny with prototypic pre-seventh pandemic strains of V. cholerae was observed, and the results of phylogenetic analysis support the hypothesis that, in the course of evolution, V. mimicus and V. cholerae diverged from a common ancestor with a prototypic sixth pandemic genomic backbone. PMID:21078967

  3. Recombinant genomes which express chloramphenicol acetyltransferase in mammalian cells

    SciTech Connect

    Gorman, C.M.; Moffat, L.F.; Howard, B.H.

    1982-09-01

    The authors constructed a series of recombinant genomes which directed expression of the enzyme chloramphenicol acetyltransferase (CAT) in mammalian cells. The prototype recombinant in this series, pSV2-cat, consisted of the beta-lactamase gene and origin of replication from pBR322 coupled to a simian virus 40 (SV40) early transcription region into which CAT coding sequences were inserted. Readily measured levels of CAT accumulated within 48 h after the introduction of pSV2-cat DNA into African green monkey kidney CV-1 cells. Because endogenous CAT activity is not present in CV-1 or other mammalian cells, and because rapid, sensitive assays for CAT activity are available, these recombinants provided a uniquely convenient system for monitoring the expression of foreign DNAs in tissue culture cells. To demonstrate the usefulness of this system, we constructed derivatives of pSV2-cat from which part or all of the SV 40 promoter region was removed. Deletion of one copy of the 72-base-pair repeat sequence in the SV40 promoter caused no significant decrease in CAT synthesis in monkey kidney CV-1 cells; however, an additional deletion of 50 base pairs from the second copy of the repeats reduced CAT synthesis to 11% of its level in the wild type. They also constructed a recombinant, pSVO-cat, in which the entire SV40 promoter region was removed and a unique HindIII site was substituted for the insertion of other promoter sequences.

  4. Comparative genomic analysis of novel Acinetobacter symbionts: A combined systems biology and genomics approach.

    PubMed

    Gupta, Vipin; Haider, Shazia; Sood, Utkarsh; Gilbert, Jack A; Ramjee, Meenakshi; Forbes, Ken; Singh, Yogendra; Lopes, Bruno S; Lal, Rup

    2016-01-01

    The increasing trend of antibiotic resistance in Acinetobacter drastically limits the range of therapeutic agents required to treat multidrug resistant (MDR) infections. This study focused on analysis of novel Acinetobacter strains using a genomics and systems biology approach. Here we used a network theory method for pathogenic and non-pathogenic Acinetobacter spp. to identify the key regulatory proteins (hubs) in each strain. We identified nine key regulatory proteins, guaA, guaB, rpsB, rpsI, rpsL, rpsE, rpsC, rplM and trmD, which have functional roles as hubs in a hierarchical scale-free fractal protein-protein interaction network. Two key hubs (guaA and guaB) were important for insect-associated strains, and comparative analysis identified guaA as more important than guaB due to its role in effective module regulation. rpsI played a significant role in all the novel strains, while rplM was unique to sheep-associated strains. rpsM, rpsB and rpsI were involved in the regulation of overall network topology across all Acinetobacter strains analyzed in this study. Future analysis will investigate whether these hubs are useful as drug targets for treating Acinetobacter infections. PMID:27378055

  5. Comparative genomic analysis of novel Acinetobacter symbionts: A combined systems biology and genomics approach

    PubMed Central

    Gupta, Vipin; Haider, Shazia; Sood, Utkarsh; Gilbert, Jack A.; Ramjee, Meenakshi; Forbes, Ken; Singh, Yogendra; Lopes, Bruno S.; Lal, Rup

    2016-01-01

    The increasing trend of antibiotic resistance in Acinetobacter drastically limits the range of therapeutic agents required to treat multidrug resistant (MDR) infections. This study focused on analysis of novel Acinetobacter strains using a genomics and systems biology approach. Here we used a network theory method for pathogenic and non-pathogenic Acinetobacter spp. to identify the key regulatory proteins (hubs) in each strain. We identified nine key regulatory proteins, guaA, guaB, rpsB, rpsI, rpsL, rpsE, rpsC, rplM and trmD, which have functional roles as hubs in a hierarchical scale-free fractal protein-protein interaction network. Two key hubs (guaA and guaB) were important for insect-associated strains, and comparative analysis identified guaA as more important than guaB due to its role in effective module regulation. rpsI played a significant role in all the novel strains, while rplM was unique to sheep-associated strains. rpsM, rpsB and rpsI were involved in the regulation of overall network topology across all Acinetobacter strains analyzed in this study. Future analysis will investigate whether these hubs are useful as drug targets for treating Acinetobacter infections. PMID:27378055

  6. Microarray-Based Comparative Genomic and Transcriptome Analysis of Borrelia burgdorferi

    PubMed Central

    Iyer, Radha; Schwartz, Ira

    2016-01-01

    Borrelia burgdorferi, the spirochetal agent of Lyme disease, is maintained in nature in a cycle involving a tick vector and a mammalian host. Adaptation to the diverse conditions of temperature, pH, oxygen tension and nutrient availability in these two environments requires the precise orchestration of gene expression. Over 25 microarray analyses relating to B. burgdorferi genomics and transcriptomics have been published. The majority of these studies has explored the global transcriptome under a variety of conditions and has contributed substantially to the current understanding of B. burgdorferi transcriptional regulation. In this review, we present a summary of these studies with particular focus on those that helped define the roles of transcriptional regulators in modulating gene expression in the tick and mammalian milieus. By performing comparative analysis of results derived from the published microarray expression profiling studies, we identified composite gene lists comprising differentially expressed genes in these two environments. Further, we explored the overlap between the regulatory circuits that function during the tick and mammalian phases of the enzootic cycle. Taken together, the data indicate that there is interplay among the distinct signaling pathways that function in feeding ticks and during adaptation to growth in the mammal. PMID:27600075

  7. Microarray-Based Comparative Genomic and Transcriptome Analysis of Borrelia burgdorferi.

    PubMed

    Iyer, Radha; Schwartz, Ira

    2016-01-01

    Borrelia burgdorferi, the spirochetal agent of Lyme disease, is maintained in nature in a cycle involving a tick vector and a mammalian host. Adaptation to the diverse conditions of temperature, pH, oxygen tension and nutrient availability in these two environments requires the precise orchestration of gene expression. Over 25 microarray analyses relating to B. burgdorferi genomics and transcriptomics have been published. The majority of these studies has explored the global transcriptome under a variety of conditions and has contributed substantially to the current understanding of B. burgdorferi transcriptional regulation. In this review, we present a summary of these studies with particular focus on those that helped define the roles of transcriptional regulators in modulating gene expression in the tick and mammalian milieus. By performing comparative analysis of results derived from the published microarray expression profiling studies, we identified composite gene lists comprising differentially expressed genes in these two environments. Further, we explored the overlap between the regulatory circuits that function during the tick and mammalian phases of the enzootic cycle. Taken together, the data indicate that there is interplay among the distinct signaling pathways that function in feeding ticks and during adaptation to growth in the mammal. PMID:27600075

  8. Exercise and gene expression: physiological regulation of the human genome through physical activity

    PubMed Central

    Booth, Frank W; Chakravarthy, Manu V; Spangenburg, Espen E

    2002-01-01

    The current human genome was moulded and refined through generations of time. We propose that the basic framework for physiologic gene regulation was selected during an era of obligatory physical activity, as the survival of our Late Palaeolithic (50 000–10 000 BC) ancestors depended on hunting and gathering. A sedentary lifestyle in such an environment probably meant elimination of that individual organism. The phenotype of the present day Homo sapiens genome is much different from that of our ancient ancestors, primarily as a consequence of expressing evolutionarily programmed Late Palaeolithic genes in an environment that is predominantly sedentary. In this sense, our current genome is maladapted, resulting in abnormal gene expression, which in turn frequently manifests itself as clinically overt disease. We speculate that some of these genes still play a role in survival by causing premature death from chronic diseases produced by physical inactivity. We also contend that the current scientific evidence supports the notion that disruptions in cellular homeostasis are diminished in magnitude in physically active individuals compared with sedentary individuals due to the natural selection of gene expression that supports the physically active lifestyle displayed by our ancestors. We speculate that genes evolved with the expectation of requiring a certain threshold of physical activity for normal physiologic gene expression, and thus habitual exercise in sedentary cultures restores perturbed homeostatic mechanisms towards the normal physiological range of the Palaeolithic Homo sapiens. This hypothesis allows us to ask the question of whether normal physiological values change as a result of becoming sedentary. In summary, in sedentary cultures, daily physical activity normalizes gene expression towards patterns established to maintain the survival in the Late Palaeolithic era. PMID:12205177

  9. Comparative genomics in chicken and Pekin duck using FISH mapping and microarray analysis

    PubMed Central

    2009-01-01

    Background The availability of the complete chicken (Gallus gallus) genome sequence as well as a large number of chicken probes for fluorescent in-situ hybridization (FISH) and microarray resources facilitate comparative genomic studies between chicken and other bird species. In a previous study, we provided a comprehensive cytogenetic map for the turkey (Meleagris gallopavo) and the first analysis of copy number variants (CNVs) in birds. Here, we extend this approach to the Pekin duck (Anas platyrhynchos), an obvious target for comparative genomic studies due to its agricultural importance and resistance to avian flu. Results We provide a detailed molecular cytogenetic map of the duck genome through FISH assignment of 155 chicken clones. We identified one inter- and six intrachromosomal rearrangements between chicken and duck macrochromosomes and demonstrated conserved synteny among all microchromosomes analysed. Array comparative genomic hybridisation revealed 32 CNVs, of which 5 overlap previously designated "hotspot" regions between chicken and turkey. Conclusion Our results suggest extensive conservation of avian genomes across 90 million years of evolution in both macro- and microchromosomes. The data on CNVs between chicken and duck extends previous analyses in chicken and turkey and supports the hypotheses that avian genomes contain fewer CNVs than mammalian genomes and that genomes of evolutionarily distant species share regions of copy number variation ("CNV hotspots"). Our results will expedite duck genomics, assist marker development and highlight areas of interest for future evolutionary and functional studies. PMID:19656363

  10. Genomic Sequencing and Comparative Analysis of Epstein-Barr Virus Genome Isolated from Primary Nasopharyngeal Carcinoma Biopsy

    PubMed Central

    Kwok, Hin; Tong, Amy H. Y.; Lin, Chi Ho; Lok, Si; Farrell, Paul J.; Kwong, Dora L. W.; Chiang, Alan K. S.

    2012-01-01

    Whether certain Epstein-Barr virus (EBV) strains are associated with pathogenesis of nasopharyngeal carcinoma (NPC) is still an unresolved question. In the present study, EBV genome contained in a primary NPC tumor biopsy was amplified by Polymerase Chain Reaction (PCR), and sequenced using next-generation (Illumina) and conventional dideoxy-DNA sequencing. The EBV genome, designated HKNPC1 (Genbank accession number JQ009376) is a type 1 EBV of approximately 171.5 kb. The virus appears to be a uniform strain in line with accepted monoclonal nature of EBV in NPC but is heterogeneous at 172 nucleotide positions. Phylogenetic analysis with the four published EBV strains, B95-8, AG876, GD1, and GD2, indicated HKNPC1 was more closely related to the Chinese NPC patient-derived strains, GD1 and GD2. HKNPC1 contains 1,589 single nucleotide variations (SNVs) and 132 insertions or deletions (indels) in comparison to the reference EBV sequence (accession number NC007605). When compared to AG876, a strain derived from Ghanaian Burkitt's lymphoma, we found 322 SNVs, of which 76 were non-synonymous SNVs and were shared amongst the Chinese GD1, GD2 and HKNPC1 isolates. We observed 88 non-synonymous SNVs shared only by HKNPC1 and GD2, the only other NPC tumor-derived strain reported thus far. Non-synonymous SNVs were mainly found in the latent, tegument and glycoprotein genes. The same point mutations were found in glycoprotein (BLLF1 and BALF4) genes of GD1, GD2 and HKNPC1 strains and might affect cell type specific binding. Variations in LMP1 and EBNA3B epitopes and mutations in Cp (11404 C>T) and Qp (50134 G>C) found in GD1, GD2 and HKNPC1 could potentially affect CD8+ T cell recognition and latent gene expression pattern in NPC, respectively. In conclusion, we showed that whole genome sequencing of EBV in NPC may facilitate discovery of previously unknown variations of pathogenic significance. PMID:22590638

  11. Abundant and broad expression of transcription-induced chimeras and protein products in mammalian genomes.

    PubMed

    Lu, Guanting; Wu, Jin; Zhao, Gangbin; Wang, Zhiqiang; Chen, Weihua; Mu, Shijie

    2016-02-12

    The expression of transcription-induced chimeras (TICs) was underestimated due to strategic and logical reasons. In order to thoroughly examine TICs, systematic survey of TIC events was conducted in mammalian genomes using ESTs, followed by experimental validation using RT-PCR and real-time quantitative PCR (qPCR). The expression of ∼98% TIC events in at least one tissue or cell line from both mouse and human was verified. Besides, ∼40% TICs were broadly expressed, and ∼33% of TICs showed expression levels comparable to or higher than their upstream parental genes. We further identified putative chimeric proteins in public databases and validated two using Western blotting. GO analysis revealed that proteins resided in one multi-protein complex or functioning in metabolic or signaling pathway tended to produce fused products. Taken together, we have shown substantial evidence for the underestimated TIC events; and TICs could be a novel regulated way to further increases the proteome complexity in mammalian genomes. Possible regulation mechanisms and evolution of TICs were also discussed. PMID:26718406

  12. Comparative Genomics of the Staphylococcus intermedius Group of Animal Pathogens

    PubMed Central

    Ben Zakour, Nouri L.; Beatson, Scott A.; van den Broek, Adri H. M.; Thoday, Keith L.; Fitzgerald, J. Ross

    2012-01-01

    The Staphylococcus intermedius group consists of three closely related coagulase-positive bacterial species including S. intermedius, Staphylococcus pseudintermedius, and Staphylococcus delphini. S. pseudintermedius is a major skin pathogen of dogs, which occasionally causes severe zoonotic infections of humans. S. delphini has been isolated from an array of different animals including horses, mink, and pigeons, whereas S. intermedius has been isolated only from pigeons to date. Here we provide a detailed analysis of the S. pseudintermedius whole genome sequence in comparison to high quality draft S. intermedius and S. delphini genomes, and to other sequenced staphylococcal species. The core genome of the SIG was highly conserved with average nucleotide identity (ANI) between the three species of 93.61%, which is very close to the threshold of species delineation (95% ANI), highlighting the close-relatedness of the SIG species. However, considerable variation was identified in the content of mobile genetic elements, cell wall-associated proteins, and iron and sugar transporters, reflecting the distinct ecological niches inhabited. Of note, S. pseudintermedius ED99 contained a clustered regularly interspaced short palindromic repeat locus of the Nmeni subtype and S. intermedius contained both Nmeni and Mtube subtypes. In contrast to S. intermedius and S. delphini and most other staphylococci examined to date, S. pseudintermedius contained at least nine predicted reverse transcriptase Group II introns. Furthermore, S. pseudintermedius ED99 encoded several transposons which were largely responsible for its multi-resistant phenotype. Overall, the study highlights extensive differences in accessory genome content between closely related staphylococcal species inhabiting distinct host niches, providing new avenues for research into pathogenesis and bacterial host-adaptation. PMID:22919635

  13. Comparative genomic analysis of integral membrane transport proteins in ciliates.

    PubMed

    Kumar, Ujjwal; Saier, Milton H

    2015-01-01

    Integral membrane transport proteins homologous to those found in the Transporter Classification Database (TCDB; www.tcdb.org) were identified and bioinformatically characterized by transporter class, family, and substrate specificity in three ciliates, Paramecium tetraurelia (Para), Tetrahymena thermophila (Tetra), and Ichthyophthirius multifiliis (Ich). In these three organisms, 1,326 of 39,600 proteins (3.4%), 1,017 of 24,800 proteins (4.2%), and 504 out of 8,100 proteins (6.2%) integral membrane transport proteins were identified, respectively. Thus, an inverse relationship was observed between the % transporters identified and the number of total proteins per genome reported. This surprising observation provides insight into the evolutionary process, giving rise to genome reduction following whole genome duplication (as in the case of Para) or during pathogenic association with a host organism (Ich). Of these transport proteins in Para and Tetra, about 41% were channels (more than any other type of organism studied), 31% were secondary carriers (fewer than most eukaryotes) and 26% were primary active transporters, mostly ATP-hydrolysis driven (more than most other eukaryotes). In Ich, the number of channels was selectively reduced by 66%, relative to Para and Tetra. Para has four times more inorganic anion transporters than Tetra, and Ich has nonselectively lost most of these. Tetra and Ich preferentially transport sugars and monocarboxylates while Para prefers di- and tricarboxylates. These observations serve to characterize the transport proteins of these related ciliates, providing insight into their nutrition and metabolism. PMID:25099884

  14. Comparative genomic and phylogeographic analysis of Mycobacterium leprae.

    PubMed

    Monot, Marc; Honoré, Nadine; Garnier, Thierry; Zidane, Nora; Sherafi, Diana; Paniz-Mondolfi, Alberto; Matsuoka, Masanori; Taylor, G Michael; Donoghue, Helen D; Bouwman, Abi; Mays, Simon; Watson, Claire; Lockwood, Diana; Khamesipour, Ali; Khamispour, Ali; Dowlati, Yahya; Jianping, Shen; Rea, Thomas H; Vera-Cabrera, Lucio; Stefani, Mariane M; Banu, Sayera; Macdonald, Murdo; Sapkota, Bishwa Raj; Spencer, John S; Thomas, Jérôme; Harshman, Keith; Singh, Pushpendra; Busso, Philippe; Gattiker, Alexandre; Rougemont, Jacques; Brennan, Patrick J; Cole, Stewart T

    2009-12-01

    Reductive evolution and massive pseudogene formation have shaped the 3.31-Mb genome of Mycobacterium leprae, an unculturable obligate pathogen that causes leprosy in humans. The complete genome sequence of M. leprae strain Br4923 from Brazil was obtained by conventional methods (6x coverage), and Illumina resequencing technology was used to obtain the sequences of strains Thai53 (38x coverage) and NHDP63 (46x coverage) from Thailand and the United States, respectively. Whole-genome comparisons with the previously sequenced TN strain from India revealed that the four strains share 99.995% sequence identity and differ only in 215 polymorphic sites, mainly SNPs, and by 5 pseudogenes. Sixteen interrelated SNP subtypes were defined by genotyping both extant and extinct strains of M. leprae from around the world. The 16 SNP subtypes showed a strong geographical association that reflects the migration patterns of early humans and trade routes, with the Silk Road linking Europe to China having contributed to the spread of leprosy. PMID:19881526

  15. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  16. Mitochondrial genome sequences and comparative genomics ofPhytophthora ramorum and P. sojae

    SciTech Connect

    Martin, Frank N.; Douda, Bensasson; Tyler, Brett M.; Boore,Jeffrey L.

    2007-01-01

    The complete sequences of the mitochondrial genomes of theoomycetes of Phytophthora ramorum and P. sojae were determined during thecourse of their complete nuclear genome sequencing (Tyler, et al. 2006).Both are circular, with sizes of 39,314 bp for P. ramorum and 42,975 bpfor P. sojae. Each contains a total of 37 identifiable protein-encodinggenes, 25 or 26 tRNAs (P. sojae and P. ramorum, respectively)specifying19 amino acids, and a variable number of ORFs (7 for P. ramorum and 12for P. sojae) which are potentially additional functional genes.Non-coding regions comprise approximately 11.5 percent and 18.4 percentof the genomes of P. ramorum and P. sojae, respectively. Relative to P.sojae, there is an inverted repeat of 1,150 bp in P. ramorum thatincludes an unassigned unique ORF, a tRNA gene, and adjacent non-codingsequences, but otherwise the gene order in both species is identical.Comparisons of these genomes with published sequences of the P. infestansmitochondrial genome reveals a number of similarities, but the gene orderin P. infestans differs in two adjacent locations due to inversions.Sequence alignments of the three genomes indicated sequence conservationranging from 75 to 85 percent and that specific regions were morevariable than others.

  17. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level

    PubMed Central

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea’s genetic data sources. PMID:27446038

  18. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level.

    PubMed

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea's genetic data sources. PMID:27446038

  19. Xp11 translocation renal cell carcinoma in adults: a clinicopathological and comparative genomic hybridization study.

    PubMed

    Zou, Hong; Kang, Xueling; Pang, Li-Juan; Hu, Wenhao; Zhao, Jin; Qi, Yan; Hu, Jianming; Liu, Chunxia; Li, Hongan; Liang, Weihua; Yuan, Xianglin; Li, Feng

    2014-01-01

    To study the clinicopathological and genomic characteristics of Xp11.2 translocation renal cell carcinoma (Xp11.2 RCC) in adults, we analyzed 9 Xp11.2 RCCs, confirmed by transcription factor E3 (TFE3) immunohistochemistry, in patients aged ≥20 years. TFE3 expression was also determined in 12 cases of alveolar soft part sarcoma (ASPS) served as a positive control. Comparative genomic hybridization (CGH) was used to investigate genomic imbalances in all Xp11.2 RCC cases. Most of our Xp11.2 RCC patients (5/9) presented with TNM stages 3-4, and 6 patients died 10 months to 7 years after their operation. Histologically, Xp11.2 RCC was composed of a mixed papillary nested/alveolar growth pattern (8/9). Immunostaining showed that all Xp11.2 RCC and ASPS cases had strong TFE3 expression and high positive ratios for p53 and vimentin. However, there were significant differences in the expression of AMACR (p<0.001), AE1/AE3 (p=0.002), and CD10 (p=0.024) between the 2 diseases. CGH profiles showed chromosomal imbalances in all 9 Xp11.2 RCC cases; gains were observed in chromosomes Xp11 (6/9), 7q20-25, 12q25-31 (5/9), 7p16-24 (4/9), 8p12-13, 8q20-21, 16q20-22, 17q25-26, 20q22-23 (4/9), and losses occurred frequently on chromosomes 3p12-16, 9q31-32, 14q22-24 (4/9). Our Conclusions show Xp11.2 RCC that occur in adults may be aggressive cancers, the expressions of AMACR, CD10, AE1/AE3 are helpful in the differential diagnosis between Xp11.2 RCC and ASPS, and CGH assay is a useful complementary method for confirming the diagnosis of Xp11.2 RCC. PMID:24427344

  20. Comparative Genomic Profiling of Synovium Versus Skin Lesions in Psoriatic Arthritis

    PubMed Central

    Belasco, Jennifer; Louie, James S; Gulati, Nicholas; Wei, Nathan; Nograles, Kristine; Fuentes-Duculan, Judilyn; Mitsui, Hiroshi; Suárez-Fariñas, Mayte; Krueger, James G

    2015-01-01

    Objective To our knowledge, there is no broad genomic analysis comparing skin and synovium in psoriatic arthritis (PsA). Also, there is little understanding of the relative levels of cytokines and chemokines in skin and synovium. The purpose of this study was to better define inflammatory pathways in paired lesional skin and affected synovial tissue in patients with PsA. Methods We conducted a comprehensive analysis of cytokine and chemokine activation and genes representative of the inflammatory processes in PsA. Paired PsA synovial tissue and skin samples were obtained from 12 patients on the same day. Gene expression studies were performed using Affymetrix HGU133 Plus 2.0 arrays. Confirmatory quantitative real-time polymerase chain reaction (PCR) was performed on selected transcripts. Cell populations were assessed by immunohistochemistry and immunofluorescence. Results Globally, gene expression in PsA synovium was more closely related to gene expression in PsA skin than to gene expression in synovium in other forms of arthritis. However, PsA gene expression patterns in skin and synovium were clearly distinct, showing a stronger interleukin-17 (IL-17) gene signature in skin than in synovium and more equivalent tumor necrosis factor (TNF) and interferon-γ gene signatures in both tissues. These results were confirmed with real-time PCR. Conclusion This is the first comprehensive molecular comparison of paired lesional skin and affected synovial tissue samples in PsA. Our results support clinical trial data showing that PsA skin and joint disease are similarly responsive to TNF antagonists, while IL-17 antagonists have better results in PsA skin than in PsA joints. Genes selectively expressed in PsA synovium might direct future therapies for PsA. PMID:25512250

  1. Identification of chromosomal alpha-proteobacterial small RNAs by comparative genome analysis and detection in Sinorhizobium meliloti strain 1021

    PubMed Central

    Ulvé, Vincent M; Sevin, Emeric W; Chéron, Angélique; Barloy-Hubler, Frédérique

    2007-01-01

    Background Small untranslated RNAs (sRNAs) seem to be far more abundant than previously believed. The number of sRNAs confirmed in E. coli through various approaches is above 70, with several hundred more sRNA candidate genes under biological validation. Although the total number of sRNAs in any one species is still unclear, their importance in cellular processes has been established. However, unlike protein genes, no simple feature enables the prediction of the location of the corresponding sequences in genomes. Several approaches, of variable usefulness, to identify genomic sequences encoding sRNA have been described in recent years. Results We used a combination of in silico comparative genomics and microarray-based transcriptional profiling. This approach to screening identified ~60 intergenic regions conserved between Sinorhizobium meliloti and related members of the alpha-proteobacteria sub-group 2. Of these, 14 appear to correspond to novel non-coding sRNAs and three are putative peptide-coding or 5' UTR RNAs (ORF smaller than 100 aa). The expression of each of these new small RNA genes was confirmed by Northern blot hybridization. Conclusion Small non coding RNA (sra) genes can be found in the intergenic regions of alpha-proteobacteria genomes. Some of these sra genes are only present in S. meliloti, sometimes in genomic islands; homologues of others are present in related genomes including those of the pathogens Brucella and Agrobacterium. PMID:18093320

  2. Genome evolution in diploid and tetraploid Coffea species as revealed by comparative analysis of orthologous genome segments.

    PubMed

    Cenci, Alberto; Combes, Marie-Christine; Lashermes, Philippe

    2012-01-01

    Sequence comparison of orthologous regions enables estimation of the divergence between genomes, analysis of their evolution and detection of particular features of the genomes, such as sequence rearrangements and transposable elements. Despite the economic importance of Coffea species, little genomic information is currently available. Coffea is a relatively young genus that includes more than one hundred diploid species and a single tetraploid species. Three Coffea orthologous regions of 470-900 kb were analyzed and compared: both subgenomes of allotetraploid Coffea arabica (contributed by the diploid species Coffea eugenioides and Coffea canephora) and the genome of diploid C. canephora. Sequence divergence was calculated on global alignments or on coding and non-coding sequences separately. A search for transposable elements detected 43 retrotransposons and 198 transposons in the sequences analyzed. Comparative insertion analysis made it possible to locate 165 TE insertions in the phylogenetic tree of the three genomes/subgenomes. In the tetraploid C. arabica, a homoeologous non-reciprocal transposition (HNRT) was detected and characterized: a 50 kb region of the C. eugenioides derived subgenome replaced the C. canephora derived counterpart. Comparative sequence analysis on three Coffea genomes/subgenomes revealed almost perfect gene synteny, low sequence divergence and a high number of shared transposable elements. Compared to the results of similar analysis in other genera (Aegilops/Triticum and Oryza), Coffea genomes/subgenomes appeared to be dramatically less diverged, which is consistent with the relatively recent radiation of the Coffea genus. Based on nucleotide substitution frequency, the HNRT was dated at 10,000-50,000 years BP, which is also the most recent estimation of the origin of C. arabica. PMID:22086332

  3. Evolution of a Cellular Immune Response in Drosophila: A Phenotypic and Genomic Comparative Analysis

    PubMed Central

    Salazar-Jaramillo, Laura; Paspati, Angeliki; van de Zande, Louis; Vermeulen, Cornelis Joseph; Schwander, Tanja; Wertheim, Bregje

    2014-01-01

    Understanding the genomic basis of evolutionary adaptation requires insight into the molecular basis underlying phenotypic variation. However, even changes in molecular pathways associated with extreme variation, gains and losses of specific phenotypes, remain largely uncharacterized. Here, we investigate the large interspecific differences in the ability to survive infection by parasitoids across 11 Drosophila species and identify genomic changes associated with gains and losses of parasitoid resistance. We show that a cellular immune defense, encapsulation, and the production of a specialized blood cell, lamellocytes, are restricted to a sublineage of Drosophila, but that encapsulation is absent in one species of this sublineage, Drosophila sechellia. Our comparative analyses of hemopoiesis pathway genes and of genes differentially expressed during the encapsulation response revealed that hemopoiesis-associated genes are highly conserved and present in all species independently of their resistance. In contrast, 11 genes that are differentially expressed during the response to parasitoids are novel genes, specific to the Drosophila sublineage capable of lamellocyte-mediated encapsulation. These novel genes, which are predominantly expressed in hemocytes, arose via duplications, whereby five of them also showed signatures of positive selection, as expected if they were recruited for new functions. Three of these novel genes further showed large-scale and presumably loss-of-function sequence changes in D. sechellia, consistent with the loss of resistance in this species. In combination, these convergent lines of evidence suggest that co-option of duplicated genes in existing pathways and subsequent neofunctionalization are likely to have contributed to the evolution of the lamellocyte-mediated encapsulation in Drosophila. PMID:24443439

  4. Genome sequencing and comparative genomics of honey bee microsporidia, Nosema apis reveal novel insights into host-parasite interactions

    PubMed Central

    2013-01-01

    Background The microsporidia parasite Nosema contributes to the steep global decline of honey bees that are critical pollinators of food crops. There are two species of Nosema that have been found to infect honey bees, Nosema apis and N. ceranae. Genome sequencing of N. apis and comparative genome analysis with N. ceranae, a fully sequenced microsporidia species, reveal novel insights into host-parasite interactions underlying the parasite infections. Results We applied the whole-genome shotgun sequencing approach to sequence and assemble the genome of N. apis which has an estimated size of 8.5 Mbp. We predicted 2,771 protein- coding genes and predicted the function of each putative protein using the Gene Ontology. The comparative genomic analysis led to identification of 1,356 orthologs that are conserved between the two Nosema species and genes that are unique characteristics of the individual species, thereby providing a list of virulence factors and new genetic tools for studying host-parasite interactions. We also identified a highly abundant motif in the upstream promoter regions of N. apis genes. This motif is also conserved in N. ceranae and other microsporidia species and likely plays a role in gene regulation across the microsporidia. Conclusions The availability of the N. apis genome sequence is a significant addition to the rapidly expanding body of microsprodian genomic data which has been improving our understanding of eukaryotic genome diversity and evolution in a broad sense. The predicted virulent genes and transcriptional regulatory elements are potential targets for innovative therapeutics to break down the life cycle of the parasite. PMID:23829473

  5. Complete Genome Sequence of Borrelia afzelii K78 and Comparative Genome Analysis

    PubMed Central

    Schüler, Wolfgang; Bunikis, Ignas; Weber-Lehman, Jacqueline; Comstedt, Pär; Kutschan-Bunikis, Sabrina; Stanek, Gerold; Huber, Jutta; Meinke, Andreas; Bergström, Sven; Lundberg, Urban

    2015-01-01

    The main Borrelia species causing Lyme borreliosis in Europe and Asia are Borrelia afzelii, B. garinii, B. burgdorferi and B. bavariensis. This is in contrast to the United States, where infections are exclusively caused by B. burgdorferi. Until to date the genome sequences of four B. afzelii strains, of which only two include the numerous plasmids, are available. In order to further assess the genetic diversity of B. afzelii, the most common species in Europe, responsible for the large variety of clinical manifestations of Lyme borreliosis, we have determined the full genome sequence of the B. afzelii strain K78, a clinical isolate from Austria. The K78 genome contains a linear chromosome (905,949 bp) and 13 plasmids (8 linear and 5 circular) together presenting 1,309 open reading frames of which 496 are located on plasmids. With the exception of lp28-8, all linear replicons in their full length including their telomeres have been sequenced. The comparison with the genomes of the four other B. afzelii strains, ACA-1, PKo, HLJ01 and Tom3107, as well as the one of B. burgdorferi strain B31, confirmed a high degree of conservation within the linear chromosome of B. afzelii, whereas plasmid encoded genes showed a much larger diversity. Since some plasmids present in B. burgdorferi are missing in the B. afzelii genomes, the corresponding virulence factors of B. burgdorferi are found in B. afzelii on other unrelated plasmids. In addition, we have identified a species specific region in the circular plasmid, cp26, which could be used for species determination. Different non-coding RNAs have been located on the B. afzelii K78 genome, which have not previously been annotated in any of the published Borrelia genomes. PMID:25798594

  6. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis

    PubMed Central

    Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros ‘Jinzaoshi’ were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. ‘Jinzaoshi’, support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales. PMID:27442423

  7. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    PubMed

    Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales. PMID:27442423

  8. Comparative Genomic Analyses of the Bacterial Phosphotransferase System

    PubMed Central

    Barabote, Ravi D.; Saier, Milton H.

    2005-01-01

    We report analyses of 202 fully sequenced genomes for homologues of known protein constituents of the bacterial phosphoenolpyruvate-dependent phosphotransferase system (PTS). These included 174 bacterial, 19 archaeal, and 9 eukaryotic genomes. Homologues of PTS proteins were not identified in archaea or eukaryotes, showing that the horizontal transfer of genes encoding PTS proteins has not occurred between the three domains of life. Of the 174 bacterial genomes (136 bacterial species) analyzed, 30 diverse species have no PTS homologues, and 29 species have cytoplasmic PTS phosphoryl transfer protein homologues but lack recognizable PTS permeases. These soluble homologues presumably function in regulation. The remaining 77 species possess all PTS proteins required for the transport and phosphorylation of at least one sugar via the PTS. Up to 3.2% of the genes in a bacterium encode PTS proteins. These homologues were analyzed for family association, range of protein types, domain organization, and organismal distribution. Different strains of a single bacterial species often possess strikingly different complements of PTS proteins. Types of PTS protein domain fusions were analyzed, showing that certain types of domain fusions are common, while others are rare or prohibited. Select PTS proteins were analyzed from different phylogenetic standpoints, showing that PTS protein phylogeny often differs from organismal phylogeny. The results document the frequent gain and loss of PTS protein-encoding genes and suggest that the lateral transfer of these genes within the bacterial domain has played an important role in bacterial evolution. Our studies provide insight into the development of complex multicomponent enzyme systems and lead to predictions regarding the types of protein-protein interactions that promote efficient PTS-mediated phosphoryl transfer. PMID:16339738

  9. EDGAR: A software framework for the comparative analysis of prokaryotic genomes

    PubMed Central

    Blom, Jochen; Albaum, Stefan P; Doppmeier, Daniel; Pühler, Alfred; Vorhölter, Frank-Jörg; Zakrzewski, Martha; Goesmann, Alexander

    2009-01-01

    Background The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. Results To support these studies EDGAR – "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy. Conclusion EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface , where the precomputed data sets can be browsed. PMID:19457249

  10. Genome wide expression profiling of two accession of G. herbaceum L. in response to drought

    PubMed Central

    2012-01-01

    Background Genome-wide gene expression profiling and detailed physiological investigation were used for understanding the molecular mechanism and physiological response of Gossypium herbaceum, which governs the adaptability of plants in drought conditions. Recently, microarray-based gene expression analysis is commonly used to decipher genes and genetic networks controlling the traits of interest. However, the results of such an analysis are often plagued due to a limited number of genes (probe sets) on microarrays. On the other hand, pyrosequencing of a transcriptome has the potential to detect rare as well as a large number of transcripts in the samples quantitatively. We used Affymetrix microarray as well as Roche's GS-FLX transcriptome sequencing for a comparative analysis of cotton transcriptome in leaf tissues under drought conditions. Results Fourteen accessions of Gossypium herbaceum were subjected to mannitol stress for preliminary screening; two accessions, namely Vagad and RAHS-14, were selected as being the most tolerant and most sensitive to osmotic stress, respectively. Affymetrix cotton arrays containing 24,045 probe sets and Roche's GS-FLX transcriptome sequencing of leaf tissue were used to analyze the gene expression profiling of Vagad and RAHS-14 under drought conditions. The analysis of physiological measurements and gene expression profiling showed that Vagad has the inherent ability to sense drought at a much earlier stage and to respond to it in a much more efficient manner than does RAHS-14. Gene Ontology (GO) studies showed that the phenyl propanoid pathway, pigment biosynthesis, polyketide biosynthesis, and other secondary metabolite pathways were enriched in Vagad under control and drought conditions as compared with RAHS-14. Similarly, GO analysis of transcriptome sequencing showed that the GO terms responses to various abiotic stresses were significantly higher in Vagad. Among the classes of transcription factors (TFs) uniquely

  11. Mechanisms of genomic rearrangements and gene expression changes in plant polyploids

    PubMed Central

    Jeffrey Chen, Z.; Ni, Zhongfu

    2007-01-01

    Summary Polyploidy is produced by multiplication of a single genome (autopolyploid) or combination of two or more divergent genomes (allopolyploid). The available data obtained from the study of synthetic (newly created or human-made) plant allopolyploids have documented dynamic and stochastic changes in genomic organization and gene expression, including sequence elimination, inter-chromosomal exchanges, cytosine methylation, gene repression, novel activation, genetic dominance, subfunctionalization and transposon activation. The underlying mechanisms for these alterations are poorly understood. To promote a better understanding of genomic and gene expression changes in polyploidy, we briefly review origins and forms of polyploidy and summarize what has been learned from genome-wide gene expression analyses in newly synthesized auto- and allopolyploids. We show transcriptome divergence between the progenitors and in the newly formed allopolyploids. We propose models for transcriptional regulation, chromatin modification and RNA-mediated pathways in establishing locus-specific expression of orthologous and homoeologous genes during allopolyploid formation and evolution. PMID:16479580

  12. Genome-Wide Comparative Analysis of Flowering-Related Genes in Arabidopsis, Wheat, and Barley

    PubMed Central

    Peng, Fred Y.; Hu, Zhiqiu; Yang, Rong-Cai

    2015-01-01

    Early flowering is an important trait influencing grain yield and quality in wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.) in short-season cropping regions. However, due to large and complex genomes of these species, direct identification of flowering genes and their molecular characterization remain challenging. Here, we used a bioinformatic approach to predict flowering-related genes in wheat and barley from 190 known Arabidopsis (Arabidopsis thaliana (L.) Heynh.) flowering genes. We identified 900 and 275 putative orthologs in wheat and barley, respectively. The annotated flowering-related genes were clustered into 144 orthologous groups with one-to-one, one-to-many, many-to-one, and many-to-many orthology relationships. Our approach was further validated by domain and phylogenetic analyses of flowering-related proteins and comparative analysis of publicly available microarray data sets for in silico expression profiling of flowering-related genes in 13 different developmental stages of wheat and barley. These further analyses showed that orthologous gene pairs in three critical flowering gene families (PEBP, MADS, and BBX) exhibited similar expression patterns among 13 developmental stages in wheat and barley, suggesting similar functions among the orthologous genes with sequence and expression similarities. The predicted candidate flowering genes can be confirmed and incorporated into molecular breeding for early flowering wheat and barley in short-season cropping regions. PMID:26435710

  13. Genome-Wide Comparative Analysis of Flowering-Related Genes in Arabidopsis, Wheat, and Barley.

    PubMed

    Peng, Fred Y; Hu, Zhiqiu; Yang, Rong-Cai

    2015-01-01

    Early flowering is an important trait influencing grain yield and quality in wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.) in short-season cropping regions. However, due to large and complex genomes of these species, direct identification of flowering genes and their molecular characterization remain challenging. Here, we used a bioinformatic approach to predict flowering-related genes in wheat and barley from 190 known Arabidopsis (Arabidopsis thaliana (L.) Heynh.) flowering genes. We identified 900 and 275 putative orthologs in wheat and barley, respectively. The annotated flowering-related genes were clustered into 144 orthologous groups with one-to-one, one-to-many, many-to-one, and many-to-many orthology relationships. Our approach was further validated by domain and phylogenetic analyses of flowering-related proteins and comparative analysis of publicly available microarray data sets for in silico expression profiling of flowering-related genes in 13 different developmental stages of wheat and barley. These further analyses showed that orthologous gene pairs in three critical flowering gene families (PEBP, MADS, and BBX) exhibited similar expression patterns among 13 developmental stages in wheat and barley, suggesting similar functions among the orthologous genes with sequence and expression similarities. The predicted candidate flowering genes can be confirmed and incorporated into molecular breeding for early flowering wheat and barley in short-season cropping regions. PMID:26435710

  14. Integrating microarray gene expression object model and clinical document architecture for cancer genomics research.

    PubMed

    Park, Yu Rang; Lee, Hye Won; Kim, Ju Han

    2005-01-01

    Systematic integration of genomic-scale expression profiles with clinical information may facilitate cancer genomics research. MAGE-OM (Microarray Gene Expression Object Model) defines standard objects for genomic but not for clinical data. HL7 CDA (Clinical Document Architecture) is a document model for clinical information, describing syntax (generic structure) but not semantics. We designed a document template in XML Schema with additional constraints for CDA to define content semantics, enabling data model-level integration of MAGE-OM and CDA for cancer genomics research. PMID:16779360

  15. Comparative Analysis of Regulatory Elements between Escherichia coli and Klebsiella pneumoniae by Genome-Wide Transcription Start Site Profiling

    PubMed Central

    Qiu, Yu; Nagarajan, Harish; Seo, Joo-Hyun; Cho, Byung-Kwan; Tsai, Shih-Feng; Palsson, Bernhard Ø.

    2012-01-01

    Genome-wide transcription start site (TSS) profiles of the enterobacteria Escherichia coli and Klebsiella pneumoniae were experimentally determined through modified 5′ RACE followed by deep sequencing of intact primary mRNA. This identified 3,746 and 3,143 TSSs for E. coli and K. pneumoniae, respectively. Experimentally determined TSSs were then used to define promoter regions and 5′ UTRs upstream of coding genes. Comparative analysis of these regulatory elements revealed the use of multiple TSSs, identical sequence motifs of promoter and Shine-Dalgarno sequence, reflecting conserved gene expression apparatuses between the two species. In both species, over 70% of primary transcripts were expressed from operons having orthologous genes during exponential growth. However, expressed orthologous genes in E. coli and K. pneumoniae showed a strikingly different organization of upstream regulatory regions with only 20% identical promoters with TSSs in both species. Over 40% of promoters had TSSs identified in only one species, despite conserved promoter sequences existing in the other species. 662 conserved promoters having TSSs in both species resulted in the same number of comparable 5′ UTR pairs, and that regulatory element was found to be the most variant region in sequence among promoter, 5′ UTR, and ORF. In K. pneumoniae, 48 sRNAs were predicted and 36 of them were expressed during exponential growth. Among them, 34 orthologous sRNAs between two species were analyzed in depth, and the analysis showed that many sRNAs of K. pneumoniae, including pleiotropic sRNAs such as rprA, arcZ, and sgrS, may work in the same way as in E. coli. These results reveal a new dimension of comparative genomics such that a comparison of two genomes needs to be comprehensive over all levels of genome organization. PMID:22912590

  16. Comparative Genomics Provide Insights into Evolution of Trichoderma Nutrition Style

    PubMed Central

    Xie, Bin-Bin; Qin, Qi-Long; Shi, Mei; Chen, Lei-Lei; Shu, Yan-Li; Luo, Yan; Wang, Xiao-Wei; Rong, Jin-Cheng; Gong, Zhi-Ting; Li, Dan; Sun, Cai-Yun; Liu, Gui-Ming; Dong, Xiao-Wei; Pang, Xiu-Hua; Huang, Feng; Liu, Weifeng; Chen, Xiu-Lan; Zhou, Bai-Cheng; Zhang, Yu-Zhong; Song, Xiao-Yan

    2014-01-01

    Saprotrophy on plant biomass is a recently developed nutrition strategy for Trichoderma. However, the physiology and evolution of this new nutrition strategy is still elusive. We report the deep sequencing and analysis of the genome of Trichoderma longibrachiatum, an efficient cellulase producer. The 31.7-Mb genome, smallest among the sequenced Trichoderma species, encodes fewer nutrition-related genes than saprotrophic T. reesei (Tr), including glycoside hydrolases and nonribosomal peptide synthetase–polyketide synthase. Homology and phylogenetic analyses suggest that a large number of nutrition-related genes, including GH18 chitinases, β-1,3/1,6-glucanases, cellulolytic enzymes, and hemicellulolytic enzymes, were lost in the common ancestor of T. longibrachiatum (Tl) and Tr. dN/dS (ω) calculation indicates that all the nutrition-related genes analyzed are under purifying selection. Cellulolytic enzymes, the key enzymes for saprotrophy on plant biomass, are under stronger purifying selection pressure in Tl and Tr than in mycoparasitic species, suggesting that development of the nutrition strategy of saprotrophy on plant biomass has increased the selection pressure. In addition, aspartic proteases, serine proteases, and metalloproteases are subject to stronger purifying selection pressure in Tl and Tr, suggesting that these enzymes may also play important roles in the nutrition. This study provides insights into the physiology and evolution of the nutrition strategy of Trichoderma. PMID:24482532

  17. Evolution of Prdm Genes in Animals: Insights from Comparative Genomics.

    PubMed

    Vervoort, Michel; Meulemeester, David; Béhague, Julien; Kerner, Pierre

    2016-03-01

    Prdm genes encode transcription factors with a subtype of SET domain known as the PRDF1-RIZ (PR) homology domain and a variable number of zinc finger motifs. These genes are involved in a wide variety of functions during animal development. As most Prdm genes have been studied in vertebrates, especially in mice, little is known about the evolution of this gene family. We searched for Prdm genes in the fully sequenced genomes of 93 different species representative of all the main metazoan lineages. A total of 976 Prdm genes were identified in these species. The number of Prdm genes per species ranges from 2 to 19. To better understand how the Prdm gene family has evolved in metazoans, we performed phylogenetic analyses using this large set of identified Prdm genes. These analyses allowed us to define 14 different subfamilies of Prdm genes and to establish, through ancestral state reconstruction, that 11 of them are ancestral to bilaterian animals. Three additional subfamilies were acquired during early vertebrate evolution (Prdm5, Prdm11, and Prdm17). Several gene duplication and gene loss events were identified and mapped onto the metazoan phylogenetic tree. By studying a large number of nonmetazoan genomes, we confirmed that Prdm genes likely constitute a metazoan-specific gene family. Our data also suggest that Prdm genes originated before the diversification of animals through the association of a single ancestral SET domain encoding gene with one or several zinc finger encoding genes. PMID:26560352

  18. Evolution of Prdm Genes in Animals: Insights from Comparative Genomics