genes sharing sequence: Topics by Science.gov

Sample records for genes sharing sequence

Vibrio chromosomes share common history.

PubMed

Kirkup, Benjamin C; Chang, LeeAnn; Chang, Sarah; Gevers, Dirk; Polz, Martin F

2010-05-10

While most gamma proteobacteria have a single circular chromosome, Vibrionales have two circular chromosomes. Horizontal gene transfer is common among Vibrios, and in light of this genetic mobility, it is an open question to what extent the two chromosomes themselves share a common history since their formation. Single copy genes from each chromosome (142 genes from chromosome I and 42 genes from chromosome II) were identified from 19 sequenced Vibrionales genomes and their phylogenetic comparison suggests consistent phylogenies for each chromosome. Additionally, study of the gene organization and phylogeny of the respective origins of replication confirmed the shared history. Thus, while elements within the chromosomes may have experienced significant genetic mobility, the backbones share a common history. This allows conclusions based on multilocus sequence analysis (MLSA) for one chromosome to be applied equally to both chromosomes.
Comparison of reduced metagenome and 16S rRNA gene sequencing for determination of genetic diversity and mother-child overlap of the gut associated microbiota.

PubMed

Ravi, Anuradha; Avershina, Ekaterina; Angell, Inga Leena; Ludvigsen, Jane; Manohar, Prasanth; Padmanaban, Sumathi; Nachimuthu, Ramesh; Snipen, Lars; Rudi, Knut

2018-06-01

Use of the 16S rRNA gene in microbiota studies is limited by the lack of taxonomic and functional resolution. High resolution analyses are particularly important for understanding transmission and persistence of bacteria. The aim of our work was therefore to compare a novel reduced metagenome sequencing (RMS) approach with 16S rRNA gene sequencing to determine both the metagenome genetic diversity and the mother-to-child sharing of the microbiota in a cohort of 17 mother-child pairs. We found that although both approaches gave comparable results with respect to sample separation and taxonomy, RMS gave higher resolution and the potential for genomic-/functional assignment. Using RMS we estimated that the metagenome size increased from about 60 Mbp for 4-day-old children to about 225 Mbp for mothers. The 4-day-old children shared 7% of the metagenome sequences with the mothers, while the metagenome sequence sharing was >30% among the mothers. We found 15 genomes shared across >50% of the mothers, of which 10 belonged to Clostridia. Only Bacteroides showed a direct mother-child association, with B. vulgatus being abundant in both 4-day-old children and mothers. For the functional assignments, we identified a significant association between antibiotic usage during labor, and quantity of Fosfomycin resistance genes. In conclusion, our results show a higher functional and taxonomic resolution for RMS compared to 16S rRNA gene sequencing, where RMS enabled a detailed description of mother to child gut microbiota transmission - supporting a late recruitment of most gut bacteria and an effect of antibiotic treatment during labor on infant antibiotic resistance gene patterns. Copyright © 2018. Published by Elsevier B.V.
Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences

PubMed Central

Gibbs, Mark J; Armstrong, John S; Gibbs, Adrian J

2005-01-01

Background Most current DNA diagnostic tests for identifying organisms use specific oligonucleotide probes that are complementary in sequence to, and hence only hybridise with the DNA of one target species. By contrast, in traditional taxonomy, specimens are usually identified by 'dichotomous keys' that use combinations of characters shared by different members of the target set. Using one specific character for each target is the least efficient strategy for identification. Using combinations of shared bisectionally-distributed characters is much more efficient, and this strategy is most efficient when they separate the targets in a progressively binary way. Results We have developed a practical method for finding minimal sets of sub-sequences that identify individual sequences, and could be targeted by combinations of probes, so that the efficient strategy of traditional taxonomic identification could be used in DNA diagnosis. The sizes of minimal sub-sequence sets depended mostly on sequence diversity and sub-sequence length and interactions between these parameters. We found that 201 distinct cytochrome oxidase subunit-1 (CO1) genes from moths (Lepidoptera) were distinguished using only 15 sub-sequences 20 nucleotides long, whereas only 8–10 sub-sequences 6–10 nucleotides long were required to distinguish the CO1 genes of 92 species from the 9 largest orders of insects. Conclusion The presence/absence of sub-sequences in a set of gene sequences can be used like the questions in a traditional dichotomous taxonomic key; hybridisation probes complementary to such sub-sequences should provide a very efficient means for identifying individual species, subtypes or genotypes. Sequence diversity and sub-sequence length are the major factors that determine the numbers of distinguishing sub-sequences in any set of sequences. PMID:15817134
Sequence determination and analysis of the NSs genes of two tospoviruses.

PubMed

Hallwass, Mariana; Leastro, Mikhail O; Lima, Mirtes F; Inoue-Nagata, Alice K; Resende, Renato O

2012-03-01

The tospoviruses groundnut ringspot virus (GRSV) and zucchini lethal chlorosis virus (ZLCV) cause severe losses in many crops, especially in solanaceous and cucurbit species. In this study, the non-structural NSs gene and the 5'UTRs of these two biologically distinct tospoviruses were cloned and sequenced. The NSs sequence of GRSV and ZLCV were both 1,404 nucleotides long. Pairwise comparison showed that the NSs amino acid sequence of GRSV shared 69.6% identity with that of ZLCV and 75.9% identity with that of TSWV, while the NSs sequence of ZLCV and TSWV shared 67.9% identity. Phylogenetic analysis based on NSs sequences confirmed that these viruses cluster in the American clade.
Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns.

PubMed

Gruel, Jérémy; LeBorgne, Michel; LeMeur, Nolwenn; Théret, Nathalie

2011-09-12

Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks.
Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns

PubMed Central

2011-01-01

Background Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Results Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Conclusions Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks. PMID:21910886
Porcine insulin receptor substrate 4 (IRS4) gene: cloning, polymorphism and association study

USDA-ARS?s Scientific Manuscript database

Using PCR and IPCR techniques we obtained a 4498 bp nucleotide sequence FN424076 encompassing the complete coding sequence of the porcine IRS4 gene and its proximal promoter. The 1269-amino acid porcine protein deduced from the nucleotide sequence shares 92% identity with the human IRS4 and possesse...
Avian acute leukemia viruses MC29 and MH2 share specific RNA sequences: Evidence for a second class of transforming genes

PubMed Central

Duesberg, Peter H.; Vogt, Peter K.

1979-01-01

The genome of the defective avian tumor virus MH2 was identified as a RNA of 5.7 kilobases by its presence in different MH2-helper virus complexes and its absence from pure helper virus, by its unique fingerprint pattern of RNase T1-resistant (T1) oligonucleotides that differed from those of two helper virus RNAs, and by its structural analogy to the RNA of MC29, another avian acute leukemia virus. Two sets of sequences were distinguished in MH2 RNA: 66% hybridized with DNA complementary to helper-independent avian tumor viruses, termed group-specific, and 34% were specific. The percentage of specific sequences is considered a minimal estimate because the MH2 RNA used was about 30% contaminated by helper virus RNA. No sequences related to the transforming src gene of avian sarcoma viruses were found in MH2. MH2 shared three large T1 oligonucleotides with MC29, two of which could also be isolated from a RNase A- and T1-resistant hybrid formed between MH2 RNA and MC29 specific cDNA. These oligonucleotides belong to a group of six that define the specific segment of MC29 RNA described previously. The group-specific sequences of MH2 and MC29 RNA shared only the two smallest out of about 20 T1 oligonucleotides associated with MH2 RNA. It is concluded that the specific sequences of MH2 and MC29 are related, and it is proposed that they are necessary for, or identical with, the onc genes of these viruses. These sequences would define a related class of transforming genes in avian tumor viruses that differs from the src genes of avian sarcoma viruses. Images PMID:221900
Mosaic Graphs and Comparative Genomics in Phage Communities

PubMed Central

Belcaid, Mahdi; Bergeron, Anne

2010-01-01

Abstract Comparing the genomes of two closely related viruses often produces mosaics where nearly identical sequences alternate with sequences that are unique to each genome. When several closely related genomes are compared, the unique sequences are likely to be shared with third genomes, leading to virus mosaic communities. Here we present comparative analysis of sets of Staphylococcus aureus phages that share large identical sequences with up to three other genomes, and with different partners along their genomes. We introduce mosaic graphs to represent these complex recombination events, and use them to illustrate the breath and depth of sequence sharing: some genomes are almost completely made up of shared sequences, while genomes that share very large identical sequences can adopt alternate functional modules. Mosaic graphs also allow us to identify breakpoints that could eventually be used for the construction of recombination networks. These findings have several implications on phage metagenomics assembly, on the horizontal gene transfer paradigm, and more generally on the understanding of the composition and evolutionary dynamics of virus communities. PMID:20874413
Democratization of genetic data: connecting government approval of clinical tests with data sharing

PubMed Central

Ross, Theodora S.

2015-01-01

Abstract When a doctor orders a genetic test, patients assume that the test will yield a useful result to guide how their physicians take care of them. That assumption is frequently correct, but not always. Until recently, a genetic test only interrogated the sequence of one or two genes. Now, DNA-sequencing technologies are so fast and cheap that they have enabled clinicians to sequence panels of genes that may or may not be relevant to the patient's condition. The technology has outpaced our ability to interpret the results. Connecting approval of clinical tests to data sharing could help close this gap. PMID:27148568
Democratization of genetic data: connecting government approval of clinical tests with data sharing.

PubMed

Ross, Theodora S

2015-10-01

When a doctor orders a genetic test, patients assume that the test will yield a useful result to guide how their physicians take care of them. That assumption is frequently correct, but not always. Until recently, a genetic test only interrogated the sequence of one or two genes. Now, DNA-sequencing technologies are so fast and cheap that they have enabled clinicians to sequence panels of genes that may or may not be relevant to the patient's condition. The technology has outpaced our ability to interpret the results. Connecting approval of clinical tests to data sharing could help close this gap.
The Complete Sequence of the First Spodoptera frugiperda Betabaculovirus Genome: A Natural Multiple Recombinant Virus

PubMed Central

Cuartas, Paola E.; Barrera, Gloria P.; Belaich, Mariano N.; Barreto, Emiliano; Ghiringhelli, Pablo D.; Villamizar, Laura F.

2015-01-01

Spodoptera frugiperda (Lepidoptera: Noctuidae) is a major pest in maize crops in Colombia, and affects several regions in America. A granulovirus isolated from S. frugiperda (SfGV VG008) has potential as an enhancer of insecticidal activity of previously described nucleopolyhedrovirus from the same insect species (SfMNPV). The SfGV VG008 genome was sequenced and analyzed showing circular double stranded DNA of 140,913 bp encoding 146 putative ORFs that include 37 Baculoviridae core genes, 88 shared with betabaculoviruses, two shared only with betabaculoviruses from Noctuide insects, two shared with alphabaculoviruses, three copies of own genes (paralogs) and the other 14 corresponding to unique genes without representation in the other baculovirus species. Particularly, the genome encodes for important virulence factors such as 4 chitinases and 2 enhancins. The sequence analysis revealed the existence of eight homologous regions (hrs) and also suggests processes of gene acquisition by horizontal transfer including the SfGV VG008 ORFs 046/047 (paralogs), 059, 089 and 099. The bioinformatics evidence indicates that the genome donors of mentioned genes could be alpha- and/or betabaculovirus species. The previous reported ability of SfGV VG008 to naturally co-infect the same host with other virus show a possible mechanism to capture genes and thus improve its fitness. PMID:25609309
The complete sequence of the first Spodoptera frugiperda Betabaculovirus genome: a natural multiple recombinant virus.

PubMed

Cuartas, Paola E; Barrera, Gloria P; Belaich, Mariano N; Barreto, Emiliano; Ghiringhelli, Pablo D; Villamizar, Laura F

2015-01-20

Spodoptera frugiperda (Lepidoptera: Noctuidae) is a major pest in maize crops in Colombia, and affects several regions in America. A granulovirus isolated from S. frugiperda (SfGV VG008) has potential as an enhancer of insecticidal activity of previously described nucleopolyhedrovirus from the same insect species (SfMNPV). The SfGV VG008 genome was sequenced and analyzed showing circular double stranded DNA of 140,913 bp encoding 146 putative ORFs that include 37 Baculoviridae core genes, 88 shared with betabaculoviruses, two shared only with betabaculoviruses from Noctuide insects, two shared with alphabaculoviruses, three copies of own genes (paralogs) and the other 14 corresponding to unique genes without representation in the other baculovirus species. Particularly, the genome encodes for important virulence factors such as 4 chitinases and 2 enhancins. The sequence analysis revealed the existence of eight homologous regions (hrs) and also suggests processes of gene acquisition by horizontal transfer including the SfGV VG008 ORFs 046/047 (paralogs), 059, 089 and 099. The bioinformatics evidence indicates that the genome donors of mentioned genes could be alpha- and/or betabaculovirus species. The previous reported ability of SfGV VG008 to naturally co-infect the same host with other virus show a possible mechanism to capture genes and thus improve its fitness.
FunGene: the functional gene pipeline and repository.

PubMed

Fish, Jordan A; Chai, Benli; Wang, Qiong; Sun, Yanni; Brown, C Titus; Tiedje, James M; Cole, James R

2013-01-01

Ribosomal RNA genes have become the standard molecular markers for microbial community analysis for good reasons, including universal occurrence in cellular organisms, availability of large databases, and ease of rRNA gene region amplification and analysis. As markers, however, rRNA genes have some significant limitations. The rRNA genes are often present in multiple copies, unlike most protein-coding genes. The slow rate of change in rRNA genes means that multiple species sometimes share identical 16S rRNA gene sequences, while many more species share identical sequences in the short 16S rRNA regions commonly analyzed. In addition, the genes involved in many important processes are not distributed in a phylogenetically coherent manner, potentially due to gene loss or horizontal gene transfer. While rRNA genes remain the most commonly used markers, key genes in ecologically important pathways, e.g., those involved in carbon and nitrogen cycling, can provide important insights into community composition and function not obtainable through rRNA analysis. However, working with ecofunctional gene data requires some tools beyond those required for rRNA analysis. To address this, our Functional Gene Pipeline and Repository (FunGene; http://fungene.cme.msu.edu/) offers databases of many common ecofunctional genes and proteins, as well as integrated tools that allow researchers to browse these collections and choose subsets for further analysis, build phylogenetic trees, test primers and probes for coverage, and download aligned sequences. Additional FunGene tools are specialized to process coding gene amplicon data. For example, FrameBot produces frameshift-corrected protein and DNA sequences from raw reads while finding the most closely related protein reference sequence. These tools can help provide better insight into microbial communities by directly studying key genes involved in important ecological processes.
Genomic sequence for the aflatoxigenic filamentous fungus Aspergillus nomius

USDA-ARS?s Scientific Manuscript database

The genome of the A. nomius type strain was sequenced using a personal genome machine. Annotation of the genes was undertaken, followed by gene ontology and an investigation into the number of secondary metabolite clusters. Comparative studies with other Aspergillus species involved shared/unique ge...
Rat prostatic steroid binding protein: DNA sequence and transcript maps of the two C3 genes.

PubMed Central

Hurst, H C; Parker, M G

1983-01-01

In the rat there are two non-allelic genes C3(1) and C3(2) for the C3 polypeptide of prostatic steroid binding protein. We have cloned and sequenced both genes and show that only C3(1) is responsible for the production of authentic C3. Although there is a marked difference in their transcriptional activity, the two genes share extensive DNA sequence homology there being only one base difference from nucleotide - 235 to within the first intron. Transcript mapping has shown that there are two distinct C3 transcripts which share a unique 3' terminus but have 5' termini 38 bases apart each preceded by a 'TATA' box homology. Interestingly, an identical repetitive element is present just upstream of both genes. Both families of transcripts, which are produced in a ratio of 18:1, are coordinately regulated by testosterone. Images Fig. 3. Fig. 4. Fig. 5. PMID:6685625
Identification and analysis of pig chimeric mRNAs using RNA sequencing data

PubMed Central

2012-01-01

Background Gene fusion is ubiquitous over the course of evolution. It is expected to increase the diversity and complexity of transcriptomes and proteomes through chimeric sequence segments or altered regulation. However, chimeric mRNAs in pigs remain unclear. Here we identified some chimeric mRNAs in pigs and analyzed the expression of them across individuals and breeds using RNA-sequencing data. Results The present study identified 669 putative chimeric mRNAs in pigs, of which 251 chimeric candidates were detected in a set of RNA-sequencing data. The 618 candidates had clear trans-splicing sites, 537 of which obeyed the canonical GU-AG splice rule. Only two putative pig chimera variants whose fusion junction was overlapped with that of a known human chimeric mRNA were found. A set of unique chimeric events were considered middle variances in the expression across individuals and breeds, and revealed non-significant variance between sexes. Furthermore, the genomic region of the 5′ partner gene shares a similar DNA sequence with that of the 3′ partner gene for 458 putative chimeric mRNAs. The 81 of those shared DNA sequences significantly matched the known DNA-binding motifs in the JASPAR CORE database. Four DNA motifs shared in parental genomic regions had significant similarity with known human CTCF binding sites. Conclusions The present study provided detailed information on some pig chimeric mRNAs. We proposed a model that trans-acting factors, such as CTCF, induced the spatial organisation of parental genes to the same transcriptional factory so that parental genes were coordinatively transcribed to give birth to chimeric mRNAs. PMID:22925561
Molecular variation and horizontal gene transfer of the homocysteine methyltransferase gene mmuM and its distribution in clinical pathogens.

PubMed

Ying, Jianchao; Wang, Huifeng; Bao, Bokan; Zhang, Ying; Zhang, Jinfang; Zhang, Cheng; Li, Aifang; Lu, Junwan; Li, Peizhen; Ying, Jun; Liu, Qi; Xu, Teng; Yi, Huiguang; Li, Jinsong; Zhou, Li; Zhou, Tieli; Xu, Zuyuan; Ni, Liyan; Bao, Qiyu

2015-01-01

The homocysteine methyltransferase encoded by mmuM is widely distributed among microbial organisms. It is the key enzyme that catalyzes the last step in methionine biosynthesis and plays an important role in the metabolism process. It also enables the microbial organisms to tolerate high concentrations of selenium in the environment. In this research, 533 mmuM gene sequences covering 70 genera of the bacteria were selected from GenBank database. The distribution frequency of mmuM is different in the investigated genera of bacteria. The mapping results of 160 mmuM reference sequences showed that the mmuM genes were found in 7 species of pathogen genomes sequenced in this work. The polymerase chain reaction products of one mmuM genotype (NC_013951 as the reference) were sequenced and the sequencing results confirmed the mapping results. Furthermore, 144 representative sequences were chosen for phylogenetic analysis and some mmuM genes from totally different genera (such as the genes between Escherichia and Klebsiella and between Enterobacter and Kosakonia) shared closer phylogenetic relationship than those from the same genus. Comparative genomic analysis of the mmuM encoding regions on plasmids and bacterial chromosomes showed that pKF3-140 and pIP1206 plasmids shared a 21 kb homology region and a 4.9 kb fragment in this region was in fact originated from the Escherichia coli chromosome. These results further suggested that mmuM gene did go through the gene horizontal transfer among different species or genera of bacteria. High-throughput sequencing combined with comparative genomics analysis would explore distribution and dissemination of the mmuM gene among bacteria and its evolution at a molecular level.
Fine-scale mergers of chloroplast and mitochondrial genes create functional, transcompartmentally chimeric mitochondrial genes.

PubMed

Hao, Weilong; Palmer, Jeffrey D

2009-09-29

The mitochondrial genomes of flowering plants possess a promiscuous proclivity for taking up sequences from the chloroplast genome. All characterized chloroplast integrants exist apart from native mitochondrial genes, and only a few, involving chloroplast tRNA genes that have functionally supplanted their mitochondrial counterparts, appear to be of functional consequence. We developed a novel computational approach to search for homologous recombination (gene conversion) in a large number of sequences and applied it to 22 mitochondrial and chloroplast gene pairs, which last shared common ancestry some 2 billion years ago. We found evidence of recurrent conversion of short patches of mitochondrial genes by chloroplast homologs during angiosperm evolution, but no evidence of gene conversion in the opposite direction. All 9 putative conversion events involve the atp1/atpA gene encoding the alpha subunit of ATP synthase, which is unusually well conserved between the 2 organelles and the only shared gene that is widely sequenced across plant mitochondria. Moreover, all conversions were limited to the 2 regions of greatest nucleotide and amino acid conservation of atp1/atpA. These observations probably reflect constraints operating on both the occurrence and fixation of recombination between ancient homologs. These findings indicate that recombination between anciently related sequences is more frequent than previously appreciated and creates functional mitochondrial genes of chimeric origin. These results also have implications for the widespread use of mitochondrial atp1 in phylogeny reconstruction.
Genetic diversity in Trypanosoma theileri from Sri Lankan cattle and water buffaloes.

PubMed

Yokoyama, Naoaki; Sivakumar, Thillaiampalam; Fukushi, Shintaro; Tattiyapong, Muncharee; Tuvshintulga, Bumduuren; Kothalawala, Hemal; Silva, Seekkuge Susil Priyantha; Igarashi, Ikuo; Inoue, Noboru

2015-01-30

Trypanosoma theileri is a hemoprotozoan parasite that infects various ruminant species. We investigated the epidemiology of this parasite among cattle and water buffalo populations bred in Sri Lanka, using a diagnostic PCR assay based on the cathepsin L-like protein (CATL) gene. Blood DNA samples sourced from cattle (n=316) and water buffaloes (n=320) bred in different geographical areas of Sri Lanka were PCR screened for T. theileri. Parasite DNA was detected in cattle and water buffaloes alike in all the sampling locations. The overall T. theileri-positive rate was higher in water buffaloes (15.9%) than in cattle (7.6%). Subsequently, PCR amplicons were sequenced and the partial CATL sequences were phylogenetically analyzed. The identity values for the CATL gene were 89.6-99.7% among the cattle-derived sequences, compared with values of 90.7-100% for the buffalo-derived sequences. However, the cattle-derived sequences shared 88.2-100% identity values with those from buffaloes. In the phylogenetic tree, the Sri Lankan CATL gene sequences fell into two major clades (TthI and TthII), both of which contain CATL sequences from several other countries. Although most of the CATL sequences from Sri Lankan cattle and buffaloes clustered independently, two buffalo-derived sequences were observed to be closely related to those of the Sri Lankan cattle. Furthermore, a Sri Lankan buffalo sequence clustered with CATL gene sequences from Brazilian buffalo and Thai cattle. In addition to reporting the first PCR-based survey of T. theileri among Sri Lankan-bred cattle and water buffaloes, the present study found that some of the CATL gene fragments sourced from water buffaloes shared similarity with those determined from cattle in this country. Copyright © 2014 Elsevier B.V. All rights reserved.

Simultaneous knockdown of six non-family genes using a single synthetic RNAi fragment in Arabidopsis thaliana

DOE Office of Scientific and Technical Information (OSTI.GOV)

Czarnecki, Olaf; Bryan, Anthony C.; Jawdy, Sara S.

Genetic engineering of plants that results in successful establishment of new biochemical or regulatory pathways requires stable introduction of one or more genes into the plant genome. It might also be necessary to down-regulate or turn off expression of endogenous genes in order to reduce activity of competing pathways. An established way to knockdown gene expression in plants is expressing a hairpin-RNAi construct, eventually leading to degradation of a specifically targeted mRNA. Knockdown of multiple genes that do not share homologous sequences is still challenging and involves either sophisticated cloning strategies to create vectors with different serial expression constructs ormore » multiple transformation events that is often restricted by a lack of available transformation markers. Synthetic RNAi fragments were assembled in yeast carrying homologous sequences to six or seven non-family genes and introduced into pAGRIKOLA. Transformation of Arabidopsis thaliana and subsequent expression analysis of targeted genes proved efficient knockdown of all target genes. In conclusion, we present a simple and cost-effective method to create constructs to simultaneously knockdown multiple non-family genes or genes that do not share sequence homology. The presented method can be applied in plant and animal synthetic biology as well as traditional plant and animal genetic engineering.« less
Simultaneous knockdown of six non-family genes using a single synthetic RNAi fragment in Arabidopsis thaliana

DOE PAGES

Czarnecki, Olaf; Bryan, Anthony C.; Jawdy, Sara S.; ...

2016-02-17

Genetic engineering of plants that results in successful establishment of new biochemical or regulatory pathways requires stable introduction of one or more genes into the plant genome. It might also be necessary to down-regulate or turn off expression of endogenous genes in order to reduce activity of competing pathways. An established way to knockdown gene expression in plants is expressing a hairpin-RNAi construct, eventually leading to degradation of a specifically targeted mRNA. Knockdown of multiple genes that do not share homologous sequences is still challenging and involves either sophisticated cloning strategies to create vectors with different serial expression constructs ormore » multiple transformation events that is often restricted by a lack of available transformation markers. Synthetic RNAi fragments were assembled in yeast carrying homologous sequences to six or seven non-family genes and introduced into pAGRIKOLA. Transformation of Arabidopsis thaliana and subsequent expression analysis of targeted genes proved efficient knockdown of all target genes. In conclusion, we present a simple and cost-effective method to create constructs to simultaneously knockdown multiple non-family genes or genes that do not share sequence homology. The presented method can be applied in plant and animal synthetic biology as well as traditional plant and animal genetic engineering.« less
Chloroplast genomes of Arabidopsis halleri ssp. gemmifera and Arabidopsis lyrata ssp. petraea: Structures and comparative analysis.

PubMed

Asaf, Sajjad; Khan, Abdul Latif; Khan, Muhammad Aaqil; Waqas, Muhammad; Kang, Sang-Mo; Yun, Byung-Wook; Lee, In-Jung

2017-08-08

We investigated the complete chloroplast (cp) genomes of non-model Arabidopsis halleri ssp. gemmifera and Arabidopsis lyrata ssp. petraea using Illumina paired-end sequencing to understand their genetic organization and structure. Detailed bioinformatics analysis revealed genome sizes of both subspecies ranging between 154.4~154.5 kbp, with a large single-copy region (84,197~84,158 bp), a small single-copy region (17,738~17,813 bp) and pair of inverted repeats (IRa/IRb; 26,264~26,259 bp). Both cp genomes encode 130 genes, including 85 protein-coding genes, eight ribosomal RNA genes and 37 transfer RNA genes. Whole cp genome comparison of A. halleri ssp. gemmifera and A. lyrata ssp. petraea, along with ten other Arabidopsis species, showed an overall high degree of sequence similarity, with divergence among some intergenic spacers. The location and distribution of repeat sequences were determined, and sequence divergences of shared genes were calculated among related species. Comparative phylogenetic analysis of the entire genomic data set and 70 shared genes between both cp genomes confirmed the previous phylogeny and generated phylogenetic trees with the same topologies. The sister species of A. halleri ssp. gemmifera is A. umezawana, whereas the closest relative of A. lyrata spp. petraea is A. arenicola.
'Candidatus Phytoplasma palmicola', associated with a lethal yellowing-type disease of coconut (Cocos nucifera L.) in Mozambique.

PubMed

Harrison, Nigel A; Davis, Robert E; Oropeza, Carlos; Helmick, Ericka E; Narváez, María; Eden-Green, Simon; Dollet, Michel; Dickinson, Matthew

2014-06-01

In this study, the taxonomic position and group classification of the phytoplasma associated with a lethal yellowing-type disease (LYD) of coconut (Cocos nucifera L.) in Mozambique were addressed. Pairwise similarity values based on alignment of nearly full-length 16S rRNA gene sequences (1530 bp) revealed that the Mozambique coconut phytoplasma (LYDM) shared 100% identity with a comparable sequence derived from a phytoplasma strain (LDN) responsible for Awka wilt disease of coconut in Nigeria, and shared 99.0-99.6% identity with 16S rRNA gene sequences from strains associated with Cape St Paul wilt (CSPW) disease of coconut in Ghana and Côte d'Ivoire. Similarity scores further determined that the 16S rRNA gene of the LYDM phytoplasma shared <97.5% sequence identity with all previously described members of 'Candidatus Phytoplasma'. The presence of unique regions in the 16S rRNA gene sequence distinguished the LYDM phytoplasma from all currently described members of 'Candidatus Phytoplasma', justifying its recognition as the reference strain of a novel taxon, 'Candidatus Phytoplasma palmicola'. Virtual RFLP profiles of the F2n/R2 portion (1251 bp) of the 16S rRNA gene and pattern similarity coefficients delineated coconut LYDM phytoplasma strains from Mozambique as novel members of established group 16SrXXII, subgroup A (16SrXXII-A). Similarity coefficients of 0.97 were obtained for comparisons between subgroup 16SrXXII-A strains and CSPW phytoplasmas from Ghana and Côte d'Ivoire. On this basis, the CSPW phytoplasma strains were designated members of a novel subgroup, 16SrXXII-B.
Phylogenetic analysis of phenotypically characterized Cryptococcus laurentii isolates reveals high frequency of cryptic species.

PubMed

Ferreira-Paim, Kennio; Ferreira, Thatiana Bragine; Andrade-Silva, Leonardo; Mora, Delio Jose; Springer, Deborah J; Heitman, Joseph; Fonseca, Fernanda Machado; Matos, Dulcilena; Melhem, Márcia Souza Carvalho; Silva-Vergara, Mario León

2014-01-01

Although Cryptococcus laurentii has been considered saprophytic and its taxonomy is still being described, several cases of human infections have already reported. This study aimed to evaluate molecular aspects of C. laurentii isolates from Brazil, Botswana, Canada, and the United States. In this study, 100 phenotypically identified C. laurentii isolates were evaluated by sequencing the 18S nuclear ribosomal small subunit rRNA gene (18S-SSU), D1/D2 region of 28S nuclear ribosomal large subunit rRNA gene (28S-LSU), and the internal transcribed spacer (ITS) of the ribosomal region. BLAST searches using 550-bp, 650-bp, and 550-bp sequenced amplicons obtained from the 18S-SSU, 28S-LSU, and the ITS region led to the identification of 75 C. laurentii strains that shared 99-100% identity with C. laurentii CBS 139. A total of nine isolates shared 99% identity with both Bullera sp. VY-68 and C. laurentii RY1. One isolate shared 99% identity with Cryptococcus rajasthanensis CBS 10406, and eight isolates shared 100% identity with Cryptococcus sp. APSS 862 according to the 28S-LSU and ITS regions and designated as Cryptococcus aspenensis sp. nov. (CBS 13867). While 16 isolates shared 99% identity with Cryptococcus flavescens CBS 942 according to the 18S-SSU sequence, only six were confirmed using the 28S-LSU and ITS region sequences. The remaining 10 shared 99% identity with Cryptococcus terrestris CBS 10810, which was recently described in Brazil. Through concatenated sequence analyses, seven sequence types in C. laurentii, three in C. flavescens, one in C. terrestris, and one in the C. aspenensis sp. nov. were identified. Sequencing permitted the characterization of 75% of the environmental C. laurentii isolates from different geographical areas and the identification of seven haplotypes of this species. Among sequenced regions, the increased variability of the ITS region in comparison to the 18S-SSU and 28S-LSU regions reinforces its applicability as a DNA barcode.
Gene Discovery in the Apicomplexa as Revealed by EST Sequencing and Assembly of a Comparative Gene Database

PubMed Central

Li, Li; Brunk, Brian P.; Kissinger, Jessica C.; Pape, Deana; Tang, Keliang; Cole, Robert H.; Martin, John; Wylie, Todd; Dante, Mike; Fogarty, Steven J.; Howe, Daniel K.; Liberator, Paul; Diaz, Carmen; Anderson, Jennifer; White, Michael; Jerome, Maria E.; Johnson, Emily A.; Radke, Jay A.; Stoeckert, Christian J.; Waterston, Robert H.; Clifton, Sandra W.; Roos, David S.; Sibley, L. David

2003-01-01

Large-scale EST sequencing projects for several important parasites within the phylum Apicomplexa were undertaken for the purpose of gene discovery. Included were several parasites of medical importance (Plasmodium falciparum, Toxoplasma gondii) and others of veterinary importance (Eimeria tenella, Sarcocystis neurona, and Neospora caninum). A total of 55,192 ESTs, deposited into dbEST/GenBank, were included in the analyses. The resulting sequences have been clustered into nonredundant gene assemblies and deposited into a relational database that supports a variety of sequence and text searches. This database has been used to compare the gene assemblies using BLAST similarity comparisons to the public protein databases to identify putative genes. Of these new entries, ∼15%–20% represent putative homologs with a conservative cutoff of p < 10−9, thus identifying many conserved genes that are likely to share common functions with other well-studied organisms. Gene assemblies were also used to identify strain polymorphisms, examine stage-specific expression, and identify gene families. An interesting class of genes that are confined to members of this phylum and not shared by plants, animals, or fungi, was identified. These genes likely mediate the novel biological features of members of the Apicomplexa and hence offer great potential for biological investigation and as possible therapeutic targets. [The sequence data from this study have been submitted to dbEST division of GenBank under accession nos.: Toxoplasma gondii: –, –, –, –, – , –, –, –, –. Plasmodium falciparum: –, –, –, –. Sarcocystis neurona: , , , , , , , , , , , , , –, –, –, –, –. Eimeria tenella: –, –, –, –, –, –, –, –, – , –, –, –, –, –, –, –, –, –, –, –. Neospora caninum: –, –, , – , –, –.] PMID:12618375
Gene Deletion in Barley Mediated by LTR-retrotransposon BARE

PubMed Central

Shang, Yi; Yang, Fei; Schulman, Alan H.; Zhu, Jinghuan; Jia, Yong; Wang, Junmei; Zhang, Xiao-Qi; Jia, Qiaojun; Hua, Wei; Yang, Jianming; Li, Chengdao

2017-01-01

A poly-row branched spike (prbs) barley mutant was obtained from soaking a two-rowed barley inflorescence in a solution of maize genomic DNA. Positional cloning and sequencing demonstrated that the prbs mutant resulted from a 28 kb deletion including the inflorescence architecture gene HvRA2. Sequence annotation revealed that the HvRA2 gene is flanked by two LTR (long terminal repeat) retrotransposons (BARE) sharing 89% sequence identity. A recombination between the integrase (IN) gene regions of the two BARE copies resulted in the formation of an intact BARE and loss of HvRA2. No maize DNA was detected in the recombination region although the flanking sequences of HvRA2 gene showed over 73% of sequence identity with repetitive sequences on 10 maize chromosomes. It is still unknown whether the interaction of retrotransposons between barley and maize has resulted in the recombination observed in the present study. PMID:28252053
Cloning, sequencing, and expression of the Zymomonas mobilis phosphoglycerate mutase gene (pgm) in Escherichia coli.

PubMed Central

Yomano, L P; Scopes, R K; Ingram, L O

1993-01-01

Phosphoglycerate mutase is an essential glycolytic enzyme for Zymomonas mobilis, catalyzing the reversible interconversion of 3-phosphoglycerate and 2-phosphoglycerate. The pgm gene encoding this enzyme was cloned on a 5.2-kbp DNA fragment and expressed in Escherichia coli. Recombinants were identified by using antibodies directed against purified Z. mobilis phosphoglycerate mutase. The pgm gene contains a canonical ribosome-binding site, a biased pattern of codon usage, a long upstream untranslated region, and four promoters which share sequence homology. Interestingly, adhA and a D-specific 2-hydroxyacid dehydrogenase were found on the same DNA fragment and appear to form a cluster of genes which function in central metabolism. The translated sequence for Z. mobilis pgm was in full agreement with the 40 N-terminal amino acid residues determined by protein sequencing. The primary structure of the translated sequence is highly conserved (52 to 60% identity with other phosphoglycerate mutases) and also shares extensive homology with bisphosphoglycerate mutases (51 to 59% identity). Since Southern blots indicated the presence of only a single copy of pgm in the Z. mobilis chromosome, it is likely that the cloned pgm gene functions to provide both activities. Z. mobilis phosphoglycerate mutase is unusual in that it lacks the flexible tail and lysines at the carboxy terminus which are present in the enzyme isolated from all other organisms examined. Images PMID:8320209
Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.

PubMed Central

Benslimane, A A; Dron, M; Hartmann, C; Rode, A

1986-01-01

Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest that small tandemly repeated DNA sequences of plants may have evolved from a tRNA gene ancestor. These tandem repeats have probably arisen via a process involving reverse transcription of polymerase III RNA intermediates, as is the case for interspersed DNA sequences of mammalians. A model is proposed to explain the formation of such small tandemly repeated DNA sequences. Images PMID:3774553
Polymorphism and selection in the major histocompatibility complex DRA and DQA genes in the family Equidae.

PubMed

Janova, Eva; Matiasovic, Jan; Vahala, Jiri; Vodicka, Roman; Van Dyk, Enette; Horin, Petr

2009-07-01

The major histocompatibility complex genes coding for antigen binding and presenting molecules are the most polymorphic genes in the vertebrate genome. We studied the DRA and DQA gene polymorphism of the family Equidae. In addition to 11 previously reported DRA and 24 DQA alleles, six new DRA sequences and 13 new DQA alleles were identified in the genus Equus. Phylogenetic analysis of both DRA and DQA sequences provided evidence for trans-species polymorphism in the family Equidae. The phylogenetic trees differed from species relationships defined by standard taxonomy of Equidae and from trees based on mitochondrial or neutral gene sequence data. Analysis of selection showed differences between the less variable DRA and more variable DQA genes. DRA alleles were more often shared by more species. The DQA sequences analysed showed strong amongst-species positive selection; the selected amino acid positions mostly corresponded to selected positions in rodent and human DQA genes.
Comparative genomic sequence analysis of novel Helicoverpa armigera nucleopolyhedrovirus (NPV) isolated from Kenya and three other previously sequenced Helicoverpa spp. NPVs.

PubMed

Ogembo, Javier Gordon; Caoili, Barbara L; Shikata, Masamitsu; Chaeychomsri, Sudawan; Kobayashi, Michihiro; Ikeda, Motoko

2009-10-01

A newly cloned Helicoverpa armigera nucleopolyhedrovirus (HearNPV) from Kenya, HearNPV-NNg1, has a higher insecticidal activity than HearNPV-G4, which also exhibits lower insecticidal activity than HearNPV-C1. In the search for genes and/or nucleotide sequences that might be involved in the observed virulence differences among Helicoverpa spp. NPVs, the entire genome of NNg1 was sequenced and compared with previously sequenced genomes of G4, C1 and Helicoverpa zea single-nucleocapsid NPV (Hz). The NNg1 genome was 132,425 bp in length, with a total of 143 putative open reading frames (ORFs), and shared high levels of overall amino acid and nucleotide sequence identities with G4, C1 and Hz. Three NNg1 ORFs, ORF5, ORF100 and ORF124, which were shared with C1, were absent in G4 and Hz, while NNg1 and C1 were missing a homologue of G4/Hz ORF5. Another three ORFs, ORF60 (bro-b), ORF119 and ORF120, and one direct repeat sequence (dr) were unique to NNg1. Relative to the overall nucleotide sequence identity, lower sequence identities were observed between NNg1 hrs and the homologous hrs in the other three Helicoverpa spp. NPVs, despite containing the same number of hrs located at essentially the same positions on the genomes. Differences were also observed between NNg1 and each of the other three Helicoverpa spp. NPVs in the diversity of bro genes encoded on the genomes. These results indicate several putative genes and nucleotide sequences that may be responsible for the virulence differences observed among Helicoverpa spp., yet the specific genes and/or nucleotide sequences responsible have not been identified.
Meta4: a web application for sharing and annotating metagenomic gene predictions using web services.

PubMed

Richardson, Emily J; Escalettes, Franck; Fotheringham, Ian; Wallace, Robert J; Watson, Mick

2013-01-01

Whole-genome shotgun metagenomics experiments produce DNA sequence data from entire ecosystems, and provide a huge amount of novel information. Gene discovery projects require up-to-date information about sequence homology and domain structure for millions of predicted proteins to be presented in a simple, easy-to-use system. There is a lack of simple, open, flexible tools that allow the rapid sharing of metagenomics datasets with collaborators in a format they can easily interrogate. We present Meta4, a flexible and extensible web application that can be used to share and annotate metagenomic gene predictions. Proteins and predicted domains are stored in a simple relational database, with a dynamic front-end which displays the results in an internet browser. Web services are used to provide up-to-date information about the proteins from homology searches against public databases. Information about Meta4 can be found on the project website, code is available on Github, a cloud image is available, and an example implementation can be seen at.
Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial “pan-genome”

PubMed Central

Tettelin, Hervé; Masignani, Vega; Cieslewicz, Michael J.; Donati, Claudio; Medini, Duccio; Ward, Naomi L.; Angiuoli, Samuel V.; Crabtree, Jonathan; Jones, Amanda L.; Durkin, A. Scott; DeBoy, Robert T.; Davidsen, Tanja M.; Mora, Marirosa; Scarselli, Maria; Margarit y Ros, Immaculada; Peterson, Jeremy D.; Hauser, Christopher R.; Sundaram, Jaideep P.; Nelson, William C.; Madupu, Ramana; Brinkac, Lauren M.; Dodson, Robert J.; Rosovitz, Mary J.; Sullivan, Steven A.; Daugherty, Sean C.; Haft, Daniel H.; Selengut, Jeremy; Gwinn, Michelle L.; Zhou, Liwei; Zafar, Nikhat; Khouri, Hoda; Radune, Diana; Dimitrov, George; Watkins, Kisha; O'Connor, Kevin J. B.; Smith, Shannon; Utterback, Teresa R.; White, Owen; Rubens, Craig E.; Grandi, Guido; Madoff, Lawrence C.; Kasper, Dennis L.; Telford, John L.; Wessels, Michael R.; Rappuoli, Rino; Fraser, Claire M.

2005-01-01

The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome-wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains representing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Analysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for ≈80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrapolation of the data suggests that the gene reservoir available for inclusion in the S. agalactiae pan-genome is vast and that unique genes will continue to be identified even after sequencing hundreds of genomes. PMID:16172379
Genetic diversity and antigenicity variation of Babesia bovis merozoite surface antigen-1 (MSA-1) in Thailand.

PubMed

Tattiyapong, Muncharee; Sivakumar, Thillaiampalam; Takemae, Hitoshi; Simking, Pacharathon; Jittapalapong, Sathaporn; Igarashi, Ikuo; Yokoyama, Naoaki

2016-07-01

Babesia bovis, an intraerythrocytic protozoan parasite, causes severe clinical disease in cattle worldwide. The genetic diversity of parasite antigens often results in different immune profiles in infected animals, hindering efforts to develop immune control methodologies against the B. bovis infection. In this study, we analyzed the genetic diversity of the merozoite surface antigen-1 (msa-1) gene using 162 B. bovis-positive blood DNA samples sourced from cattle populations reared in different geographical regions of Thailand. The identity scores shared among 93 msa-1 gene sequences isolated by PCR amplification were 43.5-100%, and the similarity values among the translated amino acid sequences were 42.8-100%. Of 23 total clades detected in our phylogenetic analysis, Thai msa-1 gene sequences occurred in 18 clades; seven among them were composed of sequences exclusively from Thailand. To investigate differential antigenicity of isolated MSA-1 proteins, we expressed and purified eight recombinant MSA-1 (rMSA-1) proteins, including an rMSA-1 from B. bovis Texas (T2Bo) strain and seven rMSA-1 proteins based on the Thai msa-1 sequences. When these antigens were analyzed in a western blot assay, anti-T2Bo cattle serum strongly reacted with the rMSA-1 from T2Bo, as well as with three other rMSA-1 proteins that shared 54.9-68.4% sequence similarity with T2Bo MSA-1. In contrast, no or weak reactivity was observed for the remaining rMSA-1 proteins, which shared low sequence similarity (35.0-39.7%) with T2Bo MSA-1. While demonstrating the high genetic diversity of the B. bovis msa-1 gene in Thailand, the present findings suggest that the genetic diversity results in antigenicity variations among the MSA-1 antigens of B. bovis in Thailand. Copyright © 2016 Elsevier B.V. All rights reserved.
Low Maternal Microbiota Sharing across Gut, Breast Milk and Vagina, as Revealed by 16S rRNA Gene and Reduced Metagenomic Sequencing.

PubMed

Avershina, Ekaterina; Angell, Inga Leena; Simpson, Melanie; Storrø, Ola; Øien, Torbjørn; Johnsen, Roar; Rudi, Knut

2018-05-01

The maternal microbiota plays an important role in infant gut colonization. In this work we have investigated which bacterial species are shared across the breast milk, vaginal and stool microbiotas of 109 women shortly before and after giving birth using 16S rRNA gene sequencing and a novel reduced metagenomic sequencing (RMS) approach in a subgroup of 16 women. All the species predicted by the 16S rRNA gene sequencing were also detected by RMS analysis and there was good correspondence between their relative abundances estimated by both approaches. Both approaches also demonstrate a low level of maternal microbiota sharing across the population and RMS analysis identified only two species common to most women and in all sample types ( Bifidobacterium longum and Enterococcus faecalis ). Breast milk was the only sample type that had significantly higher intra- than inter- individual similarity towards both vaginal and stool samples. We also searched our RMS dataset against an in silico generated reference database derived from bacterial isolates in the Human Microbiome Project. The use of this reference-based search enabled further separation of Bifidobacterium longum into Bifidobacterium longum ssp. longum and Bifidobacterium longum ssp. infantis . We also detected the Lactobacillus rhamnosus GG strain, which was used as a probiotic supplement by some women, demonstrating the potential of RMS approach for deeper taxonomic delineation and estimation.
Low Maternal Microbiota Sharing across Gut, Breast Milk and Vagina, as Revealed by 16S rRNA Gene and Reduced Metagenomic Sequencing

PubMed Central

Angell, Inga Leena; Storrø, Ola; Øien, Torbjørn; Johnsen, Roar; Rudi, Knut

2018-01-01

The maternal microbiota plays an important role in infant gut colonization. In this work we have investigated which bacterial species are shared across the breast milk, vaginal and stool microbiotas of 109 women shortly before and after giving birth using 16S rRNA gene sequencing and a novel reduced metagenomic sequencing (RMS) approach in a subgroup of 16 women. All the species predicted by the 16S rRNA gene sequencing were also detected by RMS analysis and there was good correspondence between their relative abundances estimated by both approaches. Both approaches also demonstrate a low level of maternal microbiota sharing across the population and RMS analysis identified only two species common to most women and in all sample types (Bifidobacterium longum and Enterococcus faecalis). Breast milk was the only sample type that had significantly higher intra- than inter- individual similarity towards both vaginal and stool samples. We also searched our RMS dataset against an in silico generated reference database derived from bacterial isolates in the Human Microbiome Project. The use of this reference-based search enabled further separation of Bifidobacterium longum into Bifidobacterium longum ssp. longum and Bifidobacterium longum ssp. infantis. We also detected the Lactobacillus rhamnosus GG strain, which was used as a probiotic supplement by some women, demonstrating the potential of RMS approach for deeper taxonomic delineation and estimation. PMID:29724017
Evolution of glutamine amidotransferase genes. Nucleotide sequences of the pabA genes from Salmonella typhimurium, Klebsiella aerogenes and Serratia marcescens.

PubMed

Kaplan, J B; Merkel, W K; Nichols, B P

1985-06-05

The amide group of glutamine is a source of nitrogen in the biosynthesis of a variety of compounds. These reactions are catalyzed by a group of enzymes known as glutamine amidotransferases; two of these, the glutamine amidotransferase subunits of p-aminobenzoate synthase and anthranilate synthase have been studied in detail and have been shown to be structurally and functionally related. In some micro-organisms, p-aminobenzoate synthase and anthranilate synthase share a common glutamine amidotransferase subunit. We report here the primary DNA and deduced amino acid sequences of the p-aminobenzoate synthase glutamine amidotransferase subunits from Salmonella typhimurium, Klebsiella aerogenes and Serratia marcescens. A comparison of these glutamine amidotransferase sequences to the sequences of ten others, including some that function specifically in either the p-aminobenzoate synthase or anthranilate synthase complexes and some that are shared by both synthase complexes, has revealed several interesting features of the structure and organization of these genes, and has allowed us to speculate as to the evolutionary history of this family of enzymes. We propose a model for the evolution of the p-aminobenzoate synthase and anthranilate synthase glutamine amidotransferase subunits in which the duplication and subsequent divergence of the genetic information encoding a shared glutamine amidotransferase subunit led to the evolution of two new pathway-specific enzymes.
Origin and spread of photosynthesis based upon conserved sequence features in key bacteriochlorophyll biosynthesis proteins.

PubMed

Gupta, Radhey S

2012-11-01

The origin of photosynthesis and how this capability has spread to other bacterial phyla remain important unresolved questions. I describe here a number of conserved signature indels (CSIs) in key proteins involved in bacteriochlorophyll (Bchl) biosynthesis that provide important insights in these regards. The proteins BchL and BchX, which are essential for Bchl biosynthesis, are derived by gene duplication in a common ancestor of all phototrophs. More ancient gene duplication gave rise to the BchX-BchL proteins and the NifH protein of the nitrogenase complex. The sequence alignment of NifH-BchX-BchL proteins contain two CSIs that are uniquely shared by all NifH and BchX homologs, but not by any BchL homologs. These CSIs and phylogenetic analysis of NifH-BchX-BchL protein sequences strongly suggest that the BchX homologs are ancestral to BchL and that the Bchl-based anoxygenic photosynthesis originated prior to the chlorophyll (Chl)-based photosynthesis in cyanobacteria. Another CSI in the BchX-BchL sequence alignment that is uniquely shared by all BchX homologs and the BchL sequences from Heliobacteriaceae, but absent in all other BchL homologs, suggests that the BchL homologs from Heliobacteriaceae are primitive in comparison to all other photosynthetic lineages. Several other identified CSIs in the BchN homologs are commonly shared by all proteobacterial homologs and a clade consisting of the marine unicellular Cyanobacteria (Clade C). These CSIs in conjunction with the results of phylogenetic analyses and pair-wise sequence similarity on the BchL, BchN, and BchB proteins, where the homologs from Clade C Cyanobacteria and Proteobacteria exhibited close relationship, provide strong evidence that these two groups have incurred lateral gene transfers. Additionally, phylogenetic analyses and several CSIs in the BchL-N-B proteins that are uniquely shared by all Chlorobi and Chloroflexi homologs provide evidence that the genes for these proteins have also been laterally transferred between these groups. Other results and observations reported here indicate that the genes for the BchL-N-B proteins in Proteobacteria are derived from the Clade C Cyanobacteria, whereas those in Chlorobi were acquired from Chloroflexus or related bacteria by means of LGTs. Some implications of these observations regarding the origin and spread of photosynthesis are discussed.
Phylogenetic Analysis of Shewanella Strains by DNA Relatedness Derived from Whole Genome Microarray DNA-DNA Hybridization and Comparison with Other Methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Liyou; Yi, T. Y.; Van Nostrand, Joy

Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site [Hanford Reach of the Columbia River (HRCR), 11 strains], Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the averagemore » nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.« less
Characterization of cDNAs and genomic DNAs for human threonyl- and cysteinyl-tRNA synthetases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cruzen, M.E.

1993-01-01

Techniques of molecular biology were used to clone, sequence and map two human aminoacyl-tRNA synthetase (aaRS) cDNAs: threonyl-tRNA synthetase (ThrRS) a class II enzyme and cysteinyl-tRNA synthetase (CysRS) a class I enzyme. The predicted protein sequence of human ThrRS is highly homologous to that of lower eukaryotic and prokaryotic ThRSs, particularly in the regions containing the three structural motifs common to all class II synthetases. Signature regions 1 and 2, which characterize the class IIa subgroup (SerRS, ThrRS and HisRS) are highly conserved from bacteria to human. Structural predictions for human ThrRS based on the known structure of the closelymore » related SerRS from E.coli implicate strongly conserved residues in the signature sequences to be important in substrate binding. The amino terminal 100 residues of the deduced amino acid sequence of ThrRS shares structural similarity to SerRS consistent with forming an antiparallel helix implicated in tRNA binding. The 5' untranslated sequence of the human ThrRS gene shares short stretches of common sequence with the gene for hamster HisRS including a binding site for the promoter specific transcription factor sp-1. The deduced amino acid sequence of human CysRS has a high degree of sequence identify to E. coli CysRS. Human CysRS possesses the classic characteristics of a class I synthetase and is most closely related to the MetRS subgroup. The amino terminal half of human CysRS can be modeled as a nucleotide binding fold and shares significant sequence and structural similarity to the other enzymes in this subgroup. The CysRS structural gene (CARS) was mapped to human chromosome 11p15.5 by fluorescent in situ hybridization. CARS is the first aaRS gene to be mapped to chromosome 11. The steady state of both CysRS and ThrRs mRNA were quantitated in several human tissues. Message levels for these enzymes appear to be subjected to differential regulation in different cell types.« less

Phylogenetic Analysis of Phenotypically Characterized Cryptococcus laurentii Isolates Reveals High Frequency of Cryptic Species

PubMed Central

Ferreira-Paim, Kennio; Ferreira, Thatiana Bragine; Andrade-Silva, Leonardo; Mora, Delio Jose; Springer, Deborah J.; Heitman, Joseph; Fonseca, Fernanda Machado; Matos, Dulcilena; Melhem, Márcia Souza Carvalho; Silva-Vergara, Mario León

2014-01-01

Background Although Cryptococcus laurentii has been considered saprophytic and its taxonomy is still being described, several cases of human infections have already reported. This study aimed to evaluate molecular aspects of C. laurentii isolates from Brazil, Botswana, Canada, and the United States. Methods In this study, 100 phenotypically identified C. laurentii isolates were evaluated by sequencing the 18S nuclear ribosomal small subunit rRNA gene (18S-SSU), D1/D2 region of 28S nuclear ribosomal large subunit rRNA gene (28S-LSU), and the internal transcribed spacer (ITS) of the ribosomal region. Results BLAST searches using 550-bp, 650-bp, and 550-bp sequenced amplicons obtained from the 18S-SSU, 28S-LSU, and the ITS region led to the identification of 75 C. laurentii strains that shared 99–100% identity with C. laurentii CBS 139. A total of nine isolates shared 99% identity with both Bullera sp. VY-68 and C. laurentii RY1. One isolate shared 99% identity with Cryptococcus rajasthanensis CBS 10406, and eight isolates shared 100% identity with Cryptococcus sp. APSS 862 according to the 28S-LSU and ITS regions and designated as Cryptococcus aspenensis sp. nov. (CBS 13867). While 16 isolates shared 99% identity with Cryptococcus flavescens CBS 942 according to the 18S-SSU sequence, only six were confirmed using the 28S-LSU and ITS region sequences. The remaining 10 shared 99% identity with Cryptococcus terrestris CBS 10810, which was recently described in Brazil. Through concatenated sequence analyses, seven sequence types in C. laurentii, three in C. flavescens, one in C. terrestris, and one in the C. aspenensis sp. nov. were identified. Conclusions Sequencing permitted the characterization of 75% of the environmental C. laurentii isolates from different geographical areas and the identification of seven haplotypes of this species. Among sequenced regions, the increased variability of the ITS region in comparison to the 18S-SSU and 28S-LSU regions reinforces its applicability as a DNA barcode. PMID:25251413
Primer development to obtain complete coding sequence of HA and NA genes of influenza A/H3N2 virus.

PubMed

Agustiningsih, Agustiningsih; Trimarsanto, Hidayat; Setiawaty, Vivi; Artika, I Made; Muljono, David Handojo

2016-08-30

Influenza is an acute respiratory illness and has become a serious public health problem worldwide. The need to study the HA and NA genes in influenza A virus is essential since these genes frequently undergo mutations. This study describes the development of primer sets for RT-PCR to obtain complete coding sequence of Hemagglutinin (HA) and Neuraminidase (NA) genes of influenza A/H3N2 virus from Indonesia. The primers were developed based on influenza A/H3N2 sequence worldwide from Global Initiative on Sharing All Influenza Data (GISAID) and further tested using Indonesian influenza A/H3N2 archived samples of influenza-like illness (ILI) surveillance from 2008 to 2009. An optimum RT-PCR condition was acquired for all HA and NA fragments designed to cover complete coding sequence of HA and NA genes. A total of 71 samples were successfully sequenced for complete coding sequence both of HA and NA genes out of 145 samples of influenza A/H3N2 tested. The developed primer sets were suitable for obtaining complete coding sequences of HA and NA genes of Indonesian samples from 2008 to 2009.
Genome structure drives patterns of gene family evolution in ciliates, a case study using Chilodonella uncinata (Protista, Ciliophora, Phyllopharyngea)

PubMed Central

Gao, Feng; Song, Weibo; Katz, Laura A.

2014-01-01

In most lineages, diversity among gene family members results from gene duplication followed by sequence divergence. Because of the genome rearrangements during the development of somatic nuclei, gene family evolution in ciliates involves more complex processes. Previous work on the ciliate Chilodonella uncinata revealed that macronuclear β-tubulin gene family members are generated by alternative processing, in which germline regions are alternatively used in multiple macronuclear chromosomes. To further study genome evolution in this ciliate, we analyzed its transcriptome and found that: 1) alternative processing is extensive among gene families; and 2) such gene families are likely to be C. uncinata-specific. We characterized additional macronuclear and micronuclear copies of one candidate alternatively processed gene family -- a protein kinase domain containing protein (PKc) -- from two C. uncinata strains. Analysis of the PKc sequences reveals: 1) multiple PKc gene family members in the macronucleus share some identical regions flanked by divergent regions; and 2) the shared identical regions are processed from a single micronuclear chromosome. We discuss analogous processes in lineages across the eukaryotic tree of life to provide further insights on the impact of genome structure on gene family evolution in eukaryotes. PMID:24749903
Microarray analysis of gene expression profiles in ripening pineapple fruits.

PubMed

Koia, Jonni H; Moyle, Richard L; Botella, Jose R

2012-12-18

Pineapple (Ananas comosus) is a tropical fruit crop of significant commercial importance. Although the physiological changes that occur during pineapple fruit development have been well characterized, little is known about the molecular events that occur during the fruit ripening process. Understanding the molecular basis of pineapple fruit ripening will aid the development of new varieties via molecular breeding or genetic modification. In this study we developed a 9277 element pineapple microarray and used it to profile gene expression changes that occur during pineapple fruit ripening. Microarray analyses identified 271 unique cDNAs differentially expressed at least 1.5-fold between the mature green and mature yellow stages of pineapple fruit ripening. Among these 271 sequences, 184 share significant homology with genes encoding proteins of known function, 53 share homology with genes encoding proteins of unknown function and 34 share no significant homology with any database accession. Of the 237 pineapple sequences with homologs, 160 were up-regulated and 77 were down-regulated during pineapple fruit ripening. DAVID Functional Annotation Cluster (FAC) analysis of all 237 sequences with homologs revealed confident enrichment scores for redox activity, organic acid metabolism, metalloenzyme activity, glycolysis, vitamin C biosynthesis, antioxidant activity and cysteine peptidase activity, indicating the functional significance and importance of these processes and pathways during pineapple fruit development. Quantitative real-time PCR analysis validated the microarray expression results for nine out of ten genes tested. This is the first report of a microarray based gene expression study undertaken in pineapple. Our bioinformatic analyses of the transcript profiles have identified a number of genes, processes and pathways with putative involvement in the pineapple fruit ripening process. This study extends our knowledge of the molecular basis of pineapple fruit ripening and non-climacteric fruit ripening in general.
Microarray analysis of gene expression profiles in ripening pineapple fruits

PubMed Central

2012-01-01

Background Pineapple (Ananas comosus) is a tropical fruit crop of significant commercial importance. Although the physiological changes that occur during pineapple fruit development have been well characterized, little is known about the molecular events that occur during the fruit ripening process. Understanding the molecular basis of pineapple fruit ripening will aid the development of new varieties via molecular breeding or genetic modification. In this study we developed a 9277 element pineapple microarray and used it to profile gene expression changes that occur during pineapple fruit ripening. Results Microarray analyses identified 271 unique cDNAs differentially expressed at least 1.5-fold between the mature green and mature yellow stages of pineapple fruit ripening. Among these 271 sequences, 184 share significant homology with genes encoding proteins of known function, 53 share homology with genes encoding proteins of unknown function and 34 share no significant homology with any database accession. Of the 237 pineapple sequences with homologs, 160 were up-regulated and 77 were down-regulated during pineapple fruit ripening. DAVID Functional Annotation Cluster (FAC) analysis of all 237 sequences with homologs revealed confident enrichment scores for redox activity, organic acid metabolism, metalloenzyme activity, glycolysis, vitamin C biosynthesis, antioxidant activity and cysteine peptidase activity, indicating the functional significance and importance of these processes and pathways during pineapple fruit development. Quantitative real-time PCR analysis validated the microarray expression results for nine out of ten genes tested. Conclusions This is the first report of a microarray based gene expression study undertaken in pineapple. Our bioinformatic analyses of the transcript profiles have identified a number of genes, processes and pathways with putative involvement in the pineapple fruit ripening process. This study extends our knowledge of the molecular basis of pineapple fruit ripening and non-climacteric fruit ripening in general. PMID:23245313
Computing prokaryotic gene ubiquity: rescuing the core from extinction.

PubMed

Charlebois, Robert L; Doolittle, W Ford

2004-12-01

The genomic core concept has found several uses in comparative and evolutionary genomics. Defined as the set of all genes common to (ubiquitous among) all genomes in a phylogenetically coherent group, core size decreases as the number and phylogenetic diversity of the relevant group increases. Here, we focus on methods for defining the size and composition of the core of all genes shared by sequenced genomes of prokaryotes (Bacteria and Archaea). There are few (almost certainly less than 50) genes shared by all of the 147 genomes compared, surely insufficient to conduct all essential functions. Sequencing and annotation errors are responsible for the apparent absence of some genes, while very limited but genuine disappearances (from just one or a few genomes) can account for several others. Core size will continue to decrease as more genome sequences appear, unless the requirement for ubiquity is relaxed. Such relaxation seems consistent with any reasonable biological purpose for seeking a core, but it renders the problem of definition more problematic. We propose an alternative approach (the phylogenetically balanced core), which preserves some of the biological utility of the core concept. Cores, however delimited, preferentially contain informational rather than operational genes; we present a new hypothesis for why this might be so.
Exploring the Midgut Transcriptome and Brush Border Membrane Vesicle Proteome of the Rice Stem Borer, Chilo suppressalis (Walker)

PubMed Central

Peng, Chuanhua; Wang, Xiaoping; Li, Fei; Lin, Yongjun

2012-01-01

The rice stem borer, Chilo suppressalis (Walker) (Lepidoptera: Pyralidae), is one of the most detrimental pests affecting rice crops. The use of Bacillus thuringiensis (Bt) toxins has been explored as a means to control this pest, but the potential for C. suppressalis to develop resistance to Bt toxins makes this approach problematic. Few C. suppressalis gene sequences are known, which makes in-depth study of gene function difficult. Herein, we sequenced the midgut transcriptome of the rice stem borer. In total, 37,040 contigs were obtained, with a mean size of 497 bp. As expected, the transcripts of C. suppressalis shared high similarity with arthropod genes. Gene ontology and KEGG analysis were used to classify the gene functions in C. suppressalis. Using the midgut transcriptome data, we conducted a proteome analysis to identify proteins expressed abundantly in the brush border membrane vesicles (BBMV). Of the 100 top abundant proteins that were excised and subjected to mass spectrometry analysis, 74 share high similarity with known proteins. Among these proteins, Western blot analysis showed that Aminopeptidase N and EH domain-containing protein have the binding activities with Bt-toxin Cry1Ac. These data provide invaluable information about the gene sequences of C. suppressalis and the proteins that bind with Cry1Ac. PMID:22666467
Structural requirements for recognition of the HLA-Dw14 class II epitope: A key HLA determinant associated with rheumatoid arthritis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hiraiwa, Akikazu; Yamanaka, Katsuo; Kwok, W.W.

Although HLA genes have been shown to be associated with certain diseases, the basis for this association is unknown. Recent studies, however, have documented patterns of nucleotide sequence variation among some HLA genes associated with a particular disease. For rheumatoid arthritis, HLA genes in most patients have a shared nucleotide sequence encoding a key structural element of an HLA class II polypeptide; this sequence element is critical for the interaction of the HLA molecule with antigenic peptides and with responding T cells, suggestive of a direct role for this sequence element in disease susceptibility. The authors describe the serological andmore » cellular immunologic characteristics encoded by this rheumatoid arthritis-associated sequence element. Site-directed mutagenesis of the DRB1 gene was used to define amino acids critical for antibody and T-cell recognition of this structural element, focusing on residues that distinguish the rheumatoid arthritis-associated alleles Dw4 and Dw14 from a closely related allele, Dw10, not associated with disease. Both the gain and loss of rheumatoid arthritis-associated epitopes were highly dependent on three residues within a discrete domain of the HLA-DR molecule. Recognition was most strongly influenced by the following amino acids (in order): 70 > 71 > 67. Some alloreactive T-cell clones were also influenced by amino acid variation in portions of the DR molecule lying outside the shared sequence element.« less
Sampling Daphnia's expressed genes: preservation, expansion and invention of crustacean genes with reference to insect genomes

PubMed Central

Colbourne, John K; Eads, Brian D; Shaw, Joseph; Bohuski, Elizabeth; Bauer, Darren J; Andrews, Justen

2007-01-01

Background Functional and comparative studies of insect genomes have shed light on the complement of genes, which in part, account for shared morphologies, developmental programs and life-histories. Contrasting the gene inventories of insects to those of the nematodes provides insight into the genomic changes responsible for their diversification. However, nematodes have weak relationships to insects, as each belongs to separate animal phyla. A better outgroup to distinguish lineage specific novelties would include other members of Arthropoda. For example, crustaceans are close allies to the insects (together forming Pancrustacea) and their fascinating aquatic lifestyle provides an important comparison for understanding the genetic basis of adaptations to life on land versus life in water. Results This study reports on the first characterization of cDNA libraries and sequences for the model crustacean Daphnia pulex. We analyzed 1,546 ESTs of which 1,414 represent approximately 787 nuclear genes, by measuring their sequence similarities with insect and nematode proteomes. The provisional annotation of genes is supported by expression data from microarray studies described in companion papers. Loci expected to be shared between crustaceans and insects because of their mutual biological features are identified, including genes for reproduction, regulation and cellular processes. We identify genes that are likely derived within Pancrustacea or lost within the nematodes. Moreover, lineage specific gene family expansions are identified, which suggest certain biological demands associated with their ecological setting. In particular, up to seven distinct ferritin loci are found in Daphnia compared to three in most insects. Finally, a substantial fraction of the sampled gene transcripts shares no sequence similarity with those from other arthropods. Genes functioning during development and reproduction are comparatively well conserved between crustaceans and insects. By contrast, genes that were responsive to environmental conditions (metal stress) and not sex-biased included the greatest proportion of genes with no matches to insect proteomes. Conclusion This study along with associated microarray experiments are the initial steps in a coordinated effort by the Daphnia Genomics Consortium to build the necessary genomic platform needed to discover genes that account for the phenotypic diversity within the genus and to gain new insights into crustacean biology. This effort will soon include the first crustacean genome sequence. PMID:17612412
Discrimination of germline V genes at different sequencing lengths and mutational burdens: A new tool for identifying and evaluating the reliability of V gene assignment.

PubMed

Zhang, Bochao; Meng, Wenzhao; Prak, Eline T Luning; Hershberg, Uri

2015-12-01

Immune repertoires are collections of lymphocytes that express diverse antigen receptor gene rearrangements consisting of Variable (V), (Diversity (D) in the case of heavy chains) and Joining (J) gene segments. Clonally related cells typically share the same germline gene segments and have highly similar junctional sequences within their third complementarity determining regions. Identifying clonal relatedness of sequences is a key step in the analysis of immune repertoires. The V gene is the most important for clone identification because it has the longest sequence and the greatest number of sequence variants. However, accurate identification of a clone's germline V gene source is challenging because there is a high degree of similarity between different germline V genes. This difficulty is compounded in antibodies, which can undergo somatic hypermutation. Furthermore, high-throughput sequencing experiments often generate partial sequences and have significant error rates. To address these issues, we describe a novel method to estimate which germline V genes (or alleles) cannot be discriminated under different conditions (read lengths, sequencing errors or somatic hypermutation frequencies). Starting with any set of germline V genes, this method measures their similarity using different sequencing lengths and calculates their likelihood of unambiguous assignment under different levels of mutation. Hence, one can identify, under different experimental and biological conditions, the germline V genes (or alleles) that cannot be uniquely identified and bundle them together into groups of specific V genes with highly similar sequences. Copyright © 2015 Elsevier B.V. All rights reserved.
Nucleotide sequence of the ribosomal RNA gene of Physarum polycephalum: intron 2 and its flanking regions of the 26S rRNA gene.

PubMed Central

Nomiyama, H; Kuhara, S; Kukita, T; Otsuka, T; Sakaki, Y

1981-01-01

The 26S ribosomal RNA gene of Physarum polycephalum is interrupted by two introns, and we have previously determined the sequence of one of them (intron 1) (Nomiyama et al. Proc.Natl.Acad.Sci.USA 78, 1376-1380, 1981). In this study we sequenced the second intron (intron 2) of about 0.5 kb length and its flanking regions, and found that one nucleotide at each junction is identical in intron 1 and intron 2, though the junction regions share no other sequence homology. Comparison of the flanking exon sequences to E. coli 23S rRNA sequences shows that conserved sequences are interspersed with tracts having little homology. In particular, the region encompassing the intron 2 interruption site is highly conserved. The E. coli ribosomal protein L1 binding region is also conserved. Images PMID:6171776
A proposal to rename the hyperthermophile Pyrococcus woesei as Pyrococcus furiosus subsp. woesei.

PubMed

Kanoksilapatham, Wirojne; González, Juan M; Maeder, Dennis L; DiRuggiero, Jocelyne; Robb, Frank T

2004-10-01

Pyrococcus species are hyperthermophilic members of the order Thermococcales, with optimal growth temperatures approaching 100 degrees C. All species grow heterotrophically and produce H2 or, in the presence of elemental sulfur (S(o)), H2S. Pyrococcus woesei and P. furiosus were isolated from marine sediments at the same Vulcano Island beach site and share many morphological and physiological characteristics. We report here that the rDNA operons of these strains have identical sequences, including their intergenic spacer regions and part of the 23S rRNA. Both species grow rapidly and produce H2 in the presence of 0.1% maltose and 10-100 microM sodium tungstate in S(o)-free medium. However, P. woesei shows more extensive autolysis than P. furiosus in the stationary phase. Pyrococcus furiosus and P. woesei share three closely related families of insertion sequences (ISs). A Southern blot performed with IS probes showed extensive colinearity between the genomes of P. woesei and P. furiosus. Cloning and sequencing of ISs that were in different contexts in P. woesei and P. furiosus revealed that the napA gene in P. woesei is disrupted by a type III IS element, whereas in P. furiosus, this gene is intact. A type I IS element, closely linked to the napA gene, was observed in the same context in both P. furiosus and P. woesei genomes. Our results suggest that the IS elements are implicated in genomic rearrangements and reshuffling in these closely related strains. We propose to rename P. woesei a subspecies of P. furiosus based on their identical rDNA operon sequences, many common IS elements that are shared genomic markers, and the observation that all P. woesei nucleotide sequences deposited in GenBank to date are > 99% identical to P. furiosus sequences.
Extraordinary Sequence Divergence at Tsga8, an X-linked Gene Involved in Mouse Spermiogenesis

PubMed Central

Good, Jeffrey M.; Vanderpool, Dan; Smith, Kimberly L.; Nachman, Michael W.

2011-01-01

The X chromosome plays an important role in both adaptive evolution and speciation. We used a molecular evolutionary screen of X-linked genes potentially involved in reproductive isolation in mice to identify putative targets of recurrent positive selection. We then sequenced five very rapidly evolving genes within and between several closely related species of mice in the genus Mus. All five genes were involved in male reproduction and four of the genes showed evidence of recurrent positive selection. The most remarkable evolutionary patterns were found at Testis-specific gene a8 (Tsga8), a spermatogenesis-specific gene expressed during postmeiotic chromatin condensation and nuclear transformation. Tsga8 was characterized by extremely high levels of insertion–deletion variation of an alanine-rich repetitive motif in natural populations of Mus domesticus and M. musculus, differing in length from the reference mouse genome by up to 89 amino acids (27% of the total protein length). This population-level variation was coupled with striking divergence in protein sequence and length between closely related mouse species. Although no clear orthologs had previously been described for Tsga8 in other mammalian species, we have identified a highly divergent hypothetical gene on the rat X chromosome that shares clear orthology with the 5′ and 3′ ends of Tsga8. Further inspection of this ortholog verified that it is expressed in rat testis and shares remarkable similarity with mouse Tsga8 across several general features of the protein sequence despite no conservation of nucleotide sequence across over 60% of the rat-coding domain. Overall, Tsga8 appears to be one of the most rapidly evolving genes to have been described in rodents. We discuss the potential evolutionary causes and functional implications of this extraordinary divergence and the possible contribution of Tsga8 and the other four genes we examined to reproductive isolation in mice. PMID:21186189
Genes involved in convergent evolution of eusociality in bees

PubMed Central

Woodard, S. Hollis; Fischman, Brielle J.; Venkat, Aarti; Hudson, Matt E.; Varala, Kranthi; Cameron, Sydney A.; Clark, Andrew G.; Robinson, Gene E.

2011-01-01

Eusociality has arisen independently at least 11 times in insects. Despite this convergence, there are striking differences among eusocial lifestyles, ranging from species living in small colonies with overt conflict over reproduction to species in which colonies contain hundreds of thousands of highly specialized sterile workers produced by one or a few queens. Although the evolution of eusociality has been intensively studied, the genetic changes involved in the evolution of eusociality are relatively unknown. We examined patterns of molecular evolution across three independent origins of eusociality by sequencing transcriptomes of nine socially diverse bee species and combining these data with genome sequence from the honey bee Apis mellifera to generate orthologous sequence alignments for 3,647 genes. We found a shared set of 212 genes with a molecular signature of accelerated evolution across all eusocial lineages studied, as well as unique sets of 173 and 218 genes with a signature of accelerated evolution specific to either highly or primitively eusocial lineages, respectively. These results demonstrate that convergent evolution can involve a mosaic pattern of molecular changes in both shared and lineage-specific sets of genes. Genes involved in signal transduction, gland development, and carbohydrate metabolism are among the most prominent rapidly evolving genes in eusocial lineages. These findings provide a starting point for linking specific genetic changes to the evolution of eusociality. PMID:21482769
[Cloning and characterization of Caveolin-1 gene in pigeon, Columba livia domestica].

PubMed

Zhang, Ying; Yu, Jian-Feng; Yang, Li; Wang, Xing-Guo; Gu, Zhi-Liang

2010-10-01

Caveolins, a class of principal proteins forming the structure of caveolae in plasmalemma, were encoded by caveolins gene family. Caveolin-1 gene is a member of caveolins gene family. In the present study, a full-length of 2605 bp caveolin-1 cDNA sequence in Columba livia domestica, which included a 537 bp complete ORF encoding a 178 amino acids long putative peptide, were obtained by using RT-PCR and RACE technique. The Columba livia domestica caveolin-1 CDS shared 80.1% - 93.4% homology with Bos taurus, Canis lupus familiaris, Gallus gallus and Rattus norvegicus. Meanwhile, the putative amino acid sequence of Columba livia domestica caveolin-1 shared 85.4% - 97.2% homology with the above species. The semi-quantity RT-PCR revealed that Caveolin-1 expressions were detectable in all the Columba livia domestica tissues and the expressional level of caveolin-1 gene was high in adipose, medium in various muscles, low in liver. These results demonstrated that Caveolin-1 gene was potentially involved in some metabolic pathways in adipose and muscle.
Sarcocystis spp. in domestic sheep in Kunming City, China: prevalence, morphology, and molecular characteristics.

PubMed

Hu, Jun-Jie; Huang, Si; Wen, Tao; Esch, Gerald W; Liang, Yu; Li, Hong-Liang

2017-01-01

Sheep (Ovis aries) are intermediate hosts for at least six named species of Sarcocystis: S. tenella, S. arieticanis, S. gigantea, S. medusiformis, S. mihoensis, and S. microps. Here, only two species, S. tenella and S. arieticanis, were found in 79 of 86 sheep (91.9%) in Kunming, China, based on their morphological characteristics. Four genetic markers, i.e., 18S rRNA gene, 28S rRNA gene, mitochondrial cox1 gene, and ITS-1 region, were sequenced and characterized for the two species of Sarcocystis. Sequences of the three former markers for S. tenella shared high identities with those of S. capracanis in goats, i.e., 99.0%, 98.3%, and 93.6%, respectively; the same three marker sequences of S. arieticanis shared high identities with those of S. hircicanis in goats, i.e., 98.5%, 96.5%, and 92.5%, respectively. No sequences in GenBank were found to significantly resemble the ITS-1 regions of S. tenella and S. arieticanis. Identities of the four genetic markers for S. tenella and S. arieticanis were 96.3%, 95.4%, 82.5%, and 66.2%, respectively. © J.-J. Hu et al., published by EDP Sciences, 2017.
Transcript Profiling of Common Bean (Phaseolus vulgaris L.) Using the GeneChip(R) Soybean Genome Array: Optimizing Analysis by Masking Biased Probes

USDA-ARS?s Scientific Manuscript database

Common bean (Phaseolus vulgaris) and soybean (Glycine max) both belong to the Phaseoleae tribe and share significant coding sequence homology. This suggests that the GeneChip(R) Soybean Genome Array (soybean GeneChip) may be used for gene expression studies using common bean. To evaluate the utility...
Isolation of a full-length CC-NBS-LRR resistance gene analog candidate from sugar pine showing low nucleotide diversity.

Treesearch

K.D. Jermstad; L.A. Sheppard; B.B. Kinloch; A. Delfino-Mix; E.S. Ersoz; K.V. Krutovsky; D.B Neale

2006-01-01

The nucleotide-binding-site and leucine-rich-repeat (NBSâLRR) class of R proteins is abundant and widely distributed in plants. By using degenerate primers designed on the NBS domain in lettuce, we amplified sequences in sugar pine that shared sequence identity with many of the NBSâLRR class resistance genes catalogued in GenBank. The polymerase chain reaction products...
Chloroplast DNA sequence of the green alga Oedogonium cardiacum (Chlorophyceae): Unique genome architecture, derived characters shared with the Chaetophorales and novel genes acquired through horizontal transfer

PubMed Central

Brouard, Jean-Simon; Otis, Christian; Lemieux, Claude; Turmel, Monique

2008-01-01

Background To gain insight into the branching order of the five main lineages currently recognized in the green algal class Chlorophyceae and to expand our understanding of chloroplast genome evolution, we have undertaken the sequencing of chloroplast DNA (cpDNA) from representative taxa. The complete cpDNA sequences previously reported for Chlamydomonas (Chlamydomonadales), Scenedesmus (Sphaeropleales), and Stigeoclonium (Chaetophorales) revealed tremendous variability in their architecture, the retention of only few ancestral gene clusters, and derived clusters shared by Chlamydomonas and Scenedesmus. Unexpectedly, our recent phylogenies inferred from these cpDNAs and the partial sequences of three other chlorophycean cpDNAs disclosed two major clades, one uniting the Chlamydomonadales and Sphaeropleales (CS clade) and the other uniting the Oedogoniales, Chaetophorales and Chaetopeltidales (OCC clade). Although molecular signatures provided strong support for this dichotomy and for the branching of the Oedogoniales as the earliest-diverging lineage of the OCC clade, more data are required to validate these phylogenies. We describe here the complete cpDNA sequence of Oedogonium cardiacum (Oedogoniales). Results Like its three chlorophycean homologues, the 196,547-bp Oedogonium chloroplast genome displays a distinctive architecture. This genome is one of the most compact among photosynthetic chlorophytes. It has an atypical quadripartite structure, is intron-rich (17 group I and 4 group II introns), and displays 99 different conserved genes and four long open reading frames (ORFs), three of which are clustered in the spacious inverted repeat of 35,493 bp. Intriguingly, two of these ORFs (int and dpoB) revealed high similarities to genes not usually found in cpDNA. At the gene content and gene order levels, the Oedogonium genome most closely resembles its Stigeoclonium counterpart. Characters shared by these chlorophyceans but missing in members of the CS clade include the retention of psaM, rpl32 and trnL(caa), the loss of petA, the disruption of three ancestral clusters and the presence of five derived gene clusters. Conclusion The Oedogonium chloroplast genome disclosed additional characters that bolster the evidence for a close alliance between the Oedogoniales and Chaetophorales. Our unprecedented finding of int and dpoB in this cpDNA provides a clear example that novel genes were acquired by the chloroplast genome through horizontal transfers, possibly from a mitochondrial genome donor. PMID:18558012
Cloning and characterization of a mouse gene with homology to the human von Hippel-Lindau disease tumor suppressor gene: implications for the potential organization of the human von Hippel-Lindau disease gene.

PubMed

Gao, J; Naglich, J G; Laidlaw, J; Whaley, J M; Seizinger, B R; Kley, N

1995-02-15

The human von Hippel-Lindau disease (VHL) gene has recently been identified and, based on the nucleotide sequence of a partial cDNA clone, has been predicted to encode a novel protein with as yet unknown functions [F. Latif et al., Science (Washington DC), 260: 1317-1320, 1993]. The length of the encoded protein and the characteristics of the cellular expressed protein are as yet unclear. Here we report the cloning and characterization of a mouse gene (mVHLh1) that is widely expressed in different mouse tissues and shares high homology with the human VHL gene. It predicts a protein 181 residues long (and/or 162 amino acids, considering a potential alternative start codon), which across a core region of approximately 140 residues displays a high degree of sequence identity (98%) to the predicted human VHL protein. High stringency DNA and RNA hybridization experiments and protein expression analyses indicate that this gene is the most highly VHL-related mouse gene, suggesting that it represents the mouse VHL gene homologue rather than a related gene sharing a conserved functional domain. These findings provide new insights into the potential organization of the VHL gene and nature of its encoded protein.

Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

PubMed Central

Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

2005-01-01

We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085
Gene sequences present in Citrullus sp. having been lost during domestication of watermelon

USDA-ARS?s Scientific Manuscript database

A wide genetic diversity exists among Citrullus species, while watermelon cultivars (Citrullus lanatus var. lanatus) share a narrow genetic base as a result of many years of domestication and selection for desirable fruit qualities. The recent international watermelon genome sequencing project reve...
New Carbenicillin-Hydrolyzing β-Lactamase (CARB-7) from Vibrio cholerae Non-O1, Non-O139 Strains Encoded by the VCR Region of the V. cholerae Genome†

PubMed Central

Melano, Roberto; Petroni, Alejandro; Garutti, Alicia; Saka, Héctor Alex; Mange, Laura; Pasterán, Fernando; Rapoport, Melina; Rossi, Alicia; Galas, Marcelo

2002-01-01

In a previous study, an analysis of 77 ampicillin-nonsusceptible (resistant plus intermediate categories) strains of Vibrio cholerae non-O1, non-O139, isolated from aquatic environment and diarrheal stool, showed that all of them produced a β-lactamase with a pI of 5.4. Hybridization or amplification by PCR with a probe for blaTEM or primers for blaCARB gene families was negative. In this work, an environmental ampicillin-resistant strain from this sample, ME11762, isolated from a waterway in the west region of Argentina, was studied. The nucleotide sequence of the structural gene of the β-lactamase was determined by bidirectional sequencing of a Sau3AI fragment belonging to this isolate. The gene encodes a new 288-amino-acid protein, designated CARB-7, that shares 88.5% homology with the CARB-6 enzyme; an overall 83.2% homology with PSE-4, PSE-1, CARB-3, and the Proteus mirabilis N29 enzymes; and 79% homology with CARB-4 enzyme. The gene for this β-lactamase could not be transferred to Escherichia coli by conjugation. The nucleotide sequence of the flanking regions of the blaCARB-7 gene showed the occurrence of three 123-bp V. cholerae repeated sequences, all of which were found outside the predicted open reading frame. The upstream fragment of the blaCARB-7 gene shared 93% identity with a locus situated inside V. cholerae's chromosome 2. These results strongly suggest the chromosomal location of the blaCARB-7 gene, making this the first communication of a β-lactamase gene located on the VCR island of the V. cholerae genome. PMID:12069969
Analysis of 10,000 ESTs from lymphocytes of the cynomolgus monkey to improve our understanding of its immune system

PubMed Central

Chen, Wei-Hua; Wang, Xue-Xia; Lin, Wei; He, Xiao-Wei; Wu, Zhen-Qiang; Lin, Ying; Hu, Song-Nian; Wang, Xiao-Ning

2006-01-01

Background The cynomolgus monkey (Macaca fascicularis) is one of the most widely used surrogate animal models for an increasing number of human diseases and vaccines, especially immune-system-related ones. Towards a better understanding of the gene expression background upon its immunogenetics, we constructed a cDNA library from Epstein-Barr virus (EBV)-transformed B lymphocytes of a cynomolgus monkey and sequenced 10,000 randomly picked clones. Results After processing, 8,312 high-quality expressed sequence tags (ESTs) were generated and assembled into 3,728 unigenes. Annotations of these uniquely expressed transcripts demonstrated that out of the 2,524 open reading frame (ORF) positive unigenes (mitochondrial and ribosomal sequences were not included), 98.8% shared significant similarities (E-value less than 1e-10) with the NCBI nucleotide (nt) database, while only 67.7% (E-value less than 1e-5) did so with the NCBI non-redundant protein (nr) database. Further analysis revealed that 90.0% of the unigenes that shared no similarities to the nr database could be assigned to human chromosomes, in which 75 did not match significantly to any cynomolgus monkey and human ESTs. The mapping regions to known human genes on the human genome were described in detail. The protein family and domain analysis revealed that the first, second and fourth of the most abundantly expressed protein families were all assigned to immunoglobulin and major histocompatibility complex (MHC)-related proteins. The expression profiles of these genes were compared with that of homologous genes in human blood, lymph nodes and a RAMOS cell line, which demonstrated expression changes after transformation with EBV. The degree of sequence similarity of the MHC class I and II genes to the human reference sequences was evaluated. The results indicated that class I molecules showed weak amino acid identities (<90%), while class II showed slightly higher ones. Conclusion These results indicated that the genes expressed in the cynomolgus monkey could be used to identify novel protein-coding genes and revise those incomplete or incorrect annotations in the human genome by comparative methods, since the old world monkeys and humans share high similarities at the molecular level, especially within coding regions. The identification of multiple genes involved in the immune response, their sequence variations to the human homologues, and their responses to EBV infection could provide useful information to improve our understanding of the cynomolgus monkey immune system. PMID:16618371
Recombinatorial biases and convergent recombination determine interindividual TCRβ sharing in murine thymocytes.

PubMed

Li, Hanjie; Ye, Congting; Ji, Guoli; Wu, Xiaohui; Xiang, Zhe; Li, Yuanyue; Cao, Yonghao; Liu, Xiaolong; Douek, Daniel C; Price, David A; Han, Jiahuai

2012-09-01

Overlap of TCR repertoires among individuals provides the molecular basis for public T cell responses. By deep-sequencing the TCRβ repertoires of CD4+CD8+ thymocytes from three individual mice, we observed that a substantial degree of TCRβ overlap, comprising ∼10-15% of all unique amino acid sequences and ∼5-10% of all unique nucleotide sequences across any two individuals, is already present at this early stage of T cell development. The majority of TCRβ sharing between individual thymocyte repertoires could be attributed to the process of convergent recombination, with additional contributions likely arising from recombinatorial biases; the role of selection during intrathymic development was negligible. These results indicate that the process of TCR gene recombination is the major determinant of clonotype sharing between individuals.
Genome structure drives patterns of gene family evolution in ciliates, a case study using Chilodonella uncinata (Protista, Ciliophora, Phyllopharyngea).

PubMed

Gao, Feng; Song, Weibo; Katz, Laura A

2014-08-01

In most lineages, diversity among gene family members results from gene duplication followed by sequence divergence. Because of the genome rearrangements during the development of somatic nuclei, gene family evolution in ciliates involves more complex processes. Previous work on the ciliate Chilodonella uncinata revealed that macronuclear β-tubulin gene family members are generated by alternative processing, in which germline regions are alternatively used in multiple macronuclear chromosomes. To further study genome evolution in this ciliate, we analyzed its transcriptome and found that (1) alternative processing is extensive among gene families; and (2) such gene families are likely to be C. uncinata specific. We characterized additional macronuclear and micronuclear copies of one candidate alternatively processed gene family-a protein kinase domain containing protein (PKc)-from two C. uncinata strains. Analysis of the PKc sequences reveals that (1) multiple PKc gene family members in the macronucleus share some identical regions flanked by divergent regions; and (2) the shared identical regions are processed from a single micronuclear chromosome. We discuss analogous processes in lineages across the eukaryotic tree of life to provide further insights on the impact of genome structure on gene family evolution in eukaryotes. © 2014 The Author(s). Evolution © 2014 The Society for the Study of Evolution.
The Presence of Phage Orthologous Genes in Helicobacter pylori Correlates with the Presence of the Virulence Factors CagA and VacA.

PubMed

Kyrillos, Alexandra; Arora, Gaurav; Murray, Bradley; Rosenwald, Anne G

2016-06-01

The bacterium Helicobacter pylori is associated with ulcers and the development of gastric cancer. Several genes, including cytotoxin-associated gene A (CagA) and vacuolating cytotoxin A (VacA), are associated with increased gastric cancer risk. Some strains of H. pylori also contain sequences related to bacteriophage phiHP33; however, the significance of these phage-related sequences remains unknown. We assessed the extent to which phiHP33-related sequences are present in 335 H. pylori strains using homology searches then mapped shared genes between phiHP33 and H. pylori strains onto an existing phylogeny. One hundred and twenty-one H. pylori strains contain phage orthologous sequences, and the presence of the phage-related sequences correlates with the presence of CagA and VacA. Mapping of the phage orthologs onto a phylogeny of H. pylori is consistent with the hypothesis that these genes were acquired by horizontal gene transfer. phiHP33 phage orthologous sequences might be of significance in understanding virulence of different H. pylori strains. © 2015 John Wiley & Sons Ltd.
Nucleotide sequences of the tet(M) genes from the American and Dutch type tetracycline resistance plasmids of Neisseria gonorrhoeae.

PubMed

Gascoyne-Binzi, D M; Heritage, J; Hawkey, P M

1993-11-01

High-level tetracycline-resistant Neisseria gonorrhoeae (TRNG) has been associated with the presence of a plasmid approximately 25.2 MDa in size which carries a Tet M tetracycline resistance determinant. Two different plasmid types, American and Dutch, have previously been described, based on the restriction endonuclease digestion pattern. In this study, the tet(M) genes from the two plasmid types have been amplified by the polymerase chain reaction (PCR) and then sequenced. The gene sequences from the two plasmids shared 96.8% identity, and showed similarities with different segments of the tet(M) gene sequences from Tn1545, Tn916 and Ureaplasma urealyticum. The data suggest that it is highly likely that the Tet M determinant found in the American type plasmid has a different origin from that present in the Dutch plasmid.
Aerobic and Anaerobic Transformation of cis-Dichloroethene (cis-DCE) and Vinyl Chloride (VC): Steps for Reliable Remediation

DTIC Science & Technology

2003-12-01

populations. (ii) Characterization of Dehalococeoides sp . strain FL2. The isolate, designate d Dehalococcoides sp . strain FL2, reductively...Pinellas group of the Dehalococcoides cluster, and demonstrated that strain FL2 shared an identical 165 rRNA gene sequence with Dehalococcoides sp ...strain CBDBI, a chlorobenzene-dechlorinating strain. The 165 rRNA gene sequence of Dehalococcoides sp . strain FL2 was submitted to GenBank (AF357918.2
Conserved features of eukaryotic hsp70 genes revealed by comparison with the nucleotide sequence of human hsp70.

PubMed Central

Hunt, C; Morimoto, R I

1985-01-01

We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075
The genome sequence of taurine cattle: a window to ruminant biology and evolution.

PubMed

Elsik, Christine G; Tellam, Ross L; Worley, Kim C; Gibbs, Richard A; Muzny, Donna M; Weinstock, George M; Adelson, David L; Eichler, Evan E; Elnitski, Laura; Guigó, Roderic; Hamernik, Debora L; Kappes, Steve M; Lewin, Harris A; Lynn, David J; Nicholas, Frank W; Reymond, Alexandre; Rijnkels, Monique; Skow, Loren C; Zdobnov, Evgeny M; Schook, Lawrence; Womack, James; Alioto, Tyler; Antonarakis, Stylianos E; Astashyn, Alex; Chapple, Charles E; Chen, Hsiu-Chuan; Chrast, Jacqueline; Câmara, Francisco; Ermolaeva, Olga; Henrichsen, Charlotte N; Hlavina, Wratko; Kapustin, Yuri; Kiryutin, Boris; Kitts, Paul; Kokocinski, Felix; Landrum, Melissa; Maglott, Donna; Pruitt, Kim; Sapojnikov, Victor; Searle, Stephen M; Solovyev, Victor; Souvorov, Alexandre; Ucla, Catherine; Wyss, Carine; Anzola, Juan M; Gerlach, Daniel; Elhaik, Eran; Graur, Dan; Reese, Justin T; Edgar, Robert C; McEwan, John C; Payne, Gemma M; Raison, Joy M; Junier, Thomas; Kriventseva, Evgenia V; Eyras, Eduardo; Plass, Mireya; Donthu, Ravikiran; Larkin, Denis M; Reecy, James; Yang, Mary Q; Chen, Lin; Cheng, Ze; Chitko-McKown, Carol G; Liu, George E; Matukumalli, Lakshmi K; Song, Jiuzhou; Zhu, Bin; Bradley, Daniel G; Brinkman, Fiona S L; Lau, Lilian P L; Whiteside, Matthew D; Walker, Angela; Wheeler, Thomas T; Casey, Theresa; German, J Bruce; Lemay, Danielle G; Maqbool, Nauman J; Molenaar, Adrian J; Seo, Seongwon; Stothard, Paul; Baldwin, Cynthia L; Baxter, Rebecca; Brinkmeyer-Langford, Candice L; Brown, Wendy C; Childers, Christopher P; Connelley, Timothy; Ellis, Shirley A; Fritz, Krista; Glass, Elizabeth J; Herzig, Carolyn T A; Iivanainen, Antti; Lahmers, Kevin K; Bennett, Anna K; Dickens, C Michael; Gilbert, James G R; Hagen, Darren E; Salih, Hanni; Aerts, Jan; Caetano, Alexandre R; Dalrymple, Brian; Garcia, Jose Fernando; Gill, Clare A; Hiendleder, Stefan G; Memili, Erdogan; Spurlock, Diane; Williams, John L; Alexander, Lee; Brownstein, Michael J; Guan, Leluo; Holt, Robert A; Jones, Steven J M; Marra, Marco A; Moore, Richard; Moore, Stephen S; Roberts, Andy; Taniguchi, Masaaki; Waterman, Richard C; Chacko, Joseph; Chandrabose, Mimi M; Cree, Andy; Dao, Marvin Diep; Dinh, Huyen H; Gabisi, Ramatu Ayiesha; Hines, Sandra; Hume, Jennifer; Jhangiani, Shalini N; Joshi, Vandita; Kovar, Christie L; Lewis, Lora R; Liu, Yih-Shin; Lopez, John; Morgan, Margaret B; Nguyen, Ngoc Bich; Okwuonu, Geoffrey O; Ruiz, San Juana; Santibanez, Jireh; Wright, Rita A; Buhay, Christian; Ding, Yan; Dugan-Rocha, Shannon; Herdandez, Judith; Holder, Michael; Sabo, Aniko; Egan, Amy; Goodell, Jason; Wilczek-Boney, Katarzyna; Fowler, Gerald R; Hitchens, Matthew Edward; Lozado, Ryan J; Moen, Charles; Steffen, David; Warren, James T; Zhang, Jingkun; Chiu, Readman; Schein, Jacqueline E; Durbin, K James; Havlak, Paul; Jiang, Huaiyang; Liu, Yue; Qin, Xiang; Ren, Yanru; Shen, Yufeng; Song, Henry; Bell, Stephanie Nicole; Davis, Clay; Johnson, Angela Jolivet; Lee, Sandra; Nazareth, Lynne V; Patel, Bella Mayurkumar; Pu, Ling-Ling; Vattathil, Selina; Williams, Rex Lee; Curry, Stacey; Hamilton, Cerissa; Sodergren, Erica; Wheeler, David A; Barris, Wes; Bennett, Gary L; Eggen, André; Green, Ronnie D; Harhay, Gregory P; Hobbs, Matthew; Jann, Oliver; Keele, John W; Kent, Matthew P; Lien, Sigbjørn; McKay, Stephanie D; McWilliam, Sean; Ratnakumar, Abhirami; Schnabel, Robert D; Smith, Timothy; Snelling, Warren M; Sonstegard, Tad S; Stone, Roger T; Sugimoto, Yoshikazu; Takasuga, Akiko; Taylor, Jeremy F; Van Tassell, Curtis P; Macneil, Michael D; Abatepaulo, Antonio R R; Abbey, Colette A; Ahola, Virpi; Almeida, Iassudara G; Amadio, Ariel F; Anatriello, Elen; Bahadue, Suria M; Biase, Fernando H; Boldt, Clayton R; Carroll, Jeffery A; Carvalho, Wanessa A; Cervelatti, Eliane P; Chacko, Elsa; Chapin, Jennifer E; Cheng, Ye; Choi, Jungwoo; Colley, Adam J; de Campos, Tatiana A; De Donato, Marcos; Santos, Isabel K F de Miranda; de Oliveira, Carlo J F; Deobald, Heather; Devinoy, Eve; Donohue, Kaitlin E; Dovc, Peter; Eberlein, Annett; Fitzsimmons, Carolyn J; Franzin, Alessandra M; Garcia, Gustavo R; Genini, Sem; Gladney, Cody J; Grant, Jason R; Greaser, Marion L; Green, Jonathan A; Hadsell, Darryl L; Hakimov, Hatam A; Halgren, Rob; Harrow, Jennifer L; Hart, Elizabeth A; Hastings, Nicola; Hernandez, Marta; Hu, Zhi-Liang; Ingham, Aaron; Iso-Touru, Terhi; Jamis, Catherine; Jensen, Kirsty; Kapetis, Dimos; Kerr, Tovah; Khalil, Sari S; Khatib, Hasan; Kolbehdari, Davood; Kumar, Charu G; Kumar, Dinesh; Leach, Richard; Lee, Justin C-M; Li, Changxi; Logan, Krystin M; Malinverni, Roberto; Marques, Elisa; Martin, William F; Martins, Natalia F; Maruyama, Sandra R; Mazza, Raffaele; McLean, Kim L; Medrano, Juan F; Moreno, Barbara T; Moré, Daniela D; Muntean, Carl T; Nandakumar, Hari P; Nogueira, Marcelo F G; Olsaker, Ingrid; Pant, Sameer D; Panzitta, Francesca; Pastor, Rosemeire C P; Poli, Mario A; Poslusny, Nathan; Rachagani, Satyanarayana; Ranganathan, Shoba; Razpet, Andrej; Riggs, Penny K; Rincon, Gonzalo; Rodriguez-Osorio, Nelida; Rodriguez-Zas, Sandra L; Romero, Natasha E; Rosenwald, Anne; Sando, Lillian; Schmutz, Sheila M; Shen, Libing; Sherman, Laura; Southey, Bruce R; Lutzow, Ylva Strandberg; Sweedler, Jonathan V; Tammen, Imke; Telugu, Bhanu Prakash V L; Urbanski, Jennifer M; Utsunomiya, Yuri T; Verschoor, Chris P; Waardenberg, Ashley J; Wang, Zhiquan; Ward, Robert; Weikard, Rosemarie; Welsh, Thomas H; White, Stephen N; Wilming, Laurens G; Wunderlich, Kris R; Yang, Jianqi; Zhao, Feng-Qi

2009-04-24

To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
Complete genome analysis of jasmine virus T from Jasminum sambac in China.

PubMed

Tang, Yajun; Gao, Fangluan; Yang, Zhen; Wu, Zujian; Yang, Liang

2016-07-01

The genome of a potyvirus (isolate JaVT_FZ) recovered from jasmine (Jasminum sambac L.) showing yellow ringspot symptoms in Fuzhou, China, was sequenced. JaVT_FZ is closely related to seven other potyviruses with completely sequenced genomes, with which it shares 66-70 % nucleotide and 52-56 % amino acid sequence identity. However, the coat protein (CP) gene shares 82-92 % nucleotide and 90-97 % amino acid sequence identity with those of two partially sequenced potyviruses, named jasmine potyvirus T (JaVT-jasmine) and jasmine yellow mosaic potyvirus (JaYMV-India), respectively. This suggests that JaVT_FZ, JaVT-jasmine and JaYMV-India should be regarded as members of a single potyvirus species, for which the name "Jasmine virus T" has priority.
Genomic structure of two ras family genes in the slime mold Physarum polycephalum.

PubMed

Trzcińska-Danielewicz, Joanna; Kozlowski, Piotr; Gierdal, Katarzyna; Wiejak, Jolanta; Jagielski, Adam; Toczko, Kazimierz; Fronk, Jan

2002-08-01

Genomic structure of two Physarum polycephalum ras family genes, Ppras2 and Pprap1, has been determined, including the upstream region of the latter. The genes are interrupted by three and four introns, respectively. The first intron of Ppras2 has the same location within the coding sequence as the first intron in another ras homolog from this organism, Ppras1 [Trzcińska-Danielewicz, J., Kozlowski, P., and Toczko, K. (1996). "Cloning and genomic sequence of the Physarum polycephalum Ppras1 gene, a homologue of the ras protooncogene", Gene 169, pp. 143-144]. All introns, ranging from 53 to ca. 460 base pairs, have the canonical 5' and 3' ends, are greatly enriched in pyrimidines in the coding strand and have frequent pyrimidines-only tracts. These latter features seem to be responsible for the difficulties in cloning and sequencing of parts of these genes. Short sequences shared with P. polycephalum transposon-like repeats are common in the introns, indicating a possible role of transposition in intron evolution. In all three ras family genes phase zero introns are located mostly between sequences coding for regular protein secondary structure elements.
Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera, and comparative analyses with other grass genomes

PubMed Central

Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu

2009-01-01

Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593
The cyc1-11 mutation in yeast reverts by recombination with a nonallelic gene: composite genes determining the iso-cytochromes c.

PubMed Central

Ernst, J F; Stewart, J W; Sherman, F

1981-01-01

DNA sequence analysis of a cloned fragment directly established that the cyc1-11 mutation of iso-1-cytochrome c in the yeast Saccharomyces cerevisiae is a two-base-pair substitution that changes the CCA proline codon at amino acid position 76 to a UAA nonsense codon. Analysis of 11 revertant proteins and one cloned revertant gene showed that reversion of the cyc1-11 mutation can occur in three ways: a single base-pair substitution, which produces a serine replacement at position 76; recombination with the nonallelic CYC7 gene of iso-2-cytochrome c, which causes replacement of a segment in the cyc1-11 gene by the corresponding segment of the CYC7 gene; and either a two-base-pair substitution or recombination with the CYC7 gene, which causes the formation of the normal iso-1-cytochrome c sequence. These results demonstrate the occurrence of low frequencies of recombination between nonallelic genes having extensive but not complete homology. The formation of composite genes that share sequences from nonallelic genes may be an evolutionary mechanism for producing protein diversities and for maintaining identical sequences at different loci. Images PMID:6273865
'Candidatus Phytoplasma solani', a novel taxon associated with stolbur- and bois noir-related diseases of plants.

PubMed

Quaglino, Fabio; Zhao, Yan; Casati, Paola; Bulgari, Daniela; Bianco, Piero Attilio; Wei, Wei; Davis, Robert Edward

2013-08-01

Phytoplasmas classified in group 16SrXII infect a wide range of plants and are transmitted by polyphagous planthoppers of the family Cixiidae. Based on 16S rRNA gene sequence identity and biological properties, group 16SrXII encompasses several species, including 'Candidatus Phytoplasma australiense', 'Candidatus Phytoplasma japonicum' and 'Candidatus Phytoplasma fragariae'. Other group 16SrXII phytoplasma strains are associated with stolbur disease in wild and cultivated herbaceous and woody plants and with bois noir disease in grapevines (Vitis vinifera L.). Such latter strains have been informally proposed to represent a separate species, 'Candidatus Phytoplasma solani', but a formal description of this taxon has not previously been published. In the present work, stolbur disease strain STOL11 (STOL) was distinguished from reference strains of previously described species of the 'Candidatus Phytoplasma' genus based on 16S rRNA gene sequence similarity and a unique signature sequence in the 16S rRNA gene. Other stolbur- and bois noir-associated ('Ca. Phytoplasma solani') strains shared >99 % 16S rRNA gene sequence similarity with strain STOL11 and contained the signature sequence. 'Ca. Phytoplasma solani' is the only phytoplasma known to be transmitted by Hyalesthes obsoletus. Insect vectorship and molecular characteristics are consistent with the concept that diverse 'Ca. Phytoplasma solani' strains share common properties and represent an ecologically distinct gene pool. Phylogenetic analyses of 16S rRNA, tuf, secY and rplV-rpsC gene sequences supported this view and yielded congruent trees in which 'Ca. Phytoplasma solani' strains formed, within the group 16SrXII clade, a monophyletic subclade that was most closely related to, but distinct from, that of 'Ca. Phytoplasma australiense'-related strains. Based on distinct molecular and biological properties, stolbur- and bois noir-associated strains are proposed to represent a novel species level taxon, 'Ca. Phytoplasma solani'; STOL11 is designated the reference strain.
Prospecting Metagenomic Enzyme Subfamily Genes for DNA Family Shuffling by a Novel PCR-based Approach*

PubMed Central

Wang, Qiuyan; Wu, Huili; Wang, Anming; Du, Pengfei; Pei, Xiaolin; Li, Haifeng; Yin, Xiaopu; Huang, Lifeng; Xiong, Xiaolong

2010-01-01

DNA family shuffling is a powerful method for enzyme engineering, which utilizes recombination of naturally occurring functional diversity to accelerate laboratory-directed evolution. However, the use of this technique has been hindered by the scarcity of family genes with the required level of sequence identity in the genome database. We describe here a strategy for collecting metagenomic homologous genes for DNA shuffling from environmental samples by truncated metagenomic gene-specific PCR (TMGS-PCR). Using identified metagenomic gene-specific primers, twenty-three 921-bp truncated lipase gene fragments, which shared 64–99% identity with each other and formed a distinct subfamily of lipases, were retrieved from 60 metagenomic samples. These lipase genes were shuffled, and selected active clones were characterized. The chimeric clones show extensive functional and genetic diversity, as demonstrated by functional characterization and sequence analysis. Our results indicate that homologous sequences of genes captured by TMGS-PCR can be used as suitable genetic material for DNA family shuffling with broad applications in enzyme engineering. PMID:20962349
Comparative genomics analyses revealed two virulent Listeria monocytogenes strains isolated from ready-to-eat food.

PubMed

Lim, Shu Yong; Yap, Kien-Pong; Thong, Kwai Lin

2016-01-01

Listeria monocytogenes is an important foodborne pathogen that causes considerable morbidity in humans with high mortality rates. In this study, we have sequenced the genomes and performed comparative genomics analyses on two strains, LM115 and LM41, isolated from ready-to-eat food in Malaysia. The genome size of LM115 and LM41 was 2,959,041 and 2,963,111 bp, respectively. These two strains shared approximately 90% homologous genes. Comparative genomics and phylogenomic analyses revealed that LM115 and LM41 were more closely related to the reference strains F2365 and EGD-e, respectively. Our virulence profiling indicated a total of 31 virulence genes shared by both analysed strains. These shared genes included those that encode for internalins and L. monocytogenes pathogenicity island 1 (LIPI-1). Both the Malaysian L. monocytogenes strains also harboured several genes associated with stress tolerance to counter the adverse conditions. Seven antibiotic and efflux pump related genes which may confer resistance against lincomycin, erythromycin, fosfomycin, quinolone, tetracycline, and penicillin, and macrolides were identified in the genomes of both strains. Whole genome sequencing and comparative genomics analyses revealed two virulent L. monocytogenes strains isolated from ready-to-eat foods in Malaysia. The identification of strains with pathogenic, persistent, and antibiotic resistant potentials from minimally processed food warrant close attention from both healthcare and food industry.
Diversity of nonribosomal peptide synthetase and polyketide synthase gene clusters among taxonomically close Streptomyces strains.

PubMed

Komaki, Hisayuki; Sakurai, Kenta; Hosoyama, Akira; Kimura, Akane; Igarashi, Yasuhiro; Tamura, Tomohiko

2018-05-02

To identify the species of butyrolactol-producing Streptomyces strain TP-A0882, whole genome-sequencing of three type strains in a close taxonomic relationship was performed. In silico DNA-DNA hybridization using the genome sequences suggested that Streptomyces sp. TP-A0882 is classified as Streptomyces diastaticus subsp. ardesiacus. Strain TP-A0882, S. diastaticus subsp. ardesiacus NBRC 15402 T , Streptomyces coelicoflavus NBRC 15399 T , and Streptomyces rubrogriseus NBRC 15455 T harbor at least 14, 14, 10, and 12 biosynthetic gene clusters (BGCs), respectively, coding for nonribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs). All 14 gene clusters were shared by S. diastaticus subsp. ardesiacus strains TP-A0882 and NBRC 15402 T , while only four gene clusters were shared by the three distinct species. Although BGCs for bacteriocin, ectoine, indole, melanine, siderophores such as deferrioxamine, terpenes such as albaflavenone, hopene, carotenoid and geosmin are shared by the three species, many BGCs for secondary metabolites such as butyrolactone, lantipeptides, oligosaccharide, some terpenes are species-specific. These results indicate the possibility that strains belonging to the same species possess the same set of secondary metabolite-biosynthetic pathways, whereas strains belonging to distinct species have species-specific pathways, in addition to some common pathways, even if the strains are taxonomically close.
Initial Detection and Molecular Characterization of Namaycush Herpesvirus (Salmonid Herpesvirus 5) in Lake Trout.

PubMed

Glenney, Gavin W; Barbash, Patricia A; Coll, John A

2016-03-01

A novel herpesvirus was found by molecular methods in samples of Lake Trout Salvelinus namaycush from Lake Erie, Pennsylvania, and Lake Ontario, Keuka Lake, and Lake Otsego, New York. Based on PCR amplification and partial sequencing of polymerase, terminase, and glycoprotein genes, a number of isolates were identified as a novel virus, which we have named Namaycush herpesvirus (NamHV) salmonid herpesvirus 5 (SalHV5). Phylogenetic analyses of three NamHV genes indicated strong clustering with other members of the genus Salmonivirus, placing these isolates into family Alloherpesviridae. The NamHV isolates were identical in the three partially sequenced genes; however, they varied from other salmonid herpesviruses in nucleotide sequence identity. In all three of the genes sequenced, NamHV shared the highest sequence identity with Atlantic Salmon papillomatosis virus (ASPV; SalHV4) isolated from Atlantic Salmon Salmo salar in northern Europe, including northwestern Russia. These results lead one to believe that NamHV and ASPV have a common ancestor that may have made a relatively recent host jump from Atlantic Salmon to Lake Trout or vice versa. Partial nucleotide sequence comparisons between NamHV and ASPV for the polymerase and glycoprotein genes differ by >5% and >10%, respectively. Additional nucleotide sequence comparisons between NamHV and epizootic epitheliotropic disease virus (EEDV/SalHV3) in the terminase, glycoprotein, and polymerase genes differ by >5%, >20%, and >10%, respectively. Thus, NamHV and EEDV may be occupying discrete ecological niches in Lake Trout. Even though NamHV shared the highest genetic identity with ASPV, each of these viruses has a separate host species, which also implies speciation. Additionally, NamHV has been detected over the last 4 years in four separate water bodies across two states, which suggests that NamHV is a distinct, naturally replicating lineage. This, in combination with a divergence in nucleotide sequence from EEDV, indicates that NamHV is a new species in the genus Salmonivirus. Received April 20, 2015; accepted October 11, 2015.

Draft genome sequence of Thermoanaerobacterium sp. strain PSU-2 isolated from thermophilic hydrogen producing reactor.

PubMed

O-Thong, Sompong; Khongkliang, Peerawat; Mamimin, Chonticha; Singkhala, Apinya; Prasertsan, Poonsuk; Birkeland, Nils-Kåre

2017-06-01

Thermoanaerobacterium sp. strain PSU-2 was isolated from thermophilic hydrogen producing reactor and subjected to draft genome sequencing on 454 pyrosequencing and annotated on RAST. The draft genome sequence of strain PSU-2 contains 2,552,497 bases with an estimated G + C content of 35.2%, 2555 CDS, 8 rRNAs and 57 tRNAs. The strain had a number of genes responsible for carbohydrates metabolic, amino acids and derivatives, and protein metabolism of 17.7%, 14.39% and 9.81%, respectively. Strain PSU-2 also had gene responsible for hydrogen biosynthesis as well as the genes related to Ni-Fe hydrogenase. Comparative genomic analysis indicates strain PSU-2 shares about 94% genome sequence similarity with Thermoanaerobacterium xylanolyticum LX-11. The nucleotide sequence of this draft genome was deposited into DDBJ/ENA/GenBank under the accession MSQD00000000.
Shifts in phylogenetic diversity of archaeal communities in mangrove sediments at different sites and depths in southeastern Brazil.

PubMed

Mendes, Lucas William; Taketani, Rodrigo Gouvêa; Navarrete, Acácio Aparecido; Tsai, Siu Mui

2012-06-01

This study focused on the structure and composition of archaeal communities in sediments of tropical mangroves in order to obtain sufficient insight into two Brazilian sites from different locations (one pristine and another located in an urban area) and at different depth levels from the surface. Terminal restriction fragment length polymorphism (T-RFLP) of PCR-amplified 16S rRNA gene fragments was used to scan the archaeal community structure, and 16S rRNA gene clone libraries were used to determine the community composition. Redundancy analysis of T-RFLP patterns revealed differences in archaeal community structure according to location, depth and soil attributes. Parameters such as pH, organic matter, potassium and magnesium presented significant correlation with general community structure. Furthermore, phylogenetic analysis revealed a community composition distributed differently according to depth where, in shallow samples, 74.3% of sequences were affiliated with Euryarchaeota and 25.7% were shared between Crenarchaeota and Thaumarchaeota, while for the deeper samples, 24.3% of the sequences were affiliated with Euryarchaeota and 75.7% with Crenarchaeota and Thaumarchaeota. Archaeal diversity measurements based on 16S rRNA gene clone libraries decreased with increasing depth and there was a greater difference between depths (<18% of sequences shared) than sites (>25% of sequences shared). Taken together, our findings indicate that mangrove ecosystems support a diverse archaeal community; it might possibly be involved in nutrient cycles and are affected by sediment properties, depth and distinct locations. Copyright © 2012 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
Mutational Landscape of Candidate Genes in Familial Prostate Cancer

PubMed Central

Johnson, Anna M.; Zuhlke, Kimberly A.; Plotts, Chris; McDonnell, Shannon K.; Middha, Sumit; Riska, Shaun M.; Thibodeau, Stephen N.; Douglas, Julie A.; Cooney, Kathleen A.

2014-01-01

Background Family history is a major risk factor for prostate cancer (PCa), suggesting a genetic component to the disease. However, traditional linkage and association studies have failed to fully elucidate the underlying genetic basis of familial PCa. Methods Here we use a candidate gene approach to identify potential PCa susceptibility variants in whole exome sequencing data from familial PCa cases. Six hundred ninety-seven candidate genes were identified based on function, location near a known chromosome 17 linkage signal, and/or previous association with prostate or other cancers. Single nucleotide variants (SNVs) in these candidate genes were identified in whole exome sequence data from 33 PCa cases from 11 multiplex PCa families (3 cases/family). Results Overall, 4856 candidate gene SNVs were identified, including 1052 missense and 10 nonsense variants. Twenty missense variants were shared by all 3 family members in each family in which they were observed. Additionally, 15 missense variants were shared by 2 of 3 family members and predicted to be deleterious by 5 different algorithms. Four missense variants, BLM Gln123Arg, PARP2 Arg283Gln, LRCC46 Ala295Thr and KIF2B Pro91Leu, and 1 nonsense variant, CYP3A43 Arg441Ter, showed complete co-segregation with PCa status. Twelve additional variants displayed partial co-segregation with PCa. Conclusions Forty-three nonsense and shared, missense variants were identified in our candidate genes. Further research is needed to determine the contribution of these variants to PCa susceptibility. PMID:25111073
Insights into three whole-genome duplications gleaned from the Paramecium caudatum genome sequence.

PubMed

McGrath, Casey L; Gout, Jean-Francois; Doak, Thomas G; Yanagi, Akira; Lynch, Michael

2014-08-01

Paramecium has long been a model eukaryote. The sequence of the Paramecium tetraurelia genome reveals a history of three successive whole-genome duplications (WGDs), and the sequences of P. biaurelia and P. sexaurelia suggest that these WGDs are shared by all members of the aurelia species complex. Here, we present the genome sequence of P. caudatum, a species closely related to the P. aurelia species group. P. caudatum shares only the most ancient of the three WGDs with the aurelia complex. We found that P. caudatum maintains twice as many paralogs from this early event as the P. aurelia species, suggesting that post-WGD gene retention is influenced by subsequent WGDs and supporting the importance of selection for dosage in gene retention. The availability of P. caudatum as an outgroup allows an expanded analysis of the aurelia intermediate and recent WGD events. Both the Guanine+Cytosine (GC) content and the expression level of preduplication genes are significant predictors of duplicate retention. We find widespread asymmetrical evolution among aurelia paralogs, which is likely caused by gradual pseudogenization rather than by neofunctionalization. Finally, cases of divergent resolution of intermediate WGD duplicates between aurelia species implicate this process acts as an ongoing reinforcement mechanism of reproductive isolation long after a WGD event. Copyright © 2014 by the Genetics Society of America.
Unique core genomes of the bacterial family vibrionaceae: insights into niche adaptation and speciation.

PubMed

Kahlke, Tim; Goesmann, Alexander; Hjerde, Erik; Willassen, Nils Peder; Haugen, Peik

2012-05-10

The criteria for defining bacterial species and even the concept of bacterial species itself are under debate, and the discussion is apparently intensifying as more genome sequence data is becoming available. However, it is still unclear how the new advances in genomics should be used most efficiently to address this question. In this study we identify genes that are common to any group of genomes in our dataset, to determine whether genes specific to a particular taxon exist and to investigate their potential role in adaptation of bacteria to their specific niche. These genes were named unique core genes. Additionally, we investigate the existence and importance of unique core genes that are found in isolates of phylogenetically non-coherent groups. These groups of isolates, that share a genetic feature without sharing a closest common ancestor, are termed genophyletic groups. The bacterial family Vibrionaceae was used as the model, and we compiled and compared genome sequences of 64 different isolates. Using the software orthoMCL we determined clusters of homologous genes among the investigated genome sequences. We used multilocus sequence analysis to build a host phylogeny and mapped the numbers of unique core genes of all distinct groups of isolates onto the tree. The results show that unique core genes are more likely to be found in monophyletic groups of isolates. Genophyletic groups of isolates, in contrast, are less common especially for large groups of isolate. The subsequent annotation of unique core genes that are present in genophyletic groups indicate a high degree of horizontally transferred genes. Finally, the annotation of the unique core genes of Vibrio cholerae revealed genes involved in aerotaxis and biosynthesis of the iron-chelator vibriobactin. The presented work indicates that genes specific for any taxon inside the bacterial family Vibrionaceae exist. These unique core genes encode conserved metabolic functions that can shed light on the adaptation of a species to its ecological niche. Additionally, our study suggests that unique core genes can be used to aid classification of bacteria and contribute to a bacterial species definition on a genomic level. Furthermore, these genes may be of importance in clinical diagnostics and drug development.
Gene conversion as a mechanism for divergence of a chloroplast tRNA gene inserted in the mitochondrial genome of Brassica oleracea.

PubMed Central

Dron, M; Hartmann, C; Rode, A; Sevignac, M

1985-01-01

We have characterized a 1.7 kb sequence, containing a tRNA Leu2 gene shared by the ct and mt genomes of Brassica oleracea. The two sequences are completely homologous except in two short regions where two distinct gene conversion events have occurred between two sets of direct repeats leading to the insertion of 5 bp in the T loop of the mt copy of the ct gene. This is the first evidence that gene conversion represents the initial evolutionary step in inactivation of transferred ct genes in the mt genome. We also indicate that organelle DNA transfer by organelle fusion is an ongoing process which could be useful in genetic engineering. PMID:4080548
Homologues of a single resistance-gene cluster in potato confer resistance to distinct pathogens: a virus and a nematode.

PubMed

van der Vossen, E A; van der Voort, J N; Kanyuka, K; Bendahmane, A; Sandbrink, H; Baulcombe, D C; Bakker, J; Stiekema, W J; Klein-Lankhorst, R M

2000-09-01

The isolation of the nematode-resistance gene Gpa2 in potato is described, and it is demonstrated that highly homologous resistance genes of a single resistance-gene cluster can confer resistance to distinct pathogen species. Molecular analysis of the Gpa2 locus resulted in the identification of an R-gene cluster of four highly homologous genes in a region of approximately 115 kb. At least two of these genes are active: one corresponds to the previously isolated Rx1 gene that confers resistance to potato virus X, while the other corresponds to the Gpa2 gene that confers resistance to the potato cyst nematode Globodera pallida. The proteins encoded by the Gpa2 and the Rx1 genes share an overall homology of over 88% (amino-acid identity) and belong to the leucine-zipper, nucleotide-binding site, leucine-rich repeat (LZ-NBS-LRR)-containing class of plant resistance genes. From the sequence conservation between Gpa2 and Rx1 it is clear that there is a direct evolutionary relationship between the two proteins. Sequence diversity is concentrated in the LRR region and in the C-terminus. The putative effector domains are more conserved suggesting that, at least in this case, nematode and virus resistance cascades could share common components. These findings underline the potential of protein breeding for engineering new resistance specificities against plant pathogens in vitro.
Complete cDNA sequence of SAP-like pentraxin from Limulus polyphemus: implications for pentraxin evolution.

PubMed

Tharia, Hazel A; Shrive, Annette K; Mills, John D; Arme, Chris; Williams, Gwyn T; Greenhough, Trevor J

2002-02-22

The serum amyloid P component (SAP)-like pentraxin Limulus polyphemus SAP is a recently discovered, distinct pentraxin species, of known structure, which does not bind phosphocholine and whose N-terminal sequence has been shown to differ markedly from the highly conserved N terminus of all other known horseshoe crab pentraxins. The complete cDNA sequence of Limulus SAP, and the derived amino acid sequence, the first invertebrate SAP-like pentraxin sequence, have been determined. Two sequences were identified that differed only in the length of the 3' untranslated region. Limulus SAP is synthesised as a precursor protein of 234 amino acid residues, the first 17 residues encoding a signal peptide that is absent from the mature protein. Phylogenetic analysis clusters Limulus SAP pentraxin with the horseshoe crab C-reactive proteins (CRPs) rather than the mammalian SAPs, which are clustered with mammalian CRPs. The deduced amino acid sequence shares 22% identity with both human SAP and CRP, which are 51% identical, and 31-35% with horseshoe crab CRPs. These analyses indicate that gene duplication of CRP (or SAP), followed by sequence divergence and the evolution of CRP and/or SAP function, occurred independently along the chordate and arthropod evolutionary lines rather than in a common ancestor. They further indicate that the CRP/SAP gene duplication event in Limulus occurred before both the emergence of the Limulus CRP variants and the mammalian CRP/SAP gene duplication. Limulus SAP, which does not exhibit the CRP characteristic of calcium-dependent binding to phosphocholine, is established as a pentraxin species distinct from all other known horseshoe crab pentraxins that exist in many variant forms sharing a high level of sequence homology. Copyright 2002 Elsevier Science Ltd.
Using an online genome resource to identify myostatin variation in U.S. sheep

USDA-ARS?s Scientific Manuscript database

We created a public, searchable DNA sequence resource for sheep that contained approximately 14x whole genome sequence of 96 rams. The animals represent 10 popular U.S. breeds and share minimal pedigree relationships, making the resource suitable for viewing gene variants in the user-friendly Integ...
Cloning and sequence analysis of chitin synthase gene fragments of Demodex mites.

PubMed

Zhao, Ya-e; Wang, Zheng-hang; Xu, Yang; Xu, Ji-ru; Liu, Wen-yan; Wei, Meng; Wang, Chu-ying

2012-10-01

To our knowledge, few reports on Demodex studied at the molecular level are available at present. In this study our group, for the first time, cloned, sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum, Demodex brevis, and Demodex canis (three isolates from each species) from Xi'an China, by designing specific primers based on the only partial sequence of the CHS gene of D. canis from Japan, retrieved from GenBank. Results show that amplification was successful only in three D. canis isolates and one D. brevis isolate out of the nine Demodex isolates. The obtained fragments were sequenced to be 339 bp for D. canis and 338 bp for D. brevis. The CHS gene sequence similarities between the three Xi'an D. canis isolates and one Japanese D. canis isolate ranged from 99.7% to 100.0%, and those between four D. canis isolates and one D. brevis isolate were 99.1%-99.4%. Phylogenetic trees based on maximum parsimony (MP) and maximum likelihood (ML) methods shared the same clusters, according with the traditional classification. Two open reading frames (ORFs) were identified in each CHS gene sequenced, and their corresponding amino acid sequences were located at the catalytic domain. The relatively conserved sequences could be deduced to be a CHS class A gene, which is associated with chitin synthesis in the integument of Demodex mites.
Cloning and sequence analysis of chitin synthase gene fragments of Demodex mites*

PubMed Central

Zhao, Ya-e; Wang, Zheng-hang; Xu, Yang; Xu, Ji-ru; Liu, Wen-yan; Wei, Meng; Wang, Chu-ying

2012-01-01

To our knowledge, few reports on Demodex studied at the molecular level are available at present. In this study our group, for the first time, cloned, sequenced and analyzed the chitin synthase (CHS) gene fragments of Demodex folliculorum, Demodex brevis, and Demodex canis (three isolates from each species) from Xi’an China, by designing specific primers based on the only partial sequence of the CHS gene of D. canis from Japan, retrieved from GenBank. Results show that amplification was successful only in three D. canis isolates and one D. brevis isolate out of the nine Demodex isolates. The obtained fragments were sequenced to be 339 bp for D. canis and 338 bp for D. brevis. The CHS gene sequence similarities between the three Xi’an D. canis isolates and one Japanese D. canis isolate ranged from 99.7% to 100.0%, and those between four D. canis isolates and one D. brevis isolate were 99.1%–99.4%. Phylogenetic trees based on maximum parsimony (MP) and maximum likelihood (ML) methods shared the same clusters, according with the traditional classification. Two open reading frames (ORFs) were identified in each CHS gene sequenced, and their corresponding amino acid sequences were located at the catalytic domain. The relatively conserved sequences could be deduced to be a CHS class A gene, which is associated with chitin synthesis in the integument of Demodex mites. PMID:23024043
Characterization of Urtica dioica agglutinin isolectins and the encoding gene family.

PubMed

Does, M P; Ng, D K; Dekker, H L; Peumans, W J; Houterman, P M; Van Damme, E J; Cornelissen, B J

1999-01-01

Urtica dioica agglutinin (UDA) has previously been found in roots and rhizomes of stinging nettles as a mixture of UDA-isolectins. Protein and cDNA sequencing have shown that mature UDA is composed of two hevein domains and is processed from a precursor protein. The precursor contains a signal peptide, two in-tandem hevein domains, a hinge region and a carboxyl-terminal chitinase domain. Genomic fragments encoding precursors for UDA-isolectins have been amplified by five independent polymerase chain reactions on genomic DNA from stinging nettle ecotype Weerselo. One amplified gene was completely sequenced. As compared to the published cDNA sequence, the genomic sequence contains, besides two basepair substitutions, two introns located at the same positions as in other plant chitinases. By partial sequence analysis of 40 amplified genes, 16 different genes were identified which encode seven putative UDA-isolectins. The deduced amino acid sequences share 78.9-98.9% identity. In extracts of roots and rhizomes of stinging nettle ecotype Weerselo six out of these seven isolectins were detected by mass spectrometry. One of them is an acidic form, which has not been identified before. Our results demonstrate that UDA is encoded by a large gene family.
Phylogenetic relationships among superfamilies of Neritimorpha (Mollusca: Gastropoda).

PubMed

Uribe, Juan E; Colgan, Don; Castro, Lyda R; Kano, Yasunori; Zardoya, Rafael

2016-11-01

Despite the extraordinary morphological and ecological diversity of Neritimorpha, few studies have focused on the phylogenetic relationships of this lineage of gastropods, which includes four extant superfamilies: Neritopsoidea, Hydrocenoidea, Helicinoidea, and Neritoidea. Here, the nucleotide sequences of the complete mitochondrial genomes of Georissa bangueyensis (Hydrocenoidea), Neritina usnea (Neritoidea), and Pleuropoma jana (Helicinoidea) and the nearly complete mt genomes of Titiscania sp. (Neritopsoidea) and Theodoxus fluviatilis (Neritoidea) were determined. Phylogenetic reconstructions using probabilistic methods were based on mitochondrial (13 protein coding genes and two ribosomal rRNA genes), nuclear (partial 28S rRNA, 18S rRNA, actin, and histone H3 genes) and combined sequence data sets. All phylogenetic analyses except one converged on a single, highly supported tree in which Neritopsoidea was recovered as the sister group of a clade including Helicinoidea as the sister group of Hydrocenoidea and Neritoidea. This topology agrees with the fossil record and supports at least three independent invasions of land by neritimorph snails. The mitochondrial genomes of Titiscania sp., G. bangueyensis, N. usnea, and T. fluviatilis share the same gene organization previously described for Nerita mt genomes whereas that of P. jana has undergone major rearrangements. We sequenced about half of the mitochondrial genome of another species of Helicinoidea, Viana regina, and confirmed that this species shares the highly derived gene order of P. jana. Copyright © 2016 Elsevier Inc. All rights reserved.
Genome Analysis Reveals Interplay between 5′UTR Introns and Nuclear mRNA Export for Secretory and Mitochondrial Genes

PubMed Central

Cenik, Can; Chua, Hon Nian; Zhang, Hui; Tarnawsky, Stefan P.; Akef, Abdalla; Derti, Adnan; Tasan, Murat; Moore, Melissa J.; Palazzo, Alexander F.; Roth, Frederick P.

2011-01-01

In higher eukaryotes, messenger RNAs (mRNAs) are exported from the nucleus to the cytoplasm via factors deposited near the 5′ end of the transcript during splicing. The signal sequence coding region (SSCR) can support an alternative mRNA export (ALREX) pathway that does not require splicing. However, most SSCR–containing genes also have introns, so the interplay between these export mechanisms remains unclear. Here we support a model in which the furthest upstream element in a given transcript, be it an intron or an ALREX–promoting SSCR, dictates the mRNA export pathway used. We also experimentally demonstrate that nuclear-encoded mitochondrial genes can use the ALREX pathway. Thus, ALREX can also be supported by nucleotide signals within mitochondrial-targeting sequence coding regions (MSCRs). Finally, we identified and experimentally verified novel motifs associated with the ALREX pathway that are shared by both SSCRs and MSCRs. Our results show strong correlation between 5′ untranslated region (5′UTR) intron presence/absence and sequence features at the beginning of the coding region. They also suggest that genes encoding secretory and mitochondrial proteins share a common regulatory mechanism at the level of mRNA export. PMID:21533221
Complete Genome Sequence and Comparative Analysis of the Fish Pathogen Lactococcus garvieae

PubMed Central

Oshima, Kenshiro; Yoshizaki, Mariko; Kawanishi, Michiko; Nakaya, Kohei; Suzuki, Takehito; Miyauchi, Eiji; Ishii, Yasuo; Tanabe, Soichi; Murakami, Masaru; Hattori, Masahira

2011-01-01

Lactococcus garvieae causes fatal haemorrhagic septicaemia in fish such as yellowtail. The comparative analysis of genomes of a virulent strain Lg2 and a non-virulent strain ATCC 49156 of L. garvieae revealed that the two strains shared a high degree of sequence identity, but Lg2 had a 16.5-kb capsule gene cluster that is absent in ATCC 49156. The capsule gene cluster was composed of 15 genes, of which eight genes are highly conserved with those in exopolysaccharide biosynthesis gene cluster often found in Lactococcus lactis strains. Sequence analysis of the capsule gene cluster in the less virulent strain L. garvieae Lg2-S, Lg2-derived strain, showed that two conserved genes were disrupted by a single base pair deletion, respectively. These results strongly suggest that the capsule is crucial for virulence of Lg2. The capsule gene cluster of Lg2 may be a genomic island from several features such as the presence of insertion sequences flanked on both ends, different GC content from the chromosomal average, integration into the locus syntenic to other lactococcal genome sequences, and distribution in human gut microbiomes. The analysis also predicted other potential virulence factors such as haemolysin. The present study provides new insights into understanding of the virulence mechanisms of L. garvieae in fish. PMID:21829716
Selection and Trans-Species Polymorphism of Major Histocompatibility Complex Class II Genes in the Order Crocodylia

PubMed Central

Jaratlerdsiri, Weerachai; Isberg, Sally R.; Higgins, Damien P.; Miles, Lee G.; Gongora, Jaime

2014-01-01

Major Histocompatibility Complex (MHC) class II genes encode for molecules that aid in the presentation of antigens to helper T cells. MHC characterisation within and between major vertebrate taxa has shed light on the evolutionary mechanisms shaping the diversity within this genomic region, though little characterisation has been performed within the Order Crocodylia. Here we investigate the extent and effect of selective pressures and trans-species polymorphism on MHC class II α and β evolution among 20 extant species of Crocodylia. Selection detection analyses showed that diversifying selection influenced MHC class II β diversity, whilst diversity within MHC class II α is the result of strong purifying selection. Comparison of translated sequences between species revealed the presence of twelve trans-species polymorphisms, some of which appear to be specific to the genera Crocodylus and Caiman. Phylogenetic reconstruction clustered MHC class II α sequences into two major clades representing the families Crocodilidae and Alligatoridae. However, no further subdivision within these clades was evident and, based on the observation that most MHC class II α sequences shared the same trans-species polymorphisms, it is possible that they correspond to the same gene lineage across species. In contrast, phylogenetic analyses of MHC class II β sequences showed a mixture of subclades containing sequences from Crocodilidae and/or Alligatoridae, illustrating orthologous relationships among those genes. Interestingly, two of the subclades containing sequences from both Crocodilidae and Alligatoridae shared specific trans-species polymorphisms, suggesting that they may belong to ancient lineages pre-dating the divergence of these two families from the common ancestor 85–90 million years ago. The results presented herein provide an immunogenetic resource that may be used to further assess MHC diversity and functionality in Crocodylia. PMID:24503938
The Genome Sequence of Taurine Cattle: A window to ruminant biology and evolution

PubMed Central

Elsik, Christine G.; Tellam, Ross L.; Worley, Kim C.

2010-01-01

To understand the biology and evolution of ruminants, the cattle genome was sequenced to ∼7× coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1,217 are absent or undetected in non-eutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides an enabling resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production. PMID:19390049
Evolutionary conservation analysis increases the colocalization of predicted exonic splicing enhancers in the BRCA1 gene with missense sequence changes and in-frame deletions, but not polymorphisms

PubMed Central

Pettigrew, Christopher; Wayte, Nicola; Lovelock, Paul K; Tavtigian, Sean V; Chenevix-Trench, Georgia; Spurdle, Amanda B; Brown, Melissa A

2005-01-01

Introduction Aberrant pre-mRNA splicing can be more detrimental to the function of a gene than changes in the length or nature of the encoded amino acid sequence. Although predicting the effects of changes in consensus 5' and 3' splice sites near intron:exon boundaries is relatively straightforward, predicting the possible effects of changes in exonic splicing enhancers (ESEs) remains a challenge. Methods As an initial step toward determining which ESEs predicted by the web-based tool ESEfinder in the breast cancer susceptibility gene BRCA1 are likely to be functional, we have determined their evolutionary conservation and compared their location with known BRCA1 sequence variants. Results Using the default settings of ESEfinder, we initially detected 669 potential ESEs in the coding region of the BRCA1 gene. Increasing the threshold score reduced the total number to 464, while taking into consideration the proximity to splice donor and acceptor sites reduced the number to 211. Approximately 11% of these ESEs (23/211) either are identical at the nucleotide level in human, primates, mouse, cow, dog and opossum Brca1 (conserved) or are detectable by ESEfinder in the same position in the Brca1 sequence (shared). The frequency of conserved and shared predicted ESEs between human and mouse is higher in BRCA1 exons (2.8 per 100 nucleotides) than in introns (0.6 per 100 nucleotides). Of conserved or shared putative ESEs, 61% (14/23) were predicted to be affected by sequence variants reported in the Breast Cancer Information Core database. Applying the filters described above increased the colocalization of predicted ESEs with missense changes, in-frame deletions and unclassified variants predicted to be deleterious to protein function, whereas they decreased the colocalization with known polymorphisms or unclassified variants predicted to be neutral. Conclusion In this report we show that evolutionary conservation analysis may be used to improve the specificity of an ESE prediction tool. This is the first report on the prediction of the frequency and distribution of ESEs in the BRCA1 gene, and it is the first reported attempt to predict which ESEs are most likely to be functional and therefore which sequence variants in ESEs are most likely to be pathogenic. PMID:16280041
Evolution of the alternative AQP2 gene: Acquisition of a novel protein-coding sequence in dolphins.

PubMed

Kishida, Takushi; Suzuki, Miwa; Takayama, Asuka

2018-01-01

Taxon-specific de novo protein-coding sequences are thought to be important for taxon-specific environmental adaptation. A recent study revealed that bottlenose dolphins acquired a novel isoform of aquaporin 2 generated by alternative splicing (alternative AQP2), which helps dolphins to live in hyperosmotic seawater. The AQP2 gene consists of four exons, but the alternative AQP2 gene lacks the fourth exon and instead has a longer third exon that includes the original third exon and a part of the original third intron. Here, we show that the latter half of the third exon of the alternative AQP2 arose from a non-protein-coding sequence. Intact ORF of this de novo sequence is shared not by all cetaceans, but only by delphinoids. However, this sequence is conservative in all modern cetaceans, implying that this de novo sequence potentially plays important roles for marine adaptation in cetaceans. Copyright © 2017 Elsevier Inc. All rights reserved.
Subgroup 4 R2R3-MYBs in conifer trees: gene family expansion and contribution to the isoprenoid- and flavonoid-oriented responses

PubMed Central

Bedon, Frank; Bomal, Claude; Caron, Sébastien; Levasseur, Caroline; Boyle, Brian; Mansfield, Shawn D.; Schmidt, Axel; Gershenzon, Jonathan; Grima-Pettenati, Jacqueline; Séguin, Armand; MacKay, John

2010-01-01

Transcription factors play a fundamental role in plants by orchestrating temporal and spatial gene expression in response to environmental stimuli. Several R2R3-MYB genes of the Arabidopsis subgroup 4 (Sg4) share a C-terminal EAR motif signature recently linked to stress response in angiosperm plants. It is reported here that nearly all Sg4 MYB genes in the conifer trees Picea glauca (white spruce) and Pinus taeda (loblolly pine) form a monophyletic clade (Sg4C) that expanded following the split of gymnosperm and angiosperm lineages. Deeper sequencing in P. glauca identified 10 distinct Sg4C sequences, indicating over-represention of Sg4 sequences compared with angiosperms such as Arabidopsis, Oryza, Vitis, and Populus. The Sg4C MYBs share the EAR motif core. Many of them had stress-responsive transcript profiles after wounding, jasmonic acid (JA) treatment, or exposure to cold in P. glauca and P. taeda, with MYB14 transcripts accumulating most strongly and rapidly. Functional characterization was initiated by expressing the P. taeda MYB14 (PtMYB14) gene in transgenic P. glauca plantlets with a tissue-preferential promoter (cinnamyl alcohol dehydrogenase) and a ubiquitous gene promoter (ubiquitin). Histological, metabolite, and transcript (microarray and targeted quantitiative real-time PCR) analyses of PtMYB14 transgenics, coupled with mechanical wounding and JA application experiments on wild-type plantlets, allowed identification of PtMYB14 as a putative regulator of an isoprenoid-oriented response that leads to the accumulation of sesquiterpene in conifers. Data further suggested that PtMYB14 may contribute to a broad defence response implicating flavonoids. This study also addresses the potential involvement of closely related Sg4C sequences in stress responses and plant evolution. PMID:20732878

Comparative genomics of four closely related Clostridium perfringens bacteriophages reveals variable evolution among core genes with therapeutic potential

PubMed Central

2011-01-01

Background Because biotechnological uses of bacteriophage gene products as alternatives to conventional antibiotics will require a thorough understanding of their genomic context, we sequenced and analyzed the genomes of four closely related phages isolated from Clostridium perfringens, an important agricultural and human pathogen. Results Phage whole-genome tetra-nucleotide signatures and proteomic tree topologies correlated closely with host phylogeny. Comparisons of our phage genomes to 26 others revealed three shared COGs; of particular interest within this core genome was an endolysin (PF01520, an N-acetylmuramoyl-L-alanine amidase) and a holin (PF04531). Comparative analyses of the evolutionary history and genomic context of these common phage proteins revealed two important results: 1) strongly significant host-specific sequence variation within the endolysin, and 2) a protein domain architecture apparently unique to our phage genomes in which the endolysin is located upstream of its associated holin. Endolysin sequences from our phages were one of two very distinct genotypes distinguished by variability within the putative enzymatically-active domain. The shared or core genome was comprised of genes with multiple sequence types belonging to five pfam families, and genes belonging to 12 pfam families, including the holin genes, which were nearly identical. Conclusions Significant genomic diversity exists even among closely-related bacteriophages. Holins and endolysins represent conserved functions across divergent phage genomes and, as we demonstrate here, endolysins can have significant variability and host-specificity even among closely-related genomes. Endolysins in our phage genomes may be subject to different selective pressures than the rest of the genome. These findings may have important implications for potential biotechnological applications of phage gene products. PMID:21631945
Genome sequence and comparative analysis of a putative entomopathogenic Serratia isolated from Caenorhabditis briggsae.

PubMed

Abebe-Akele, Feseha; Tisa, Louis S; Cooper, Vaughn S; Hatcher, Philip J; Abebe, Eyualem; Thomas, W Kelley

2015-07-18

Entomopathogenic associations between nematodes in the genera Steinernema and Heterorhabdus with their cognate bacteria from the bacterial genera Xenorhabdus and Photorhabdus, respectively, are extensively studied for their potential as biological control agents against invasive insect species. These two highly coevolved associations were results of convergent evolution. Given the natural abundance of bacteria, nematodes and insects, it is surprising that only these two associations with no intermediate forms are widely studied in the entomopathogenic context. Discovering analogous systems involving novel bacterial and nematode species would shed light on the evolutionary processes involved in the transition from free living organisms to obligatory partners in entomopathogenicity. We report the complete genome sequence of a new member of the enterobacterial genus Serratia that forms a putative entomopathogenic complex with Caenorhabditis briggsae. Analysis of the 5.04 MB chromosomal genome predicts 4599 protein coding genes, seven sets of ribosomal RNA genes, 84 tRNA genes and a 64.8 KB plasmid encoding 74 genes. Comparative genomic analysis with three of the previously sequenced Serratia species, S. marcescens DB11 and S. proteamaculans 568, and Serratia sp. AS12, revealed that these four representatives of the genus share a core set of ~3100 genes and extensive structural conservation. The newly identified species shares a more recent common ancestor with S. marcescens with 99% sequence identity in rDNA sequence and orthology across 85.6% of predicted genes. Of the 39 genes/operons implicated in the virulence, symbiosis, recolonization, immune evasion and bioconversion, 21 (53.8%) were present in Serratia while 33 (84.6%) and 35 (89%) were present in Xenorhabdus and Photorhabdus EPN bacteria respectively. The majority of unique sequences in Serratia sp. SCBI (South African Caenorhabditis briggsae Isolate) are found in ~29 genomic islands of 5 to 65 genes and are enriched in putative functions that are biologically relevant to an entomopathogenic lifestyle, including non-ribosomal peptide synthetases, bacteriocins, fimbrial biogenesis, ushering proteins, toxins, secondary metabolite secretion and multiple drug resistance/efflux systems. By revealing the early stages of adaptation to this lifestyle, the Serratia sp. SCBI genome underscores the fact that in EPN formation the composite end result - killing, bioconversion, cadaver protection and recolonization- can be achieved by dissimilar mechanisms. This genome sequence will enable further study of the evolution of entomopathogenic nematode-bacteria complexes.
Resistance gene candidates identified by PCR with degenerate oligonucleotide primers map to clusters of resistance genes in lettuce.

PubMed

Shen, K A; Meyers, B C; Islam-Faridi, M N; Chin, D B; Stelly, D M; Michelmore, R W

1998-08-01

The recent cloning of genes for resistance against diverse pathogens from a variety of plants has revealed that many share conserved sequence motifs. This provides the possibility of isolating numerous additional resistance genes by polymerase chain reaction (PCR) with degenerate oligonucleotide primers. We amplified resistance gene candidates (RGCs) from lettuce with multiple combinations of primers with low degeneracy designed from motifs in the nucleotide binding sites (NBSs) of RPS2 of Arabidopsis thaliana and N of tobacco. Genomic DNA, cDNA, and bacterial artificial chromosome (BAC) clones were successfully used as templates. Four families of sequences were identified that had the same similarity to each other as to resistance genes from other species. The relationship of the amplified products to resistance genes was evaluated by several sequence and genetic criteria. The amplified products contained open reading frames with additional sequences characteristic of NBSs. Hybridization of RGCs to genomic DNA and to BAC clones revealed large numbers of related sequences. Genetic analysis demonstrated the existence of clustered multigene families for each of the four RGC sequences. This parallels classical genetic data on clustering of disease resistance genes. Two of the four families mapped to known clusters of resistance genes; these two families were therefore studied in greater detail. Additional evidence that these RGCs could be resistance genes was gained by the identification of leucine-rich repeat (LRR) regions in sequences adjoining the NBS similar to those in RPM1 and RPS2 of A. thaliana. Fluorescent in situ hybridization confirmed the clustered genomic distribution of these sequences. The use of PCR with degenerate oligonucleotide primers is therefore an efficient method to identify numerous RGCs in plants.
Metagenomics of urban sewage identifies an extensively shared antibiotic resistome in China.

PubMed

Su, Jian-Qiang; An, Xin-Li; Li, Bing; Chen, Qing-Lin; Gillings, Michael R; Chen, Hong; Zhang, Tong; Zhu, Yong-Guan

2017-07-19

Antibiotic-resistant pathogens are challenging treatment of infections worldwide. Urban sewage is potentially a major conduit for dissemination of antibiotic resistance genes into various environmental compartments. However, the diversity and abundance of such genes in wastewater are not well known. Here, seasonal and geographical distributions of antibiotic resistance genes and their host bacterial communities from Chinese urban sewage were characterized, using metagenomic analyses and 16S rRNA gene-based Illumina sequencing, respectively. In total, 381 different resistance genes were detected, and these genes were extensively shared across China, with no geographical clustering. Seasonal variation in abundance of resistance genes was observed, with average concentrations of 3.27 × 10 11 and 1.79 × 10 12 copies/L in summer and winter, respectively. Bacterial communities did not exhibit geographical clusters, but did show a significant distance-decay relationship (P < 0.01). The core, shared resistome accounted for 57.7% of the total resistance genes, and was significantly associated with the core microbial community (P < 0.01). The core human gut microbiota was also strongly associated with the shared resistome, demonstrating the potential contribution of human gut microbiota to the dissemination of resistance elements via sewage disposal. This study provides a baseline for investigating environmental dissemination of resistance elements and raises the possibility of using the abundance of resistance genes in sewage as a tool for antibiotic stewardship.
Genome sequencing of the lizard parasite Leishmania tarentolae reveals loss of genes associated to the intracellular stage of human pathogenic species

PubMed Central

Raymond, Frédéric; Boisvert, Sébastien; Roy, Gaétan; Ritt, Jean-François; Légaré, Danielle; Isnard, Amandine; Stanke, Mario; Olivier, Martin; Tremblay, Michel J.; Papadopoulou, Barbara; Ouellette, Marc; Corbeil, Jacques

2012-01-01

The Leishmania tarentolae Parrot-TarII strain genome sequence was resolved to an average 16-fold mean coverage by next-generation DNA sequencing technologies. This is the first non-pathogenic to humans kinetoplastid protozoan genome to be described thus providing an opportunity for comparison with the completed genomes of pathogenic Leishmania species. A high synteny was observed between all sequenced Leishmania species. A limited number of chromosomal regions diverged between L. tarentolae and L. infantum, while remaining syntenic to L. major. Globally, >90% of the L. tarentolae gene content was shared with the other Leishmania species. We identified 95 predicted coding sequences unique to L. tarentolae and 250 genes that were absent from L. tarentolae. Interestingly, many of the latter genes were expressed in the intracellular amastigote stage of pathogenic species. In addition, genes coding for products involved in antioxidant defence or participating in vesicular-mediated protein transport were underrepresented in L. tarentolae. In contrast to other Leishmania genomes, two gene families were expanded in L. tarentolae, namely the zinc metallo-peptidase surface glycoprotein GP63 and the promastigote surface antigen PSA31C. Overall, L. tarentolae's gene content appears better adapted to the promastigote insect stage rather than the amastigote mammalian stage. PMID:21998295
Isolation and characterization of the chicken trypsinogen gene family.

PubMed Central

Wang, K; Gan, L; Lee, I; Hood, L

1995-01-01

Based on genomic Southern hybridizations and cDNA sequence analyses, the chicken trypsinogen gene family can be divided into two multi-member subfamilies, a six-member trypsinogen I subfamily which encodes the cationic trypsin isoenzymes and a three-member trypsinogen II subfamily which encodes the anionic trypsin isoenzymes. The chicken cDNA and genomic clones containing these two subfamilies were isolated and characterized by DNA sequence analysis. The results indicated that the chicken trypsinogen genes encoded a signal peptide of 15 to 16 amino acid residues, an activation peptide of 9 to 10 residues and a trypsin of 223 amino acid residues. The chicken trypsinogens contain all the common catalytic and structural features for trypsins, including the catalytic triad His, Asp and Ser and the six disulphide bonds. The trypsinogen I and II subfamilies share approximately 70% sequence identity at the nucleotide and amino acid level. The sequence comparison among chicken trypsinogen subfamily members and trypsin sequences from other species suggested that the chicken trypsinogen genes may have evolved in coincidental or concerted fashion. Images Figure 6 Figure 7 PMID:7733885
Cloning and sequence analysis demonstrate the chromate reduction ability of a novel chromate reductase gene from Serratia sp.

PubMed

Deng, Peng; Tan, Xiaoqing; Wu, Ying; Bai, Qunhua; Jia, Yan; Xiao, Hong

2015-03-01

The ChrT gene encodes a chromate reductase enzyme which catalyzes the reduction of Cr(VI). The chromate reductase is also known as flavin mononucleotide (FMN) reductase (FMN_red). The aim of the present study was to clone the full-length ChrT DNA from Serratia sp. CQMUS2 and analyze the deduced amino acid sequence and three-dimensional structure. The putative ChrT gene fragment of Serratia sp. CQMUS2 was isolated by polymerase chain reaction (PCR), according to the known FMN_red gene sequence from Serratia sp. AS13. The flanking sequences of the ChrT gene were obtained by high efficiency TAIL-PCR, while the full-length gene of ChrT was cloned in Escherichia coli for subsequent sequencing. The nucleotide sequence of ChrT was submitted onto GenBank under the accession number, KF211434. Sequence analysis of the gene and amino acids was conducted using the Basic Local Alignment Search Tool, and open reading frame (ORF) analysis was performed using ORF Finder software. The ChrT gene was found to be an ORF of 567 bp that encodes a 188-amino acid enzyme with a calculated molecular weight of 20.4 kDa. In addition, the ChrT protein was hypothesized to be an NADPH-dependent FMN_red and a member of the flavodoxin-2 superfamily. The amino acid sequence of ChrT showed high sequence similarity to the FMN reductase genes of Klebsiella pneumonia and Raoultella ornithinolytica , which belong to the flavodoxin-2 superfamily. Furthermore, ChrT was shown to have a 85.6% similarity to the three-dimensional structure of Escherichia coli ChrR, sharing four common enzyme active sites for chromate reduction. Therefore, ChrT gene cloning and protein structure determination demonstrated the ability of the gene for chromate reduction. The results of the present study provide a basis for further studies on ChrT gene expression and protein function.
Cloning and sequence analysis demonstrate the chromate reduction ability of a novel chromate reductase gene from Serratia sp

PubMed Central

DENG, PENG; TAN, XIAOQING; WU, YING; BAI, QUNHUA; JIA, YAN; XIAO, HONG

2015-01-01

The ChrT gene encodes a chromate reductase enzyme which catalyzes the reduction of Cr(VI). The chromate reductase is also known as flavin mononucleotide (FMN) reductase (FMN_red). The aim of the present study was to clone the full-length ChrT DNA from Serratia sp. CQMUS2 and analyze the deduced amino acid sequence and three-dimensional structure. The putative ChrT gene fragment of Serratia sp. CQMUS2 was isolated by polymerase chain reaction (PCR), according to the known FMN_red gene sequence from Serratia sp. AS13. The flanking sequences of the ChrT gene were obtained by high efficiency TAIL-PCR, while the full-length gene of ChrT was cloned in Escherichia coli for subsequent sequencing. The nucleotide sequence of ChrT was submitted onto GenBank under the accession number, KF211434. Sequence analysis of the gene and amino acids was conducted using the Basic Local Alignment Search Tool, and open reading frame (ORF) analysis was performed using ORF Finder software. The ChrT gene was found to be an ORF of 567 bp that encodes a 188-amino acid enzyme with a calculated molecular weight of 20.4 kDa. In addition, the ChrT protein was hypothesized to be an NADPH-dependent FMN_red and a member of the flavodoxin-2 superfamily. The amino acid sequence of ChrT showed high sequence similarity to the FMN reductase genes of Klebsiella pneumonia and Raoultella ornithinolytica, which belong to the flavodoxin-2 superfamily. Furthermore, ChrT was shown to have a 85.6% similarity to the three-dimensional structure of Escherichia coli ChrR, sharing four common enzyme active sites for chromate reduction. Therefore, ChrT gene cloning and protein structure determination demonstrated the ability of the gene for chromate reduction. The results of the present study provide a basis for further studies on ChrT gene expression and protein function. PMID:25667630
Phylogeographic reconstruction of a bacterial species with high levels of lateral gene transfer

USGS Publications Warehouse

Pearson, T.; Giffard, P.; Beckstrom-Sternberg, S.; Auerbach, R.; Hornstra, H.; Tuanyok, A.; Price, E.P.; Glass, M.B.; Leadem, B.; Beckstrom-Sternberg, J. S.; Allan, G.J.; Foster, J.T.; Wagner, D.M.; Okinaka, R.T.; Sim, S.H.; Pearson, O.; Wu, Z.; Chang, J.; Kaul, R.; Hoffmaster, A.R.; Brettin, T.S.; Robison, R.A.; Mayo, M.; Gee, J.E.; Tan, P.; Currie, B.J.; Keim, P.

2009-01-01

Background: Phylogeographic reconstruction of some bacterial populations is hindered by low diversity coupled with high levels of lateral gene transfer. A comparison of recombination levels and diversity at seven housekeeping genes for eleven bacterial species, most of which are commonly cited as having high levels of lateral gene transfer shows that the relative contributions of homologous recombination versus mutation for Burkholderia pseudomallei is over two times higher than for Streptococcus pneumoniae and is thus the highest value yet reported in bacteria. Despite the potential for homologous recombination to increase diversity, B. pseudomallei exhibits a relative lack of diversity at these loci. In these situations, whole genome genotyping of orthologous shared single nucleotide polymorphism loci, discovered using next generation sequencing technologies, can provide very large data sets capable of estimating core phylogenetic relationships. We compared and searched 43 whole genome sequences of B. pseudomallei and its closest relatives for single nucleotide polymorphisms in orthologous shared regions to use in phylogenetic reconstruction. Results: Bayesian phylogenetic analyses of >14,000 single nucleotide polymorphisms yielded completely resolved trees for these 43 strains with high levels of statistical support. These results enable a better understanding of a separate analysis of population differentiation among >1,700 B. pseudomallei isolates as defined by sequence data from seven housekeeping genes. We analyzed this larger data set for population structure and allele sharing that can be attributed to lateral gene transfer. Our results suggest that despite an almost panmictic population, we can detect two distinct populations of B. pseudomallei that conform to biogeographic patterns found in many plant and animal species. That is, separation along Wallace's Line, a biogeographic boundary between Southeast Asia and Australia. Conclusion: We describe an Australian origin for B. pseudomallei, characterized by a single introduction event into Southeast Asia during a recent glacial period, and variable levels of lateral gene transfer within populations. These patterns provide insights into mechanisms of genetic diversification in B. pseudomallei and its closest relatives, and provide a framework for integrating the traditionally separate fields of population genetics and phylogenetics for other bacterial species with high levels of lateral gene transfer. ?? 2009 Pearson et al; licensee BioMed Central Ltd.
In silico comparison of genomic regions containing genes coding for enzymes and transcription factors for the phenylpropanoid pathway in Phaseolus vulgaris L. and Glycine max L. Merr

PubMed Central

Reinprecht, Yarmilla; Yadegari, Zeinab; Perry, Gregory E.; Siddiqua, Mahbuba; Wright, Lori C.; McClean, Phillip E.; Pauls, K. Peter

2013-01-01

Legumes contain a variety of phytochemicals derived from the phenylpropanoid pathway that have important effects on human health as well as seed coat color, plant disease resistance and nodulation. However, the information about the genes involved in this important pathway is fragmentary in common bean (Phaseolus vulgaris L.). The objectives of this research were to isolate genes that function in and control the phenylpropanoid pathway in common bean, determine their genomic locations in silico in common bean and soybean, and analyze sequences of the 4CL gene family in two common bean genotypes. Sequences of phenylpropanoid pathway genes available for common bean or other plant species were aligned, and the conserved regions were used to design sequence-specific primers. The PCR products were cloned and sequenced and the gene sequences along with common bean gene-based (g) markers were BLASTed against the Glycine max v.1.0 genome and the P. vulgaris v.1.0 (Andean) early release genome. In addition, gene sequences were BLASTed against the OAC Rex (Mesoamerican) genome sequence assembly. In total, fragments of 46 structural and regulatory phenylpropanoid pathway genes were characterized in this way and placed in silico on common bean and soybean sequence maps. The maps contain over 250 common bean g and SSR (simple sequence repeat) markers and identify the positions of more than 60 additional phenylpropanoid pathway gene sequences, plus the putative locations of seed coat color genes. The majority of cloned phenylpropanoid pathway gene sequences were mapped to one location in the common bean genome but had two positions in soybean. The comparison of the genomic maps confirmed previous studies, which show that common bean and soybean share genomic regions, including those containing phenylpropanoid pathway gene sequences, with conserved synteny. Indels identified in the comparison of Andean and Mesoamerican common bean 4CL gene sequences might be used to develop inter-pool phenylpropanoid pathway gene-based markers. We anticipate that the information obtained by this study will simplify and accelerate selections of common bean with specific phenylpropanoid pathway alleles to increase the contents of beneficial phenylpropanoids in common bean and other legumes. PMID:24046770
PNMA family: Protein interaction network and cell signalling pathways implicated in cancer and apoptosis.

PubMed

Pang, Siew Wai; Lahiri, Chandrajit; Poh, Chit Laa; Tan, Kuan Onn

2018-05-01

Paraneoplastic Ma Family (PNMA) comprises a growing number of family members which share relatively conserved protein sequences encoded by the human genome and is localized to several human chromosomes, including the X-chromosome. Based on sequence analysis, PNMA family members share sequence homology to the Gag protein of LTR retrotransposon, and several family members with aberrant protein expressions have been reported to be closely associated with the human Paraneoplastic Disorder (PND). In addition, gene mutations of specific members of PNMA family are known to be associated with human mental retardation or 3-M syndrome consisting of restrictive post-natal growth or dwarfism, and development of skeletal abnormalities. Other than sequence homology, the physiological function of many members in this family remains unclear. However, several members of this family have been characterized, including cell signalling events mediated by these proteins that are associated with apoptosis, and cancer in different cell types. Furthermore, while certain PNMA family members show restricted gene expression in the human brain and testis, other PNMA family members exhibit broader gene expression or preferential and selective protein interaction profiles, suggesting functional divergence within the family. Functional analysis of some members of this family have identified protein domains that are required for subcellular localization, protein-protein interactions, and cell signalling events which are the focus of this review paper. Copyright © 2018 Elsevier Inc. All rights reserved.
Sequencing and comparison of the Rickettsia genomes from the whitefly Bemisia tabaci Middle East Asia Minor I.

PubMed

Zhu, Dan-Tong; Xia, Wen-Qiang; Rao, Qiong; Liu, Shu-Sheng; Ghanim, Murad; Wang, Xiao-Wei

2016-08-01

The whitefly, Bemisia tabaci, harbors the primary symbiont 'Candidatus Portiera aleyrodidarum' and a variety of secondary symbionts. Among these secondary symbionts, Rickettsia is the only one that can be detected both inside and outside the bacteriomes. Infection with Rickettsia has been reported to influence several aspects of the whitefly biology, such as fitness, sex ratio, virus transmission and resistance to pesticides. However, mechanisms underlying these differences remain unclear, largely due to the lack of genomic information of Rickettsia. In this study, we sequenced the genome of two Rickettsia strains isolated from the Middle East Asia Minor 1 (MEAM1) species of the B. tabaci complex in China and Israel. Both Rickettsia genomes were of high coding density and AT-rich, containing more than 1000 coding sequences, much larger than that of the coexisted primary symbiont, Portiera. Moreover, the two Rickettsia strains isolated from China and Israel shared most of the genes with 100% identity and only nine genes showed sequence differences. The phylogenetic analysis using orthologs shared in the genus, inferred the proximity of Rickettsia in MEAM1 and Rickettsia bellii. Functional analysis revealed that Rickettsia was unable to synthesize amino acids required for complementing the whitefly nutrition. Besides, a type IV secretion system and a number of virulence-related genes were detected in the Rickettsia genome. The presence of virulence-related genes might benefit the symbiotic life of the bacteria, and hint on potential effects of Rickettsia on whiteflies. The genome sequences of Rickettsia provided a basis for further understanding the function of Rickettsia in whiteflies. © 2016 Institute of Zoology, Chinese Academy of Sciences.
Characterization of the temperate phage vB_RleM_PPF1 and its site-specific integration into the Rhizobium leguminosarum F1 genome.

PubMed

Halmillawewa, Anupama P; Restrepo-Córdoba, Marcela; Perry, Benjamin J; Yost, Christopher K; Hynes, Michael F

2016-02-01

Bacteriophages may play an important role in regulating population size and diversity of the root nodule symbiont Rhizobium leguminosarum, as well as participating in horizontal gene transfer. Although phages that infect this species have been isolated in the past, our knowledge of their molecular biology, and especially of genome composition, is extremely limited, and this lack of information impacts on the ability to assess phage population dynamics and limits potential agricultural applications of rhizobiophages. To help address this deficit in available sequence and biological information, the complete genome sequence of the Myoviridae temperate phage PPF1 that infects R. leguminosarum biovar viciae strain F1 was determined. The genome is 54,506 bp in length with an average G+C content of 61.9 %. The genome contains 94 putative open reading frames (ORFs) and 74.5 % of these predicted ORFs share homology at the protein level with previously reported sequences in the database. However, putative functions could only be assigned to 25.5 % (24 ORFs) of the predicted genes. PPF1 was capable of efficiently lysogenizing its rhizobial host R. leguminosarum F1. The site-specific recombination system of the phage targets an integration site that lies within a putative tRNA-Pro (CGG) gene in R. leguminosarum F1. Upon integration, the phage is capable of restoring the disrupted tRNA gene, owing to the 50 bp homologous sequence (att core region) it shares with its rhizobial host genome. Phage PPF1 is the first temperate phage infecting members of the genus Rhizobium for which a complete genome sequence, as well as other biological data such as the integration site, is available.
Genetic characterization of low pathogenic H5N1 and co-circulating avian influenza viruses in wild mallards (Anas platyrhynchos) in Belgium, 2008.

PubMed

Van Borm, S; Vangeluwe, D; Steensels, M; Poncin, O; van den Berg, T; Lambrecht, B

2011-12-01

As part of a long-term wild bird monitoring programme, five different low pathogenic (LP) avian influenza viruses (AIVs) were isolated from wild mallards (subtypes H1N1, H4N6, H5N1, H5N3, and H10N7). A LP H5N1 and two co-circulating (same location, same time period) viruses were selected for full genome sequencing. An H1N1 (A/Anas platyrhynchos/Belgium/09-762/2008) and an H5N1 virus (A/Anas platyrhynchos/Belgium/09-762-P1/2008) were isolated on the same day in November 2008, then an H5N3 virus (A/Anas platyrhynchos/09-884/2008) 5 days later in December 2008. All genes of these co-circulating viruses shared common ancestors with recent (2001 to 2007) European wild waterfowl influenza viruses. The H5N1 virus shares genome segments with both the H1N1 (PB1, NA, M) and the H5N3 (PB2, HA) viruses, and all three viruses share the same NS sequence. A double infection with two different PA segments from H5N1 and from H5N3 could be observed for the H1N1 sample. The observed gene constellations resulted from multiple reassortment events between viruses circulating in wild birds in Eurasia. Several internal gene segments from these 2008 viruses and the N3 sequence from the H5N3 show homology with sequences from 2003 H7 outbreaks in Italy (LP) and the Netherlands (highly pathogenic). These data contribute to the growing sequence evidence of the dynamic nature of the avian influenza natural reservoir in Eurasia, and underline the importance of monitoring AIV in wild birds. Genetic information of potential hazard to commercial poultry continues to circulate in this reservoir, including H5 and H7 subtype viruses and genes related to previous AIV outbreaks.
Small gene family encoding an eggshell (chorion) protein of the human parasite Schistosoma mansoni

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bobek, L.A.; Rekosh, D.M.; Lo Verde, P.T.

1988-08-01

The authors isolated six independent genomic clones encoding schistosome chorion or eggshell proteins from a Schistosoma mansoni genomic library. A linkage map of five of the clones spanning 35 kilobase pairs (kbp) of the S. mansoni genome was constructed. The region contained two eggshell protein genes closely linked, separated by 7.5 kbp of intergenic DNA. The two genes of the cluster were arranged in the same orientation, that is, they were transcribed from the same strand. The sixth clone probably represents a third copy of the eggshell gene that is not contained within the 35-kbp region. The 5- end ofmore » the mRNA transcribed from these genes was defined by primer extension directly off the RNA. The ATCAT cap site sequence was homologous to a silkmoth chorion PuTCATT cap site sequence, where Pu indicates any purine. DNA sequence analysis showed that there were no introns in these genes. The DNA sequences of the three genes were very homologous to each other and to a cDNA clone, pSMf61-46, differing only in three or four nucleotices. A multiple TATA box was located at positions -23 to -31, and a CAAAT sequence was located at -52 upstream of the eggshell transcription unit. Comparison of sequences in regions further upstream with silkmoth and Drosophila sequences revealed very short elements that were shared. One such element, TCACGT, recently shown to be an essential cis-regulatory element for silkmoth chorion gene promoter function, was found at a similar position in all three organisms.« less
Hypermutation in shark immunoglobulin light chain genes results in contiguous substitutions.

PubMed

Lee, Susan S; Tranchina, Daniel; Ohta, Yuko; Flajnik, Martin F; Hsu, Ellen

2002-04-01

Among 631 substitutions present in 90 nurse shark immunoglobulin light chain somatic mutants, 338 constitute 2-4 bp stretches of adjacent changes. An absence of mutations in perinatal sequences and the bias for one mutating V gene in adults suggest that the diversification is antigen dependent. The substitutions shared no patterns, and the absence of donor sequences, including from family members, supports the idea that most changes arose from nontemplated mutation. The tandem mutations as a group are distinguished by consistently fewer transition changes and an A bias. We suggest this is one of several pathways of hypermutation diversifying shark antigen-receptor genes--point mutations, tandem mutations, and mutations with a G-C preference--that coevolved with or preceded gene rearrangement.
The sequence of camelpox virus shows it is most closely related to variola virus, the cause of smallpox.

PubMed

Gubser, Caroline; Smith, Geoffrey L

2002-04-01

Camelpox virus (CMPV) and variola virus (VAR) are orthopoxviruses (OPVs) that share several biological features and cause high mortality and morbidity in their single host species. The sequence of a virulent CMPV strain was determined; it is 202182 bp long, with inverted terminal repeats (ITRs) of 6045 bp and has 206 predicted open reading frames (ORFs). As for other poxviruses, the genes are tightly packed with little non-coding sequence. Most genes within 25 kb of each terminus are transcribed outwards towards the terminus, whereas genes within the centre of the genome are transcribed from either DNA strand. The central region of the genome contains genes that are highly conserved in other OPVs and 87 of these are conserved in all sequenced chordopoxviruses. In contrast, genes towards either terminus are more variable and encode proteins involved in host range, virulence or immunomodulation. In some cases, these are broken versions of genes found in other OPVs. The relationship of CMPV to other OPVs was analysed by comparisons of DNA and predicted protein sequences, repeats within the ITRs and arrangement of ORFs within the terminal regions. Each comparison gave the same conclusion: CMPV is the closest known virus to variola virus, the cause of smallpox.
A candidate gene for choanal atresia in alpaca.

PubMed

Reed, Kent M; Bauer, Miranda M; Mendoza, Kristelle M; Armién, Aníbal G

2010-03-01

Choanal atresia (CA) is a common nasal craniofacial malformation in New World domestic camelids (alpaca and llama). CA results from abnormal development of the nasal passages and is especially debilitating to newborn crias. CA in camelids shares many of the clinical manifestations of a similar condition in humans (CHARGE syndrome). Herein we report on the regulatory gene CHD7 of alpaca, whose homologue in humans is most frequently associated with CHARGE. Sequence of the CHD7 coding region was obtained from a non-affected cria. The complete coding region was 9003 bp, corresponding to a translated amino acid sequence of 3000 aa. Additional genomic sequences corresponding to a significant portion of the CHD7 gene were identified and assembled from the 2x alpaca whole genome sequence, providing confirmatory sequence for much of the CHD7 coding region. The alpaca CHD7 mRNA sequence was 97.9% similar to the human sequence, with the greatest sequence difference being an insertion in exon 38 that results in a polyalanine repeat (A12). Polymorphism in this repeat was tested for association with CA in alpaca by cloning and sequencing the repeat from both affected and non-affected individuals. Variation in length of the poly-A repeat was not associated with CA. Complete sequencing of the CHD7 gene will be necessary to determine whether other mutations in CHD7 are the cause of CA in camelids.
Transposon mutagenesis of type III group B Streptococcus: correlation of capsule expression with virulence.

PubMed

Rubens, C E; Wessels, M R; Heggen, L M; Kasper, D L

1987-10-01

The capsular polysaccharide of type III group B Streptococcus (GBS) is thought to be a major factor in the virulence of this organism. Transposon mutagenesis was used to obtain isogenic strains of a GBS serotype III clinical isolate (COH 31r/s) with site-specific mutations in the gene(s) responsible for capsule production. The self-conjugative transposon Tn916 was transferred to strain COH 31r/s during incubation with Streptococcus faecalis strain CG110 on membrane filters. Eleven transconjugant clones did not bind type III GBS antiserum by immunoblot. Immunofluorescence, competitive ELISA, and electron microscopy confirmed the absence of detectable GBS type III capsular polysaccharide in one of the transconjugants, COH 31-15. Southern hybridization analysis with a Tn916 probe confirmed the presence of the transposon sequence within each mutant. A 3.0-kilobase EcoRI fragment that flanked the Tn916 sequence was subcloned from mutant COH 31-15. This fragment shared homology with DNA from the other GBS serotypes, suggesting a common sequence for capsulation shared by organisms of different capsular types. Loss of capsule expression resulted in loss of virulence in a neonatal rat model. We conclude that a gene common to all capsular types of GBS is required for surface expression of the type III capsule and that inactivation of this gene by Tn916 results in the loss of virulence.
Reptiles and mammals have differentially retained long conserved noncoding sequences from the amniote ancestor.

PubMed

Janes, D E; Chapus, C; Gondo, Y; Clayton, D F; Sinha, S; Blatti, C A; Organ, C L; Fujita, M K; Balakrishnan, C N; Edwards, S V

2011-01-01

Many noncoding regions of genomes appear to be essential to genome function. Conservation of large numbers of noncoding sequences has been reported repeatedly among mammals but not thus far among birds and reptiles. By searching genomes of chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and green anole (Anolis carolinensis), we quantified the conservation among birds and reptiles and across amniotes of long, conserved noncoding sequences (LCNS), which we define as sequences ≥500 bp in length and exhibiting ≥95% similarity between species. We found 4,294 LCNS shared between chicken and zebra finch and 574 LCNS shared by the two birds and Anolis. The percent of genomes comprised by LCNS in the two birds (0.0024%) is notably higher than the percent in mammals (<0.0003% to <0.001%), differences that we show may be explained in part by differences in genome-wide substitution rates. We reconstruct a large number of LCNS for the amniote ancestor (ca. 8,630) and hypothesize differential loss and substantial turnover of these sites in descendent lineages. By contrast, we estimated a small role for recruitment of LCNS via acquisition of novel functions over time. Across amniotes, LCNS are significantly enriched with transcription factor binding sites for many developmental genes, and 2.9% of LCNS shared between the two birds show evidence of expression in brain expressed sequence tag databases. These results show that the rate of retention of LCNS from the amniote ancestor differs between mammals and Reptilia (including birds) and that this may reflect differing roles and constraints in gene regulation.

Reptiles and Mammals Have Differentially Retained Long Conserved Noncoding Sequences from the Amniote Ancestor

PubMed Central

Janes, D.E.; Chapus, C.; Gondo, Y.; Clayton, D.F.; Sinha, S.; Blatti, C.A.; Organ, C.L.; Fujita, M.K.; Balakrishnan, C.N.; Edwards, S.V.

2010-01-01

Many noncoding regions of genomes appear to be essential to genome function. Conservation of large numbers of noncoding sequences has been reported repeatedly among mammals but not thus far among birds and reptiles. By searching genomes of chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and green anole (Anolis carolinensis), we quantified the conservation among birds and reptiles and across amniotes of long, conserved noncoding sequences (LCNS), which we define as sequences ≥500 bp in length and exhibiting ≥95% similarity between species. We found 4,294 LCNS shared between chicken and zebra finch and 574 LCNS shared by the two birds and Anolis. The percent of genomes comprised by LCNS in the two birds (0.0024%) is notably higher than the percent in mammals (<0.0003% to <0.001%), differences that we show may be explained in part by differences in genome-wide substitution rates. We reconstruct a large number of LCNS for the amniote ancestor (ca. 8,630) and hypothesize differential loss and substantial turnover of these sites in descendent lineages. By contrast, we estimated a small role for recruitment of LCNS via acquisition of novel functions over time. Across amniotes, LCNS are significantly enriched with transcription factor binding sites for many developmental genes, and 2.9% of LCNS shared between the two birds show evidence of expression in brain expressed sequence tag databases. These results show that the rate of retention of LCNS from the amniote ancestor differs between mammals and Reptilia (including birds) and that this may reflect differing roles and constraints in gene regulation. PMID:21183607
Using the shared genetics of dystonia and ataxia to unravel their pathogenesis

PubMed Central

Nibbeling, Esther A.R.; Delnooz, Cathérine C.S.; de Koning, Tom J.; Sinke, Richard J.; Jinnah, Hyder A.; Tijssen, Marina A.J.; Verbeek, Dineke S.

2018-01-01

In this review we explore the similarities between spinocerebellar ataxias and dystonias, and suggest potentially shared molecular pathways using a gene co-expression network approach. The spinocerebellar ataxias are a group of neurodegenerative disorders characterized by coordination problems caused mainly by atrophy of the cerebellum. The dystonias are another group of neurological movement disorders linked to basal ganglia dysfunction, although evidence is now pointing to cerebellar involvement as well. Our gene co-expression network approach identified 99 shared genes and showed the involvement of two major pathways: synaptic transmission and neurodevelopment. These pathways overlapped in the two disorders, with a large role for GABAergic signaling in both. The overlapping pathways may provide novel targets for disease therapies. We need to prioritize variants obtained by whole exome sequencing in the genes associated with these pathways in the search for new pathogenic variants, which can than be used to help in the genetic counseling of patients and their families. PMID:28143763
NABIC: A New Access Portal to Search, Visualize, and Share Agricultural Genomics Data.

PubMed

Seol, Young-Joo; Lee, Tae-Ho; Park, Dong-Suk; Kim, Chang-Kug

2016-01-01

The National Agricultural Biotechnology Information Center developed an access portal to search, visualize, and share agricultural genomics data with a focus on South Korean information and resources. The portal features an agricultural biotechnology database containing a wide range of omics data from public and proprietary sources. We collected 28.4 TB of data from 162 agricultural organisms, with 10 types of omics data comprising next-generation sequencing sequence read archive, genome, gene, nucleotide, DNA chip, expressed sequence tag, interactome, protein structure, molecular marker, and single-nucleotide polymorphism datasets. Our genomic resources contain information on five animals, seven plants, and one fungus, which is accessed through a genome browser. We also developed a data submission and analysis system as a web service, with easy-to-use functions and cutting-edge algorithms, including those for handling next-generation sequencing data.
Variation of partial transferrin sequences and phylogenetic relationships among hares (Lepus capensis, Lagomorpha) from Tunisia.

PubMed

Awadi, Asma; Suchentrunk, Franz; Makni, Mohamed; Ben Slimen, Hichem

2016-10-01

North African hares are currently included in cape hares, Lepus capensis sensu lato, a taxon that may be considered a superspecies or a complex of closely related species. The existing molecular data, however, are not unequivocal, with mtDNA control region sequences suggesting a separate species status and nuclear loci (allozymes, microsatellites) revealing conspecificity of L. capensis and L. europaeus. Here, we study sequence variation in the intron 6 (468 bp) of the transferrin nuclear gene, of 105 hares with different coat colour from different regions in Tunisia with respect to genetic diversity and differentiation, as well as their phylogenetic status. Forty-six haplotypes (alleles) were revealed and compared phylogenetically to all available TF haplotypes of various Lepus species retrieved from GenBank. Maximum Likelihood, neighbor joining and median joining network analyses concordantly grouped all currently obtained haplotypes together with haplotypes belonging to six different Chinese hare species and the African scrub hare L. saxatilis. Moreover, two Tunisian haploypes were shared with L. capensis, L timidus, L. sinensis, L. yarkandensis, and L. hainanus from China. These results indicated the evolutionary complexity of the genus Lepus with the mixing of nuclear gene haplotypes resulting from introgressive hybridization or/and shared ancestral polymorphism. We report the presence of shared ancestral polymorphism between North African and Chinese hares. This has not been detected earlier in the mtDNA sequences of the same individuals. Genetic diversity of the TF sequences from the Tunisian populations was relatively high compared to other hare populations. However, genetic differentiation and gene flow analyses (AMOVA, F ST , Nm) indicated little divergence with the absence of geographically meaningful phylogroups and lack of clustering with coat colour types. These results confirm the presence of a single hare species in Tunisia, but a sound inference on its phylogenetic position would require additional nuclear markers and numerous geographically meaningful samples from Africa and Eurasia.
Genetic Analysis of Norovirus Strains that Caused Gastroenteritis Outbreaks Among River Rafters in the Grand Canyon, Arizona.

PubMed

Kitajima, Masaaki; Iker, Brandon C; Magill-Collins, Anne; Gaither, Marlene; Stoehr, James D; Gerba, Charles P

2017-06-01

Toilet solid waste samples collected from five outbreaks among rafters in the Grand Canyon were subjected to sequencing analysis of norovirus partial capsid gene. The results revealed that a GI.3 strain was associated with one outbreak, whereas the other outbreaks were caused by GII.5 whose sequences shared >98.9% homology.
DNA Sequence Analysis of a Complementary DNA for Cold-Regulated Arabidopsis Gene cor15 and Characterization of the COR 15 Polypeptide 1

PubMed Central

Lin, Chentao; Thomashow, Michael F.

1992-01-01

Previous studies have indicated that changes in gene expression occur in Arabidopsis thaliana L. (Heyn) during cold acclimation and that certain of the cor (cold-regulated) genes encode polypeptides that share the unusual property of remaining soluble upon boiling in aqueous solution. Here, we identify a cDNA clone for a cold-regulated gene encoding one of the “boiling-stable” polypeptides, COR15. DNA sequence analysis indicated that the gene, designated cor15, encodes a 14.7-kilodalton hydrophilic polypeptide having an N-terminal amino acid sequence that closely resembles transit peptides that target proteins to the stromal compartment of chloroplasts. Immunological studies indicated that COR15 is processed in vivo and that the mature polypeptide, COR 15m, is present in the soluble fraction of chloroplasts. Possible functions of COR 15m are discussed. ImagesFigure 1Figure 4Figure 5Figure 6Figure 7 PMID:16668917
Molecular phylogeny of Babesia poelea from brown boobies (Sula leucogaster) from Johnston Atoll, Central Pacific

USGS Publications Warehouse

Yabsley, Michael J.; Work, Thierry M.; Rameyer, Robert A.

2006-01-01

The phylogenetic relationship of avian Babesia with other piroplasms remains unclear, mainly because of a lack of objective criteria such as molecular phylogenetics. In this study, our objective was to sequence the entire 18S, ITS-1, 5.8S, and ITS-2 regions of the rRNA gene and partial ß-tubulin gene of B. poelea, first described from brown boobies (Sula leucogaster) from the central Pacific, and compare them to those of other piroplasms. Phylogenetic analyses of the entire 18S rRNA gene sequence revealed that B. poelea belonged to the clade of piroplasms previously detected in humans, domestic dogs, and wild ungulates in the western United States. The entire ITS-1, 5.8S, ITS-2, and partial ß-tubulin gene sequence shared conserved regions with previously described Babesia and Theileria species. The intron of the ß-tubulin gene was 45 bp. This is the first molecular characterization of an avian piroplasm.
Molecular phylogeny of Babesia poelea from brown boobies (Sula leucogaster) from Johnston Atoll, central Pacific.

PubMed

Yabsley, Michael J; Work, Thierry M; Rameyer, Robert A

2006-04-01

The phylogenetic relationship of avian Babesia with other piroplasms remains unclear, mainly because of a lack of objective criteria such as molecular phylogenetics. In this study, our objective was to sequence the entire 18S, ITS-1, 5.8S, and ITS-2 regions of the rRNA gene and partial beta-tubulin gene of B. poelea, first described from brown boobies (Sula leucogaster) from the central Pacific, and compare them to those of other piroplasms. Phylogenetic analyses of the entire 18S rRNA gene sequence revealed that B. poelea belonged to the clade of piroplasms previously detected in humans, domestic dogs, and wild ungulates in the western United States. The entire ITS-1, 5.8S, ITS-2, and partial beta-tubulin gene sequence shared conserved regions with previously described Babesia and Theileria species. The intron of the beta-tubulin gene was 45 bp. This is the first molecular characterization of an avian piroplasm.
Identification of interleukin genes in Pogona vitticeps using a de novo transcriptome assembly from RNA-seq data.

PubMed

Livernois, Alexandra; Hardy, Kristine; Domaschenz, Renae; Papanicolaou, Alexie; Georges, Arthur; Sarre, Stephen D; Rao, Sudha; Ezaz, Tariq; Deakin, Janine E

2016-10-01

Interleukins are a group of cytokines with complex immunomodulatory functions that are important for regulating immunity in vertebrate species. Reptiles and mammals last shared a common ancestor more than 350 million years ago, so it is not surprising that low sequence identity has prevented divergent interleukin genes from being identified in the central bearded dragon lizard, Pogona vitticeps, in its genome assembly. To determine the complete nucleotide sequences of key interleukin genes, we constructed full-length transcripts, using the Trinity platform, from short paired-end read RNA sequences from stimulated spleen cells. De novo transcript reconstruction and analysis allowed us to identify interleukin genes that are missing from the published P. vitticeps assembly. Identification of key cytokines in P. vitticeps will provide insight into the essential molecular mechanisms and evolution of interleukin gene families and allow for characterization of the immune response in a lizard for comparison with mammals.
The Complete Mitochondrial Genome of Gossypium hirsutum and Evolutionary Analysis of Higher Plant Mitochondrial Genomes

PubMed Central

Su, Aiguo; Geng, Jianing; Grover, Corrinne E.; Hu, Songnian; Hua, Jinping

2013-01-01

Background Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. Methodology/Principal Findings We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. Conclusion The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species. PMID:23940520
The complete mitochondrial genome of Gossypium hirsutum and evolutionary analysis of higher plant mitochondrial genomes.

PubMed

Liu, Guozheng; Cao, Dandan; Li, Shuangshuang; Su, Aiguo; Geng, Jianing; Grover, Corrinne E; Hu, Songnian; Hua, Jinping

2013-01-01

Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.
Structure and evolution of cereal genomes.

PubMed

Paterson, Andrew H; Bowers, John E; Peterson, Daniel G; Estill, James C; Chapman, Brad A

2003-12-01

The cereal species, of central importance to our diet, began to diverge 50-70 million years ago. For the past few thousand years, these species have undergone largely parallel selection regimes associated with domestication and improvement. The rice genome sequence provides a platform for organizing information about diverse cereals, and together with genetic maps and sequence samples from other cereals is yielding new insights into both the shared and the independent dimensions of cereal evolution. New data and population-based approaches are identifying genes that have been involved in cereal improvement. Reduced-representation sequencing promises to accelerate gene discovery in many large-genome cereals, and to better link the under-explored genomes of 'orphan' cereals with state-of-the-art knowledge.
Analysis of eight out genes in a cluster required for pectic enzyme secretion by Erwinia chrysanthemi: sequence comparison with secretion genes from other gram-negative bacteria.

PubMed Central

Lindeberg, M; Collmer, A

1992-01-01

Many extracellular proteins produced by Erwinia chrysanthemi require the out gene products for transport across the outer membrane. In a previous report (S. Y. He, M. Lindeberg, A. K. Chatterjee, and A. Collmer, Proc. Natl. Acad. Sci. USA 88:1079-1083, 1991) cosmid pCPP2006, sufficient for secretion of Erwinia chrysanthemi extracellular proteins by Escherichia coli, was partially sequenced, revealing four out genes sharing high homology with pulH through pulK from Klebsiella oxytoca. The nucleotide sequence of eight additional out genes reveals homology with pulC through pulG, pulL, pulM, pulO, and other genes involved in secretion by various gram-negative bacteria. Although signal sequences and hydrophobic regions are generally conserved between Pul and Out proteins, four out genes contain unique inserts, a pulN homolog is not present, and outO appears to be transcribed separately from outC through outM. The sequenced region was subcloned, and an additional 7.6-kb region upstream was identified as being required for secretion in E. coli. out gene homologs were found on Erwinia carotovora cosmid clone pAKC651 but were not detected in E. coli. The outC-through-outM operon is weakly induced by polygalacturonic acid and strongly expressed in the early stationary phase. The out and pul genes are highly similar in sequence, hydropathic properties, and overall arrangement but differ in both transcriptional organization and the nature of their induction. Images PMID:1429461
Silencing Effect of Hominoid Highly Conserved Noncoding Sequences on Embryonic Brain Development

PubMed Central

Mahmoudi Saber, Morteza

2017-01-01

Abstract Superfamily Hominoidea, which consists of Hominidae (humans and great apes) and Hylobatidae (gibbons), is well-known for sharing human-like characteristics, however, the genomic origins of these shared unique phenotypes have mainly remained elusive. To decipher the underlying genomic basis of Hominoidea-restricted phenotypes, we identified and characterized Hominoidea-restricted highly conserved noncoding sequences (HCNSs) that are a class of potential regulatory elements which may be involved in evolution of lineage-specific phenotypes. We discovered 679 such HCNSs from human, chimpanzee, gorilla, orangutan and gibbon genomes. These HCNSs were demonstrated to be under purifying selection but with lineage-restricted characteristics different from old CNSs. A significant proportion of their ancestral sequences had accelerated rates of nucleotide substitutions, insertions and deletions during the evolution of common ancestor of Hominoidea, suggesting the intervention of positive Darwinian selection for creating those HCNSs. In contrary to enhancer elements and similar to silencer sequences, these Hominoidea-restricted HCNSs are located in close proximity of transcription start sites. Their target genes are enriched in the nervous system, development and transcription, and they tend to be remotely located from the nearest coding gene. Chip-seq signals and gene expression patterns suggest that Hominoidea-restricted HCNSs are likely to be functional regulatory elements by imposing silencing effects on their target genes in a tissue-restricted manner during fetal brain development. These HCNSs, emerged through adaptive evolution and conserved through purifying selection, represent a set of promising targets for future functional studies of the evolution of Hominoidea-restricted phenotypes. PMID:28633494
A Comparative Encyclopedia of DNA Elements in the Mouse Genome

PubMed Central

Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D.; Shen, Yin; Pervouchine, Dmitri D.; Djebali, Sarah; Thurman, Bob; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K.; Williams, Brian A.; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M. A.; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T.; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D.; Bansal, Mukul S.; Keller, Cheryl A.; Morrissey, Christapher S.; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S.; Cayting, Philip; Kawli, Trupti; Boyle, Alan P.; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S.; Cline, Melissa S.; Erickson, Drew T.; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A.; Rosenbloom, Kate R.; de Sousa, Beatriz Lacerda; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W. James; Santos, Miguel Ramalho; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J.; Wilken, Matthew S.; Reh, Thomas A.; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P.; Neph, Shane; Humbert, Richard; Hansen, R. Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E.; Orkin, Stuart H.; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J.; Blobel, Gerd A.; Good, Peter J.; Lowdon, Rebecca F.; Adams, Leslie B.; Zhou, Xiao-Qiao; Pazin, Michael J.; Feingold, Elise A.; Wold, Barbara; Taylor, James; Kellis, Manolis; Mortazavi, Ali; Weissman, Sherman M.; Stamatoyannopoulos, John; Snyder, Michael P.; Guigo, Roderic; Gingeras, Thomas R.; Gilbert, David M.; Hardison, Ross C.; Beer, Michael A.; Ren, Bing

2014-01-01

Summary As the premier model organism in biomedical research, the laboratory mouse shares the majority of protein-coding genes with humans, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications, and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of other sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases. PMID:25409824
A comparative encyclopedia of DNA elements in the mouse genome.

PubMed

Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D; Shen, Yin; Pervouchine, Dmitri D; Djebali, Sarah; Thurman, Robert E; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K; Williams, Brian A; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M A; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D; Bansal, Mukul S; Kellis, Manolis; Keller, Cheryl A; Morrissey, Christapher S; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S; Cayting, Philip; Kawli, Trupti; Boyle, Alan P; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S; Cline, Melissa S; Erickson, Drew T; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A; Rosenbloom, Kate R; Lacerda de Sousa, Beatriz; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W James; Ramalho Santos, Miguel; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J; Wilken, Matthew S; Reh, Thomas A; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P; Neph, Shane; Humbert, Richard; Hansen, R Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E; Orkin, Stuart H; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J; Blobel, Gerd A; Cao, Xiaoyi; Zhong, Sheng; Wang, Ting; Good, Peter J; Lowdon, Rebecca F; Adams, Leslie B; Zhou, Xiao-Qiao; Pazin, Michael J; Feingold, Elise A; Wold, Barbara; Taylor, James; Mortazavi, Ali; Weissman, Sherman M; Stamatoyannopoulos, John A; Snyder, Michael P; Guigo, Roderic; Gingeras, Thomas R; Gilbert, David M; Hardison, Ross C; Beer, Michael A; Ren, Bing

2014-11-20

The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.
Identification of a DNA sequence motif required for expression of iron-regulated genes in pseudomonads.

PubMed

Rombel, I T; McMorran, B J; Lamont, I L

1995-02-20

Many bacteria respond to a lack of iron in the environment by synthesizing siderophores, which act as iron-scavenging compounds. Fluorescent pseudomonads synthesize strain-specific but chemically related siderophores called pyoverdines or pseudobactins. We have investigated the mechanisms by which iron controls expression of genes involved in pyoverdine metabolism in Pseudomonas aeruginosa. Transcription of these genes is repressed by the presence of iron in the growth medium. Three promoters from these genes were cloned and the activities of the promoters were dependent on the amounts of iron in the growth media. Two of the promoters were sequenced and the transcriptional start site were identified by S1 nuclease analysis. Sequences similar to the consensus binding site for the Fur repressor protein, which controls expression of iron-repressible genes in several gram-negative species, were not present in the promoters, suggesting that they are unlikely to have a high affinity for Fur. However, comparison of the promoter sequences with those of iron-regulated genes from other Pseudomonas species and also the iron-regulated exotoxin gene of P. aeruginosa allowed identification of a shared sequence element, with the consensus sequence (G/C)CTAAAT-CCC, which is likely to act as a binding site for a transcriptional activator protein. Mutations in this sequence greatly reduced the activities of the promoters characterized here as well as those of other iron-regulated promoters. The requirement for this motif in the promoters of iron-regulated genes of different Pseudomonas species indicates that similar mechanisms are likely to be involved in controlling expression of a range of iron-regulated genes in pseudomonads.
The RNase P RNA from cyanobacteria: short tandemly repeated repetitive (STRR) sequences are present within the RNase P RNA gene in heterocyst-forming cyanobacteria.

PubMed Central

Vioque, A

1997-01-01

The RNase P RNA gene (rnpB) from 10 cyanobacteria has been characterized. These new RNAs, together with the previously available ones, provide a comprehensive data set of RNase P RNA from diverse cyanobacterial lineages. All heterocystous cyanobacteria, but none of the non-heterocystous strains analyzed, contain short tandemly repeated repetitive (STRR) sequences that increase the length of helix P12. Site-directed mutagenesis experiments indicate that the STRR sequences are not required for catalytic activity in vitro. STRR sequences seem to have recently and independently invaded the RNase P RNA genes in heterocyst-forming cyanobacteria because closely related strains contain unrelated STRR sequences. Most cyanobacteria RNase P RNAs lack the sequence GGU in the loop connecting helices P15 and P16 that has been established to interact with the 3'-end CCA in precursor tRNA substrates in other bacteria. This character is shared with plastid RNase P RNA. Helix P6 is longer than usual in most cyanobacteria as well as in plastid RNase P RNA. PMID:9254706
Definition of the Cattle Killer Cell Ig–like Receptor Gene Family: Comparison with Aurochs and Human Counterparts

PubMed Central

Sanderson, Nicholas D.; Norman, Paul J.; Guethlein, Lisbeth A.; Ellis, Shirley A.; Williams, Christina; Breen, Matthew; Park, Steven D. E.; Magee, David A.; Babrzadeh, Farbod; Warry, Andrew; Watson, Mick; Bradley, Daniel G.; MacHugh, David E.; Parham, Peter

2014-01-01

Under selection pressure from pathogens, variable NK cell receptors that recognize polymorphic MHC class I evolved convergently in different species of placental mammal. Unexpectedly, diversified killer cell Ig–like receptors (KIRs) are shared by simian primates, including humans, and cattle, but not by other species. Whereas much is known of human KIR genetics and genomics, knowledge of cattle KIR is limited to nine cDNA sequences. To facilitate comparison of the cattle and human KIR gene families, we determined the genomic location, structure, and sequence of two cattle KIR haplotypes and defined KIR sequences of aurochs, the extinct wild ancestor of domestic cattle. Larger than its human counterpart, the cattle KIR locus evolved through successive duplications of a block containing ancestral KIR3DL and KIR3DX genes that existed before placental mammals. Comparison of two cattle KIR haplotypes and aurochs KIR show the KIR are polymorphic and the gene organization and content appear conserved. Of 18 genes, 8 are functional and 10 were inactivated by point mutation. Selective inactivation of KIR3DL and activating receptor genes leaves a functional cohort of one inhibitory KIR3DL, one activating KIR3DX, and six inhibitory KIR3DX. Functional KIR diversity evolved from KIR3DX in cattle and from KIR3DL in simian primates. Although independently evolved, cattle and human KIR gene families share important function-related properties, indicating that cattle KIR are NK cell receptors for cattle MHC class I. Combinations of KIR and MHC class I are the major genetic factors associated with human disease and merit investigation in cattle. PMID:25398326
Microbial and viral-like rhodopsins present in coastal marine sediments from four polar and subpolar regions

DOE Office of Scientific and Technical Information (OSTI.GOV)

López, José L.; Golemba, Marcelo; Hernández, Edgardo

Rhodopsins are broadly distributed. In this work, we analyzed 23 metagenomes corresponding to marine sediment samples from four regions that share cold climate conditions (Norway; Sweden; Argentina and Antarctica). In order to investigate the genes evolution of viral rhodopsins, an initial set of 6224 bacterial rhodopsin sequences according to COG5524 were retrieved from the 23 metagenomes. After selection by the presence of transmembrane domains and alignment, 123 viral (51) and non-viral (72) sequences (>50 amino acids) were finally included in further analysis. Viral rhodopsin genes were homologs of Phaeocystis globosa virus and Organic lake Phycodnavirus. Non-viral microbial rhodopsin genes weremore » ascribed to Bacteroidetes, Planctomycetes, Firmicutes, Actinobacteria, Cyanobacteria, Proteobacteria, Deinococcus-Thermus and Cryptophyta and Fungi. A rescreening using Blastp, using as queries the viral sequences previously described, retrieved 30 sequences (>100 amino acids). Phylogeographic analysis revealed a geographical clustering of the sequences affiliated to the viral group. This clustering was not observed for the microbial non-viral sequences. The phylogenetic reconstruction allowed us to propose the existence of a putative ancestor of viral rhodopsin genes related to Actinobacteria and Chloroflexi. This is the first report about the existence of a phylogeographic association of the viral rhodopsin sequences from marine sediments.« less

The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans

PubMed Central

Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

2015-01-01

Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191
Pleurochrysome: A Web Database of Pleurochrysis Transcripts and Orthologs Among Heterogeneous Algae

PubMed Central

Fujiwara, Shoko; Takatsuka, Yukiko; Hirokawa, Yasutaka; Tsuzuki, Mikio; Takano, Tomoyuki; Kobayashi, Masaaki; Suda, Kunihiro; Asamizu, Erika; Yokoyama, Koji; Shibata, Daisuke; Tabata, Satoshi; Yano, Kentaro

2016-01-01

Pleurochrysis is a coccolithophorid genus, which belongs to the Coccolithales in the Haptophyta. The genus has been used extensively for biological research, together with Emiliania in the Isochrysidales, to understand distinctive features between the two coccolithophorid-including orders. However, molecular biological research on Pleurochrysis such as elucidation of the molecular mechanism behind coccolith formation has not made great progress at least in part because of lack of comprehensive gene information. To provide such information to the research community, we built an open web database, the Pleurochrysome (http://bioinf.mind.meiji.ac.jp/phapt/), which currently stores 9,023 unique gene sequences (designated as UNIGENEs) assembled from expressed sequence tag sequences of P. haptonemofera as core information. The UNIGENEs were annotated with gene sequences sharing significant homology, conserved domains, Gene Ontology, KEGG Orthology, predicted subcellular localization, open reading frames and orthologous relationship with genes of 10 other algal species, a cyanobacterium and the yeast Saccharomyces cerevisiae. This sequence and annotation information can be easily accessed via several search functions. Besides fundamental functions such as BLAST and keyword searches, this database also offers search functions to explore orthologous genes in the 12 organisms and to seek novel genes. The Pleurochrysome will promote molecular biological and phylogenetic research on coccolithophorids and other haptophytes by helping scientists mine data from the primary transcriptome of P. haptonemofera. PMID:26746174
Genetic Diversity among Clostridium botulinum Strains Harboring bont/A2 and bont/A3 Genes

PubMed Central

Raphael, Brian H.; Joseph, Lavin A.; Meno, Sarah R.; Fernández, Rafael A.; Maslanka, Susan E.

2012-01-01

Clostridium botulinum type A strains are known to be genetically diverse and widespread throughout the world. Genetic diversity studies have focused mainly on strains harboring one type A botulinum toxin gene, bont/A1, although all reported bont/A gene variants have been associated with botulism cases. Our study provides insight into the genetic diversity of C. botulinum type A strains, which contain bont/A2 (n = 42) and bont/A3 (n = 4) genes, isolated from diverse samples and geographic origins. Genetic diversity was assessed by using bont nucleotide sequencing, content analysis of the bont gene clusters, multilocus sequence typing (MLST), and pulsed-field gel electrophoresis (PFGE). Sequences of bont genes obtained in this study showed 99.9 to 100% identity with other bont/A2 or bont/A3 gene sequences available in public databases. The neurotoxin gene clusters of the subtype A2 and A3 strains analyzed in this study were similar in gene content. C. botulinum strains harboring bont/A2 and bont/A3 genes were divided into six and two MLST profiles, respectively. Four groups of strains shared a similarity of at least 95% by PFGE; the largest group included 21 out of 46 strains. The strains analyzed in this study showed relatively limited genetic diversity using either MLST or PFGE. PMID:23042179
Isolation and sequencing of the gene encoding Sp23, a structural protein of spermatophore of the mealworm beetle, Tenebrio molitor.

PubMed

Feng, X; Happ, G M

1996-11-14

The cDNA for Sp23, a structural protein of the spermatophore of Tenebrio molitor, had been previously cloned and characterized (Paesen, G.C., Schwartz, M.B., Peferoen, M., Weyda, F. and Happ, G.M. (1992a) Amino acid sequence of Sp23, a structure protein of the spermatophore of the mealworm beetle, Tenebrio molitor. J. Biol. Chem. 257, 18852-18857). Using the labeled cDNA for Sp23 as a probe to screen a library of genomic DNA from Tenebrio molitor, we isolated a genomic clone for Sp23. A 5373-base pair (bp) restriction fragment containing the Sp23 gene was sequenced. The coding region is separated by a 55-bp intron which is located close to the translation start site. Three putative ecdysone response elements (EcRE) are identified in the 5' flanking region of the Sp23 gene. Comparison of the flanking regions of the Sp23 gene with those of the D-protein gene expressed in the accessory glands of Tenebrio reveals similar sequences present in the flanking regions of the two genes. The genomic organization of the coding region of the Sp23 gene shares similarities with that of the D-protein gene, three Drosophila accessory gland genes and two Drosophila 20-OH ecdysone-responsive genes.
Cis-acting elements in the promoter region of the human aldolase C gene.

PubMed

Buono, P; de Conciliis, L; Olivetta, E; Izzo, P; Salvatore, F

1993-08-16

We investigated the cis-acting sequences involved in the expression of the human aldolase C gene by transient transfections into human neuroblastoma cells (SKNBE). We demonstrate that 420 bp of the 5'-flanking DNA direct at high efficiency the transcription of the CAT reporter gene. A deletion between -420 bp and -164 bp causes a 60% decrease of CAT activity. Gel shift and DNase I footprinting analyses revealed four protected elements: A, B, C and D. Competition analyses indicate that Sp1 or factors sharing a similar sequence specificity bind to elements A and B, but not to elements C and D. Sequence analysis shows a half palindromic ERE motif (GGTCA), in elements B and D. Region D binds a transactivating factor which appears also essential to stabilize the initiation complex.
Isolation and Expression Analysis of CYP9A11 and Cytochrome P450 Reductase Gene in the Beet Armyworm (Lepidoptera: Noctuidae)

PubMed Central

Zhao, Chunqing; Feng, Xiaoyun; Tang, Tao; Qiu, Lihong

2015-01-01

Cytochrome P450 monooxygenases (CYPs), as an enzyme superfamily, is widely distributed in organisms and plays a vital function in the metabolism of exogenous and endogenous compounds by interacting with its obligatory redox partner, CYP reductase (CPR). A novel CYP gene (CYP9A11) and CPR gene from the agricultural pest insect Spodoptera exigua were cloned and characterized. The complete cDNA sequences of SeCYP9A11 and SeCPR are 1,931 and 3,919 bp in length, respectively, and contain open reading frames of 1,593 and 2,070 nucleotides, respectively. Analysis of the putative protein sequences indicated that SeCYP9A11 contains a heme-binding domain and the unique characteristic sequence (SRFALCE) of the CYP9 family, in addition to a signal peptide and transmembrane segment at the N-terminal. Alignment analysis revealed that SeCYP9A11 shares the highest sequence similarity with CYP9A13 from Mamestra brassicae, which is 66.54%. The putative protein sequence of SeCPR has all of the classical CPR features, such as an N-terminal membrane anchor; three conserved domain flavin adenine dinucleotide (FAD), flavin mononucleotide (FMN), and nicotinamide adenine dinucleotide phosphate (NADPH) domain; and characteristic binding motifs. Phylogenetic analysis revealed that SeCPR shares the highest identity with HaCPR, which is 95.21%. The SeCYP9A11 and SeCPR genes were detected in the midgut, fat body, and cuticle tissues, and throughout all of the developmental stages of S. exigua. The mRNA levels of SeCYP9A11 and SeCPR decreased remarkably after exposure to plant secondary metabolites quercetin and tannin. The results regarding SeCYP9A11 and SeCPR genes in the current study provide foundation for the further study of S. exigua P450 system. PMID:26320261
Transcript Profiling of Common Bean (Phaseolus vulgaris) Using the GeneChip Soybean Genome Array: Optimizing Analysis by Masking Biased Probes

USDA-ARS?s Scientific Manuscript database

Common bean (Phaseolus vulgaris) and soybean (Glycine max) both belong to the Phaseoleae tribe and share significant coding sequence homology. To evaluate the utility of the soybean GeneChip for transcript profiling of common bean, we hybridized cRNAs purified from nodule, leaf, and root of common b...
Expression and Functional Characterization of two Pathogenesis-Related Protein 10 Genes from Zea mays

USDA-ARS?s Scientific Manuscript database

Pathogenesis-related protein 10 (PR10) is one of seventeen PR protein families and plays important roles in plant response to biotic and abiotic stresses. A novel PR10 gene (ZmPR10.1), which shares 89.8% and 85.7% identity to the previous ZmPR10 at the nucleotide and amino acid sequence level, respe...
ACLAME: a CLAssification of Mobile genetic Elements, update 2010.

PubMed

Leplae, Raphaël; Lima-Mendez, Gipsi; Toussaint, Ariane

2010-01-01

The ACLAME database is dedicated to the collection, analysis and classification of sequenced mobile genetic elements (MGEs, in particular phages and plasmids). In addition to providing information on the MGEs content, classifications are available at various levels of organization. At the gene/protein level, families group similar sequences that are expected to share the same function. Families of four or more proteins are manually assigned with a functional annotation using the GeneOntology and the locally developed ontology MeGO dedicated to MGEs. At the genome level, evolutionary cohesive modules group sets of protein families shared among MGEs. At the population level, networks display the reticulate evolutionary relationships among MGEs. To increase the coverage of the phage sequence space, ACLAME version 0.4 incorporates 760 high-quality predicted prophages selected from the Prophinder database. Most of the data can be downloaded from the freely accessible ACLAME web site (http://aclame.ulb.ac.be). The BLAST interface for querying the database has been extended and numerous tools for in-depth analysis of the results have been added.
Genetic diversity of the DBLalpha region in Plasmodium falciparum var genes among Asia-Pacific isolates.

PubMed

Fowler, Elizabeth V; Peters, Jennifer M; Gatton, Michelle L; Chen, Nanhua; Cheng, Qin

2002-03-01

In Plasmodium falciparum a highly polymorphic multi-copy gene family, var, encodes the variant surface antigen P. falciparum erythrocyte membrane protein 1 (PfEMP1), which has an important role in cytoadherence and immune evasion. Using previously described universal PCR primers for the first Duffy binding-like domain (DBLalpha) of var we analysed the DBLalpha repertoires of Dd2 (originally from Thailand) and eight isolates from the Solomon Islands (n=4), Philippines (n=2), Papua New Guinea (n=1) and Africa (n=1). We found 15-32 unique DBLalpha sequence types among these isolates and estimated detectable DBLalpha repertoire sizes ranging from 33-38 to 52-57 copies per genome. Our data suggest that var gene repertoires generally consist of 40-50 copies per genome. Eighteen DBLalpha sequences appeared in more than one Asia-Pacific isolate with the number of sequences shared between any two isolates ranging from 0 to 6 (mean=2.0 +/-1.6). At the amino acid level DBLalpha sequence similarity within isolates ranged from 45.2 +/- 7.1 to 50.2 +/- 6.9%, and was not significantly different from the DBLalpha amino acid sequence similarity among isolates (P>0.1). Comparisons with published sequences also revealed little overlap among DBLalpha sequences from different regions. High DBLalpha sequence diversity and minimal overlap among these isolates suggest that the global var gene repertoire is immense, and may potentially be selected for by the host's protective immune response to the var gene products, PfEMP1.
Dissemination of IMP-6 metallo-β-lactamase-producing Pseudomonas aeruginosa sequence type 235 in Korea.

PubMed

Seok, Yoonmi; Bae, Il Kwon; Jeong, Seok Hoon; Kim, Soo Hyun; Lee, Hyukmin; Lee, Kyungwon

2011-12-01

To investigate the epidemiological traits of Pseudomonas aeruginosa clinical isolates producing metallo-β-lactamases (MBLs) in Korea. A total of 386 non-duplicate P. aeruginosa clinical isolates were collected from Korea in 2009. Detection of MBL genes was performed by PCR. The genetic organization of class 1 integrons carrying the MBL gene cassette was investigated by PCR mapping and sequencing. The epidemiological relationships of the isolates were investigated by multilocus sequence typing and PFGE. Of 386 P. aeruginosa isolates, 30 (7.8%) isolates carried the bla(IMP-6) gene and 1 (0.3%) isolate carried the bla(VIM-2) gene. A probe specific for the bla(IMP-6) gene was hybridized to an ∼950 kbp I-CeuI-macrorestriction fragment from all 30 isolates and a probe specific for the bla(VIM-2) gene also hybridized to an ∼500 kbp I-CeuI-macrorestriction fragment from 1 isolate (BDC10). All 31 MBL-producing isolates shared an identical sequence type (ST), ST235, and they carried the same bla(OXA-50) allelic type, bla(OXA-50g). All MBL-producing isolates showed similar XbaI-macrorestriction patterns (similarity >85%), irrespective of MBL genotype. P. aeruginosa ST235 carrying the chromosomally located bla(IMP-6) gene is widely disseminated in Korea.
Biochemical and genetic characterization of enterocin A from Enterococcus faecium, a new antilisterial bacteriocin in the pediocin family of bacteriocins.

PubMed Central

Aymerich, T; Holo, H; Håvarstein, L S; Hugas, M; Garriga, M; Nes, I F

1996-01-01

A new bacteriocin has been isolated from an Enterococcus faecium strain. The bacteriocin, termed enterocin A, was purified to homogeneity as judged by sodium dodecyl sulfate-polyacrylamide gel electrophoresis, N-terminal amino acid sequencing, and mass spectrometry analysis. By combining the data obtained from amino acid and DNA sequencing, the primary structure of enterocin A was determined. It consists of 47 amino acid residues, and the molecular weight was calculated to be 4,829, assuming that the four cysteine residues form intramolecular disulfide bridges. This molecular weight was confirmed by mass spectrometry analysis. The amino acid sequence of enterocin A shared significant homology with a group of bacteriocins (now termed pediocin-like bacteriocins) isolated from a variety of lactic acid-producing bacteria, which include members of the genera Lactobacillus, Pediococcus, Leuconostoc, and Carnobacterium. Sequencing of the structural gene of enterocin A, which is located on the bacterial chromosome, revealed an N-terminal leader sequence of 18 amino acid residues, which was removed during the maturation process. The enterocin A leader belongs to the double-glycine leaders which are found among most other small nonlantibiotic bacteriocins, some lantibiotics, and colicin V. Downstream of the enterocin A gene was located a second open reading frame, encoding a putative protein of 103 amino acid residues. This gene may encode the immunity factor of enterocin A, and it shares 40% identity with a similar open reading frame in the operon of leucocin AUL 187, another pediocin-like bacteriocin. PMID:8633865
The Complete Mitochondrial DNA Sequence of Scenedesmus obliquus Reflects an Intermediate Stage in the Evolution of the Green Algal Mitochondrial Genome

PubMed Central

Nedelcu, Aurora M.; Lee, Robert W.; Lemieux, Claude; Gray, Michael W.; Burger, Gertraud

2000-01-01

Two distinct mitochondrial genome types have been described among the green algal lineages investigated to date: a reduced–derived, Chlamydomonas-like type and an ancestral, Prototheca-like type. To determine if this unexpected dichotomy is real or is due to insufficient or biased sampling and to define trends in the evolution of the green algal mitochondrial genome, we sequenced and analyzed the mitochondrial DNA (mtDNA) of Scenedesmus obliquus. This genome is 42,919 bp in size and encodes 42 conserved genes (i.e., large and small subunit rRNA genes, 27 tRNA and 13 respiratory protein-coding genes), four additional free-standing open reading frames with no known homologs, and an intronic reading frame with endonuclease/maturase similarity. No 5S rRNA or ribosomal protein-coding genes have been identified in Scenedesmus mtDNA. The standard protein-coding genes feature a deviant genetic code characterized by the use of UAG (normally a stop codon) to specify leucine, and the unprecedented use of UCA (normally a serine codon) as a signal for termination of translation. The mitochondrial genome of Scenedesmus combines features of both green algal mitochondrial genome types: the presence of a more complex set of protein-coding and tRNA genes is shared with the ancestral type, whereas the lack of 5S rRNA and ribosomal protein-coding genes as well as the presence of fragmented and scrambled rRNA genes are shared with the reduced–derived type of mitochondrial genome organization. Furthermore, the gene content and the fragmentation pattern of the rRNA genes suggest that this genome represents an intermediate stage in the evolutionary process of mitochondrial genome streamlining in green algae. [The sequence data described in this paper have been submitted to the GenBank data library under accession no. AF204057.] PMID:10854413
Presence of a Shared 5'-Leader Sequence in Ancestral Human and Mammalian Retroviruses and Its Transduction into Feline Leukemia Virus.

PubMed

Kawasaki, Junna; Kawamura, Maki; Ohsato, Yoshiharu; Ito, Jumpei; Nishigaki, Kazuo

2017-10-15

Recombination events induce significant genetic changes, and this process can result in virus genetic diversity or in the generation of novel pathogenicity. We discovered a new recombinant feline leukemia virus (FeLV) gag gene harboring an unrelated insertion, termed the X region, which was derived from Felis catus endogenous gammaretrovirus 4 (FcERV-gamma4). The identified FcERV-gamma4 proviruses have lost their coding capabilities, but some can express their viral RNA in feline tissues. Although the X-region-carrying recombinant FeLVs appeared to be replication-defective viruses, they were detected in 6.4% of tested FeLV-infected cats. All isolated recombinant FeLV clones commonly incorporated a middle part of the FcERV-gamma4 5'-leader region as an X region. Surprisingly, a sequence corresponding to the portion contained in all X regions is also present in at least 13 endogenous retroviruses (ERVs) observed in the cat, human, primate, and pig genomes. We termed this shared genetic feature the commonly shared (CS) sequence. Despite our phylogenetic analysis indicating that all CS-sequence-carrying ERVs are classified as gammaretroviruses, no obvious closeness was revealed among these ERVs. However, the Shannon entropy in the CS sequence was lower than that in other parts of the provirus genome. Notably, the CS sequence of human endogenous retrovirus T had 73.8% similarity with that of FcERV-gamma4, and specific signals were detected in the human genome by Southern blot analysis using a probe for the FcERV-gamma4 CS sequence. Our results provide an interesting evolutionary history for CS-sequence circulation among several distinct ancestral viruses and a novel recombined virus over a prolonged period. IMPORTANCE Recombination among ERVs or modern viral genomes causes a rapid evolution of retroviruses, and this phenomenon can result in the serious situation of viral disease reemergence. We identified a novel recombinant FeLV gag gene that contains an unrelated sequence, termed the X region. This region originated from the 5' leader of FcERV-gamma4, a replication-incompetent feline ERV. Surprisingly, a sequence corresponding to the X region is also present in the 5' portion of other ERVs, including human endogenous retroviruses. Scattered copies of the ERVs carrying the unique genetic feature, here named the commonly shared (CS) sequence, were found in each host genome, suggesting that ancestral viruses may have captured and maintained the CS sequence. More recently, a novel recombinant FeLV hijacked the CS sequence from inactivated FcERV-gamma4 as the X region. Therefore, tracing the CS sequences can provide unique models for not only the modern reservoir of new recombinant viruses but also the genetic features shared among ancient retroviruses. Copyright © 2017 American Society for Microbiology.
Generation and Analysis of Expressed Sequence Tags (ESTs) from Halophyte Atriplex canescens to Explore Salt-Responsive Related Genes

PubMed Central

Li, Jingtao; Sun, Xinhua; Yu, Gang; Jia, Chengguo; Liu, Jinliang; Pan, Hongyu

2014-01-01

Little information is available on gene expression profiling of halophyte A. canescens. To elucidate the molecular mechanism for stress tolerance in A. canescens, a full-length complementary DNA library was generated from A. canescens exposed to 400 mM NaCl, and provided 343 high-quality ESTs. In an evaluation of 343 valid EST sequences in the cDNA library, 197 unigenes were assembled, among which 190 unigenes (83.1% ESTs) were identified according to their significant similarities with proteins of known functions. All the 343 EST sequences have been deposited in the dbEST GenBank under accession numbers JZ535802 to JZ536144. According to Arabidopsis MIPS functional category and GO classifications, we identified 193 unigenes of the 311 annotations EST, representing 72 non-redundant unigenes sharing similarities with genes related to the defense response. The sets of ESTs obtained provide a rich genetic resource and 17 up-regulated genes related to salt stress resistance were identified by qRT-PCR. Six of these genes may contribute crucially to earlier and later stage salt stress resistance. Additionally, among the 343 unigenes sequences, 22 simple sequence repeats (SSRs) were also identified contributing to the study of A. canescens resources. PMID:24960361
Discovery, characterization and expression of a novel zebrafish gene, znfr, important for notochord formation.

PubMed

Xu, Yan; Zou, Peng; Liu, Yao; Deng, Fengjiao

2010-06-01

Genes specifically expressed in the notochord may be crucial for proper notochord development. Using the digital differential display program offered by the National Center for Biotechnology Information, we identified a novel EST sequence from a zebrafish ovary library (No. XM_701450). The full-length cDNA of this transcript was cloned by performing 3' and 5'-RACE and was further confirmed by PCR and sequencing. The resulting 614 bp gene was found to encode a novel 94 amino acid protein that did not share significant homology with any other known protein. Characterization of the genomic sequence revealed that the gene spanned 4.9 kb and was composed of four exons and three introns. RT-PCR gene expression analysis revealed that our gene of interest was expressed in ovary, kidney, brain, mature oocytes and during the early stages of embryogenesis. During embryonic development, znfr mRNA was found to be expressed in the embryonic shield, chordamesoderm and the vacuolated notochord cells by in situ hybridization. Based on this information, we hypothesize that this novel gene is an important maternal factor required for zebrafish notochord formation during early embryonic development. We have thus named this gene znfr (zebrafish notochord formation related).
Genome sequence of an enhancin gene-rich nucleopolyhedrovirus (NPV) from Agrotis segetum: collinearity with Spodoptera exigua multiple NPV.

PubMed

Jakubowska, Agata K; Peters, Sander A; Ziemnicka, Jadwiga; Vlak, Just M; van Oers, Monique M

2006-03-01

The genome sequence of a Polish isolate of Agrotis segetum nucleopolyhedrovirus (AgseNPV-A) was determined and analysed. The circular genome is composed of 147,544 bp and has a G+C content of 45.7 mol%. It contains 153 putative, non-overlapping open reading frames (ORFs) encoding predicted proteins of more than 50 aa, together making up 89.8 % of the genome. The remaining 10.2 % of the DNA constitutes non-coding regions and homologous-repeat regions. One hundred and forty-three AgseNPV-A ORFs are homologues of previously reported baculovirus gene sequences. There are ten unique ORFs and they account for 3 % of the genome in total. All 62 lepidopteran baculovirus genes, including the 29 core baculovirus genes, were found in the AgseNPV-A genome. The gene content and gene order of AgseNPV-A are most similar to those of Spodoptera exigua (Se) multiple NPV and their shared homologous genes are 100 % collinear. Three putative enhancin genes were identified in the AgseNPV-A genome. In phylogenetic analysis, the AgseNPV-A enhancins form a cluster separated from enhancins of the Mamestra species NPVs.
The bipartite mitochondrial genome of Ruizia karukerae (Rhigonematomorpha, Nematoda).

PubMed

Kim, Taeho; Kern, Elizabeth; Park, Chungoo; Nadler, Steven A; Bae, Yeon Jae; Park, Joong-Ki

2018-05-10

Mitochondrial genes and whole mitochondrial genome sequences are widely used as molecular markers in studying population genetics and resolving both deep and shallow nodes in phylogenetics. In animals the mitochondrial genome is generally composed of a single chromosome, but mystifying exceptions sometimes occur. We determined the complete mitochondrial genome of the millipede-parasitic nematode Ruizia karukerae and found its mitochondrial genome consists of two circular chromosomes, which is highly unusual in bilateral animals. Chromosome I is 7,659 bp and includes six protein-coding genes, two rRNA genes and nine tRNA genes. Chromosome II comprises 7,647 bp, with seven protein-coding genes and 16 tRNA genes. Interestingly, both chromosomes share a 1,010 bp sequence containing duplicate copies of cox2 and three tRNA genes (trnD, trnG and trnH), and the nucleotide sequences between the duplicated homologous gene copies are nearly identical, suggesting a possible recent genesis for this bipartite mitochondrial genome. Given that little is known about the formation, maintenance or evolution of abnormal mitochondrial genome structures, R. karukerae mtDNA may provide an important early glimpse into this process.
NABIC: A New Access Portal to Search, Visualize, and Share Agricultural Genomics Data

PubMed Central

Seol, Young-Joo; Lee, Tae-Ho; Park, Dong-Suk; Kim, Chang-Kug

2016-01-01

The National Agricultural Biotechnology Information Center developed an access portal to search, visualize, and share agricultural genomics data with a focus on South Korean information and resources. The portal features an agricultural biotechnology database containing a wide range of omics data from public and proprietary sources. We collected 28.4 TB of data from 162 agricultural organisms, with 10 types of omics data comprising next-generation sequencing sequence read archive, genome, gene, nucleotide, DNA chip, expressed sequence tag, interactome, protein structure, molecular marker, and single-nucleotide polymorphism datasets. Our genomic resources contain information on five animals, seven plants, and one fungus, which is accessed through a genome browser. We also developed a data submission and analysis system as a web service, with easy-to-use functions and cutting-edge algorithms, including those for handling next-generation sequencing data. PMID:26848255
Genome-wide analysis of the homeodomain-leucine zipper (HD-ZIP) gene family in peach (Prunus persica).

PubMed

Zhang, C H; Ma, R J; Shen, Z J; Sun, X; Korir, N K; Yu, M L

2014-04-08

In this study, 33 homeodomain-leucine zipper (HD-ZIP) genes were identified in peach using the HD-ZIP amino acid sequences of Arabidopsis thaliana as a probe. Based on the phylogenetic analysis and the individual gene or protein characteristics, the HD-ZIP gene family in peach can be classified into 4 subfamilies, HD-ZIP I, II, III, and IV, containing 14, 7, 4, and 8 members, respectively. The most closely related peach HD-ZIP members within the same subfamilies shared very similar gene structure in terms of either intron/exon numbers or lengths. Almost all members of the same subfamily shared common motif compositions, thereby implying that the HD-ZIP proteins within the same subfamily may have functional similarity. The 33 peach HD-ZIP genes were distributed across scaffolds 1 to 7. Although the primary structure varied among HD-ZIP family proteins, their tertiary structures were similar. The results from this study will be useful in selecting candidate genes from specific subfamilies for functional analysis.

FoxP2 in song-learning birds and vocal-learning mammals.

PubMed

Webb, D M; Zhang, J

2005-01-01

FoxP2 is the first identified gene that is specifically involved in speech and language development in humans. Population genetic studies of FoxP2 revealed a selective sweep in recent human history associated with two amino acid substitutions in exon 7. Avian song learning and human language acquisition share many behavioral and neurological similarities. To determine whether FoxP2 plays a similar role in song-learning birds, we sequenced exon 7 of FoxP2 in multiple song-learning and nonlearning birds. We show extreme conservation of FoxP2 sequences in birds, including unusually low rates of synonymous substitutions. However, no amino acid substitutions are shared between the song-learning birds and humans. Furthermore, sequences from vocal-learning whales, dolphins, and bats do not share the human-unique substitutions. While FoxP2 appears to be under strong functional constraints in mammals and birds, we find no evidence for its role during the evolution of vocal learning in nonhuman animals as in humans.
Sequence change in the HS2-LCR and Ggamma-globin gene promoter region of sickle cell anemia patients.

PubMed

Adorno, E V; Moura-Neto, J P; Lyra, I; Zanette, A; Santos, L F O; Seixas, M O; Reis, M G; Goncalves, M S

2008-02-01

The fetal hemoglobin (HbF) levels and betaS-globin gene haplotypes of 125 sickle cell anemia patients from Brazil were investigated. We sequenced the Ggamma- and Agamma-globin gene promoters and the DNase I-2 hypersensitive sites in the locus control regions (HS2-LCR) of patients with HbF level disparities as compared to their betaS haplotypes. Sixty-four (51.2%) patients had CAR/Ben genotype; 36 (28.8%) Ben/Ben; 18 (14.4%) CAR/CAR; 2 (1.6%) CAR/Atypical; 2 (1.6%) Ben/Cam; 1 (0.8%) CAR/Cam; 1 (0.8%) CAR/Arab-Indian, and 1 (0.8%) Sen/Atypical. The HS2-LCR sequence analyses demonstrated a c.-10.677G>A change in patients with the Ben haplotype and high HbF levels. The Gg gene promoter sequence analyses showed a c.-157T>C substitution shared by all patients, and a c.-222_-225del related to the Cam haplotype. These results identify new polymorphisms in the HS2-LCR and Gg-globin gene promoter. Further studies are required to determine the correlation between HbF synthesis and the clinical profile of sickle cell anemia patients.
Molecular cloning, characterization and expression analysis of TLR9, MyD88 and TRAF6 genes in common carp (Cyprinus carpio).

PubMed

Kongchum, Pawapol; Hallerman, Eric M; Hulata, Gideon; David, Lior; Palti, Yniv

2011-01-01

Induction of innate immune pathways is critical for early host defense, but there is limited understanding of how teleost fishes recognize pathogen molecules and activate these pathways. In mammals, cells of the innate immune system detect pathogenic molecular structures using pattern recognition receptors (PRRs). TLR9 functions as a PRR that recognizes CpG motifs in bacterial and viral DNA and requires adaptor molecules MyD88 and TRAF6 for signal transduction. Here we report full-length cDNA isolation, structural characterization and tissue mRNA expression analysis of the common carp (cc) TLR9, MyD88 and TRAF6 gene orthologs. The ccTLR9 open-reading frame (ORF) is predicted to encode a 1064-amino acid (aa) protein. We found that MyD88 and TRAF6 genes are duplicated in common carp. This is the first report of TRAF6 duplication in a vertebrate genome and stronger evidence in support of MyD88 duplication is provided. The ccMyD88a and b ORFs are predicted to encode 288-aa and 284-aa peptides, respectively. They share 91% aa sequence identity between paralogs. The ccTRAF6a and b ORFs are both predicted to encode 543-aa peptides sharing 95% aa sequence identity between paralogs. The ccTLR9 gene is contained in a single large exon. The ccMyD88a and ccMyD88b coding sequences span five exons. The TRAF6b gene spans six exons. PCR amplification to obtain the entire coding sequence of ccTRAF6a gene was not successful. The 2104-bp fragment amplified covers the 3' end of the gene and it contains a partial sequence of one exon and three complete exons. The predicated protein domains of the ccTLR9, ccMyD88 and ccTRAF6 are conserved and resemble orthologs from other vertebrates. Real-time quantitative PCR assays of the ccTLR9, MyD88a and b, and TRAF6a and b gene transcripts in healthy common carp indicated that mRNA expression varied between tissues. Differential expression of duplicate copies were found for ccMyD88 and ccTRAF6 in white and red muscle tissues, suggesting that paralogs may have evolved and attained a new function. The genomic information we describe in this paper provides evidence of sequence and structural conservation of immune response genes in common carp. Published by Elsevier Ltd.
Complete mitochondrial DNA sequence of oyster Crassostrea hongkongensis-a case of "Tandem duplication-random loss" for genome rearrangement in Crassostrea?

PubMed Central

Yu, Ziniu; Wei, Zhengpeng; Kong, Xiaoyu; Shi, Wei

2008-01-01

Background Mitochondrial DNA sequences are extensively used as genetic markers not only for studies of population or ecological genetics, but also for phylogenetic and evolutionary analyses. Complete mt-sequences can reveal information about gene order and its variation, as well as gene and genome evolution when sequences from multiple phyla are compared. Mitochondrial gene order is highly variable among mollusks, with bivalves exhibiting the most variability. Of the 41 complete mt genomes sequenced so far, 12 are from bivalves. We determined, in the current study, the complete mitochondrial DNA sequence of Crassostrea hongkongensis. We present here an analysis of features of its gene content and genome organization in comparison with two other Crassostrea species to assess the variation within bivalves and among main groups of mollusks. Results The complete mitochondrial genome of C. hongkongensis was determined using long PCR and a primer walking sequencing strategy with genus-specific primers. The genome is 16,475 bp in length and contains 12 protein-coding genes (the atp8 gene is missing, as in most bivalves), 22 transfer tRNA genes (including a suppressor tRNA gene), and 2 ribosomal RNA genes, all of which appear to be transcribed from the same strand. A striking finding of this study is that a DNA segment containing four tRNA genes (trnk1, trnC, trnQ1 and trnN) and two duplicated or split rRNA gene (rrnL5' and rrnS) are absent from the genome, when compared with that of two other extant Crassostrea species, which is very likely a consequence of loss of a single genomic region present in ancestor of C. hongkongensis. It indicates this region seem to be a "hot spot" of genomic rearrangements over the Crassostrea mt-genomes. The arrangement of protein-coding genes in C. hongkongensis is identical to that of Crassostrea gigas and Crassostrea virginica, but higher amino acid sequence identities are shared between C. hongkongensis and C. gigas than between other pairs. There exists significant codon bias, favoring codons ending in A or T and against those ending with C. Pair analysis of genome rearrangements showed that the rearrangement distance is great between C. gigas-C. hongkongensis and C. virginica, indicating a high degree of rearrangements within Crassostrea. The determination of complete mt-genome of C. hongkongensis has yielded useful insight into features of gene order, variation, and evolution of Crassostrea and bivalve mt-genomes. Conclusion The mt-genome of C. hongkongensis shares some similarity with, and interesting differences to, other Crassostrea species and bivalves. The absence of trnC and trnN genes and duplicated or split rRNA genes from the C. hongkongensis genome is a completely novel feature not previously reported in Crassostrea species. The phenomenon is likely due to the loss of a segment that is present in other Crassostrea species and was present in ancestor of C. hongkongensis, thus a case of "tandem duplication-random loss (TDRL)". The mt-genome and new feature presented here reveal and underline the high level variation of gene order and gene content in Crassostrea and bivalves, inspiring more research to gain understanding to mechanisms underlying gene and genome evolution in bivalves and mollusks. PMID:18847502
Gene response profiles for Daphnia pulex exposed to the environmental stressor cadmium reveals novel crustacean metallothioneins.

PubMed

Shaw, Joseph R; Colbourne, John K; Davey, Jennifer C; Glaholt, Stephen P; Hampton, Thomas H; Chen, Celia Y; Folt, Carol L; Hamilton, Joshua W

2007-12-21

Genomic research tools such as microarrays are proving to be important resources to study the complex regulation of genes that respond to environmental perturbations. A first generation cDNA microarray was developed for the environmental indicator species Daphnia pulex, to identify genes whose regulation is modulated following exposure to the metal stressor cadmium. Our experiments revealed interesting changes in gene transcription that suggest their biological roles and their potentially toxicological features in responding to this important environmental contaminant. Our microarray identified genes reported in the literature to be regulated in response to cadmium exposure, suggested functional attributes for genes that share no sequence similarity to proteins in the public databases, and pointed to genes that are likely members of expanded gene families in the Daphnia genome. Genes identified on the microarray also were associated with cadmium induced phenotypes and population-level outcomes that we experimentally determined. A subset of genes regulated in response to cadmium exposure was independently validated using quantitative-realtime (Q-RT)-PCR. These microarray studies led to the discovery of three genes coding for the metal detoxication protein metallothionein (MT). The gene structures and predicted translated sequences of D. pulex MTs clearly place them in this gene family. Yet, they share little homology with previously characterized MTs. The genomic information obtained from this study represents an important first step in characterizing microarray patterns that may be diagnostic to specific environmental contaminants and give insights into their toxicological mechanisms, while also providing a practical tool for evolutionary, ecological, and toxicological functional gene discovery studies. Advances in Daphnia genomics will enable the further development of this species as a model organism for the environmental sciences.
Gene response profiles for Daphnia pulex exposed to the environmental stressor cadmium reveals novel crustacean metallothioneins

PubMed Central

Shaw, Joseph R; Colbourne, John K; Davey, Jennifer C; Glaholt, Stephen P; Hampton, Thomas H; Chen, Celia Y; Folt, Carol L; Hamilton, Joshua W

2007-01-01

Background Genomic research tools such as microarrays are proving to be important resources to study the complex regulation of genes that respond to environmental perturbations. A first generation cDNA microarray was developed for the environmental indicator species Daphnia pulex, to identify genes whose regulation is modulated following exposure to the metal stressor cadmium. Our experiments revealed interesting changes in gene transcription that suggest their biological roles and their potentially toxicological features in responding to this important environmental contaminant. Results Our microarray identified genes reported in the literature to be regulated in response to cadmium exposure, suggested functional attributes for genes that share no sequence similarity to proteins in the public databases, and pointed to genes that are likely members of expanded gene families in the Daphnia genome. Genes identified on the microarray also were associated with cadmium induced phenotypes and population-level outcomes that we experimentally determined. A subset of genes regulated in response to cadmium exposure was independently validated using quantitative-realtime (Q-RT)-PCR. These microarray studies led to the discovery of three genes coding for the metal detoxication protein metallothionein (MT). The gene structures and predicted translated sequences of D. pulex MTs clearly place them in this gene family. Yet, they share little homology with previously characterized MTs. Conclusion The genomic information obtained from this study represents an important first step in characterizing microarray patterns that may be diagnostic to specific environmental contaminants and give insights into their toxicological mechanisms, while also providing a practical tool for evolutionary, ecological, and toxicological functional gene discovery studies. Advances in Daphnia genomics will enable the further development of this species as a model organism for the environmental sciences. PMID:18154678
Cloning, analysis and functional annotation of expressed sequence tags from the Earthworm Eisenia fetida

PubMed Central

Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping

2007-01-01

Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730
Ginkgo and Welwitschia Mitogenomes Reveal Extreme Contrasts in Gymnosperm Mitochondrial Evolution.

PubMed

Guo, Wenhu; Grewe, Felix; Fan, Weishu; Young, Gregory J; Knoop, Volker; Palmer, Jeffrey D; Mower, Jeffrey P

2016-06-01

Mitochondrial genomes (mitogenomes) of flowering plants are well known for their extreme diversity in size, structure, gene content, and rates of sequence evolution and recombination. In contrast, little is known about mitogenomic diversity and evolution within gymnosperms. Only a single complete genome sequence is available, from the cycad Cycas taitungensis, while limited information is available for the one draft sequence, from Norway spruce (Picea abies). To examine mitogenomic evolution in gymnosperms, we generated complete genome sequences for the ginkgo tree (Ginkgo biloba) and a gnetophyte (Welwitschia mirabilis). There is great disparity in size, sequence conservation, levels of shared DNA, and functional content among gymnosperm mitogenomes. The Cycas and Ginkgo mitogenomes are relatively small, have low substitution rates, and possess numerous genes, introns, and edit sites; we infer that these properties were present in the ancestral seed plant. By contrast, the Welwitschia mitogenome has an expanded size coupled with accelerated substitution rates and extensive loss of these functional features. The Picea genome has expanded further, to more than 4 Mb. With regard to structural evolution, the Cycas and Ginkgo mitogenomes share a remarkable amount of intergenic DNA, which may be related to the limited recombinational activity detected at repeats in Ginkgo Conversely, the Welwitschia mitogenome shares almost no intergenic DNA with any other seed plant. By conducting the first measurements of rates of DNA turnover in seed plant mitogenomes, we discovered that turnover rates vary by orders of magnitude among species. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Sequence Divergence and Conservation in Genomes of Helicobacter cetorum Strains from a Dolphin and a Whale

PubMed Central

Kersulyte, Dangeruta; Rossi, Mirko; Berg, Douglas E.

2013-01-01

Background and Objectives Strains of Helicobacter cetorum have been cultured from several marine mammals and have been found to be closely related in 16 S rDNA sequence to the human gastric pathogen H. pylori, but their genomes were not characterized further. Methods The genomes of H. cetorum strains from a dolphin and a whale were sequenced completely using 454 technology and PCR and capillary sequencing. Results These genomes are 1.8 and 1.95 mb in size, some 7–26% larger than H. pylori genomes, and differ markedly from one another in gene content, and sequences and arrangements of shared genes. However, each strain is more related overall to H. pylori and its descendant H. acinonychis than to other known species. These H. cetorum strains lack cag pathogenicity islands, but contain novel alleles of the virulence-associated vacuolating cytotoxin (vacA) gene. Of particular note are (i) an extra triplet of vacA genes with ≤50% protein-level identity to each other in the 5′ two-thirds of the gene needed for host factor interaction; (ii) divergent sets of outer membrane protein genes; (iii) several metabolic genes distinct from those of H. pylori; (iv) genes for an iron-cofactored urease related to those of Helicobacter species from terrestrial carnivores, in addition to genes for a nickel co-factored urease; and (v) members of the slr multigene family, some of which modulate host responses to infection and improve Helicobacter growth with mammalian cells. Conclusions Our genome sequence data provide a glimpse into the novelty and great genetic diversity of marine helicobacters. These data should aid further analyses of microbial genome diversity and evolution and infection and disease mechanisms in vast and often fragile ocean ecosystems. PMID:24358262
Age-related regulation of genes: slow homeostatic changes and age-dimension technology

NASA Astrophysics Data System (ADS)

Kurachi, Kotoku; Zhang, Kezhong; Huo, Jeffrey; Ameri, Afshin; Kuwahara, Mitsuhiro; Fontaine, Jean-Marc; Yamamoto, Kei; Kurachi, Sumiko

2002-11-01

Through systematic studies of pro- and anti-blood coagulation factors, we have determined molecular mechanisms involving two genetic elements, age-related stability element (ASE), GAGGAAG and age-related increase element (AIE), a unique stretch of dinucleotide repeats (AIE). ASE and AIE are essential for age-related patterns of stable and increased gene expression patterns, respectively. Such age-related gene regulatory mechanisms are also critical for explaining homeostasis in various physiological reactions as well as slow homeostatic changes in them. The age-related increase expression of the human factor IX (hFIX) gene requires the presence of both ASE and AIE, which apparently function additively. The anti-coagulant factor protein C (hPC) gene uses an ASE (CAGGAG) to produce age-related stable expression. Both ASE sequences (G/CAGAAG) share consensus sequence of the transcriptional factor PEA-3 element. No other similar sequences, including another PEA-3 consensus sequence, GAGGATG, function in conferring age-related gene regulation. The age-regulatory mechanisms involving ASE and AIE apparently function universally with different genes and across different animal species. These findings have led us to develop a new field of research and applications, which we named “age-dimension technology (ADT)”. ADT has exciting potential for modifying age-related expression of genes as well as associated physiological processes, and developing novel, more effective prophylaxis or treatments for age-related diseases.
The mitochondrial genome sequence of Enterobius vermicularis (Nematoda: Oxyurida)--an idiosyncratic gene order and phylogenetic information for chromadorean nematodes.

PubMed

Kang, Seokha; Sultana, Tahera; Eom, Keeseon S; Park, Yung Chul; Soonthornpong, Nathan; Nadler, Steven A; Park, Joong-Ki

2009-01-15

The complete mitochondrial genome sequence was determined for the human pinworm Enterobius vermicularis (Oxyurida: Nematoda) and used to infer its phylogenetic relationship to other major groups of chromadorean nematodes. The E. vermicularis genome is a 14,010-bp circular DNA molecule that encodes 36 genes (12 proteins, 22 tRNAs, and 2 rRNAs). This mtDNA genome lacks atp8, as reported for almost all other nematode species investigated. Phylogenetic analyses (maximum parsimony, maximum likelihood, neighbor joining, and Bayesian inference) of nucleotide sequences for the 12 protein-coding genes of 25 nematode species placed E. vermicularis, a representative of the order Oxyurida, as sister to the main Ascaridida+Rhabditida group. Tree topology comparisons using statistical tests rejected an alternative hypothesis favoring a closer relationship among Ascaridida, Spirurida, and Oxyurida, which has been supported from most studies based on nuclear ribosomal DNA sequences. Unlike the relatively conserved gene arrangement found for most chromadorean taxa, E. vermicularis mtDNA gene order is very unique, not sharing similarity to any other nematode species reported to date. This lack of gene order similarity may represent idiosyncratic gene rearrangements unique to this specific lineage of the oxyurids. To more fully understand the extent of gene rearrangement and its evolutionary significance within the nematode phylogenetic framework, additional mitochondrial genomes representing a greater evolutionary diversity of species must be characterized.
The genome of woodland strawberry (Fragaria vesca)

PubMed Central

Shulaev, Vladimir; Sargent, Daniel J; Crowhurst, Ross N; Mockler, Todd C; Folkerts, Otto; Delcher, Arthur L; Jaiswal, Pankaj; Mockaitis, Keithanne; Liston, Aaron; Mane, Shrinivasrao P; Burns, Paul; Davis, Thomas M; Slovin, Janet P; Bassil, Nahla; Hellens, Roger P; Evans, Clive; Harkins, Tim; Kodira, Chinnappa; Desany, Brian; Crasta, Oswald R; Jensen, Roderick V; Allan, Andrew C; Michael, Todd P; Setubal, Joao Carlos; Celton, Jean-Marc; Rees, D Jasper G; Williams, Kelly P; Holt, Sarah H; Ruiz Rojas, Juan Jairo; Chatterjee, Mithu; Liu, Bo; Silva, Herman; Meisel, Lee; Adato, Avital; Filichkin, Sergei A; Troggio, Michela; Viola, Roberto; Ashman, Tia-Lynn; Wang, Hao; Dharmawardhana, Palitha; Elser, Justin; Raja, Rajani; Priest, Henry D; Bryant, Douglas W; Fox, Samuel E; Givan, Scott A; Wilhelm, Larry J; Naithani, Sushma; Christoffels, Alan; Salama, David Y; Carter, Jade; Girona, Elena Lopez; Zdepski, Anna; Wang, Wenqin; Kerstetter, Randall A; Schwab, Wilfried; Korban, Schuyler S; Davik, Jahn; Monfort, Amparo; Denoyes-Rothan, Beatrice; Arus, Pere; Mittler, Ron; Flinn, Barry; Aharoni, Asaph; Bennetzen, Jeffrey L; Salzberg, Steven L; Dickerman, Allan W; Velasco, Riccardo; Borodovsky, Mark; Veilleux, Richard E; Folta, Kevin M

2012-01-01

The woodland strawberry, Fragaria vesca (2n = 2x = 14), is a versatile experimental plant system. This diminutive herbaceous perennial has a small genome (240 Mb), is amenable to genetic transformation and shares substantial sequence identity with the cultivated strawberry (Fragaria × ananassa) and other economically important rosaceous plants. Here we report the draft F. vesca genome, which was sequenced to ×39 coverage using second-generation technology, assembled de novo and then anchored to the genetic linkage map into seven pseudochromosomes. This diploid strawberry sequence lacks the large genome duplications seen in other rosids. Gene prediction modeling identified 34,809 genes, with most being supported by transcriptome mapping. Genes critical to valuable horticultural traits including flavor, nutritional value and flowering time were identified. Macrosyntenic relationships between Fragaria and Prunus predict a hypothetical ancestral Rosaceae genome that had nine chromosomes. New phylogenetic analysis of 154 protein-coding genes suggests that assignment of Populus to Malvidae, rather than Fabidae, is warranted. PMID:21186353
A complete mitochondrial genome sequence of Asian black bear Sichuan subspecies (Ursus thibetanus mupinensis)

PubMed Central

Hou, Wan-ru; Chen, Yu; Wu, Xia; Hu, Jin-chu; Peng, Zheng-song; Yang, Jung; Tang, Zong-xiang; Zhou, Cai-Quan; Li, Yu-ming; Yang, Shi-kui; Du, Yu-jie; Kong, Ling-lu; Ren, Zheng-long; Zhang, Huai-yu; Shuai, Su-rong

2007-01-01

We obtained the complete mitochondrial genome of U.thibetanus mupinensis by DNA sequencing based on the PCR fragments of 18 primers we designed. The results indicate that the mtDNA is 16 868 bp in size, encodes 13 protein genes, 22 tRNA genes, and 2 rRNA genes, with an overall H-strand base composition of 31.2% A, 25.4% C, 15.5% G and 27.9% T. The sequence of the control region (CR) located between tRNA-Pro and tRNA-Phe is 1422 bp in size, consists of 8.43% of the whole genome, GC content is 51.9% and has a 6bp tandem repeat and two 10bp tandem repeats identified by using the Tandem Repeats Finder. U. thibetanus mupinensis mitochondrial genome shares high similarity with those of three other Ursidae: U. americanus (91.46%), U. arctos (89.25%) and U. maritimus (87.66%). PMID:17205108
Exome Sequencing in Suspected Monogenic Dyslipidemias

PubMed Central

Stitziel, Nathan O.; Peloso, Gina M.; Abifadel, Marianne; Cefalu, Angelo B.; Fouchier, Sigrid; Motazacker, M. Mahdi; Tada, Hayato; Larach, Daniel B.; Awan, Zuhier; Haller, Jorge F.; Pullinger, Clive R.; Varret, Mathilde; Rabès, Jean-Pierre; Noto, Davide; Tarugi, Patrizia; Kawashiri, Masa-aki; Nohara, Atsushi; Yamagishi, Masakazu; Risman, Marjorie; Deo, Rahul; Ruel, Isabelle; Shendure, Jay; Nickerson, Deborah A.; Wilson, James G.; Rich, Stephen S.; Gupta, Namrata; Farlow, Deborah N.; Neale, Benjamin M.; Daly, Mark J.; Kane, John P.; Freeman, Mason W.; Genest, Jacques; Rader, Daniel J.; Mabuchi, Hiroshi; Kastelein, John J.P.; Hovingh, G. Kees; Averna, Maurizio R.; Gabriel, Stacey; Boileau, Catherine; Kathiresan, Sekar

2015-01-01

Background Exome sequencing is a promising tool for gene mapping in Mendelian disorders. We utilized this technique in an attempt to identify novel genes underlying monogenic dyslipidemias. Methods and Results We performed exome sequencing on 213 selected family members from 41 kindreds with suspected Mendelian inheritance of extreme levels of low-density lipoprotein (LDL) cholesterol (after candidate gene sequencing excluded known genetic causes for high LDL cholesterol families) or high-density lipoprotein (HDL) cholesterol. We used standard analytic approaches to identify candidate variants and also assigned a polygenic score to each individual in order to account for their burden of common genetic variants known to influence lipid levels. In nine families, we identified likely pathogenic variants in known lipid genes (ABCA1, APOB, APOE, LDLR, LIPA, and PCSK9); however, we were unable to identify obvious genetic etiologies in the remaining 32 families despite follow-up analyses. We identified three factors that limited novel gene discovery: (1) imperfect sequencing coverage across the exome hid potentially causal variants; (2) large numbers of shared rare alleles within families obfuscated causal variant identification; and (3) individuals from 15% of families carried a significant burden of common lipid-related alleles, suggesting complex inheritance can masquerade as monogenic disease. Conclusions We identified the genetic basis of disease in nine of 41 families; however, none of these represented novel gene discoveries. Our results highlight the promise and limitations of exome sequencing as a discovery technique in suspected monogenic dyslipidemias. Considering the confounders identified may inform the design of future exome sequencing studies. PMID:25632026
G-Boxes, Bigfoot Genes, and Environmental Response: Characterization of Intragenomic Conserved Noncoding Sequences in Arabidopsis[W

PubMed Central

Freeling, Michael; Rapaka, Lakshmi; Lyons, Eric; Pedersen, Brent; Thomas, Brian C.

2007-01-01

A tetraploidy left Arabidopsis thaliana with 6358 pairs of homoeologs that, when aligned, generated 14,944 intragenomic conserved noncoding sequences (CNSs). Our previous work assembled these phylogenetic footprints into a database. We show that known transcription factor (TF) binding motifs, including the G-box, are overrepresented in these CNSs. A total of 254 genes spanning long lengths of CNS-rich chromosomes (Bigfoot) dominate this database. Therefore, we made subdatabases: one containing Bigfoot genes and the other containing genes with three to five CNSs (Smallfoot). Bigfoot genes are generally TFs that respond to signals, with their modal CNS positioned 3.1 kb 5′ from the ATG. Smallfoot genes encode components of signal transduction machinery, the cytoskeleton, or involve transcription. We queried each subdatabase with each possible 7-nucleotide sequence. Among hundreds of hits, most were purified from CNSs, and almost all of those significantly enriched in CNSs had no experimental history. The 7-mers in CNSs are not 5′- to 3′-oriented in Bigfoot genes but are often oriented in Smallfoot genes. CNSs with one G-box tend to have two G-boxes. CNSs were shared with the homoeolog only and with no other gene, suggesting that binding site turnover impedes detection. Bigfoot genes may function in adaptation to environmental change. PMID:17496117
G-boxes, bigfoot genes, and environmental response: characterization of intragenomic conserved noncoding sequences in Arabidopsis.

PubMed

Freeling, Michael; Rapaka, Lakshmi; Lyons, Eric; Pedersen, Brent; Thomas, Brian C

2007-05-01

A tetraploidy left Arabidopsis thaliana with 6358 pairs of homoeologs that, when aligned, generated 14,944 intragenomic conserved noncoding sequences (CNSs). Our previous work assembled these phylogenetic footprints into a database. We show that known transcription factor (TF) binding motifs, including the G-box, are overrepresented in these CNSs. A total of 254 genes spanning long lengths of CNS-rich chromosomes (Bigfoot) dominate this database. Therefore, we made subdatabases: one containing Bigfoot genes and the other containing genes with three to five CNSs (Smallfoot). Bigfoot genes are generally TFs that respond to signals, with their modal CNS positioned 3.1 kb 5' from the ATG. Smallfoot genes encode components of signal transduction machinery, the cytoskeleton, or involve transcription. We queried each subdatabase with each possible 7-nucleotide sequence. Among hundreds of hits, most were purified from CNSs, and almost all of those significantly enriched in CNSs had no experimental history. The 7-mers in CNSs are not 5'- to 3'-oriented in Bigfoot genes but are often oriented in Smallfoot genes. CNSs with one G-box tend to have two G-boxes. CNSs were shared with the homoeolog only and with no other gene, suggesting that binding site turnover impedes detection. Bigfoot genes may function in adaptation to environmental change.
The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans.

PubMed

Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

2015-07-20

Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Corporate control and global governance of marine genetic resources

PubMed Central

Österblom, Henrik

2018-01-01

Who owns ocean biodiversity? This is an increasingly relevant question, given the legal uncertainties associated with the use of genetic resources from areas beyond national jurisdiction, which cover half of the Earth’s surface. We accessed 38 million records of genetic sequences associated with patents and created a database of 12,998 sequences extracted from 862 marine species. We identified >1600 sequences from 91 species associated with deep-sea and hydrothermal vent systems, reflecting commercial interest in organisms from remote ocean areas, as well as a capacity to collect and use the genes of such species. A single corporation registered 47% of all marine sequences included in gene patents, exceeding the combined share of 220 other companies (37%). Universities and their commercialization partners registered 12%. Actors located or headquartered in 10 countries registered 98% of all patent sequences, and 165 countries were unrepresented. Our findings highlight the importance of inclusive participation by all states in international negotiations and the urgency of clarifying the legal regime around access and benefit sharing of marine genetic resources. We identify a need for greater transparency regarding species provenance, transfer of patent ownership, and activities of corporations with a disproportionate influence over the patenting of marine biodiversity. We suggest that identifying these key actors is a critical step toward encouraging innovation, fostering greater equity, and promoting better ocean stewardship. PMID:29881777
Sequence and expression variation in SUPPRESSOR of OVEREXPRESSION of CONSTANS 1 (SOC1): homeolog evolution in Indian Brassicas.

PubMed

Sri, Tanu; Mayee, Pratiksha; Singh, Anandita

2015-09-01

Whole genome sequence analyses allow unravelling such evolutionary consequences of meso-triplication event in Brassicaceae (∼14-20 million years ago (MYA)) as differential gene fractionation and diversification in homeologous sub-genomes. This study presents a simple gene-centric approach involving microsynteny and natural genetic variation analysis for understanding SUPPRESSOR of OVEREXPRESSION of CONSTANS 1 (SOC1) homeolog evolution in Brassica. Analysis of microsynteny in Brassica rapa homeologous regions containing SOC1 revealed differential gene fractionation correlating to reported fractionation status of sub-genomes of origin, viz. least fractionated (LF), moderately fractionated 1 (MF1) and most fractionated (MF2), respectively. Screening 18 cultivars of 6 Brassica species led to the identification of 8 genomic and 27 transcript variants of SOC1, including splice-forms. Co-occurrence of both interrupted and intronless SOC1 genes was detected in few Brassica species. In silico analysis characterised Brassica SOC1 as MADS intervening, K-box, C-terminal (MIKC(C)) transcription factor, with highly conserved MADS and I domains relative to K-box and C-terminal domain. Phylogenetic analyses and multiple sequence alignments depicting shared pattern of silent/non-silent mutations assigned Brassica SOC1 homologs into groups based on shared diploid base genome. In addition, a sub-genome structure in uncharacterised Brassica genomes was inferred. Expression analysis of putative MF2 and LF (Brassica diploid base genome A (AA)) sub-genome-specific SOC1 homeologs of Brassica juncea revealed near identical expression pattern. However, MF2-specific homeolog exhibited significantly higher expression implying regulatory diversification. In conclusion, evidence for polyploidy-induced sequence and regulatory evolution in Brassica SOC1 is being presented wherein differential homeolog expression is implied in functional diversification.
Fungal Genes in Context: Genome Architecture Reflects Regulatory Complexity and Function

PubMed Central

Noble, Luke M.; Andrianopoulos, Alex

2013-01-01

Gene context determines gene expression, with local chromosomal environment most influential. Comparative genomic analysis is often limited in scope to conserved or divergent gene and protein families, and fungi are well suited to this approach with low functional redundancy and relatively streamlined genomes. We show here that one aspect of gene context, the amount of potential upstream regulatory sequence maintained through evolution, is highly predictive of both molecular function and biological process in diverse fungi. Orthologs with large upstream intergenic regions (UIRs) are strongly enriched in information processing functions, such as signal transduction and sequence-specific DNA binding, and, in the genus Aspergillus, include the majority of experimentally studied, high-level developmental and metabolic transcriptional regulators. Many uncharacterized genes are also present in this class and, by implication, may be of similar importance. Large intergenic regions also share two novel sequence characteristics, currently of unknown significance: they are enriched for plus-strand polypyrimidine tracts and an information-rich, putative regulatory motif that was present in the last common ancestor of the Pezizomycotina. Systematic consideration of gene UIR in comparative genomics, particularly for poorly characterized species, could help reveal organisms’ regulatory priorities. PMID:23699226

The transcriptome of Lutzomyia longipalpis (Diptera: Psychodidae) male reproductive organs.

PubMed

Azevedo, Renata V D M; Dias, Denise B S; Bretãs, Jorge A C; Mazzoni, Camila J; Souza, Nataly A; Albano, Rodolpho M; Wagner, Glauber; Davila, Alberto M R; Peixoto, Alexandre A

2012-01-01

It has been suggested that genes involved in the reproductive biology of insect disease vectors are potential targets for future alternative methods of control. Little is known about the molecular biology of reproduction in phlebotomine sand flies and there is no information available concerning genes that are expressed in male reproductive organs of Lutzomyia longipalpis, the main vector of American visceral leishmaniasis and a species complex. We generated 2678 high quality ESTs ("Expressed Sequence Tags") of L. longipalpis male reproductive organs that were grouped in 1391 non-redundant sequences (1136 singlets and 255 clusters). BLAST analysis revealed that only 57% of these sequences share similarity with a L. longipalpis female EST database. Although no more than 36% of the non-redundant sequences showed similarity to protein sequences deposited in databases, more than half of them presented the best-match hits with mosquito genes. Gene ontology analysis identified subsets of genes involved in biological processes such as protein biosynthesis and DNA replication, which are probably associated with spermatogenesis. A number of non-redundant sequences were also identified as putative male reproductive gland proteins (mRGPs), also known as male accessory gland protein genes (Acps). The transcriptome analysis of L. longipalpis male reproductive organs is one step further in the study of the molecular basis of the reproductive biology of this important species complex. It has allowed the identification of genes potentially involved in spermatogenesis as well as putative mRGPs sequences, which have been studied in many insect species because of their effects on female post-mating behavior and physiology and their potential role in sexual selection and speciation. These data open a number of new avenues for further research in the molecular and evolutionary reproductive biology of sand flies.
The Transcriptome of Lutzomyia longipalpis (Diptera: Psychodidae) Male Reproductive Organs

PubMed Central

Bretãs, Jorge A. C.; Mazzoni, Camila J.; Souza, Nataly A.; Albano, Rodolpho M.; Wagner, Glauber; Davila, Alberto M. R.; Peixoto, Alexandre A.

2012-01-01

Background It has been suggested that genes involved in the reproductive biology of insect disease vectors are potential targets for future alternative methods of control. Little is known about the molecular biology of reproduction in phlebotomine sand flies and there is no information available concerning genes that are expressed in male reproductive organs of Lutzomyia longipalpis, the main vector of American visceral leishmaniasis and a species complex. Methods/Principal Findings We generated 2678 high quality ESTs (“Expressed Sequence Tags”) of L. longipalpis male reproductive organs that were grouped in 1391 non-redundant sequences (1136 singlets and 255 clusters). BLAST analysis revealed that only 57% of these sequences share similarity with a L. longipalpis female EST database. Although no more than 36% of the non-redundant sequences showed similarity to protein sequences deposited in databases, more than half of them presented the best-match hits with mosquito genes. Gene ontology analysis identified subsets of genes involved in biological processes such as protein biosynthesis and DNA replication, which are probably associated with spermatogenesis. A number of non-redundant sequences were also identified as putative male reproductive gland proteins (mRGPs), also known as male accessory gland protein genes (Acps). Conclusions The transcriptome analysis of L. longipalpis male reproductive organs is one step further in the study of the molecular basis of the reproductive biology of this important species complex. It has allowed the identification of genes potentially involved in spermatogenesis as well as putative mRGPs sequences, which have been studied in many insect species because of their effects on female post-mating behavior and physiology and their potential role in sexual selection and speciation. These data open a number of new avenues for further research in the molecular and evolutionary reproductive biology of sand flies. PMID:22496818
DMRT gene cluster analysis in the platypus: new insights into genomic organization and regulatory regions.

PubMed

El-Mogharbel, Nisrine; Wakefield, Matthew; Deakin, Janine E; Tsend-Ayush, Enkhjargal; Grützner, Frank; Alsop, Amber; Ezaz, Tariq; Marshall Graves, Jennifer A

2007-01-01

We isolated and characterized a cluster of platypus DMRT genes and compared their arrangement, location, and sequence across vertebrates. The DMRT gene cluster on human 9p24.3 harbors, in order, DMRT1, DMRT3, and DMRT2, which share a DM domain. DMRT1 is highly conserved and involved in sexual development in vertebrates, and deletions in this region cause sex reversal in humans. Sequence comparisons of DMRT genes between species have been valuable in identifying exons, control regions, and conserved nongenic regions (CNGs). The addition of platypus sequences is expected to be particularly valuable, since monotremes fill a gap in the vertebrate genome coverage. We therefore isolated and fully sequenced platypus BAC clones containing DMRT3 and DMRT2 as well as DMRT1 and then generated multispecies alignments and ran prediction programs followed by experimental verification to annotate this gene cluster. We found that the three genes have 58-66% identity to their human orthologues, lie in the same order as in other vertebrates, and colocate on 1 of the 10 platypus sex chromosomes, X5. We also predict that optimal annotation of the newly sequenced platypus genome will be challenging. The analysis of platypus sequence revealed differences in structure and sequence of the DMRT gene cluster. Multispecies comparison was particularly effective for detecting CNGs, revealing several novel potential regulatory regions within DMRT3 and DMRT2 as well as DMRT1. RT-PCR indicated that platypus DMRT1 and DMRT3 are expressed specifically in the adult testis (and not ovary), but DMRT2 has a wider expression profile, as it does for other mammals. The platypus DMRT1 expression pattern, and its location on an X chromosome, suggests an involvement in monotreme sexual development.
Draft Genome Sequences of Endophytic Isolates of Klebsiella variicola and Klebsiella pneumoniae Obtained from the Same Sugarcane Plant

PubMed Central

2018-01-01

ABSTRACT Endophytic Klebsiella variicola KvMx2 and Klebsiella pneumoniae KpMx1 isolates obtained from the same sugarcane stem were used for whole-genome sequencing. The genomes revealed clear differences in essential genes for plant growth, development, and detoxification, as well as nitrogen fixation, catalases, cellulases, and shared virulence factors described in the K. pneumoniae pathogen. PMID:29567733
MHC class II genes in European wolves: a comparison with dogs.

PubMed

Seddon, Jennifer M; Ellegren, Hans

2002-10-01

The genome of the grey wolf, one of the most widely distributed land mammal species, has been subjected to both stochastic factors, including biogeographical subdivision and population fragmentation, and strong selection during the domestication of the dog. To explore the effects of drift and selection on the partitioning of MHC variation in the diversification of species, we present nine DQA, 10 DQB, and 17 DRB1 sequences of the second exon for European wolves and compare them with sequences of North American wolves and dogs. The relatively large number of class II alleles present in both European and North American wolves attests to their large historical population sizes, yet there are few alleles shared between these regions at DQB and DRB1. Similarly, the dog has an extensive array of class II MHC alleles, a consequence of a genetically diverse origin, but allelic overlap with wolves only at DQA. Although we might expect a progression from shared alleles to shared allelic lineages during differentiation, the partitioning of diversity between wolves and dogs at DQB and DRB1 differs from that at DQA. Furthermore, an extensive region of nucleotide sequence shared between DRB1 and DQB alleles and a shared motif suggests intergenic recombination may have contributed to MHC diversity in the Canidae.
The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.

PubMed

Smith, Adam Alexander Thil; Belda, Eugeni; Viari, Alain; Medigue, Claudine; Vallenet, David

2012-05-01

Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activities by exploiting context-dependent rather than sequence-dependent data, and none are readily accessible and propose result integration across multiple genomes. Here, we present CanOE (Candidate genes for Orphan Enzymes), a four-step bioinformatics strategy that proposes ranked candidate genes for sequence-orphan enzymatic activities (or orphan enzymes for short). The first step locates "genomic metabolons", i.e. groups of co-localized genes coding proteins catalyzing reactions linked by shared metabolites, in one genome at a time. These metabolons can be particularly helpful for aiding bioanalysts to visualize relevant metabolic data. In the second step, they are used to generate candidate associations between un-annotated genes and gene-less reactions. The third step integrates these gene-reaction associations over several genomes using gene families, and summarizes the strength of family-reaction associations by several scores. In the final step, these scores are used to rank members of gene families which are proposed for metabolic reactions. These associations are of particular interest when the metabolic reaction is a sequence-orphan enzymatic activity. Our strategy found over 60,000 genomic metabolons in more than 1,000 prokaryote organisms from the MicroScope platform, generating candidate genes for many metabolic reactions, of which more than 70 distinct orphan reactions. A computational validation of the approach is discussed. Finally, we present a case study on the anaerobic allantoin degradation pathway in Escherichia coli K-12.
A complete mitochondrial genome sequence of the wild two-humped camel (Camelus bactrianus ferus): an evolutionary history of camelidae

PubMed Central

Cui, Peng; Ji, Rimutu; Ding, Feng; Qi, Dan; Gao, Hongwei; Meng, He; Yu, Jun; Hu, Songnian; Zhang, Heping

2007-01-01

Background The family Camelidae that evolved in North America during the Eocene survived with two distinct tribes, Camelini and Lamini. To investigate the evolutionary relationship between them and to further understand the evolutionary history of this family, we determined the complete mitochondrial genome sequence of the wild two-humped camel (Camelus bactrianus ferus), the only wild survivor of the Old World camel. Results The mitochondrial genome sequence (16,680 bp) from C. bactrianus ferus contains 13 protein-coding, two rRNA, and 22 tRNA genes as well as a typical control region; this basic structure is shared by all metazoan mitochondrial genomes. Its protein-coding region exhibits codon usage common to all mammals and possesses the three cryptic stop codons shared by all vertebrates. C. bactrianus ferus together with the rest of mammalian species do not share a triplet nucleotide insertion (GCC) that encodes a proline residue found only in the nd1 gene of the New World camelid Lama pacos. This lineage-specific insertion in the L. pacos mtDNA occurred after the split between the Old and New World camelids suggests that it may have functional implication since a proline insertion in a protein backbone usually alters protein conformation significantly, and nd1 gene has not been seen as polymorphic as the rest of ND family genes among camelids. Our phylogenetic study based on complete mitochondrial genomes excluding the control region suggested that the divergence of the two tribes may occur in the early Miocene; it is much earlier than what was deduced from the fossil record (11 million years). An evolutionary history reconstructed for the family Camelidae based on cytb sequences suggested that the split of bactrian camel and dromedary may have occurred in North America before the tribe Camelini migrated from North America to Asia. Conclusion Molecular clock analysis of complete mitochondrial genomes from C. bactrianus ferus and L. pacos suggested that the two tribes diverged from their common ancestor about 25 million years ago, much earlier than what was predicted based on fossil records. PMID:17640355
The genome sequence and transcriptome of Potentilla micrantha and their comparison to Fragaria vesca (the woodland strawberry)

PubMed Central

Moretto, Marco; Barghini, Elena; Mascagni, Flavia; Natali, Lucia; Brilli, Matteo; Lomsadze, Alexandre; Sonego, Paolo; Giongo, Lara; Alonge, Michael; Velasco, Riccardo; Varotto, Claudio; Šurbanovski, Nada; Borodovsky, Mark; Ward, Judson A; Engelen, Kristof; Cavallini, Andrea; Cestaro, Alessandro

2018-01-01

Abstract Background The genus Potentilla is closely related to that of Fragaria, the economically important strawberry genus. Potentilla micrantha is a species that does not develop berries but shares numerous morphological and ecological characteristics with Fragaria vesca. These similarities make P. micrantha an attractive choice for comparative genomics studies with F. vesca. Findings In this study, the P. micrantha genome was sequenced and annotated, and RNA-Seq data from the different developmental stages of flowering and fruiting were used to develop a set of gene predictions. A 327 Mbp sequence and annotation of the genome of P. micrantha, spanning 2674 sequence contigs, with an N50 size of 335,712, estimated to cover 80% of the total genome size of the species was developed. The genus Potentilla has a characteristically larger genome size than Fragaria, but the recovered sequence scaffolds were remarkably collinear at the micro-syntenic level with the genome of F. vesca, its closest sequenced relative. A total of 33,602 genes were predicted, and 95.1% of bench-marking universal single-copy orthologous genes were complete within the presented sequence. Thus, we argue that the majority of the gene-rich regions of the genome have been sequenced. Conclusions Comparisons of RNA-Seq data from the stages of floral and fruit development revealed genes differentially expressed between P. micrantha and F. vesca.The data presented are a valuable resource for future studies of berry development in Fragaria and the Rosaceae and they also shed light on the evolution of genome size and organization in this family. PMID:29659812
The genome sequence and transcriptome of Potentilla micrantha and their comparison to Fragaria vesca (the woodland strawberry).

PubMed

Buti, Matteo; Moretto, Marco; Barghini, Elena; Mascagni, Flavia; Natali, Lucia; Brilli, Matteo; Lomsadze, Alexandre; Sonego, Paolo; Giongo, Lara; Alonge, Michael; Velasco, Riccardo; Varotto, Claudio; Šurbanovski, Nada; Borodovsky, Mark; Ward, Judson A; Engelen, Kristof; Cavallini, Andrea; Cestaro, Alessandro; Sargent, Daniel James

2018-04-01

The genus Potentilla is closely related to that of Fragaria, the economically important strawberry genus. Potentilla micrantha is a species that does not develop berries but shares numerous morphological and ecological characteristics with Fragaria vesca. These similarities make P. micrantha an attractive choice for comparative genomics studies with F. vesca. In this study, the P. micrantha genome was sequenced and annotated, and RNA-Seq data from the different developmental stages of flowering and fruiting were used to develop a set of gene predictions. A 327 Mbp sequence and annotation of the genome of P. micrantha, spanning 2674 sequence contigs, with an N50 size of 335,712, estimated to cover 80% of the total genome size of the species was developed. The genus Potentilla has a characteristically larger genome size than Fragaria, but the recovered sequence scaffolds were remarkably collinear at the micro-syntenic level with the genome of F. vesca, its closest sequenced relative. A total of 33,602 genes were predicted, and 95.1% of bench-marking universal single-copy orthologous genes were complete within the presented sequence. Thus, we argue that the majority of the gene-rich regions of the genome have been sequenced. Comparisons of RNA-Seq data from the stages of floral and fruit development revealed genes differentially expressed between P. micrantha and F. vesca.The data presented are a valuable resource for future studies of berry development in Fragaria and the Rosaceae and they also shed light on the evolution of genome size and organization in this family.
Plasmid Characterization and Chromosome Analysis of Two netF+ Clostridium perfringens Isolates Associated with Foal and Canine Necrotizing Enteritis.

PubMed

Mehdizadeh Gohari, Iman; Kropinski, Andrew M; Weese, Scott J; Parreira, Valeria R; Whitehead, Ashley E; Boerlin, Patrick; Prescott, John F

2016-01-01

The recent discovery of a novel beta-pore-forming toxin, NetF, which is strongly associated with canine and foal necrotizing enteritis should improve our understanding of the role of type A Clostridium perfringens associated disease in these animals. The current study presents the complete genome sequence of two netF-positive strains, JFP55 and JFP838, which were recovered from cases of foal necrotizing enteritis and canine hemorrhagic gastroenteritis, respectively. Genome sequencing was done using Single Molecule, Real-Time (SMRT) technology-PacBio and Illumina Hiseq2000. The JFP55 and JFP838 genomes include a single 3.34 Mb and 3.53 Mb chromosome, respectively, and both genomes include five circular plasmids. Plasmid annotation revealed that three plasmids were shared by the two newly sequenced genomes, including a NetF/NetE toxins-encoding tcp-conjugative plasmid, a CPE/CPB2 toxins-encoding tcp-conjugative plasmid and a putative bacteriocin-encoding plasmid. The putative beta-pore-forming toxin genes, netF, netE and netG, were located in unique pathogenicity loci on tcp-conjugative plasmids. The C. perfringens JFP55 chromosome carries 2,825 protein-coding genes whereas the chromosome of JFP838 contains 3,014 protein-encoding genes. Comparison of these two chromosomes with three available reference C. perfringens chromosome sequences identified 48 (~247 kb) and 81 (~430 kb) regions unique to JFP55 and JFP838, respectively. Some of these divergent genomic regions in both chromosomes are phage- and plasmid-related segments. Sixteen of these unique chromosomal regions (~69 kb) were shared between the two isolates. Five of these shared regions formed a mosaic of plasmid-integrated segments, suggesting that these elements were acquired early in a clonal lineage of netF-positive C. perfringens strains. These results provide significant insight into the basis of canine and foal necrotizing enteritis and are the first to demonstrate that netF resides on a large and unique plasmid-encoded locus.
Development of a PCR-based marker utilizing a deletion mutation in the dihydroflavonol 4-reductase (DFR) gene responsible for the lack of anthocyanin production in yellow onions (Allium cepa).

PubMed

Kim, Sunggil; Yoo, Kil Sun; Pike, Leonard M

2005-02-01

Bulb color in onions (Allium cepa) is an important trait, but the mechanism of color inheritance is poorly understood at the molecular level. A previous study showed that inactivation of the dihydroflavonol 4-reductase (DFR) gene at the transcriptional level resulted in a lack of anthocyanin production in yellow onions. The objectives of the present study were the identification of the critical mutations in the DFR gene (DFR-A) and the development of a PCR-based marker for allelic selection. We report the isolation of two additional DFR homologs (DFR-B and DFR-C). No unique sequences were identified in either DFR homolog, even in the untranslated region (UTR). Both genes shared more than 95% nucleotide sequence identity with the DFR-A gene. To obtain a unique sequence from each gene, we isolated the promoter regions. Sequences of the DFR-A and DFR-B promoters differed completely from one another, except for an approximately 100-bp sequence adjacent to the 5'UTR. It was possible to specifically amplify only the DFR-A gene using primers designed to anneal to the unique promoter region. The sequences of yellow and red DFR-A alleles were the same except for a single base-pair change in the promoter and an approximately 800-bp deletion within the 3' region of the yellow DFR-A allele. This deletion was used to develop a co-dominant PCR-based marker that segregated perfectly with color phenotypes in the F2 population. These results indicate that a deletion mutation in the yellow DFR-A gene results in the lack of anthocyanin production in yellow onions.
Genomic V exons from whole genome shotgun data in reptiles.

PubMed

Olivieri, D N; von Haeften, B; Sánchez-Espinel, C; Faro, J; Gambón-Deza, F

2014-08-01

Reptiles and mammals diverged over 300 million years ago, creating two parallel evolutionary lineages amongst terrestrial vertebrates. In reptiles, two main evolutionary lines emerged: one gave rise to Squamata, while the other gave rise to Testudines, Crocodylia, and Aves. In this study, we determined the genomic variable (V) exons from whole genome shotgun sequencing (WGS) data in reptiles corresponding to the three main immunoglobulin (IG) loci and the four main T cell receptor (TR) loci. We show that Squamata lack the TRG and TRD genes, and snakes lack the IGKV genes. In representative species of Testudines and Crocodylia, the seven major IG and TR loci are maintained. As in mammals, genes of the IG loci can be grouped into well-defined IMGT clans through a multi-species phylogenetic analysis. We show that the reptilian IGHV and IGLV genes are distributed amongst the established mammalian clans, while their IGKV genes are found within a single clan, nearly exclusive from the mammalian sequences. The reptilian and mammalian TRAV genes cluster into six common evolutionary clades (since IMGT clans have not been defined for TR). In contrast, the reptilian TRBV genes cluster into three clades, which have few mammalian members. In this locus, the V exon sequences from mammals appear to have undergone different evolutionary diversification processes that occurred outside these shared reptilian clans. These sequences can be obtained in a freely available public repository (http://vgenerepertoire.org).
Detecting novel genes with sparse arrays

PubMed Central

Haiminen, Niina; Smit, Bart; Rautio, Jari; Vitikainen, Marika; Wiebe, Marilyn; Martinez, Diego; Chee, Christine; Kunkel, Joe; Sanchez, Charles; Nelson, Mary Anne; Pakula, Tiina; Saloheimo, Markku; Penttilä, Merja; Kivioja, Teemu

2014-01-01

Species-specific genes play an important role in defining the phenotype of an organism. However, current gene prediction methods can only efficiently find genes that share features such as sequence similarity or general sequence characteristics with previously known genes. Novel sequencing methods and tiling arrays can be used to find genes without prior information and they have demonstrated that novel genes can still be found from extensively studied model organisms. Unfortunately, these methods are expensive and thus are not easily applicable, e.g., to finding genes that are expressed only in very specific conditions. We demonstrate a method for finding novel genes with sparse arrays, applying it on the 33.9 Mb genome of the filamentous fungus Trichoderma reesei. Our computational method does not require normalisations between arrays and it takes into account the multiple-testing problem typical for analysis of microarray data. In contrast to tiling arrays, that use overlapping probes, only one 25mer microarray oligonucleotide probe was used for every 100 b. Thus, only relatively little space on a microarray slide was required to cover the intergenic regions of a genome. The analysis was done as a by-product of a conventional microarray experiment with no additional costs. We found at least 23 good candidates for novel transcripts that could code for proteins and all of which were expressed at high levels. Candidate genes were found to neighbour ire1 and cre1 and many other regulatory genes. Our simple, low-cost method can easily be applied to finding novel species-specific genes without prior knowledge of their sequence properties. PMID:20691772
Structural analysis of the 5{prime} region of mouse and human Huntington disease genes reveals conservation of putative promoter region and Di- and trinucleotide polymorphisms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Biaoyang; Nasir, J.; Kalchman, M.A.

1995-02-10

We have previously cloned and characterized the murine homologue of the Huntington disease (HD) gene and shown that it maps to mouse chromosome 5 within a region of conserved synteny with human chromosome 4p16.3. Here we present a detailed comparison of the sequence of the putative promoter and the organization of the 5{prime} genomic region of the murine (Hdh) and human HD genes encompassing the first five exons. We show that in this region these two genes share identical exon boundaries, but have different-size introns. Two dinucleotide (CT) and one trinucleotide intronic polymorphism in Hdh and an intronic CA polymorphismmore » in the HD gene were identified. Comparison of 940-bp sequence 5{prime} to the putative translation start site reveals a highly conserved region (78.8% nucleotide identity) between Hdh and the HD gene from nucleotide -56 to -206 (of Hdh). Neither Hdh nor the HD gene have typical TATA or CCAAT elements, but both show one putative AP2 binding site and numerous potential Sp1 binding sites. The high sequence identity between Hdh and the HD gene for approximately 200 bp 5{prime} to the putative translation start site indicates that these sequences may play a role in regulating expression of the Huntington disease gene. 30 refs., 4 figs., 2 tabs.« less
Strain diversity and host specificity in bee gut symbionts revealed by deep sampling of single copy protein-coding sequences

PubMed Central

Powell, J. Elijah; Ratnayeke, Nalin; Moran, Nancy A.

2017-01-01

High throughput rRNA amplicon surveys of bacterial communities provide a rapid snapshot of taxonomic composition. But strains with nearly identical rRNA sequences often differ in gene repertoires and metabolic capabilities. To assess strain-level variation within Snodgrassella alvi, a gut symbiont of corbiculate bees, we performed deep sequencing on amplicons of a single copy coding gene (minD) as well as the 16S rDNA V4 region. We surveyed honey bees (Apis mellifera) sampled globally and 12 bumble bee species (Bombus) sampled from two regions of the USA. The minD analyses reveal that S. alvi contains far more strain diversity than is evident from 16S rDNA analysis. Many taxa inferred on the basis of 16S rDNA are shared between A. mellifera and Bombus species, but taxa inferred on the basis of minD are never shared and often are restricted to particular Bombus species. Clustering based on minD revealed that gut communities often reflect host species and geographic location. Both minD and 16S rDNA analyses indicate that strain diversity is higher in A. mellifera than in Bombus species. The minD locus flanks a 16S gene, enabling development of strain-specific 16S fluorescent probes to illuminate the spatial relationship of strains within the bee gut. PMID:27482856
PreCisIon: PREdiction of CIS-regulatory elements improved by gene's positION.

PubMed

Elati, Mohamed; Nicolle, Rémy; Junier, Ivan; Fernández, David; Fekih, Rim; Font, Julio; Képès, François

2013-02-01

Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli, respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases.
Detection of the High-Level Aminoglycoside Resistance Gene aph(2")-Ib in Enterococcus faecium

PubMed Central

Kao, Susan J.; You, Il; Clewell, Don B.; Donabedian, Susan M.; Zervos, Marcus J.; Petrin, Joanne; Shaw, Karen J.; Chow, Joseph W.

2000-01-01

A new high-level gentamicin resistance gene, designated aph(2")-Ib, was cloned from Enterococcus faecium SF11770. The deduced amino acid sequence of the 897-bp open reading frame of aph(2")-Ib shares homology with the aminoglycoside-modifying enzymes AAC(6′)-APH(2"), APH(2")-Ic, and APH(2")-Id. The observed phosphotransferase activity is designated APH(2")-Ib. PMID:10991878
Transfer of Pseudomonas flectens Johnson 1956 to Phaseolibacter gen. nov., in the family Enterobacteriaceae, as Phaseolibacter flectens gen. nov., comb. nov.

PubMed

Halpern, Malka; Fridman, Svetlana; Aizenberg-Gershtein, Yana; Izhaki, Ido

2013-01-01

Pseudomonas flectens Johnson 1956, a plant-pathogenic bacterium on the pods of the French bean, is no longer considered to be a member of the genus Pseudomonas sensu stricto. A polyphasic approach that included examination of phenotypic properties and phylogenetic analyses based on 16S rRNA, rpoB and atpD gene sequences supported the transfer of Pseudomonas flectens Johnson 1956 to a new genus in the family Enterobacteriaceae as Phaseolibacter flectens gen. nov., comb. nov. Two strains of Phaseolibacter flectens were studied (ATCC 12775(T) and LMG 2186); the strains shared 99.8 % sequence similarity in their 16S rRNA genes and the housekeeping gene sequences were identical. Strains of Phaseolibacter flectens shared 96.6 % or less 16S rRNA gene sequence similarity with members of different genera in the family Enterobacteriaceae and only 84.7 % sequence similarity with Pseudomonas aeruginosa LMG 1242(T), demonstrating that they are not related to the genus Pseudomonas. As Phaseolibacter flectens formed an independent phyletic lineage in all of the phylogenetic analyses, it could not be affiliated to any of the recognized genera within the family Enterobacteriaceae and therefore was assigned to a new genus. Cells were Gram-negative, straight rods, motile by means of one or two polar flagella, fermentative, facultative anaerobes, oxidase-negative and catalase-positive. Growth occurred in the presence of 0-60 % sucrose. The DNA G+C content of the type strain was 44.3 mol%. On the basis of phenotypic properties and phylogenetic distinctiveness, Pseudomonas flectens Johnson 1956 is transferred to the novel genus Phaseolibacter gen. nov. as Phaseolibacter flectens gen. nov., comb. nov. The type strain of Phaseolibacter flectens is ATCC 12775(T) = CFBP 3281(T) = ICMP 745(T) = LMG 2187(T) = NCPPB 539(T).
CPm gene diversity in field isolates of Citrus tristeza virus from Colombia.

PubMed

Oliveros-Garay, Oscar Arturo; Martinez-Salazar, Natalhie; Torres-Ruiz, Yanneth; Acosta, Orlando

2009-01-01

The nucleotide sequence diversity of the CPm gene from 28 field isolates of Citrus tristeza virus (CTV) was assessed by SSCP and sequence analyses. These isolates showed two major shared haplotypes, which differed in distribution: A1 was the major haplotype in 23 isolates from different geographic regions, whereas R1 was found in isolates from a discrete region. Phylogenetic reconstruction clustered A1 within an independent group, while R1 was grouped with mild isolates T30 from Florida and T385 from Spain. Some isolates contained several minor haplotypes, which were very similar to, and associated with, the major haplotype.
The genome sequence of Leishmania (Leishmania) amazonensis: functional annotation and extended analysis of gene models.

PubMed

Real, Fernando; Vidal, Ramon Oliveira; Carazzolle, Marcelo Falsarella; Mondego, Jorge Maurício Costa; Costa, Gustavo Gilson Lacerda; Herai, Roberto Hirochi; Würtele, Martin; de Carvalho, Lucas Miguel; Carmona e Ferreira, Renata; Mortara, Renato Arruda; Barbiéri, Clara Lucia; Mieczkowski, Piotr; da Silveira, José Franco; Briones, Marcelo Ribeiro da Silva; Pereira, Gonçalo Amarante Guimarães; Bahia, Diana

2013-12-01

We present the sequencing and annotation of the Leishmania (Leishmania) amazonensis genome, an etiological agent of human cutaneous leishmaniasis in the Amazon region of Brazil. L. (L.) amazonensis shares features with Leishmania (L.) mexicana but also exhibits unique characteristics regarding geographical distribution and clinical manifestations of cutaneous lesions (e.g. borderline disseminated cutaneous leishmaniasis). Predicted genes were scored for orthologous gene families and conserved domains in comparison with other human pathogenic Leishmania spp. Carboxypeptidase, aminotransferase, and 3'-nucleotidase genes and ATPase, thioredoxin, and chaperone-related domains were represented more abundantly in L. (L.) amazonensis and L. (L.) mexicana species. Phylogenetic analysis revealed that these two species share groups of amastin surface proteins unique to the genus that could be related to specific features of disease outcomes and host cell interactions. Additionally, we describe a hypothetical hybrid interactome of potentially secreted L. (L.) amazonensis proteins and host proteins under the assumption that parasite factors mimic their mammalian counterparts. The model predicts an interaction between an L. (L.) amazonensis heat-shock protein and mammalian Toll-like receptor 9, which is implicated in important immune responses such as cytokine and nitric oxide production. The analysis presented here represents valuable information for future studies of leishmaniasis pathogenicity and treatment.

The Genome Sequence of Leishmania (Leishmania) amazonensis: Functional Annotation and Extended Analysis of Gene Models

PubMed Central

Real, Fernando; Vidal, Ramon Oliveira; Carazzolle, Marcelo Falsarella; Mondego, Jorge Maurício Costa; Costa, Gustavo Gilson Lacerda; Herai, Roberto Hirochi; Würtele, Martin; de Carvalho, Lucas Miguel; e Ferreira, Renata Carmona; Mortara, Renato Arruda; Barbiéri, Clara Lucia; Mieczkowski, Piotr; da Silveira, José Franco; Briones, Marcelo Ribeiro da Silva; Pereira, Gonçalo Amarante Guimarães; Bahia, Diana

2013-01-01

We present the sequencing and annotation of the Leishmania (Leishmania) amazonensis genome, an etiological agent of human cutaneous leishmaniasis in the Amazon region of Brazil. L. (L.) amazonensis shares features with Leishmania (L.) mexicana but also exhibits unique characteristics regarding geographical distribution and clinical manifestations of cutaneous lesions (e.g. borderline disseminated cutaneous leishmaniasis). Predicted genes were scored for orthologous gene families and conserved domains in comparison with other human pathogenic Leishmania spp. Carboxypeptidase, aminotransferase, and 3′-nucleotidase genes and ATPase, thioredoxin, and chaperone-related domains were represented more abundantly in L. (L.) amazonensis and L. (L.) mexicana species. Phylogenetic analysis revealed that these two species share groups of amastin surface proteins unique to the genus that could be related to specific features of disease outcomes and host cell interactions. Additionally, we describe a hypothetical hybrid interactome of potentially secreted L. (L.) amazonensis proteins and host proteins under the assumption that parasite factors mimic their mammalian counterparts. The model predicts an interaction between an L. (L.) amazonensis heat-shock protein and mammalian Toll-like receptor 9, which is implicated in important immune responses such as cytokine and nitric oxide production. The analysis presented here represents valuable information for future studies of leishmaniasis pathogenicity and treatment. PMID:23857904
Molecular analysis of the split cox1 gene from the Basidiomycota Agrocybe aegerita: relationship of its introns with homologous Ascomycota introns and divergence levels from common ancestral copies.

PubMed

Gonzalez, P; Barroso, G; Labarère, J

1998-10-05

The Basidiomycota Agrocybe aegerita (Aa) mitochondrial cox1 gene (6790 nucleotides), encoding a protein of 527aa (58377Da), is split by four large subgroup IB introns possessing site-specific endonucleases assumed to be involved in intron mobility. When compared to other fungal COX1 proteins, the Aa protein is closely related to the COX1 one of the Basidiomycota Schizophyllum commune (Sc). This clade reveals a relationship with the studied Ascomycota ones, with the exception of Schizosaccharomyces pombe (Sp) which ranges in an out-group position compared with both higher fungi divisions. When comparison is extended to other kingdoms, fungal COX1 sequences are found to be more related to algae and plant ones (more than 57.5% aa similarity) than to animal sequences (53.6% aa similarity), contrasting with the previously established close relationship between fungi and animals, based on comparisons of nuclear genes. The four Aa cox1 introns are homologous to Ascomycota or algae cox1 introns sharing the same location within the exonic sequences. The percentages of identity of the intronic nucleotide sequences suggest a possible acquisition by lateral transfers of ancestral copies or of their derived sequences. These identities extend over the whole intronic sequences, arguing in favor of a transfer of the complete intron rather than a transfer limited to the encoded ORF. The intron i4 shares 74% of identity, at the nucleotidic level, with the Podospora anserina (Pa) intron i14, and up to 90.5% of aa similarity between the encoded proteins, i.e. the highest values reported to date between introns of two phylogenetically distant species. This low divergence argues for a recent lateral transfer between the two species. On the contrary, the low sequence identities (below 36%) observed between Aa i1 and the homologous Sp i1 or Prototheca wickeramii (Pw) i1 suggest a long evolution time after the separation of these sequences. The introns i2 and i3 possessed intermediate percentages of identity with their homologous Ascomycota introns. This is the first report of the complete nucleotide sequence and molecular organization of a mitochondrial cox1 gene of any member of the Basidiomycota division.
Fetal origins of the TEL-AML1 fusion gene in identical twins with leukemia

PubMed Central

Ford, Anthony M.; Bennett, Caroline A.; Price, Cathy M.; Bruin, M. C. A.; Van Wering, Elisabeth R.; Greaves, Mel

1998-01-01

The TEL (ETV6)−AML1 (CBFA2) gene fusion is the most common reciprocal chromosomal rearrangement in childhood cancer occurring in ≈25% of the most predominant subtype of leukemia— common acute lymphoblastic leukemia. The TEL-AML1 genomic sequence has been characterized in a pair of monozygotic twins diagnosed at ages 3 years, 6 months and 4 years, 10 months with common acute lymphoblastic leukemia. The twin leukemic DNA shared the same unique (or clonotypic) but nonconstitutive TEL-AML1 fusion sequence. The most plausible explanation for this finding is a single cell origin of the TEL-AML fusion in one fetus in utero, probably as a leukemia-initiating mutation, followed by intraplacental metastasis of clonal progeny to the other twin. Clonal identity is further supported by the finding that the leukemic cells in the two twins shared an identical rearranged IGH allele. These data have implications for the etiology and natural history of childhood leukemia. PMID:9539781
Ancestral and more recently acquired syntenic relationships of MADS-box genes uncovered by the Physcomitrella patens pseudochromosomal genome assembly.

PubMed

Barker, Elizabeth I; Ashton, Neil W

2016-03-01

The Physcomitrella pseudochromosomal genome assembly revealed previously invisible synteny enabling realisation of the full potential of shared synteny as a tool for probing evolution of this plant's MADS-box gene family. Assembly of the sequenced genome of Physcomitrella patens into 27 mega-scaffolds (pseudochromosomes) has confirmed the major predictions of our earlier model of expansion of the MADS-box gene family in the Physcomitrella lineage. Additionally, microsynteny has been conserved in the immediate vicinity of some recent duplicates of MADS-box genes. However, comparison of non-syntenic MIKC MADS-box genes and neighbouring genes indicates that chromosomal rearrangements and/or sequence degeneration have destroyed shared synteny over longer distances (macrosynteny) around MADS-box genes despite subsets comprising two or three MIKC genes having remained syntenic. In contrast, half of the type I MADS-box genes have been transposed creating new syntenic relations with MIKC genes. This implies that conservation of ancient ancestral synteny of MIKC genes and of more recently acquired synteny of type I and MIKC genes may be selectively advantageous. Our revised model predicts the birth rate of MIKC genes in Physcomitrella is higher than that of type I genes. However, this difference is attributable to an early tandem duplication and an early segmental duplication of MIKC genes prior to the two polyploidisations that account for most of the expansion of the MADS-box gene family in Physcomitrella. Furthermore, this early segmental duplication spawned two chromosomal lineages: one with a MIKC (C) gene, belonging to the PPM2 clade, in close proximity to one or a pair of MIKC* genes and another with a MIKC (C) gene, belonging to the PpMADS-S clade, characterised by greater separation from syntenic MIKC* genes. Our model has evolutionary implications for the Physcomitrella karyotype.
Comparative Analysis of Genome Sequences Covering the Seven Cronobacter Species

PubMed Central

Cummings, Craig A.; Shih, Rita; Degoricija, Lovorka; Rico, Alain; Brzoska, Pius; Hamby, Stephen E.; Masood, Naqash; Hariri, Sumyya; Sonbol, Hana; Chuzhanova, Nadia; McClelland, Michael; Furtado, Manohar R.; Forsythe, Stephen J.

2012-01-01

Background Species of Cronobacter are widespread in the environment and are occasional food-borne pathogens associated with serious neonatal diseases, including bacteraemia, meningitis, and necrotising enterocolitis. The genus is composed of seven species: C. sakazakii, C. malonaticus, C. turicensis, C. dublinensis, C. muytjensii, C. universalis, and C. condimenti. Clinical cases are associated with three species, C. malonaticus, C. turicensis and, in particular, with C. sakazakii multilocus sequence type 4. Thus, it is plausible that virulence determinants have evolved in certain lineages. Methodology/Principal Findings We generated high quality sequence drafts for eleven Cronobacter genomes representing the seven Cronobacter species, including an ST4 strain of C. sakazakii. Comparative analysis of these genomes together with the two publicly available genomes revealed Cronobacter has over 6,000 genes in one or more strains and over 2,000 genes shared by all Cronobacter. Considerable variation in the presence of traits such as type six secretion systems, metal resistance (tellurite, copper and silver), and adhesins were found. C. sakazakii is unique in the Cronobacter genus in encoding genes enabling the utilization of exogenous sialic acid which may have clinical significance. The C. sakazakii ST4 strain 701 contained additional genes as compared to other C. sakazakii but none of them were known specific virulence-related genes. Conclusions/Significance Genome comparison revealed that pair-wise DNA sequence identity varies between 89 and 97% in the seven Cronobacter species, and also suggested various degrees of divergence. Sets of universal core genes and accessory genes unique to each strain were identified. These gene sequences can be used for designing genus/species specific detection assays. Genes encoding adhesins, T6SS, and metal resistance genes as well as prophages are found in only subsets of genomes and have contributed considerably to the variation of genomic content. Differences in gene content likely contribute to differences in the clinical and environmental distribution of species and sequence types. PMID:23166675
Microsporidia, amitochondrial protists, possess a 70-kDa heat shock protein gene of mitochondrial evolutionary origin.

PubMed

Peyretaillade, E; Broussolle, V; Peyret, P; Méténier, G; Gouy, M; Vivarès, C P

1998-06-01

An intronless gene encoding a protein of 592 amino acid residues with similarity to 70-kDa heat shock proteins (HSP70s) has been cloned and sequenced from the amitochondrial protist Encephalitozoon cuniculi (phylum Microsporidia). Southern blot analyses show the presence of a single gene copy located on chromosome XI. The encoded protein exhibits an N-terminal hydrophobic leader sequence and two motifs shared by proteobacterial and mitochondrially expressed HSP70 homologs. Phylogenetic analysis using maximum likelihood and evolutionary distances place the E. cuniculi sequence in the cluster of mitochondrially expressed HSP70s, with a higher evolutionary rate than those of homologous sequences. Similar results were obtained after cloning a fragment of the homologous gene in the closely related species E. hellem. The presence of a nuclear targeting signal-like sequence supports a role of the Encephalitozoon HSP70 as a molecular chaperone of nuclear proteins. No evidence for cytosolic or endoplasmic reticulum forms of HSP70 was obtained through PCR amplification. These data suggest that Encephalitozoon species have evolved from an ancestor bearing mitochondria, which is in disagreement with the postulated presymbiotic origin of Microsporidia. The specific role and intracellular localization of the mitochondrial HSP70-like protein remain to be elucidated.
Complete Genome Sequence of Akkermansia glycaniphila Strain PytT, a Mucin-Degrading Specialist of the Reticulated Python Gut

PubMed Central

Ouwerkerk, Janneke P.; Schaap, Peter J.; Ritari, Jarmo; Paulin, Lars; Belzer, Clara

2017-01-01

ABSTRACT Akkermansia glycaniphila is a novel Akkermansia species that was isolated from the intestine of the reticulated python and shares the capacity to degrade mucin with the human strain Akkermansia muciniphila MucT. Here, we report the complete genome sequence of strain PytT of 3,074,121 bp. The genomic analysis reveals genes for mucin degradation and aerobic respiration. PMID:28057747
Draft Genome Sequences of Endophytic Isolates of Klebsiella variicola and Klebsiella pneumoniae Obtained from the Same Sugarcane Plant.

PubMed

Reyna-Flores, Fernando; Barrios-Camacho, Humberto; Dantán-González, Edgar; Ramírez-Trujillo, José Augusto; Lozano Aguirre Beltrán, Luis Fernando; Rodríguez-Medina, Nadia; Garza-Ramos, Ulises; Suárez-Rodríguez, Ramón

2018-03-22

Endophytic Klebsiella variicola KvMx2 and Klebsiella pneumoniae KpMx1 isolates obtained from the same sugarcane stem were used for whole-genome sequencing. The genomes revealed clear differences in essential genes for plant growth, development, and detoxification, as well as nitrogen fixation, catalases, cellulases, and shared virulence factors described in the K. pneumoniae pathogen. Copyright © 2018 Reyna-Flores et al.
Draft genome sequence of Cicer reticulatum L., the wild progenitor of chickpea provides a resource for agronomic trait improvement.

PubMed

Gupta, Sonal; Nawaz, Kashif; Parween, Sabiha; Roy, Riti; Sahu, Kamlesh; Kumar Pole, Anil; Khandal, Hitaishi; Srivastava, Rishi; Kumar Parida, Swarup; Chattopadhyay, Debasis

2017-02-01

Cicer reticulatum L. is the wild progenitor of the fourth most important legume crop chickpea (C. arietinum L.). We assembled short-read sequences into 416 Mb draft genome of C. reticulatum and anchored 78% (327 Mb) of this assembly to eight linkage groups. Genome annotation predicted 25,680 protein-coding genes covering more than 90% of predicted gene space. The genome assembly shared a substantial synteny and conservation of gene orders with the genome of the model legume Medicago truncatula. Resistance gene homologs of wild and domesticated chickpeas showed high sequence homology and conserved synteny. Comparison of gene sequences and nucleotide diversity using 66 wild and domesticated chickpea accessions suggested that the desi type chickpea was genetically closer to the wild species than the kabuli type. Comparative analyses predicted gene flow between the wild and the cultivated species during domestication. Molecular diversity and population genetic structure determination using 15,096 genome-wide single nucleotide polymorphisms revealed an admixed domestication pattern among cultivated (desi and kabuli) and wild chickpea accessions belonging to three population groups reflecting significant influence of parentage or geographical origin for their cultivar-specific population classification. The assembly and the polymorphic sequence resources presented here would facilitate the study of chickpea domestication and targeted use of wild Cicer germplasms for agronomic trait improvement in chickpea. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Complexity in the cattle CD94/NKG2 gene families.

PubMed

Birch, James; Ellis, Shirley A

2007-04-01

Natural killer cell responses are controlled to a large extent by the interaction of an array of inhibitory and activating receptors with their ligands. The mostly nonpolymorphic CD94/NKG2 receptors in both humans and mice were shown to recognize a single nonclassical MHC class I molecule in each case. In this paper, we describe the CD94/NKG2 gene family in cattle. NKG2 and CD94 sequences were amplified from cDNA derived from four animals. Four CD94 sequences, ten NKG2A, and three NKG2C sequences were identified in total. In contrast to human, we show that cattle have multiple distinct NKG2A genes, some of which show minor allelic variation. All of the sequences designated NKG2A have two tyrosine-based inhibitory motifs in the cytoplasmic domain and one putative gene has, in addition, a charged residue in the transmembrane domain. NKG2C appears to be essentially monomorphic in cattle. All of the NKG2A sequences are similar apart from NKG2A-01, which, in contrast, shares the majority of its carbohydrate recognition domain with NKG2-C. Most of the genes appear to generate multiple alternatively spliced forms. These findings suggest that the CD94/NKG2A heterodimers in cattle, in contrast to other species, are binding several different ligands. Because NKG2C is not polymorphic, this raises questions as to the combined functional capacity of the CD94/NKG2 gene families in cattle.
Comparative genomics of Lactobacillus

PubMed Central

Kant, Ravi; Blom, Jochen; Palva, Airi; Siezen, Roland J.; de Vos, Willem M.

2011-01-01

Summary The genus Lactobacillus includes a diverse group of bacteria consisting of many species that are associated with fermentations of plants, meat or milk. In addition, various lactobacilli are natural inhabitants of the intestinal tract of humans and other animals. Finally, several Lactobacillus strains are marketed as probiotics as their consumption can confer a health benefit to host. Presently, 154 Lactobacillus species are known and a growing fraction of these are subject to draft genome sequencing. However, complete genome sequences are needed to provide a platform for detailed genomic comparisons. Therefore, we selected a total of 20 genomes of various Lactobacillus strains for which complete genomic sequences have been reported. These genomes had sizes varying from 1.8 to 3.3 Mb and other characteristic features, such as G+C content that ranged from 33% to 51%. The Lactobacillus pan genome was found to consist of approximately 14 000 protein‐encoding genes while all 20 genomes shared a total of 383 sets of orthologous genes that defined the Lactobacillus core genome (LCG). Based on advanced phylogeny of the proteins encoded by this LCG, we grouped the 20 strains into three main groups and defined core group genes present in all genomes of a single group, signature group genes shared in all genomes of one group but absent in all other Lactobacillus genomes, and Group‐specific ORFans present in core group genes of one group and absent in all other complete genomes. The latter are of specific value in defining the different groups of genomes. The study provides a platform for present individual comparisons as well as future analysis of new Lactobacillus genomes. PMID:21375712
Within-host whole genome analysis of an antibiotic resistant Pseudomonas aeruginosa strain sub-type in cystic fibrosis.

PubMed

Sherrard, Laura J; Tai, Anna S; Wee, Bryan A; Ramsay, Kay A; Kidd, Timothy J; Ben Zakour, Nouri L; Whiley, David M; Beatson, Scott A; Bell, Scott C

2017-01-01

A Pseudomonas aeruginosa AUST-02 strain sub-type (M3L7) has been identified in Australia, infects the lungs of some people with cystic fibrosis and is associated with antibiotic resistance. Multiple clonal lineages may emerge during treatment with mutations in chromosomally encoded antibiotic resistance genes commonly observed. Here we describe the within-host diversity and antibiotic resistance of M3L7 during and after antibiotic treatment of an acute pulmonary exacerbation using whole genome sequencing and show both variation and shared mutations in important genes. Eleven isolates from an M3L7 population (n = 134) isolated over 3 months from an individual with cystic fibrosis underwent whole genome sequencing. A phylogeny based on core genome SNPs identified three distinct phylogenetic groups comprising two groups with higher rates of mutation (hypermutators) and one non-hypermutator group. Genomes were screened for acquired antibiotic resistance genes with the result suggesting that M3L7 resistance is principally driven by chromosomal mutations as no acquired mechanisms were detected. Small genetic variations, shared by all 11 isolates, were found in 49 genes associated with antibiotic resistance including frame-shift mutations (mexA, mexT), premature stop codons (oprD, mexB) and mutations in quinolone-resistance determining regions (gyrA, parE). However, whole genome sequencing also revealed mutations in 21 genes that were acquired following divergence of groups, which may also impact the activity of antibiotics and multi-drug efflux pumps. Comparison of mutations with minimum inhibitory concentrations of anti-pseudomonal antibiotics could not easily explain all resistance profiles observed. These data further demonstrate the complexity of chronic and antibiotic resistant P. aeruginosa infection where a multitude of co-existing genotypically diverse sub-lineages might co-exist during and after intravenous antibiotic treatment.
A Case-by-Case Evolutionary Analysis of Four Imprinted Retrogenes

PubMed Central

McCole, Ruth B; Loughran, Noeleen B; Chahal, Mandeep; Fernandes, Luis P; Roberts, Roland G; Fraternali, Franca; O'Connell, Mary J; Oakey, Rebecca J

2011-01-01

Retroposition is a widespread phenomenon resulting in the generation of new genes that are initially related to a parent gene via very high coding sequence similarity. We examine the evolutionary fate of four retrogenes generated by such an event; mouse Inpp5f_v2, Mcts2, Nap1l5, and U2af1-rs1. These genes are all subject to the epigenetic phenomenon of parental imprinting. We first provide new data on the age of these retrogene insertions. Using codon-based models of sequence evolution, we show these retrogenes have diverse evolutionary trajectories, including divergence from the parent coding sequence under positive selection pressure, purifying selection pressure maintaining parent-retrogene similarity, and neutral evolution. Examination of the expression pattern of retrogenes shows an atypical, broad pattern across multiple tissues. Protein 3D structure modeling reveals that a positively selected residue in U2af1-rs1, not shared by its parent, may influence protein conformation. Our case-by-case analysis of the evolution of four imprinted retrogenes reveals that this interesting class of imprinted genes, while similar in regulation and sequence characteristics, follow very varied evolutionary paths. PMID:21166792
Rare Compound Heterozygous Frameshift Mutations in ALMS1 Gene Identified Through Exome Sequencing in a Taiwanese Patient With Alström Syndrome.

PubMed

Tsai, Meng-Che; Yu, Hui-Wen; Liu, Tsunglin; Chou, Yen-Yin; Chiou, Yuan-Yow; Chen, Peng-Chieh

2018-01-01

Alström syndrome (AS) is a rare autosomal recessive disorder that shares clinical features with other ciliopathy-related diseases. Genetic mutation analysis is often required in making differential diagnosis but usually costly in time and effort using conventional Sanger sequencing. Herein we describe a Taiwanese patient presenting cone-rod dystrophy and early-onset obesity that progressed to diabetes mellitus with marked insulin resistance during adolescence. Whole exome sequencing of the patient's genomic DNA identified a novel frameshift mutation in exons 15 (c.10290_10291delTA, p.Lys3431Serfs * 10) and a rare mutation in 16 (c.10823_10824delAG, p.Arg3609Alafs * 6) of ALMS1 gene. The compound heterozygous mutations were predicted to render truncated proteins. This report highlighted the clinical utility of exome sequencing and extended the knowledge of mutation spectrum in AS patients.
An innovative platform for quick and flexible joining of assorted DNA fragments

DOE PAGES

De Paoli, Henrique Cestari; Tuskan, Gerald A.; Yang, Xiaohan

2016-01-13

Successful synthetic biology efforts rely on conceptual and experimental designs in combination with testing of multi-gene constructs. Despite recent progresses, several limitations still hinder the ability to flexibly assemble and collectively share different types of DNA segments. We describe an advanced system for joining DNA fragments from a universal library that automatically maintains open reading frames (ORFs) and does not require linkers, adaptors, sequence homology, amplification or mutation (domestication) of fragments in order to work properly. Moreover, we find that this system, which is enhanced by a unique buffer formulation, provides unforeseen capabilities for testing, and sharing, complex multi-gene circuitrymore » assembled from different DNA fragments.« less
Delineation of Streptococcus dysgalactiae, Its Subspecies, and Its Clinical and Phylogenetic Relationship to Streptococcus pyogenes

PubMed Central

Jensen, Anders

2012-01-01

The taxonomic status and structure of Streptococcus dysgalactiae have been the object of much confusion. Bacteria belonging to this species are usually referred to as Lancefield group C or group G streptococci in clinical settings in spite of the fact that these terms lack precision and prevent recognition of the exact clinical relevance of these bacteria. The purpose of this study was to develop an improved basis for delineation and identification of the individual species of the pyogenic group of streptococci in the clinical microbiology laboratory, with a special focus on S. dysgalactiae. We critically reexamined the genetic relationships of the species S. dysgalactiae, Streptococcus pyogenes, Streptococcus canis, and Streptococcus equi, which may share Lancefield group antigens, by phylogenetic reconstruction based on multilocus sequence analysis (MLSA) and 16S rRNA gene sequences and by emm typing combined with phenotypic characterization. Analysis of concatenated sequences of seven genes previously used for examination of viridans streptococci distinguished robust and coherent clusters. S. dysgalactiae consists of two separate clusters consistent with the two recognized subspecies dysgalactiae and equisimilis. Both taxa share alleles with S. pyogenes in several housekeeping genes, which invalidates identification based on single-locus sequencing. S. dysgalactiae, S. canis, and S. pyogenes constitute a closely related branch within the genus Streptococcus indicative of recent descent from a common ancestor, while S. equi is highly divergent from other species of the pyogenic group streptococci. The results provide an improved basis for identification of clinically important pyogenic group streptococci and explain the overlapping spectrum of infections caused by the species associated with humans. PMID:22075580
Optimizing and benchmarking de novo transcriptome sequencing: from library preparation to assembly evaluation.

PubMed

Hara, Yuichiro; Tatsumi, Kaori; Yoshida, Michio; Kajikawa, Eriko; Kiyonari, Hiroshi; Kuraku, Shigehiro

2015-11-18

RNA-seq enables gene expression profiling in selected spatiotemporal windows and yields massive sequence information with relatively low cost and time investment, even for non-model species. However, there remains a large room for optimizing its workflow, in order to take full advantage of continuously developing sequencing capacity. Transcriptome sequencing for three embryonic stages of Madagascar ground gecko (Paroedura picta) was performed with the Illumina platform. The output reads were assembled de novo for reconstructing transcript sequences. In order to evaluate the completeness of transcriptome assemblies, we prepared a reference gene set consisting of vertebrate one-to-one orthologs. To take advantage of increased read length of >150 nt, we demonstrated shortened RNA fragmentation time, which resulted in a dramatic shift of insert size distribution. To evaluate products of multiple de novo assembly runs incorporating reads with different RNA sources, read lengths, and insert sizes, we introduce a new reference gene set, core vertebrate genes (CVG), consisting of 233 genes that are shared as one-to-one orthologs by all vertebrate genomes examined (29 species)., The completeness assessment performed by the computational pipelines CEGMA and BUSCO referring to CVG, demonstrated higher accuracy and resolution than with the gene set previously established for this purpose. As a result of the assessment with CVG, we have derived the most comprehensive transcript sequence set of the Madagascar ground gecko by means of assembling individual libraries followed by clustering the assembled sequences based on their overall similarities. Our results provide several insights into optimizing de novo RNA-seq workflow, including the coordination between library insert size and read length, which manifested in improved connectivity of assemblies. The approach and assembly assessment with CVG demonstrated here would be applicable to transcriptome analysis of other species as well as whole genome analyses.
Distinct profiles of expressed sequence tags during intestinal regeneration in the sea cucumber Holothuria glaberrima

PubMed Central

Rojas-Cartagena, Carmencita; Ortíz-Pineda, Pablo; Ramírez-Gómez, Francisco; Suárez-Castillo, Edna C.; Matos-Cruz, Vanessa; Rodríguez, Carlos; Ortíz-Zuazaga, Humberto; García-Arrarás, José E.

2010-01-01

Repair and regeneration are key processes for tissue maintenance, and their disruption may lead to disease states. Little is known about the molecular mechanisms that underline the repair and regeneration of the digestive tract. The sea cucumber Holothuria glaberrima represents an excellent model to dissect and characterize the molecular events during intestinal regeneration. To study the gene expression profile, cDNA libraries were constructed from normal, 3-day, and 7-day regenerating intestines of H. glaberrima. Clones were randomly sequenced and queried against the nonredundant protein database at the National Center for Biotechnology Information. RT-PCR analyses were made of several genes to determine their expression profile during intestinal regeneration. A total of 5,173 sequences from three cDNA libraries were obtained. About 46.2, 35.6, and 26.2% of the sequences for the normal, 3-days, and 7-days cDNA libraries, respectively, shared significant similarity with known sequences in the protein database of GenBank but only present 10% of similarity among them. Analysis of the libraries in terms of functional processes, protein domains, and most common sequences suggests that a differential expression profile is taking place during the regeneration process. Further examination of the expressed sequence tag dataset revealed that 12 putative genes are differentially expressed at significant level (R > 6). Experimental validation by RT-PCR analysis reveals that at least three genes (unknown C-4677-1, melanotransferrin, and centaurin) present a differential expression during regeneration. These findings strongly suggest that the gene expression profile varies among regeneration stages and provide evidence for the existence of differential gene expression. PMID:17579180
An ethylene-responsive enhancer element is involved in the senescence-related expression of the carnation glutathione-S-transferase (GST1) gene.

PubMed

Itzhaki, H; Maxson, J M; Woodson, W R

1994-09-13

The increased production of ethylene during carnation petal senescence regulates the transcription of the GST1 gene encoding a subunit of glutathione-S-transferase. We have investigated the molecular basis for this ethylene-responsive transcription by examining the cis elements and trans-acting factors involved in the expression of the GST1 gene. Transient expression assays following delivery of GST1 5' flanking DNA fused to a beta-glucuronidase receptor gene were used to functionally define sequences responsible for ethylene-responsive expression. Deletion analysis of the 5' flanking sequences of GST1 identified a single positive regulatory element of 197 bp between -667 and -470 necessary for ethylene-responsive expression. The sequences within this ethylene-responsive region were further localized to 126 bp between -596 and -470. The ethylene-responsive element (ERE) within this region conferred ethylene-regulated expression upon a minimal cauliflower mosaic virus-35S TATA-box promoter in an orientation-independent manner. Gel electrophoresis mobility-shift assays and DNase I footprinting were used to identify proteins that bind to sequences within the ERE. Nuclear proteins from carnation petals were shown to specifically interact with the 126-bp ERE and the presence and binding of these proteins were independent of ethylene or petal senescence. DNase I footprinting defined DNA sequences between -510 and -488 within the ERE specifically protected by bound protein. An 8-bp sequence (ATTTCAAA) within the protected region shares significant homology with promoter sequences required for ethylene responsiveness from the tomato fruit-ripening E4 gene.
Lactobacillus rodentium sp. nov., from the digestive tract of wild rodents.

PubMed

Killer, J; Havlík, J; Vlková, E; Rada, V; Pechar, R; Benada, O; Kopečný, J; Kofroňová, O; Sechovcová, H

2014-05-01

Three strains of regular, long, Gram-stain-positive bacterial rods were isolated using TPY, M.R.S. and Rogosa agar under anaerobic conditions from the digestive tract of wild mice (Mus musculus). All 16S rRNA gene sequences of these isolates were most similar to sequences of Lactobacillus gasseri ATCC 33323T and Lactobacillus johnsonii ATCC 33200T (97.3% and 97.2% sequence similarities, respectively). The novel strains shared 99.2-99.6% 16S rRNA gene sequence similarities. Type strains of L. gasseri and L. johnsonii were also most related to the newly isolated strains according to rpoA (83.9-84.0% similarities), pheS (84.6-87.8%), atpA (86.2-87.7%), hsp60 (89.4-90.4%) and tuf (92.7-93.6%) gene sequence similarities. Phylogenetic studies based on 16S rRNA, hsp60, rpoA, atpA and pheS gene sequences, other genotypic and many phenotypic characteristics (results of API 50 CHL, Rapid ID 32A and API ZYM biochemical tests; cellular fatty acid profiles; cellular polar lipid profiles; end products of glucose fermentation) showed that these bacterial strains represent a novel species within the genus Lactobacillus. The name Lactobacillus rodentium sp. nov. is proposed to accommodate this group of new isolates. The type strain is MYMRS/TLU1T (=DSM 24759T=CCM 7945T).

Sequence divergence of the red and green visual pigments in great apes and humans.

PubMed Central

Deeb, S S; Jorgensen, A L; Battisti, L; Iwasaki, L; Motulsky, A G

1994-01-01

We have determined the coding sequences of red and green visual pigment genes of the chimpanzee, gorilla, and orangutan. The deduced amino acid sequences of these pigments are highly homologous to the equivalent human pigments. None of the amino acid differences occurred at sites that were previously shown to influence pigment absorption characteristics. Therefore, we predict the spectra of red and green pigments of the apes to have wavelengths of maximum absorption that differ by < 2 nm from the equivalent human pigments and that color vision in these nonhuman primates will be very similar, if not identical, to that in humans. A total of 14 within-species polymorphisms (6 involving silent substitutions) were observed in the coding sequences of the red and green pigment genes of the great apes. Remarkably, the polymorphisms at 6 of these sites had been observed in human populations, suggesting that they predated the evolution of higher primates. Alleles at polymorphic sites were often shared between the red and green pigment genes. The average synonymous rate of divergence of red from green sequences was approximately 1/10th that estimated for other proteins of higher primates, indicating the involvement of gene conversion in generating these polymorphisms. The high degree of homology and juxtaposition of these two genes on the X chromosome has promoted unequal recombination and/or gene conversion that led to sequence homogenization. However, natural selection operated to maintain the degree of separation in peak absorbance between the red and green pigments that resulted in optimal chromatic discrimination. This represents a unique case of molecular coevolution between two homologous genes that functionally interact at the behavioral level. PMID:8041777
Comparing the Dictyostelium and Entamoeba Genomes Reveals an Ancient Split in the Conosa Lineage

PubMed Central

Song, Jie; Xu, Qikai; Olsen, Rolf; Loomis, William F; Shaulsky, Gad; Kuspa, Adam; Sucgang, Richard

2005-01-01

The Amoebozoa are a sister clade to the fungi and the animals, but are poorly sampled for completely sequenced genomes. The social amoeba Dictyostelium discoideum and amitochondriate pathogen Entamoeba histolytica are the first Amoebozoa with genomes completely sequenced. Both organisms are classified under the Conosa subphylum. To identify Amoebozoa-specific genomic elements, we compared these two genomes to each other and to other eukaryotic genomes. An expanded phylogenetic tree built from the complete predicted proteomes of 23 eukaryotes places the two amoebae in the same lineage, although the divergence is estimated to be greater than that between animals and fungi, and probably happened shortly after the Amoebozoa split from the opisthokont lineage. Most of the 1,500 orthologous gene families shared between the two amoebae are also shared with plant, animal, and fungal genomes. We found that only 42 gene families are distinct to the amoeba lineage; among these are a large number of proteins that contain repeats of the FNIP domain, and a putative transcription factor essential for proper cell type differentiation in D. discoideum. These Amoebozoa-specific genes may be useful in the design of novel diagnostics and therapies for amoebal pathologies. PMID:16362072
Molecular cloning and expression of the gene encoding the kinetoplast-associated type II DNA topoisomerase of Crithidia fasciculata.

PubMed

Pasion, S G; Hines, J C; Aebersold, R; Ray, D S

1992-01-01

A type II DNA topoisomerase, topoIImt, was shown previously to be associated with the kinetoplast DNA of the trypanosomatid Crithidia fasciculata. The gene encoding this kinetoplast-associated topoisomerase has been cloned by immunological screening of a Crithidia genomic expression library with monoclonal antibodies raised against the purified enzyme. The gene CfaTOP2 is a single copy gene and is expressed as a 4.8-kb polyadenylated transcript. The nucleotide sequence of CfaTOP2 has been determined and encodes a predicted polypeptide of 1239 amino acids with a molecular mass of 138,445. The identification of the cloned gene is supported by immunoblot analysis of the beta-galactosidase-CfaTOP2 fusion protein expressed in Escherichia coli and by analysis of tryptic peptide sequences derived from purified topoIImt. CfaTOP2 shares significant homology with nuclear type II DNA topoisomerases of other eukaryotes suggesting that in Crithidia both nuclear and mitochondrial forms of topoisomerase II are encoded by the same gene.
Cloning and characterization of a Candida albicans maltase gene involved in sucrose utilization.

PubMed Central

Geber, A; Williamson, P R; Rex, J H; Sweeney, E C; Bennett, J E

1992-01-01

In order to isolate the structural gene involved in sucrose utilization, we screened a sucrose-induced Candida albicans cDNA library for clones expressing alpha-glucosidase activity. The C. albicans maltase structural gene (CAMAL2) was isolated. No other clones expressing alpha-glucosidase activity. were detected. A genomic CAMAL2 clone was obtained by screening a size-selected genomic library with the cDNA clone. DNA sequence analysis reveals that CAMAL2 encodes a 570-amino-acid protein which shares 50% identity with the maltase structural gene (MAL62) of Saccharomyces carlsbergensis. The substrate specificity of the recombinant protein purified from Escherichia coli identifies the enzyme as a maltase. Northern (RNA) analysis reveals that transcription of CAMAL2 is induced by maltose and sucrose and repressed by glucose. These results suggest that assimilation of sucrose in C. albicans relies on an inducible maltase enzyme. The family of genes controlling sucrose utilization in C. albicans shares similarities with the MAL gene family of Saccharomyces cerevisiae and provides a model system for studying gene regulation in this pathogenic yeast. Images PMID:1400249
Genome-wide comparative analysis reveals human-mouse regulatory landscape and evolution.

PubMed

Denas, Olgert; Sandstrom, Richard; Cheng, Yong; Beal, Kathryn; Herrero, Javier; Hardison, Ross C; Taylor, James

2015-02-14

Because species-specific gene expression is driven by species-specific regulation, understanding the relationship between sequence and function of the regulatory regions in different species will help elucidate how differences among species arise. Despite active experimental and computational research, relationships among sequence, conservation, and function are still poorly understood. We compared transcription factor occupied segments (TFos) for 116 human and 35 mouse TFs in 546 human and 125 mouse cell types and tissues from the Human and the Mouse ENCODE projects. We based the map between human and mouse TFos on a one-to-one nucleotide cross-species mapper, bnMapper, that utilizes whole genome alignments (WGA). Our analysis shows that TFos are under evolutionary constraint, but a substantial portion (25.1% of mouse and 25.85% of human on average) of the TFos does not have a homologous sequence on the other species; this portion varies among cell types and TFs. Furthermore, 47.67% and 57.01% of the homologous TFos sequence shows binding activity on the other species for human and mouse respectively. However, 79.87% and 69.22% is repurposed such that it binds the same TF in different cells or different TFs in the same cells. Remarkably, within the set of repurposed TFos, the corresponding genome regions in the other species are preferred locations of novel TFos. These events suggest exaptation of some functional regulatory sequences into new function. Despite TFos repurposing, we did not find substantial changes in their predicted target genes, suggesting that CRMs buffer evolutionary events allowing little or no change in the TFos - target gene associations. Thus, the small portion of TFos with strictly conserved occupancy underestimates the degree of conservation of regulatory interactions. We mapped regulatory sequences from an extensive number of TFs and cell types between human and mouse using WGA. A comparative analysis of this correspondence unveiled the extent of the shared regulatory sequence across TFs and cell types under study. Importantly, a large part of the shared regulatory sequence is repurposed on the other species. This sequence, fueled by turnover events, provides a strong case for exaptation in regulatory elements.
The Genome of the Western Clawed Frog Xenopus tropicalis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hellsten, Uffe; Harland, Richard M.; Gilchrist, Michael J.

2009-10-01

The western clawed frog Xenopus tropicalis is an important model for vertebrate development that combines experimental advantages of the African clawed frog Xenopus laevis with more tractable genetics. Here we present a draft genome sequence assembly of X. tropicalis. This genome encodes over 20,000 protein-coding genes, including orthologs of at least 1,700 human disease genes. Over a million expressed sequence tags validated the annotation. More than one-third of the genome consists of transposable elements, with unusually prevalent DNA transposons. Like other tetrapods, the genome contains gene deserts enriched for conserved non-coding elements. The genome exhibits remarkable shared synteny with humanmore » and chicken over major parts of large chromosomes, broken by lineage-specific chromosome fusions and fissions, mainly in the mammalian lineage.« less
Isolation and characterization of cDNAs encoding wheat 3-hydroxy-3-methylglutaryl coenzyme A reductase.

PubMed Central

Aoyagi, K; Beyou, A; Moon, K; Fang, L; Ulrich, T

1993-01-01

The enzyme 3-hydroxy-3-methylglutaryl coenzyme A reductase (HMGR, EC 1.1.1.34) is a key enzyme in the isoprenoid biosynthetic pathway. We have isolated partial cDNAs from wheat (Triticum aestivum) using the polymerase chain reaction. Comparison of deduced amino acid sequences of these cDNAs shows that they represent a small family of genes that share a high degree of sequence homology among themselves as well as among genes from other organisms including tomato, Arabidopsis, hamster, human, Drosophila, and yeast. Southern blot analysis reveals the presence of at least four genes. Our results concerning the tissue-specific expression as well as developmental regulation of these HMGR cDNAs highlight the important role of this enzyme in the growth and development of wheat. PMID:8108513
Phylogenetic evidence for intratypic recombinant events in a novel human adenovirus C that causes severe acute respiratory infection in children.

PubMed

Wang, Yanqun; Li, Yamin; Lu, Roujian; Zhao, Yanjie; Xie, Zhengde; Shen, Jun; Tan, Wenjie

2016-03-10

Human adenoviruses (HAdVs) are prevalent in hospitalized children with severe acute respiratory infection (SARI). Here, we report a unique recombinant HAdV strain (CBJ113) isolated from a HAdV-positive child with SARI. The whole-genome sequence was determined using Sanger sequencing and high-throughput sequencing. A phylogenetic analysis of the complete genome indicated that the CBJ113 strain shares a common origin with HAdV-C2, HAdV-C6, HAdV-C1, HAdV-C5, and HAdV-C57 and formed a novel subclade on the same branch as other HAdV-C subtypes. BootScan and single nucleotide polymorphism analyses showed that the CBJ113 genome has an intra-subtype recombinant structure and comprises gene regions mainly originating from two circulating viral strains: HAdV-1 and HAdV-2. The parental penton base, pVI, and DBP genes of the recombinant strain clustered with the HAdV-1 prototype strain, and the E1B, hexon, fiber, and 100 K genes of the recombinant clustered within the HAdV-2 subtype, meanwhile the E4orf1 and DNA polymerase genes of the recombinant shared the greatest similarity with those of HAdV-5 and HAdV-6, respectively. All of these findings provide insight into our understanding of the dynamics of the complexity of the HAdV-C epidemic. More extensive studies should address the pathogenicity and clinical characteristics of the novel recombinant.
Phylogenetic evidence for intratypic recombinant events in a novel human adenovirus C that causes severe acute respiratory infection in children

PubMed Central

Wang, Yanqun; Li, Yamin; Lu, Roujian; Zhao, Yanjie; Xie, Zhengde; Shen, Jun; Tan, Wenjie

2016-01-01

Human adenoviruses (HAdVs) are prevalent in hospitalized children with severe acute respiratory infection (SARI). Here, we report a unique recombinant HAdV strain (CBJ113) isolated from a HAdV-positive child with SARI. The whole-genome sequence was determined using Sanger sequencing and high-throughput sequencing. A phylogenetic analysis of the complete genome indicated that the CBJ113 strain shares a common origin with HAdV-C2, HAdV-C6, HAdV-C1, HAdV-C5, and HAdV-C57 and formed a novel subclade on the same branch as other HAdV-C subtypes. BootScan and single nucleotide polymorphism analyses showed that the CBJ113 genome has an intra-subtype recombinant structure and comprises gene regions mainly originating from two circulating viral strains: HAdV-1 and HAdV-2. The parental penton base, pVI, and DBP genes of the recombinant strain clustered with the HAdV-1 prototype strain, and the E1B, hexon, fiber, and 100 K genes of the recombinant clustered within the HAdV-2 subtype, meanwhile the E4orf1 and DNA polymerase genes of the recombinant shared the greatest similarity with those of HAdV-5 and HAdV-6, respectively. All of these findings provide insight into our understanding of the dynamics of the complexity of the HAdV-C epidemic. More extensive studies should address the pathogenicity and clinical characteristics of the novel recombinant. PMID:26960434
The Effect on the Transcriptome of Anemone coronaria following Infection with Rust (Tranzschelia discolor)

PubMed Central

Laura, Marina; Borghi, Cristina; Bobbio, Valentina; Allavena, Andrea

2015-01-01

In order to understand plant/pathogen interaction, the transcriptome of uninfected (1S) and infected (2I) plant was sequenced at 3’end by the GS FLX 454 platform. De novo assembly of high-quality reads generated 27,231 contigs leaving 37,191 singletons in the 1S and 38,393 in the 2I libraries. ESTcalc tool suggested that 71% of the transcriptome had been captured, with 99% of the genes present being represented by at least one read. Unigene annotation showed that 50.5% of the predicted translation products shared significant homology with protein sequences in GenBank. In all 253 differential transcript abundance (DTAs) were in higher abundance and 52 in lower abundance in the 2I library. 128 higher abundance DTA genes were of fungal origin and 49 were clearly plant sequences. A tBLASTn-based search of the sequences using as query the full length predicted polypeptide product of 50 R genes identified 16 R gene products. Only one R gene (PGIP) was up-regulated. The response of the plant to fungal invasion included the up-regulation of several pathogenesis related protein (PR) genes involved in JA signaling and other genes associated with defense response and down regulation of cell wall associated genes, non-race-specific disease resistance1 (NDR1) and other genes like myb, presqualene diphosphate phosphatase (PSDPase), a UDP-glycosyltransferase 74E2-like (UGT). The DTA genes identified here should provide a basis for understanding the A. coronaria/T. discolor interaction and leads for biotechnology-based disease resistance breeding. PMID:25768012
The effect on the transcriptome of Anemone coronaria following infection with rust (Tranzschelia discolor).

PubMed

Laura, Marina; Borghi, Cristina; Bobbio, Valentina; Allavena, Andrea

2015-01-01

In order to understand plant/pathogen interaction, the transcriptome of uninfected (1S) and infected (2I) plant was sequenced at 3'end by the GS FLX 454 platform. De novo assembly of high-quality reads generated 27,231 contigs leaving 37,191 singletons in the 1S and 38,393 in the 2I libraries. ESTcalc tool suggested that 71% of the transcriptome had been captured, with 99% of the genes present being represented by at least one read. Unigene annotation showed that 50.5% of the predicted translation products shared significant homology with protein sequences in GenBank. In all 253 differential transcript abundance (DTAs) were in higher abundance and 52 in lower abundance in the 2I library. 128 higher abundance DTA genes were of fungal origin and 49 were clearly plant sequences. A tBLASTn-based search of the sequences using as query the full length predicted polypeptide product of 50 R genes identified 16 R gene products. Only one R gene (PGIP) was up-regulated. The response of the plant to fungal invasion included the up-regulation of several pathogenesis related protein (PR) genes involved in JA signaling and other genes associated with defense response and down regulation of cell wall associated genes, non-race-specific disease resistance1 (NDR1) and other genes like myb, presqualene diphosphate phosphatase (PSDPase), a UDP-glycosyltransferase 74E2-like (UGT). The DTA genes identified here should provide a basis for understanding the A. coronaria/T. discolor interaction and leads for biotechnology-based disease resistance breeding.
Cloning and sequence analysis of the LEU2 homologue gene from Pichia anomala.

PubMed

De la Rosa, J M; Pérez, J A; Gutiérrez, F; González, J M; Ruiz, T; Rodríguez, L

2001-11-01

The Pichia anomala LEU2 gene (PaLEU2) was isolated by complementation of a leu2 Saccharomyces cerevisiae mutant. The cloned gene also allowed growth of a Escherichia coli leuB mutant in leucine-lacking medium, indicating that it encodes a product able to complement the beta-isopropylmalate dehydrogenase deficiency of the mutants. The sequenced DNA fragment contains a complete ORF of 1092 bp, and the deduced polypeptide shares significant homologies with the products of the LEU2 genes from S. cerevisiae (84% identity) and other yeast species. A sequence resembling the GC-rich palindrome motif identified in the 5' region of S. cerevisiae LEU2 gene as the binding site for the transcription activating factor encoded by the LEU3 gene was found at the promoter region. In addition, upstream of the PaLEU2 the 3'-terminal half of a gene of the same orientation, encoding a homologue of the S. cerevisiae NFS1/SPL1 gene that encodes a mitochondrial cysteine desulphurase involved in both tRNA processing and mitochondrial metabolism, was found. The genomic organization of the PaNFS1-PaLEU2 gene pair is similar to that found in several other yeast species, including S. cerevisiae and Candida albicans, except that in some of them the LEU2 gene appears in the reverse orientation. Copyright 2001 John Wiley & Sons, Ltd.
Genomic organization, expression, and chromosome localization of a third aurora-related kinase gene, Aie1.

PubMed

Hu, H M; Chuang, C K; Lee, M J; Tseng, T C; Tang, T K

2000-11-01

We previously reported two novel testis-specific serine/threonine kinases, Aie1 (mouse) and AIE2 (human), that share high amino acid identities with the kinase domains of fly aurora and yeast Ipl1. Here, we report the entire intron-exon organization of the Aie1 gene and analyze the expression patterns of Aie1 mRNA during testis development. The mouse Aie1 gene spans approximately 14 kb and contains seven exons. The sequences of the exon-intron boundaries of the Aie1 gene conform to the consensus sequences (GT/AG) of the splicing donor and acceptor sites of most eukaryotic genes. Comparative genomic sequencing revealed that the gene structure is highly conserved between mouse Aie1 and human AIE2. However, much less homology was found in the sequence outside the kinase-coding domains. The Aie1 locus was mapped to mouse chromosome 7A2-A3 by fluorescent in situ hybridization. Northern blot analysis indicates that Aie1 mRNA likely is expressed at a low level on day 14 and reaches its plateau on day 21 in the developing postnatal testis. RNA in situ hybridization indicated that the expression of the Aie1 transcript was restricted to meiotically active germ cells, with the highest levels detected in spermatocytes at the late pachytene stage. These findings suggest that Aie1 plays a role in spermatogenesis.
Novel arrangement and comparative analysis of hsp90 family genes in three thermotolerant species of Stratiomyidae (Diptera).

PubMed

Astakhova, L N; Zatsepina, O G; Przhiboro, A A; Evgen'ev, M B; Garbuz, D G

2013-06-01

The heat shock proteins belonging to the Hsp90 family (Hsp83 in Diptera) play a crucial role in the protection of cells due to their chaperoning functions. We sequenced hsp90 genes from three species of the family Stratiomyidae (Diptera) living in thermally different habitats and characterized by extraordinarily high thermotolerance. The sequence variation and structure of the hsp90 family genes were compared with previously described features of hsp70 copies isolated from the same species. Two functional hsp83 genes were found in the species studied, that are arranged in tandem orientation at least in one of them. This organization was not previously described. Stratiomyidae hsp83 genes share a high level of identity with hsp83 of Drosophila, and the deduced protein possesses five conserved amino acid sequence motifs characteristic of the Hsp90 family as well as the C-terminus MEEVD sequence characteristic of the cytosolic isoform. A comparison of the hsp83 promoters of two Stratiomyidae species from thermally contrasting habitats demonstrated that while both species contain canonical heat shock elements in the same position, only one of the species contains functional GAF-binding elements. Our data indicate that in the same species, hsp83 family genes show a higher evolution rate than the hsp70 family. © 2013 Royal Entomological Society.
Molecular cloning and sequence analysis of the Anticarsia gemmatalis multicapsid nuclear polyhedrosis virus GP64 glycoprotein.

PubMed

Pilloff, Marcela Gabriela; Bilen, Marcos Fabián; Belaich, Mariano Nicolás; Lozano, Mario Enrique; Ghiringhelli, Pablo Daniel

2003-01-01

The gp64 locus of Anticarsia gemmatalis multicapsid nucleopolyhedrovirus isolate Santa Fe (AgMNPV-SF) was characterised molecularly in our laboratory. To this end, we have located and cloned a AgMNPV-SF genomic DNA fragment containing the gp64 gene and sequenced the complete gp64 locus. Nucleotide sequence analysis indicated that the AgMNPV gp64 gene consists of a 1500 nucleotide open reading frame (ORF), encoding a protein of 499 amino acids. Of the seven gp64 homologues identified to date, the AgMNPV gp64 ORF shared most sequence similarity with the gp64 gene of Orgyia pseudotsugata MNPV. The GP64 from AgMNPV is the smallest baculoviral envelope glycoprotein found to date, differing in 10 or more residues from the other group I nucleopolyhedroviruses. The biological activity of AgMNPV GP64 protein was assessed by cell fusion assays in UFL-AG-286 cells using the obtained recombinant plasmids. In the upstream and downstream regions, relative to the gp64 ORF, we found different conserved transcriptional and post-transcriptional regulatory elements, respectively.
Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.).

PubMed

Bushakra, Jill M; Lewers, Kim S; Staton, Margaret E; Zhebentyayeva, Tetyana; Saski, Christopher A

2015-10-26

Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed sequence tags (ESTs) are a source of SSRs that can be used to develop markers to facilitate plant breeding and for more basic research across genera and higher plant orders. Leaf and meristem tissue from 'Heritage' red raspberry (Rubus idaeus) and 'Bristol' black raspberry (R. occidentalis) were utilized for RNA extraction. After conversion to cDNA and library construction, ESTs were sequenced, quality verified, assembled and scanned for SSRs. Primers flanking the SSRs were designed and a subset tested for amplification, polymorphism and transferability across species. ESTs containing SSRs were functionally annotated using the GenBank non-redundant (nr) database and further classified using the gene ontology database. To accelerate development of EST-SSRs in the genus Rubus (Rosaceae), 1149 and 2358 cDNA sequences were generated from red raspberry and black raspberry, respectively. The cDNA sequences were screened using rigorous filtering criteria which resulted in the identification of 121 and 257 SSR loci for red and black raspberry, respectively. Primers were designed from the surrounding sequences resulting in 131 and 288 primer pairs, respectively, as some sequences contained more than one SSR locus. Sequence analysis revealed that the SSR-containing genes span a diversity of functions and share more sequence identity with strawberry genes than with other Rosaceous species. This resource of Rubus-specific, gene-derived markers will facilitate the construction of linkage maps composed of transferable markers for studying and manipulating important traits in this economically important genus.
Parentage determination of Vanda Miss Joaquim (Orchidaceae) through two chloroplast genes rbcL and matK

PubMed Central

Khew, Gillian Su-Wen; Chia, Tet Fatt

2011-01-01

Background and aims The popular hybrid orchid Vanda Miss Joaquim was made Singapore's national flower in 1981. It was originally described in the Gardeners’ Chronicle in 1893, as a cross between Vanda hookeriana and Vanda teres. However, no record had been kept as to which parent contributed the pollen. This study was conducted using DNA barcoding techniques to determine the pod parent of V. Miss Joaquim, thereby inferring the pollen parent of the hybrid by exclusion. Methodology Two chloroplast genes, matK and rbcL, from five related taxa, V. hookeriana, V. teres var. alba, V. teres var. andersonii, V. teres var. aurorea and V. Miss Joaquim ‘Agnes’, were sequenced. The matK gene from herbarium specimens of V. teres and V. Miss Joaquim, both collected in 1893, was also sequenced. Principal results No sequence variation was found in the 600-bp region of rbcL sequenced. Sequence variation was found in the matK gene of V. hookeriana, V. teres var. alba, V. teres var. aurorea and V. Miss Joaquim ‘Agnes’. Complete sequence identity was established between V. teres var. andersonii and V. Miss Joaquim ‘Agnes’. The matK sequences obtained from the herbarium specimens of V. teres and V. Miss Joaquim were completely identical to the sequences obtained from the fresh samples of V. teres var. andersonii and V. Miss Joaquim ‘Agnes’. Conclusions The pod parent of V. Miss Joaquim ‘Agnes’ is V. teres var. andersonii and, by exclusion, the pollen parent is V. hookeriana. The herbarium and fresh samples of V. teres var. andersonii and V. Miss Joaquim share the same inferred maternity. The matK gene was more informative than rbcL and facilitated differentiation of varieties of V. teres. PMID:22476488
The complete mitochondrial genomes of three parasitic nematodes of birds: a unique gene order and insights into nematode phylogeny

PubMed Central

2013-01-01

Background Analyses of mitochondrial (mt) genome sequences in recent years challenge the current working hypothesis of Nematoda phylogeny proposed from morphology, ecology and nuclear small subunit rRNA gene sequences, and raise the need to sequence additional mt genomes for a broad range of nematode lineages. Results We sequenced the complete mt genomes of three Ascaridia species (family Ascaridiidae) that infest chickens, pigeons and parrots, respectively. These three Ascaridia species have an identical arrangement of mt genes to each other but differ substantially from other nematodes. Phylogenetic analyses of the mt genome sequences of the Ascaridia species, together with 62 other nematode species, support the monophylies of seven high-level taxa of the phylum Nematoda: 1) the subclass Dorylaimia; 2) the orders Rhabditida, Trichinellida and Mermithida; 3) the suborder Rhabditina; and 4) the infraorders Spiruromorpha and Oxyuridomorpha. Analyses of mt genome sequences, however, reject the monophylies of the suborders Spirurina and Tylenchina, and the infraorders Rhabditomorpha, Panagrolaimomorpha and Tylenchomorpha. Monophyly of the infraorder Ascaridomorpha varies depending on the methods of phylogenetic analysis. The Ascaridomorpha was more closely related to the infraorders Rhabditomorpha and Diplogasteromorpha (suborder Rhabditina) than they were to the other two infraorders of the Spirurina: Oxyuridorpha and Spiruromorpha. The closer relationship among Ascaridomorpha, Rhabditomorpha and Diplogasteromorpha was also supported by a shared common pattern of mitochondrial gene arrangement. Conclusions Analyses of mitochondrial genome sequences and gene arrangement has provided novel insights into the phylogenetic relationships among several major lineages of nematodes. Many lineages of nematodes, however, are underrepresented or not represented in these analyses. Expanding taxon sampling is necessary for future phylogenetic studies of nematodes with mt genome sequences. PMID:23800363
Characterization of a Novel Simian Immunodeficiency Virus (SIVmonNG1) Genome Sequence from a Mona Monkey (Cercopithecus mona)

PubMed Central

Barlow, Katrina L.; Ajao, Adebowale Oluwafemi; Clewley, Jonathan P.

2003-01-01

A novel simian immunodeficiency virus (SIV) sequence has been recovered from RNA extracted from the serum of a mona monkey (Cercopithecus mona) wild born in Nigeria. The sequence was obtained by using novel generic (degenerate) PCR primers and spans from two-thirds into the gag gene to the 3′ poly(A) tail of the SIVmonNG1 RNA genome. Analysis of the open reading frames revealed that the SIVmonNG1 genome codes for a Vpu protein, in addition to Gag, Pol, Vif, Vpr, Tat, Rev, Env, and Nef proteins. Previously, only lentiviruses infecting humans (human immunodeficiency virus type 1 [HIV-1]) and chimpanzees (SIVcpz) were known to have a vpu gene; more recently, this has also been found in SIVgsn from Cercopithecus nictitans. Overall, SIVmonNG1 most closely resembles SIVgsn: the env gene sequence groups with HIV-1/SIVcpz env sequences, whereas the pol gene sequence clusters closely with the pol sequence of SIVsyk from Cercopithecus albogaris. By bootscanning and similarity plotting, the first half of pol resembles SIVsyk, whereas the latter part is closer to SIVcol from Colobus guereza. The similarities between the complex mosaic genomes of SIVmonNG1 and SIVgsn are consistent with a shared or common lineage. These data further highlight the intricate nature of the relationships between the SIVs from different primate species and will be helpful for unraveling these associations. PMID:12768007
Tibrogargan and Coastal Plains rhabdoviruses: genomic characterization, evolution of novel genes and seroprevalence in Australian livestock.

PubMed

Gubala, Aneta; Davis, Steven; Weir, Richard; Melville, Lorna; Cowled, Chris; Boyle, David

2011-09-01

Tibrogargan virus (TIBV) and Coastal Plains virus (CPV) were isolated from cattle in Australia and TIBV has also been isolated from the biting midge Culicoides brevitarsis. Complete genomic sequencing revealed that the viruses share a novel genome structure within the family Rhabdoviridae, each virus containing two additional putative genes between the matrix protein (M) and glycoprotein (G) genes and one between the G and viral RNA polymerase (L) genes. The predicted novel protein products are highly diverged at the sequence level but demonstrate clear conservation of secondary structure elements, suggesting conservation of biological functions. Phylogenetic analyses showed that TIBV and CPV form an independent group within the 'dimarhabdovirus supergroup'. Although no disease has been observed in association with these viruses, antibodies were detected at high prevalence in cattle and buffalo in northern Australia, indicating the need for disease monitoring and further study of this distinctive group of viruses.

[Characterization of ibeB gene of meningitic Escherichia coli strains in calves from Xinjiang].

PubMed

Ling, Chen; Jiang, Jianjun; Song, Kang; Zhang, Kun; Shi, Yanxia; Feng, Guangyu; Ni, Hongbin; Zhu, Ling; Wang, Pengyan; Yan, Genqiang

2016-06-04

To understand the molecular biology information of ibeB gene of meningitic Escherichia coli isolates in calves. The strain used was isolated from the brain and liver tissue of calves died from Meningitis. It was identified to be an O161-K99-STa pathogenic Escherichia coli strain and named as bovine-EN and bovine-EG. Based on the sequence of ibeB gene of meningitic Escherichia coli K1 RS218 strain in GenBank, a pair of primers was designed and the ibeB gene was cloned from isolates by PCR. Part molecular biology information of ibeB among different strains was compared. The sequence length of isolates ibeB gene was 1500 bp, containing a 1371 bp open reading frame (ORF) encoding 457 amino acids. Bioinformatics analysis showed that the nucleotide and amino acid homology of ibeB gene of bovine-EN strain shared 90.5% and 96.9% identity with Escherichia coli K1 RS218 ibeB gene, respectively, while bovine-EG strain shared 99.4% and 100.0% identity with Escherichia coli K12 respectively. The ibeB gene of bovine-E strains encoded water-soluble protein whose molecular weight was 50.26 kDa and isoelectric point was 6.05. This protein contained a signal peptide A but no transmembrane domain. Subcellular localization of ibeB belonged to the secreted protein, which secretory signal path site (SP) proportion was 0.939. The ibeB gene was cloned from meningitic E. coli isolates and had higher homology and similar biological characteristics with meningitis E. coli K1 RS218ibeB, which belongs to extraintestinal pathogenic Escherichia coli.
Heterogeneous conservation of Dlx paralog co-expression in jawed vertebrates.

PubMed

Debiais-Thibaud, Mélanie; Metcalfe, Cushla J; Pollack, Jacob; Germon, Isabelle; Ekker, Marc; Depew, Michael; Laurenti, Patrick; Borday-Birraux, Véronique; Casane, Didier

2013-01-01

The Dlx gene family encodes transcription factors involved in the development of a wide variety of morphological innovations that first evolved at the origins of vertebrates or of the jawed vertebrates. This gene family expanded with the two rounds of genome duplications that occurred before jawed vertebrates diversified. It includes at least three bigene pairs sharing conserved regulatory sequences in tetrapods and teleost fish, but has been only partially characterized in chondrichthyans, the third major group of jawed vertebrates. Here we take advantage of developmental and molecular tools applied to the shark Scyliorhinus canicula to fill in the gap and provide an overview of the evolution of the Dlx family in the jawed vertebrates. These results are analyzed in the theoretical framework of the DDC (Duplication-Degeneration-Complementation) model. The genomic organisation of the catshark Dlx genes is similar to that previously described for tetrapods. Conserved non-coding elements identified in bony fish were also identified in catshark Dlx clusters and showed regulatory activity in transgenic zebrafish. Gene expression patterns in the catshark showed that there are some expression sites with high conservation of the expressed paralog(s) and other expression sites with events of paralog sub-functionalization during jawed vertebrate diversification, resulting in a wide variety of evolutionary scenarios within this gene family. Dlx gene expression patterns in the catshark show that there has been little neo-functionalization in Dlx genes over gnathostome evolution. In most cases, one tandem duplication and two rounds of vertebrate genome duplication have led to at least six Dlx coding sequences with redundant expression patterns followed by some instances of paralog sub-functionalization. Regulatory constraints such as shared enhancers, and functional constraints including gene pleiotropy, may have contributed to the evolutionary inertia leading to high redundancy between gene expression patterns.
Structure of CARB-4 and AER-1 CarbenicillinHydrolyzing β-Lactamases

PubMed Central

Sanschagrin, François; Bejaoui, Noureddine; Levesque, Roger C.

1998-01-01

We determined the nucleotide sequences of blaCARB-4 encoding CARB-4 and deduced a polypeptide of 288 amino acids. The gene was characterized as a variant of group 2c carbenicillin-hydrolyzing β-lactamases such as PSE-4, PSE-1, and CARB-3. The level of DNA homology between the bla genes for these β-lactamases varied from 98.7 to 99.9%, while that between these genes and blaCARB-4 encoding CARB-4 was 86.3%. The blaCARB-4 gene was acquired from some other source because it has a G+C content of 39.1%, compared to a G+C content of 67% for typical Pseudomonas aeruginosa genes. DNA sequencing revealed that blaAER-1 shared 60.8% DNA identity with blaPSE-3 encoding PSE-3. The deduced AER-1 β-lactamase peptide was compared to class A, B, C, and D enzymes and had 57.6% identity with PSE-3, including an STHK tetrad at the active site. For CARB-4 and AER-1, conserved canonical amino acid boxes typical of class A β-lactamases were identified in a multiple alignment. Analysis of the DNA sequences flanking blaCARB-4 and blaAER-1 confirmed the importance of gene cassettes acquired via integrons in bla gene distribution. PMID:9687391
Mitochondrial Genomes of Kinorhyncha: trnM Duplication and New Gene Orders within Animals.

PubMed

Popova, Olga V; Mikhailov, Kirill V; Nikitin, Mikhail A; Logacheva, Maria D; Penin, Aleksey A; Muntyan, Maria S; Kedrova, Olga S; Petrov, Nikolai B; Panchin, Yuri V; Aleoshin, Vladimir V

2016-01-01

Many features of mitochondrial genomes of animals, such as patterns of gene arrangement, nucleotide content and substitution rate variation are extensively used in evolutionary and phylogenetic studies. Nearly 6,000 mitochondrial genomes of animals have already been sequenced, covering the majority of animal phyla. One of the groups that escaped mitogenome sequencing is phylum Kinorhyncha-an isolated taxon of microscopic worm-like ecdysozoans. The kinorhynchs are thought to be one of the early-branching lineages of Ecdysozoa, and their mitochondrial genomes may be important for resolving evolutionary relations between major animal taxa. Here we present the results of sequencing and analysis of mitochondrial genomes from two members of Kinorhyncha, Echinoderes svetlanae (Cyclorhagida) and Pycnophyes kielensis (Allomalorhagida). Their mitochondrial genomes are circular molecules approximately 15 Kbp in size. The kinorhynch mitochondrial gene sequences are highly divergent, which precludes accurate phylogenetic inference. The mitogenomes of both species encode a typical metazoan complement of 37 genes, which are all positioned on the major strand, but the gene order is distinct and unique among Ecdysozoa or animals as a whole. We predict four types of start codons for protein-coding genes in E. svetlanae and five in P. kielensis with a consensus DTD in single letter code. The mitochondrial genomes of E. svetlanae and P. kielensis encode duplicated methionine tRNA genes that display compensatory nucleotide substitutions. Two distant species of Kinorhyncha demonstrate similar patterns of gene arrangements in their mitogenomes. Both genomes have duplicated methionine tRNA genes; the duplication predates the divergence of two species. The kinorhynchs share a few features pertaining to gene order that align them with Priapulida. Gene order analysis reveals that gene arrangement specific of Priapulida may be ancestral for Scalidophora, Ecdysozoa, and even Protostomia.
Mitochondrial Genomes of Kinorhyncha: trnM Duplication and New Gene Orders within Animals

PubMed Central

Popova, Olga V.; Mikhailov, Kirill V.; Nikitin, Mikhail A.; Logacheva, Maria D.; Penin, Aleksey A.; Muntyan, Maria S.; Kedrova, Olga S.; Petrov, Nikolai B.; Panchin, Yuri V.

2016-01-01

Many features of mitochondrial genomes of animals, such as patterns of gene arrangement, nucleotide content and substitution rate variation are extensively used in evolutionary and phylogenetic studies. Nearly 6,000 mitochondrial genomes of animals have already been sequenced, covering the majority of animal phyla. One of the groups that escaped mitogenome sequencing is phylum Kinorhyncha—an isolated taxon of microscopic worm-like ecdysozoans. The kinorhynchs are thought to be one of the early-branching lineages of Ecdysozoa, and their mitochondrial genomes may be important for resolving evolutionary relations between major animal taxa. Here we present the results of sequencing and analysis of mitochondrial genomes from two members of Kinorhyncha, Echinoderes svetlanae (Cyclorhagida) and Pycnophyes kielensis (Allomalorhagida). Their mitochondrial genomes are circular molecules approximately 15 Kbp in size. The kinorhynch mitochondrial gene sequences are highly divergent, which precludes accurate phylogenetic inference. The mitogenomes of both species encode a typical metazoan complement of 37 genes, which are all positioned on the major strand, but the gene order is distinct and unique among Ecdysozoa or animals as a whole. We predict four types of start codons for protein-coding genes in E. svetlanae and five in P. kielensis with a consensus DTD in single letter code. The mitochondrial genomes of E. svetlanae and P. kielensis encode duplicated methionine tRNA genes that display compensatory nucleotide substitutions. Two distant species of Kinorhyncha demonstrate similar patterns of gene arrangements in their mitogenomes. Both genomes have duplicated methionine tRNA genes; the duplication predates the divergence of two species. The kinorhynchs share a few features pertaining to gene order that align them with Priapulida. Gene order analysis reveals that gene arrangement specific of Priapulida may be ancestral for Scalidophora, Ecdysozoa, and even Protostomia. PMID:27755612
Complete Genome Sequence of Akkermansia glycaniphila Strain PytT, a Mucin-Degrading Specialist of the Reticulated Python Gut.

PubMed

Ouwerkerk, Janneke P; Koehorst, Jasper J; Schaap, Peter J; Ritari, Jarmo; Paulin, Lars; Belzer, Clara; de Vos, Willem M

2017-01-05

Akkermansia glycaniphila is a novel Akkermansia species that was isolated from the intestine of the reticulated python and shares the capacity to degrade mucin with the human strain Akkermansia muciniphila Muc T Here, we report the complete genome sequence of strain Pyt T of 3,074,121 bp. The genomic analysis reveals genes for mucin degradation and aerobic respiration. Copyright © 2017 Ouwerkerk et al.
The cytochrome p450 homepage.

PubMed

Nelson, David R

2009-10-01

The Cytochrome P450 Homepage is a universal resource for nomenclature and sequence information on cytochrome P450 ( CYP ) genes. The site has been in continuous operation since February 1995. Currently, naming information for 11,512 CYPs are available on the web pages. The P450 sequences are manually curated by David Nelson, and the nomenclature system conforms to an evolutionary scheme such that members of CYP families and subfamilies share common ancestors. The organisation and content of the Homepage are described.
Ribosomal DNA replication fork barrier and HOT1 recombination hot spot: shared sequences but independent activities.

PubMed

Ward, T R; Hoang, M L; Prusty, R; Lau, C K; Keil, R L; Fangman, W L; Brewer, B J

2000-07-01

In the ribosomal DNA of Saccharomyces cerevisiae, sequences in the nontranscribed spacer 3' of the 35S ribosomal RNA gene are important to the polar arrest of replication forks at a site called the replication fork barrier (RFB) and also to the cis-acting, mitotic hyperrecombination site called HOT1. We have found that the RFB and HOT1 activity share some but not all of their essential sequences. Many of the mutations that reduce HOT1 recombination also decrease or eliminate fork arrest at one of two closely spaced RFB sites, RFB1 and RFB2. A simple model for the juxtaposition of RFB and HOT1 sequences is that the breakage of strands in replication forks arrested at RFB stimulates recombination. Contrary to this model, we show here that HOT1-stimulated recombination does not require the arrest of forks at the RFB. Therefore, while HOT1 activity is independent of replication fork arrest, HOT1 and RFB require some common sequences, suggesting the existence of a common trans-acting factor(s).
Agarivorans gilvus sp. nov. Isolated From Seaweed

USDA-ARS?s Scientific Manuscript database

A novel agarase-producing, non-endospore-forming marine bacterium WH0801T was isolated from a fresh seaweed sample collected from the coast of Weihai, China. Preliminary characterization based on 16S rRNA gene sequence analysis showed that WH0801T shared 96.1% identity with Agarivorans albus MKT 10...
Phylogenetic analysis of of Sarcocystis nesbitti (Coccidia: Sarcocystidae) suggests a snake as its probable definitive host

USDA-ARS?s Scientific Manuscript database

Sarcocystis nesbitti was first described by Mandour in 1969 from rhesus monkey muscle. Its definitive host remains unknown. 18SrRNA gene of Sarcocystis nesbitti was amplified, sequenced, and subjected to phylogenetic analysis. Among those congeners available for comparison, it shares closest affinit...
Single sea urchin phagocytes express messages of a single sequence from the diverse Sp185/333 gene family in response to bacterial challenge.

PubMed

Majeske, Audrey J; Oren, Matan; Sacchi, Sandro; Smith, L Courtney

2014-12-01

Immune systems in animals rely on fast and efficient responses to a wide variety of pathogens. The Sp185/333 gene family in the purple sea urchin, Strongylocentrotus purpuratus, consists of an estimated 50 (±10) members per genome that share a basic gene structure but show high sequence diversity, primarily due to the mosaic appearance of short blocks of sequence called elements. The genes show significantly elevated expression in three subpopulations of phagocytes responding to marine bacteria. The encoded Sp185/333 proteins are highly diverse and have central effector functions in the immune system. In this study we report the Sp185/333 gene expression in single sea urchin phagocytes. Sea urchins challenged with heat-killed marine bacteria resulted in a typical increase in coelomocyte concentration within 24 h, which included an increased proportion of phagocytes expressing Sp185/333 proteins. Phagocyte fractions enriched from coelomocytes were used in limiting dilutions to obtain samples of single cells that were evaluated for Sp185/333 gene expression by nested RT-PCR. Amplicon sequences showed identical or nearly identical Sp185/333 amplicon sequences in single phagocytes with matches to six known Sp185/333 element patterns, including both common and rare element patterns. This suggested that single phagocytes show restricted expression from the Sp185/333 gene family and infers a diverse, flexible, and efficient response to pathogens. This type of expression pattern from a family of immune response genes in single cells has not been identified previously in other invertebrates. Copyright © 2014 by The American Association of Immunologists, Inc.
Prediction of Human Disease Genes by Human-Mouse Conserved Coexpression Analysis

PubMed Central

Grassi, Elena; Damasco, Christian; Silengo, Lorenzo; Oti, Martin; Provero, Paolo; Di Cunto, Ferdinando

2008-01-01

Background Even in the post-genomic era, the identification of candidate genes within loci associated with human genetic diseases is a very demanding task, because the critical region may typically contain hundreds of positional candidates. Since genes implicated in similar phenotypes tend to share very similar expression profiles, high throughput gene expression data may represent a very important resource to identify the best candidates for sequencing. However, so far, gene coexpression has not been used very successfully to prioritize positional candidates. Methodology/Principal Findings We show that it is possible to reliably identify disease-relevant relationships among genes from massive microarray datasets by concentrating only on genes sharing similar expression profiles in both human and mouse. Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions. Finally, using this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases. Conclusion Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes. PMID:18369433
Construction of a Lotus japonicus late nodulin expressed sequence tag library and identification of novel nodule-specific genes.

PubMed Central

Szczyglowski, K; Hamburger, D; Kapranov, P; de Bruijn, F J

1997-01-01

A range of novel expressed sequence tags (ESTs) associated with late developmental events during nodule organogenesis in the legume Lotus japonicus were identified using mRNA differential display; 110 differentially displayed polymerase chain reaction products were cloned and analyzed. Of 88 unique cDNAs obtained, 22 shared significant homology to DNA/protein sequences in the respective databases. This group comprises, among others, a nodule-specific homolog of protein phosphatase 2C, a peptide transporter protein, and a nodule-specific form of cytochrome P450. RNA gel-blot analysis of 16 differentially displayed ESTs confirmed their nodule-specific expression pattern. The kinetics of mRNA accumulation of the majority of the ESTs analyzed were found to resemble the expression pattern observed for the L. japonicus leghemoglobin gene. These results indicate that the newly isolated molecular markers correspond to genes induced during late developmental stages of L. japonicus nodule organogenesis and provide important, novel tools for the study of nodulation. PMID:9276951
Expression of glutathione peroxidase I gene in selenium-deficient rats.

PubMed Central

Reddy, A P; Hsu, B L; Reddy, P S; Li, N Q; Thyagaraju, K; Reddy, C C; Tam, M F; Tu, C P

1988-01-01

We have characterized a cDNA pGPX1211 encoding rat glutathione peroxidase I. The selenocysteine in the protein corresponded to a TGA codon in the coding region of the cDNA, similar to earlier findings in mouse and human genes, and a gene encoding the formate dehydrogenase from E. coli, another selenoenzyme. The rat GSH peroxidase I has a calculated subunit molecular weight of 22,155 daltons and shares 95% and 86% sequence homology with the mouse and human subunits, respectively. The 3'-noncoding sequence (greater than 930 bp) in pGPX1211 is much longer than that of the human sequences. We found that glutathione peroxidase I mRNA, but not the polypeptide, was expressed under nutritional stress of selenium deficiency where no glutathione peroxidase I activity can be detected. The failure of detecting any apoprotein for the glutathione peroxidase I under selenium deficiency and results published from other laboratories supports the proposal that selenium may be incorporated into the glutathione peroxidase I co-translationally. Images PMID:2838821
A 21.7 kb DNA segment on the left arm of yeast chromosome XIV carries WHI3, GCR2, SPX18, SPX19, an homologue to the heat shock gene SSB1 and 8 new open reading frames of unknown function.

PubMed

Jonniaux, J L; Coster, F; Purnelle, B; Goffeau, A

1994-12-01

We report the amino acid sequence of 13 open reading frames (ORF > 299 bp) located on a 21.7 kb DNA segment from the left arm of chromosome XIV of Saccharomyces cerevisiae. Five open reading frames had been entirely or partially sequenced previously: WHI3, GCR2, SPX19, SPX18 and a heat shock gene similar to SSB1. The products of 8 other ORFs are new putative proteins among which N1394 is probably a membrane protein. N1346 contains a leucine zipper pattern and the corresponding ORF presents an HAP (global regulator of respiratory genes) upstream activating sequence in the promoting region. N1386 shares homologies with the DNA structure-specific recognition protein family SSRPs and the corresponding ORF is preceded by an MCB (MluI cell cycle box) upstream activating factor.
Expression of the Pasteurella haemolytica leukotoxin is inhibited by a locus that encodes an ATP-binding cassette homolog.

PubMed Central

Highlander, S K; Wickersham, E A; Garza, O; Weinstock, G M

1993-01-01

Multicopy and single-copy chromosomal fusions between the Pasteurella haemolytica leukotoxin regulatory region and the Escherichia coli beta-galactosidase gene have been constructed. These fusions were used as reporters to identify and isolate regulators of leukotoxin expression from a P. haemolytica cosmid library. A cosmid clone, which inhibited leukotoxin expression from multicopy and single-copy protein fusions, was isolated and found to contain the complete leukotoxin gene cluster plus additional upstream sequences. The locus responsible for inhibition of expression from leukotoxin-beta-galactosidase fusions was mapped within these upstream sequences, by transposon mutagenesis with Tn5, and its DNA sequence was determined. The inhibitory activity was found to be associated with a predicted 440-amino-acid reading frame (lapA) that lies within a four-gene arginine transport locus. LapA is predicted to be the nucleotide-binding component of this transport system and shares homology with the Clp family of proteases. Images PMID:8359916
The complete genome sequence of the African buffalo (Syncerus caffer).

PubMed

Glanzmann, Brigitte; Möller, Marlo; le Roex, Nikki; Tromp, Gerard; Hoal, Eileen G; van Helden, Paul D

2016-12-07

The African buffalo (Syncerus caffer) is an important role player in the savannah ecosystem. It has become a species of relevance because of its role as a wildlife maintenance host for an array of infectious and zoonotic diseases some of which include corridor disease, foot-and-mouth disease and bovine tuberculosis. To date, no complete genome sequence for S. caffer had been available for study and the genomes of other species such as the domestic cow (Bos taurus) had been used as a proxy for any genetics analysis conducted on this species. Here, the high coverage genome sequence of the African buffalo (S. caffer) is presented. A total of 19,765 genes were predicted and 19,296 genes could be successfully annotated to S. caffer while 469 genes remained unannotated. Moreover, in order to extend a detailed annotation of S. caffer, gene clusters were constructed using twelve additional mammalian genomes. The S. caffer genome contains 10,988 gene clusters, of which 62 are shared exclusively between B. taurus and S. caffer. This study provides a unique genomic perspective for the S. caffer, allowing for the identification of novel variants that may play a role in the natural history and physiological adaptations.
Inducible in vivo DNA footprints define sequences necessary for UV light activation of the parsley chalcone synthase gene.

PubMed Central

Schulze-Lefert, P; Dangl, J L; Becker-André, M; Hahlbrock, K; Schulz, W

1989-01-01

We began characterization of the protein--DNA interactions necessary for UV light induced transcriptional activation of the gene encoding chalcone synthase (CHS), a key plant defense enzyme. Three light dependent in vivo footprints appear on a 90 bp stretch of the CHS promoter with a time course correlated with the onset of CHS transcription. We define a minimal light responsive promoter by functional analysis of truncated CHS promoter fusions with a reporter gene in transient expression experiments in parsley protoplasts. Two of the three footprinted sequence 'boxes' reside within the minimal promoter. Replacement of 10 bp within either of these 'boxes' leads to complete loss of light responsiveness. We conclude that these sequences define the necessary cis elements of the minimal CHS promoter's light responsive element. One of the functionally defined 'boxes' is homologous to an element implicated in regulation of genes involved in photosynthesis. These data represent the first example in a plant defense gene of an induced change in protein--DNA contacts necessary for transcriptional activation. Also, our data argue strongly that divergent light induced biosynthetic pathways share common regulatory units. Images PMID:2566481
Versatile Gene-Specific Sequence Tags for Arabidopsis Functional Genomics: Transcript Profiling and Reverse Genetics Applications

PubMed Central

Hilson, Pierre; Allemeersch, Joke; Altmann, Thomas; Aubourg, Sébastien; Avon, Alexandra; Beynon, Jim; Bhalerao, Rishikesh P.; Bitton, Frédérique; Caboche, Michel; Cannoot, Bernard; Chardakov, Vasil; Cognet-Holliger, Cécile; Colot, Vincent; Crowe, Mark; Darimont, Caroline; Durinck, Steffen; Eickhoff, Holger; de Longevialle, Andéol Falcon; Farmer, Edward E.; Grant, Murray; Kuiper, Martin T.R.; Lehrach, Hans; Léon, Céline; Leyva, Antonio; Lundeberg, Joakim; Lurin, Claire; Moreau, Yves; Nietfeld, Wilfried; Paz-Ares, Javier; Reymond, Philippe; Rouzé, Pierre; Sandberg, Goran; Segura, Maria Dolores; Serizet, Carine; Tabrett, Alexandra; Taconnat, Ludivine; Thareau, Vincent; Van Hummelen, Paul; Vercruysse, Steven; Vuylsteke, Marnik; Weingartner, Magdalena; Weisbeek, Peter J.; Wirta, Valtteri; Wittink, Floyd R.A.; Zabeau, Marc; Small, Ian

2004-01-01

Microarray transcript profiling and RNA interference are two new technologies crucial for large-scale gene function studies in multicellular eukaryotes. Both rely on sequence-specific hybridization between complementary nucleic acid strands, inciting us to create a collection of gene-specific sequence tags (GSTs) representing at least 21,500 Arabidopsis genes and which are compatible with both approaches. The GSTs were carefully selected to ensure that each of them shared no significant similarity with any other region in the Arabidopsis genome. They were synthesized by PCR amplification from genomic DNA. Spotted microarrays fabricated from the GSTs show good dynamic range, specificity, and sensitivity in transcript profiling experiments. The GSTs have also been transferred to bacterial plasmid vectors via recombinational cloning protocols. These cloned GSTs constitute the ideal starting point for a variety of functional approaches, including reverse genetics. We have subcloned GSTs on a large scale into vectors designed for gene silencing in plant cells. We show that in planta expression of GST hairpin RNA results in the expected phenotypes in silenced Arabidopsis lines. These versatile GST resources provide novel and powerful tools for functional genomics. PMID:15489341
A new genome of Acidithiobacillus thiooxidans provides insights into adaptation to a bioleaching environment.

PubMed

Travisany, Dante; Cortés, María Paz; Latorre, Mauricio; Di Genova, Alex; Budinich, Marko; Bobadilla-Fazzini, Roberto A; Parada, Pilar; González, Mauricio; Maass, Alejandro

2014-11-01

Acidithiobacillus thiooxidans is a sulfur oxidizing acidophilic bacterium found in many sulfur-rich environments. It is particularly interesting due to its role in bioleaching of sulphide minerals. In this work, we report the genome sequence of At. thiooxidans Licanantay, the first strain from a copper mine to be sequenced and currently used in bioleaching industrial processes. Through comparative genomic analysis with two other At. thiooxidans non-metal mining strains (ATCC 19377 and A01) we determined that these strains share a large core genome of 2109 coding sequences and a high average nucleotide identity over 98%. Nevertheless, the presence of 841 strain-specific genes (absent in other At. thiooxidans strains) suggests a particular adaptation of Licanantay to its specific biomining environment. Among this group, we highlight genes encoding for proteins involved in heavy metal tolerance, mineral cell attachment and cysteine biosynthesis. Several of these genes were located near genetic motility genes (e.g. transposases and integrases) in genomic regions of over 10 kbp absent in the other strains, suggesting the presence of genomic islands in the Licanantay genome probably produced by horizontal gene transfer in mining environments. Copyright © 2014 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

An extremely thermophilic anaerobic bacterium Caldicellulosiruptor sp. F32 exhibits distinctive properties in growth and xylanases during xylan hydrolysis.

PubMed

Ying, Yu; Meng, Dongdong; Chen, Xiaohua; Li, Fuli

2013-08-15

An anaerobic, extremely thermophilic, and cellulose- and xylan-degrading bacterium F32 was isolated from biocompost. Sequence analysis of the 16S rRNA gene of this strain showed that it was closely related to Caldicellulosiruptor saccharolyticus DSM 8903 (99.0% identity). Physiological and biochemical data also supported that identification of strain F32 as a Caldicellulosiruptor species. The proteins secreted by Caldicellulosiruptor sp. F32 grown on xylan showed a xylanase activity of 7.74U/mg, which was 2.5 times higher than that of C. saccharolyticus DSM 8903. Based on the genomic sequencing data, 2 xylanase genes, JX030400 and JX030401, were identified in Caldicellulosiruptor sp. F32. The xylanase encoded by JX030401 shared 97% identity with Csac_0696 of C. saccharolyticus DSM 8903, while that encoded by JX030400 shared 94% identity with Athe_0089 of C. bescii DSM 6725, which was not found in the genome of strain DSM 8903. Xylanse encoded by JX030400 had 9-fold higher specific activity than JX030401. Our results indicated that although the 2 strains shared high identity, the xylanase system in Caldicellulosiruptor sp. F32 was more efficient than that in C. saccharolyticus DSM 8903. Copyright © 2013 Elsevier Inc. All rights reserved.
MHC class I loci of the Bar-Headed goose (Anser indicus)

PubMed Central

2010-01-01

MHC class I proteins mediate functions in anti-pathogen defense. MHC diversity has already been investigated by many studies in model avian species, but here we chose the bar-headed goose, a worldwide migrant bird, as a non-model avian species. Sequences from exons encoding the peptide-binding region (PBR) of MHC class I molecules were isolated from liver genomic DNA, to investigate variation in these genes. These are the first MHC class I partial sequences of the bar-headed goose to be reported. A preliminary analysis suggests the presence of at least four MHC class I genes, which share great similarity with those of the goose and duck. A phylogenetic analysis of bar-headed goose, goose and duck MHC class I sequences using the NJ method supports the idea that they all cluster within the anseriforms clade. PMID:21637434
Molecular Cloning and Characterization of cDNA Encoding a Putative Stress-Induced Heat-Shock Protein from Camelus dromedarius

PubMed Central

Elrobh, Mohamed S.; Alanazi, Mohammad S.; Khan, Wajahatullah; Abduljaleel, Zainularifeen; Al-Amri, Abdullah; Bazzi, Mohammad D.

2011-01-01

Heat shock proteins are ubiquitous, induced under a number of environmental and metabolic stresses, with highly conserved DNA sequences among mammalian species. Camelus dromedaries (the Arabian camel) domesticated under semi-desert environments, is well adapted to tolerate and survive against severe drought and high temperatures for extended periods. This is the first report of molecular cloning and characterization of full length cDNA of encoding a putative stress-induced heat shock HSPA6 protein (also called HSP70B′) from Arabian camel. A full-length cDNA (2417 bp) was obtained by rapid amplification of cDNA ends (RACE) and cloned in pET-b expression vector. The sequence analysis of HSPA6 gene showed 1932 bp-long open reading frame encoding 643 amino acids. The complete cDNA sequence of the Arabian camel HSPA6 gene was submitted to NCBI GeneBank (accession number HQ214118.1). The BLAST analysis indicated that C. dromedaries HSPA6 gene nucleotides shared high similarity (77–91%) with heat shock gene nucleotide of other mammals. The deduced 643 amino acid sequences (accession number ADO12067.1) showed that the predicted protein has an estimated molecular weight of 70.5 kDa with a predicted isoelectric point (pI) of 6.0. The comparative analyses of camel HSPA6 protein sequences with other mammalian heat shock proteins (HSPs) showed high identity (80–94%). Predicted camel HSPA6 protein structure using Protein 3D structural analysis high similarities with human and mouse HSPs. Taken together, this study indicates that the cDNA sequences of HSPA6 gene and its amino acid and protein structure from the Arabian camel are highly conserved and have similarities with other mammalian species. PMID:21845074
Genomic and genetic analyses of diversity and plant interactions of Pseudomonas fluorescens

PubMed Central

Silby, Mark W; Cerdeño-Tárraga, Ana M; Vernikos, Georgios S; Giddens, Stephen R; Jackson, Robert W; Preston, Gail M; Zhang, Xue-Xian; Moon, Christina D; Gehrig, Stefanie M; Godfrey, Scott AC; Knight, Christopher G; Malone, Jacob G; Robinson, Zena; Spiers, Andrew J; Harris, Simon; Challis, Gregory L; Yaxley, Alice M; Harris, David; Seeger, Kathy; Murphy, Lee; Rutter, Simon; Squares, Rob; Quail, Michael A; Saunders, Elizabeth; Mavromatis, Konstantinos; Brettin, Thomas S; Bentley, Stephen D; Hothersall, Joanne; Stephens, Elton; Thomas, Christopher M; Parkhill, Julian; Levy, Stuart B; Rainey, Paul B; Thomson, Nicholas R

2009-01-01

Background Pseudomonas fluorescens are common soil bacteria that can improve plant health through nutrient cycling, pathogen antagonism and induction of plant defenses. The genome sequences of strains SBW25 and Pf0-1 were determined and compared to each other and with P. fluorescens Pf-5. A functional genomic in vivo expression technology (IVET) screen provided insight into genes used by P. fluorescens in its natural environment and an improved understanding of the ecological significance of diversity within this species. Results Comparisons of three P. fluorescens genomes (SBW25, Pf0-1, Pf-5) revealed considerable divergence: 61% of genes are shared, the majority located near the replication origin. Phylogenetic and average amino acid identity analyses showed a low overall relationship. A functional screen of SBW25 defined 125 plant-induced genes including a range of functions specific to the plant environment. Orthologues of 83 of these exist in Pf0-1 and Pf-5, with 73 shared by both strains. The P. fluorescens genomes carry numerous complex repetitive DNA sequences, some resembling Miniature Inverted-repeat Transposable Elements (MITEs). In SBW25, repeat density and distribution revealed 'repeat deserts' lacking repeats, covering approximately 40% of the genome. Conclusions P. fluorescens genomes are highly diverse. Strain-specific regions around the replication terminus suggest genome compartmentalization. The genomic heterogeneity among the three strains is reminiscent of a species complex rather than a single species. That 42% of plant-inducible genes were not shared by all strains reinforces this conclusion and shows that ecological success requires specialized and core functions. The diversity also indicates the significant size of genetic information within the Pseudomonas pan genome. PMID:19432983
Complete chloroplast genome sequence and comparative analysis of loblolly pine (Pinus taeda L.) with related species

PubMed Central

Khan, Abdul Latif; Khan, Muhammad Aaqil; Shahzad, Raheem; Lubna; Kang, Sang Mo; Al-Harrasi, Ahmed; Al-Rawahi, Ahmed; Lee, In-Jung

2018-01-01

Pinaceae, the largest family of conifers, has a diversified organization of chloroplast (cp) genomes with two typical highly reduced inverted repeats (IRs). In the current study, we determined the complete sequence of the cp genome of an economically and ecologically important conifer tree, the loblolly pine (Pinus taeda L.), using Illumina paired-end sequencing and compared the sequence with those of other pine species. The results revealed a genome size of 121,531 base pairs (bp) containing a pair of 830-bp IR regions, distinguished by a small single copy (42,258 bp) and large single copy (77,614 bp) region. The chloroplast genome of P. taeda encodes 120 genes, comprising 81 protein-coding genes, four ribosomal RNA genes, and 35 tRNA genes, with 151 randomly distributed microsatellites. Approximately 6 palindromic, 34 forward, and 22 tandem repeats were found in the P. taeda cp genome. Whole cp genome comparison with those of other Pinus species exhibited an overall high degree of sequence similarity, with some divergence in intergenic spacers. Higher and lower numbers of indels and single-nucleotide polymorphism substitutions were observed relative to P. contorta and P. monophylla, respectively. Phylogenomic analyses based on the complete genome sequence revealed that 60 shared genes generated trees with the same topologies, and P. taeda was closely related to P. contorta in the subgenus Pinus. Thus, the complete P. taeda genome provided valuable resources for population and evolutionary studies of gymnosperms and can be used to identify related species. PMID:29596414
Divergent evolution of arrested development in the dauer stage of Caenorhabditis elegans and the infective stage of Heterodera glycines

PubMed Central

Elling, Axel A; Mitreva, Makedonka; Recknor, Justin; Gai, Xiaowu; Martin, John; Maier, Thomas R; McDermott, Jeffrey P; Hewezi, Tarek; McK Bird, David; Davis, Eric L; Hussey, Richard S; Nettleton, Dan; McCarter, James P; Baum, Thomas J

2007-01-01

Background The soybean cyst nematode Heterodera glycines is the most important parasite in soybean production worldwide. A comprehensive analysis of large-scale gene expression changes throughout the development of plant-parasitic nematodes has been lacking to date. Results We report an extensive genomic analysis of H. glycines, beginning with the generation of 20,100 expressed sequence tags (ESTs). In-depth analysis of these ESTs plus approximately 1,900 previously published sequences predicted 6,860 unique H. glycines genes and allowed a classification by function using InterProScan. Expression profiling of all 6,860 genes throughout the H. glycines life cycle was undertaken using the Affymetrix Soybean Genome Array GeneChip. Our data sets and results represent a comprehensive resource for molecular studies of H. glycines. Demonstrating the power of this resource, we were able to address whether arrested development in the Caenorhabditis elegans dauer larva and the H. glycines infective second-stage juvenile (J2) exhibits shared gene expression profiles. We determined that the gene expression profiles associated with the C. elegans dauer pathway are not uniformly conserved in H. glycines and that the expression profiles of genes for metabolic enzymes of C. elegans dauer larvae and H. glycines infective J2 are dissimilar. Conclusion Our results indicate that hallmark gene expression patterns and metabolism features are not shared in the developmentally arrested life stages of C. elegans and H. glycines, suggesting that developmental arrest in these two nematode species has undergone more divergent evolution than previously thought and pointing to the need for detailed genomic analyses of individual parasite species. PMID:17919324
Tracing common origins of Genomic Islands in prokaryotes based on genome signature analyses.

PubMed

van Passel, Mark Wj

2011-09-01

Horizontal gene transfer constitutes a powerful and innovative force in evolution, but often little is known about the actual origins of transferred genes. Sequence alignments are generally of limited use in tracking the original donor, since still only a small fraction of the total genetic diversity is thought to be uncovered. Alternatively, approaches based on similarities in the genome specific relative oligonucleotide frequencies do not require alignments. Even though the exact origins of horizontally transferred genes may still not be established using these compositional analyses, it does suggest that compositionally very similar regions are likely to have had a common origin. These analyses have shown that up to a third of large acquired gene clusters that reside in the same genome are compositionally very similar, indicative of a shared origin. This brings us closer to uncovering the original donors of horizontally transferred genes, and could help in elucidating possible regulatory interactions between previously unlinked sequences.
A conserved genetic module that encodes the major virion components in both the coliphage T4 and the marine cyanophage S-PM2

PubMed Central

Hambly, Emma; Tétart, Francoise; Desplats, Carine; Wilson, William H.; Krisch, Henry M.; Mann, Nicholas H.

2001-01-01

Sequence analysis of a 10-kb region of the genome of the marine cyanomyovirus S-PM2 reveals a homology to coliphage T4 that extends as a contiguous block from gene (g)18 to g23. The order of the S-PM2 genes in this region is similar to that of T4, but there are insertions and deletions of small ORFs of unknown function. In T4, g18 codes for the tail sheath, g19, the tail tube, g20, the head portal protein, g21, the prohead core protein, g22, a scaffolding protein, and g23, the major capsid protein. Thus, the entire module that determines the structural components of the phage head and contractile tail is conserved between T4 and this cyanophage. The significant differences in the morphology of these phages must reflect the considerable divergence of the amino acid sequence of their homologous virion proteins, which uniformly exceeds 50%. We suggest that their enormous diversity in the sea could be a result of genetic shuffling between disparate phages mediated by such commonly shared modules. These conserved sequences could facilitate genetic exchange by providing partially homologous substrates for recombination between otherwise divergent phage genomes. Such a mechanism would thus expand the pool of phage genes accessible by recombination to all those phages that share common modules. PMID:11553768
Round and pointed-head grenadier fishes (Actinopterygii: Gadiformes) represent a single sister group: evidence from the complete mitochondrial genome sequences.

PubMed

Satoh, Takashi P; Miya, Masaki; Endo, Hiromitsu; Nishida, Mutsumi

2006-07-01

The gene order of mitochondrial genomes (mitogenomes) has been employed as a useful phylogenetic marker in various metazoan animals, because it may represent uniquely derived characters shared by members of monophyletic groups. During the course of molecular phylogenetic studies of the order Gadiformes (cods and their relatives) based on whole mitogenome sequences, we found that two deep-sea grenadiers (Squalogadus modificatus and Trachyrincus murrayi: family Macrouridae) revealed a unusually identical gene order (translocation of the tRNA(Leu (UUR))). Both are members of the same family, although their external morphologies differed so greatly (e.g., round vs. pointed head) that they have been placed in different subfamilies Macrouroidinae and Trachyrincinae, respectively. Additionally, we determined the whole mitogenome sequences of two other species, Bathygadus antrodes and Ventrifossa garmani, representing a total of four subfamilies currently recognized within Macrouridae. The latter two species also exhibited gene rearrangements, resulting in a total of three different patterns of unique gene order being observed in the four subfamilies. Partitioned Bayesian analysis was conducted using available whole mitogenome sequences from five macrourids plus five outgroups. The resultant trees clearly indicated that S. modificatus and T. murrayi formed a monophyletic group, having a sister relationship to other macrourids. Thus, monophyly of the two species with disparate head morphologies was corroborated by two different lines of evidence (nucleotide sequences and gene order). The overall topology of the present tree differed from any of the previously proposed, morphology-based phylogenetic hypotheses.
An aureobasidin A resistance gene isolated from Aspergillus is a homolog of yeast AUR1, a gene responsible for inositol phosphorylceramide (IPC) synthase activity.

PubMed

Kuroda, M; Hashida-Okado, T; Yasumoto, R; Gomi, K; Kato, I; Takesako, K

1999-03-01

The AUR1 gene of Saccharomyces cerevisiae, mutations in which confer resistance to the antibiotic aureobasidin A, is necessary for inositol phosphorylceramide (IPC) synthase activity. We report the molecular cloning and characterization of the Aspergillus nidulans aurA gene, which is homologous to AUR1. A single point mutation in the aurA gene of A. nidulans confers a high level of resistance to aureobasidin A. The A. nidulans aurA gene was used to identify its homologs in other Aspergillus species, including A. fumigatus, A. niger, and A. oryzae. The deduced amino acid sequence of an aurA homolog from the pathogenic fungus A. fumigatus showed 87% identity to that of A. nidulans. The AurA proteins of A. nidulans and A. fumigatus shared common characteristics in primary structure, including sequence, hydropathy profile, and N-glycosylation sites, with their S. cerevisiae, Schizosaccharomyces pombe, and Candida albicans counterparts. These results suggest that the aureobasidin resistance gene is conserved evolutionarily in various fungi.
Gene structure and expression characteristic of a novel odorant receptor gene cluster in the parasitoid wasp Microplitis mediator (Hymenoptera: Braconidae).

PubMed

Wang, S-N; Shan, S; Zheng, Y; Peng, Y; Lu, Z-Y; Yang, Y-Q; Li, R-J; Zhang, Y-J; Guo, Y-Y

2017-08-01

Odorant receptors (ORs) expressed in the antennae of parasitoid wasps are responsible for detection of various lipophilic airborne molecules. In the present study, 107 novel OR genes were identified from Microplitis mediator antennal transcriptome data. Phylogenetic analysis of the set of OR genes from M. mediator and Microplitis demolitor revealed that M. mediator OR (MmedOR) genes can be classified into different subfamilies, and the majority of MmedORs in each subfamily shared high sequence identities and clear orthologous relationships to M. demolitor ORs. Within a subfamily, six MmedOR genes, MmedOR98, 124, 125, 126, 131 and 155, shared a similar gene structure and were tightly linked in the genome. To evaluate whether the clustered MmedOR genes share common regulatory features, the transcription profile and expression characteristics of the six closely related OR genes were investigated in M. mediator. Rapid amplification of cDNA ends-PCR experiments revealed that the OR genes within the cluster were transcribed as single mRNAs, and a bicistronic mRNA for two adjacent genes (MmedOR124 and MmedOR98) was also detected in female antennae by reverse transcription PCR. In situ hybridization experiments indicated that each OR gene within the cluster was expressed in a different number of cells. Moreover, there was no co-expression of the two highly related OR genes, MmedOR124 and MmedOR98, which appeared to be individually expressed in a distinct population of neurons. Overall, there were distinct expression profiles of closely related MmedOR genes from the same cluster in M. mediator. These data provide a basic understanding of the olfactory coding in parasitoid wasps. © 2017 The Royal Entomological Society.
Metagenomic gene annotation by a homology-independent approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Froula, Jeff; Zhang, Tao; Salmeen, Annette

2011-06-02

Fully understanding the genetic potential of a microbial community requires functional annotation of all the genes it encodes. The recently developed deep metagenome sequencing approach has enabled rapid identification of millions of genes from a complex microbial community without cultivation. Current homology-based gene annotation fails to detect distantly-related or structural homologs. Furthermore, homology searches with millions of genes are very computational intensive. To overcome these limitations, we developed rhModeller, a homology-independent software pipeline to efficiently annotate genes from metagenomic sequencing projects. Using cellulases and carbonic anhydrases as two independent test cases, we demonstrated that rhModeller is much faster than HMMERmore » but with comparable accuracy, at 94.5percent and 99.9percent accuracy, respectively. More importantly, rhModeller has the ability to detect novel proteins that do not share significant homology to any known protein families. As {approx}50percent of the 2 million genes derived from the cow rumen metagenome failed to be annotated based on sequence homology, we tested whether rhModeller could be used to annotate these genes. Preliminary results suggest that rhModeller is robust in the presence of missense and frameshift mutations, two common errors in metagenomic genes. Applying the pipeline to the cow rumen genes identified 4,990 novel cellulases candidates and 8,196 novel carbonic anhydrase candidates.In summary, we expect rhModeller to dramatically increase the speed and quality of metagnomic gene annotation.« less
Multiple ice-binding proteins of probable prokaryotic origin in an Antarctic lake alga, Chlamydomonas sp. ICE-MDV (Chlorophyceae).

PubMed

Raymond, James A; Morgan-Kiss, Rachael

2017-08-01

Ice-associated algae produce ice-binding proteins (IBPs) to prevent freezing damage. The IBPs of the three chlorophytes that have been examined so far share little similarity across species, making it likely that they were acquired by horizontal gene transfer (HGT). To clarify the importance and source of IBPs in chlorophytes, we sequenced the IBP genes of another Antarctic chlorophyte, Chlamydomonas sp. ICE-MDV (Chlamy-ICE). Genomic DNA and total RNA were sequenced and screened for known ice-associated genes. Chlamy-ICE has as many as 50 IBP isoforms, indicating that they have an important role in survival. The IBPs are of the DUF3494 type and have similar exon structures. The DUF3494 sequences are much more closely related to prokaryotic sequences than they are to sequences in other chlorophytes, and the chlorophyte IBP and ribosomal 18S phylogenies are dissimilar. The multiple IBP isoforms found in Chlamy-ICE and other algae may allow the algae to adapt to a greater variety of ice conditions than prokaryotes, which typically have a single IBP gene. The predicted structure of the DUF3494 domain has an ice-binding face with an orderly array of hydrophilic side chains. The results indicate that Chlamy-ICE acquired its IBP genes by HGT in a single event. The acquisitions of IBP genes by this and other species of Antarctic algae by HGT appear to be key evolutionary events that allowed algae to extend their ranges into polar environments. © 2017 Phycological Society of America.
Genome-wide analysis of putative peroxiredoxin in unicellular and filamentous cyanobacteria.

PubMed

Cui, Hongli; Wang, Yipeng; Wang, Yinchu; Qin, Song

2012-11-16

Cyanobacteria are photoautotrophic prokaryotes with wide variations in genome sizes and ecological habitats. Peroxiredoxin (PRX) is an important protein that plays essential roles in protecting own cells against reactive oxygen species (ROS). PRXs have been identified from mammals, fungi and higher plants. However, knowledge on cyanobacterial PRXs still remains obscure. With the availability of 37 sequenced cyanobacterial genomes, we performed a comprehensive comparative analysis of PRXs and explored their diversity, distribution, domain structure and evolution. Overall 244 putative prx genes were identified, which were abundant in filamentous diazotrophic cyanobacteria, Acaryochloris marina MBIC 11017, and unicellular cyanobacteria inhabiting freshwater and hot-springs, while poor in all Prochlorococcus and marine Synechococcus strains. Among these putative genes, 25 open reading frames (ORFs) encoding hypothetical proteins were identified as prx gene family members and the others were already annotated as prx genes. All 244 putative PRXs were classified into five major subfamilies (1-Cys, 2-Cys, BCP, PRX5_like, and PRX-like) according to their domain structures. The catalytic motifs of the cyanobacterial PRXs were similar to those of eukaryotic PRXs and highly conserved in all but the PRX-like subfamily. Classical motif (CXXC) of thioredoxin was detected in protein sequences from the PRX-like subfamily. Phylogenetic tree constructed of catalytic domains coincided well with the domain structures of PRXs and the phylogenies based on 16s rRNA. The distribution of genes encoding PRXs in different unicellular and filamentous cyanobacteria especially those sub-families like PRX-like or 1-Cys PRX correlate with the genome size, eco-physiology, and physiological properties of the organisms. Cyanobacterial and eukaryotic PRXs share similar conserved motifs, indicating that cyanobacteria adopt similar catalytic mechanisms as eukaryotes. All cyanobacterial PRX proteins share highly similar structures, implying that these genes may originate from a common ancestor. In this study, a general framework of the sequence-structure-function connections of the PRXs was revealed, which may facilitate functional investigations of PRXs in various organisms.
Genome-wide analysis of putative peroxiredoxin in unicellular and filamentous cyanobacteria

PubMed Central

2012-01-01

Background Cyanobacteria are photoautotrophic prokaryotes with wide variations in genome sizes and ecological habitats. Peroxiredoxin (PRX) is an important protein that plays essential roles in protecting own cells against reactive oxygen species (ROS). PRXs have been identified from mammals, fungi and higher plants. However, knowledge on cyanobacterial PRXs still remains obscure. With the availability of 37 sequenced cyanobacterial genomes, we performed a comprehensive comparative analysis of PRXs and explored their diversity, distribution, domain structure and evolution. Results Overall 244 putative prx genes were identified, which were abundant in filamentous diazotrophic cyanobacteria, Acaryochloris marina MBIC 11017, and unicellular cyanobacteria inhabiting freshwater and hot-springs, while poor in all Prochlorococcus and marine Synechococcus strains. Among these putative genes, 25 open reading frames (ORFs) encoding hypothetical proteins were identified as prx gene family members and the others were already annotated as prx genes. All 244 putative PRXs were classified into five major subfamilies (1-Cys, 2-Cys, BCP, PRX5_like, and PRX-like) according to their domain structures. The catalytic motifs of the cyanobacterial PRXs were similar to those of eukaryotic PRXs and highly conserved in all but the PRX-like subfamily. Classical motif (CXXC) of thioredoxin was detected in protein sequences from the PRX-like subfamily. Phylogenetic tree constructed of catalytic domains coincided well with the domain structures of PRXs and the phylogenies based on 16s rRNA. Conclusions The distribution of genes encoding PRXs in different unicellular and filamentous cyanobacteria especially those sub-families like PRX-like or 1-Cys PRX correlate with the genome size, eco-physiology, and physiological properties of the organisms. Cyanobacterial and eukaryotic PRXs share similar conserved motifs, indicating that cyanobacteria adopt similar catalytic mechanisms as eukaryotes. All cyanobacterial PRX proteins share highly similar structures, implying that these genes may originate from a common ancestor. In this study, a general framework of the sequence-structure-function connections of the PRXs was revealed, which may facilitate functional investigations of PRXs in various organisms. PMID:23157370
Xanthomonas adaptation to common bean is associated with horizontal transfers of genes encoding TAL effectors.

PubMed

Ruh, Mylène; Briand, Martial; Bonneau, Sophie; Jacques, Marie-Agnès; Chen, Nicolas W G

2017-08-30

Common bacterial blight is a devastating bacterial disease of common bean (Phaseolus vulgaris) caused by Xanthomonas citri pv. fuscans and Xanthomonas phaseoli pv. phaseoli. These phylogenetically distant strains are able to cause similar symptoms on common bean, suggesting that they have acquired common genetic determinants of adaptation to common bean. Transcription Activator-Like (TAL) effectors are bacterial type III effectors that are able to induce the expression of host genes to promote infection or resistance. Their capacity to bind to a specific host DNA sequence suggests that they are potential candidates for host adaption. To study the diversity of tal genes from Xanthomonas strains responsible for common bacterial blight of bean, whole genome sequences of 17 strains representing the diversity of X. citri pv. fuscans and X. phaseoli pv. phaseoli were obtained by single molecule real time sequencing. Analysis of these genomes revealed the existence of four tal genes named tal23A, tal20F, tal18G and tal18H, respectively. While tal20F and tal18G were chromosomic, tal23A and tal18H were carried on plasmids and shared between phylogenetically distant strains, therefore suggesting recent horizontal transfers of these genes between X. citri pv. fuscans and X. phaseoli pv. phaseoli strains. Strikingly, tal23A was present in all strains studied, suggesting that it played an important role in adaptation to common bean. In silico predictions of TAL effectors targets in the common bean genome suggested that TAL effectors shared by X. citri pv. fuscans and X. phaseoli pv. phaseoli strains target the promoters of genes of similar functions. This could be a trace of convergent evolution among TAL effectors from different phylogenetic groups, and comforts the hypothesis that TAL effectors have been implied in the adaptation to common bean. Altogether, our results favour a model where plasmidic TAL effectors are able to contribute to host adaptation by being horizontally transferred between distant lineages.
Molecular characterization of feline panleukopenia virus isolated from mink and its pathogenesis in mink.

PubMed

Fei-Fei, Diao; Yong-Feng, Zhao; Jian-Li, Wang; Xue-Hua, Wei; Kai, Cui; Chuan-Yi, Liu; Shou-Yu, Guo; Jiang, Shijin; Zhi-Jing, Xie

2017-06-01

Six feline panleukopenia viruses (FPV) were detected in the intestinal samples from the 176 mink collected in China during 2015 to 2016, named MEV-SD1, MEV-SD2, MEV-SD3, MEV-SD4, MEV-SD5 and MEV-SD6. The VP2 genes of the isolates shared 98.9%-100% identity with the reference sequences. The substitution of residue V300A in VP2 protein differentiates the isolates from the reference MEVs, and A300 is a characteristic of FPV. Furthermore, phylogenetic analysis of VP2 genes indicated that the six isolates were clustered into the same branch of all the reference FPVs. The NS1 genes of the isolates shared 98.2%-100% identity with the reference sequences. The NS1 genes of the six isolates and the three reference FPVs formed one unique evolutionary branch. To clarify the pathogenicity of the isolates, animal experiments were performed on healthy mink, using MEV-SD1. As a result, the morbidity of the inoculated animals was 100% and the mortality was as high as 38.9%. It was implied that the FPV infection caused a high morbidity and mortality in mink and the inoculation dose had an effect on pathogenicity of MEV-SD1 in mink. Copyright © 2017 Elsevier B.V. All rights reserved.
MOXD2, a Gene Possibly Associated with Olfaction, Is Frequently Inactivated in Birds

PubMed Central

Goh, Chul Jun; Choi, Dongjin; Park, Dong-Bin; Kim, Hyein; Hahn, Yoonsoo

2016-01-01

Vertebrate MOXD2 encodes a monooxygenase DBH-like 2 protein that could be involved in neurotransmitter metabolism, potentially during olfactory transduction. Loss of MOXD2 in apes and whales has been proposed to be associated with evolution of olfaction in these clades. We analyzed 57 bird genomes to identify MOXD2 sequences and found frequent loss of MOXD2 in 38 birds. Among the 57 birds, 19 species appeared to have an intact MOXD2 that encoded a full-length protein; 32 birds had a gene with open reading frame-disrupting point mutations and/or exon deletions; and the remaining 6 species did not show any MOXD2 sequence, suggesting a whole-gene deletion. Notably, among 10 passerine birds examined, 9 species shared a common genomic deletion that spanned several exons, implying the gene loss occurred in a common ancestor of these birds. However, 2 closely related penguin species, each of which had an inactive MOXD2, did not share any mutation, suggesting an independent loss after their divergence. Distribution of the 38 birds without an intact MOXD2 in the bird phylogenetic tree clearly indicates that MOXD2 loss is widespread and independent in bird lineages. We propose that widespread MOXD2 loss in some bird lineages may be implicated in the evolution of olfactory perception in these birds. PMID:27074048
MOXD2, a Gene Possibly Associated with Olfaction, Is Frequently Inactivated in Birds.

PubMed

Goh, Chul Jun; Choi, Dongjin; Park, Dong-Bin; Kim, Hyein; Hahn, Yoonsoo

2016-01-01

Vertebrate MOXD2 encodes a monooxygenase DBH-like 2 protein that could be involved in neurotransmitter metabolism, potentially during olfactory transduction. Loss of MOXD2 in apes and whales has been proposed to be associated with evolution of olfaction in these clades. We analyzed 57 bird genomes to identify MOXD2 sequences and found frequent loss of MOXD2 in 38 birds. Among the 57 birds, 19 species appeared to have an intact MOXD2 that encoded a full-length protein; 32 birds had a gene with open reading frame-disrupting point mutations and/or exon deletions; and the remaining 6 species did not show any MOXD2 sequence, suggesting a whole-gene deletion. Notably, among 10 passerine birds examined, 9 species shared a common genomic deletion that spanned several exons, implying the gene loss occurred in a common ancestor of these birds. However, 2 closely related penguin species, each of which had an inactive MOXD2, did not share any mutation, suggesting an independent loss after their divergence. Distribution of the 38 birds without an intact MOXD2 in the bird phylogenetic tree clearly indicates that MOXD2 loss is widespread and independent in bird lineages. We propose that widespread MOXD2 loss in some bird lineages may be implicated in the evolution of olfactory perception in these birds.
Lineage-Specific Biology Revealed by a Finished Genome Assembly of the Mouse

PubMed Central

Hillier, LaDeana W.; Zody, Michael C.; Goldstein, Steve; She, Xinwe; Bult, Carol J.; Agarwala, Richa; Cherry, Joshua L.; DiCuccio, Michael; Hlavina, Wratko; Kapustin, Yuri; Meric, Peter; Maglott, Donna; Birtle, Zoë; Marques, Ana C.; Graves, Tina; Zhou, Shiguo; Teague, Brian; Potamousis, Konstantinos; Churas, Christopher; Place, Michael; Herschleb, Jill; Runnheim, Ron; Forrest, Daniel; Amos-Landgraf, James; Schwartz, David C.; Cheng, Ze; Lindblad-Toh, Kerstin; Eichler, Evan E.; Ponting, Chris P.

2009-01-01

The mouse (Mus musculus) is the premier animal model for understanding human disease and development. Here we show that a comprehensive understanding of mouse biology is only possible with the availability of a finished, high-quality genome assembly. The finished clone-based assembly of the mouse strain C57BL/6J reported here has over 175,000 fewer gaps and over 139 Mb more of novel sequence, compared with the earlier MGSCv3 draft genome assembly. In a comprehensive analysis of this revised genome sequence, we are now able to define 20,210 protein-coding genes, over a thousand more than predicted in the human genome (19,042 genes). In addition, we identified 439 long, non–protein-coding RNAs with evidence for transcribed orthologs in human. We analyzed the complex and repetitive landscape of 267 Mb of sequence that was missing or misassembled in the previously published assembly, and we provide insights into the reasons for its resistance to sequencing and assembly by whole-genome shotgun approaches. Duplicated regions within newly assembled sequence tend to be of more recent ancestry than duplicates in the published draft, correcting our initial understanding of recent evolution on the mouse lineage. These duplicates appear to be largely composed of sequence regions containing transposable elements and duplicated protein-coding genes; of these, some may be fixed in the mouse population, but at least 40% of segmentally duplicated sequences are copy number variable even among laboratory mouse strains. Mouse lineage-specific regions contain 3,767 genes drawn mainly from rapidly-changing gene families associated with reproductive functions. The finished mouse genome assembly, therefore, greatly improves our understanding of rodent-specific biology and allows the delineation of ancestral biological functions that are shared with human from derived functions that are not. PMID:19468303

Novel molecular approach to define pest species status and tritrophic interactions from historical Bemisia specimens.

PubMed

Tay, W T; Elfekih, S; Polaszek, A; Court, L N; Evans, G A; Gordon, K H J; De Barro, P J

2017-03-27

Museum specimens represent valuable genomic resources for understanding host-endosymbiont/parasitoid evolutionary relationships, resolving species complexes and nomenclatural problems. However, museum collections suffer DNA degradation, making them challenging for molecular-based studies. Here, the mitogenomes of a single 1912 Sri Lankan Bemisia emiliae cotype puparium, and of a 1942 Japanese Bemisia puparium are characterised using a Next-Generation Sequencing approach. Whiteflies are small sap-sucking insects including B. tabaci pest species complex. Bemisia emiliae's draft mitogenome showed a high degree of homology with published B. tabaci mitogenomes, and exhibited 98-100% partial mitochondrial DNA Cytochrome Oxidase I (mtCOI) gene identity with the B. tabaci species known as Asia II-7. The partial mtCOI gene of the Japanese specimen shared 99% sequence identity with the Bemisia 'JpL' genetic group. Metagenomic analysis identified bacterial sequences in both Bemisia specimens, while hymenopteran sequences were also identified in the Japanese Bemisia puparium, including complete mtCOI and rRNA genes, and various partial mtDNA genes. At 88-90% mtCOI sequence identity to Aphelinidae wasps, we concluded that the 1942 Bemisia nymph was parasitized by an Eretmocerus parasitoid wasp. Our approach enables the characterisation of genomes and associated metagenomic communities of museum specimens using 1.5 ng gDNA, and to infer historical tritrophic relationships in Bemisia whiteflies.
Molecular characterization and expression of the M6 gene of grass carp hemorrhage virus (GCHV), an aquareovirus.

PubMed

Qiu, T; Lu, R H; Zhang, J; Zhu, Z Y

2001-07-01

The complete nucleotide sequence of M6 gene of grass carp hemorrhage virus (GCHV) was determined. It is 2039 nucleotides in length and contains a single large open reading frame that could encode a protein of 648 amino acids with predicted molecular mass of 68.7 kDa. Amino acid sequence comparison revealed that the protein encoded by GCHV M6 is closely related to the protein mu1 of mammalian reovirus. The M6 gene, encoding the major outer-capsid protein, was expressed using the pET fusion protein vector in Escherichia coli and detected by Western blotting using chicken anti-GCHV immunoglobulin (IgY). The result indicates that the protein encoded by M6 may share a putative Asn-42-Pro-43 proteolytic cleavage site with mu1.
A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower

DOE Office of Scientific and Technical Information (OSTI.GOV)

Timme, Ruth E.; Kuehl, Jennifer V.; Boore, Jeffrey L.

2006-01-20

Asteraceae is the second largest family of plants, with over 20,000 species. For the past few decades, numerous phylogenetic studies have contributed to our understanding of the evolutionary relationships within this family, including comparisons of the fast evolving chloroplast gene, ndhF, rbcL, as well as non-coding DNA from the trnL intron plus the trnLtrnF intergenic spacer, matK, and, with lesser resolution, psbA-trnH. This culminated in a study by Panero and Funk in 2002 that used over 13,000 bp per taxon for the largest taxonomic revision of Asteraceae in over a hundred years. Still, some uncertainties remain, and it would bemore » very useful to have more information on the relative rates of sequence evolution among various genes and on genome structure as a potential set of phylogenetic characters to help guide future phylogenetic structures. By way of contributing to this, we report the first two complete chloroplast genome sequences from members of the Asteraceae, those of Helianthus annuus and Lactuca sativa. These plants belong to two distantly related subfamilies, Asteroideae and Cichorioideae, respectively. In addition to these, there is only one other published chloroplast genome sequence for any plant within the larger group called Eusterids II, that of Panax ginseng (Araliaceae, 156,318 bps, AY582139). Early chloroplast genome mapping studies demonstrated that H. annuus and L. sativa share a 22 kb inversion relative to members of the subfamily Barnadesioideae. By comparison to outgroups, this inversion was shown to be derived, indicating that the Asteroideae and Cichorioideae are more closely related than either is to the Barnadesioideae. Later sequencing study found that taxa that share this 22 kb inversion also contain within this region a second, smaller, 3.3 kb inversion. These sequences also enable an analysis of patterns of shared repeats in the genomes at fine level and of RNA editing by comparison to available EST sequences. In addition, since both of these genomes are crop plants, their complete genome sequence will facilitate development of chloroplast genetic engineering technology, as in recent studies from Daniell's lab. Knowing the exact sequence from spacer regions is crucial for introducing transgenes into the chloroplast genome.« less
A Comprehensive Strategy for Accurate Mutation Detection of the Highly Homologous PMS2.

PubMed

Li, Jianli; Dai, Hongzheng; Feng, Yanming; Tang, Jia; Chen, Stella; Tian, Xia; Gorman, Elizabeth; Schmitt, Eric S; Hansen, Terah A A; Wang, Jing; Plon, Sharon E; Zhang, Victor Wei; Wong, Lee-Jun C

2015-09-01

Germline mutations in the DNA mismatch repair gene PMS2 underlie the cancer susceptibility syndrome, Lynch syndrome. However, accurate molecular testing of PMS2 is complicated by a large number of highly homologous sequences. To establish a comprehensive approach for mutation detection of PMS2, we have designed a strategy combining targeted capture next-generation sequencing (NGS), multiplex ligation-dependent probe amplification, and long-range PCR followed by NGS to simultaneously detect point mutations and copy number changes of PMS2. Exonic deletions (E2 to E9, E5 to E9, E8, E10, E14, and E1 to E15), duplications (E11 to E12), and a nonsense mutation, p.S22*, were identified. Traditional multiplex ligation-dependent probe amplification and Sanger sequencing approaches cannot differentiate the origin of the exonic deletions in the 3' region when PMS2 and PMS2CL share identical sequences as a result of gene conversion. Our approach allows unambiguous identification of mutations in the active gene with a straightforward long-range-PCR/NGS method. Breakpoint analysis of multiple samples revealed that recurrent exon 14 deletions are mediated by homologous Alu sequences. Our comprehensive approach provides a reliable tool for accurate molecular analysis of genes containing multiple copies of highly homologous sequences and should improve PMS2 molecular analysis for patients with Lynch syndrome. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Mitochondrial DNA of Vitis vinifera and the issue of rampant horizontal gene transfer.

PubMed

Goremykin, Vadim V; Salamini, Francesco; Velasco, Riccardo; Viola, Roberto

2009-01-01

The mitochondrial genome of grape (Vitis vinifera), the largest organelle genome sequenced so far, is presented. The genome is 773,279 nt long and has the highest coding capacity among known angiosperm mitochondrial DNAs (mtDNAs). The proportion of promiscuous DNA of plastid origin in the genome is also the largest ever reported for an angiosperm mtDNA, both in absolute and relative terms. In all, 42.4% of chloroplast genome of Vitis has been incorporated into its mitochondrial genome. In order to test if horizontal gene transfer (HGT) has also contributed to the gene content of the grape mtDNA, we built phylogenetic trees with the coding sequences of mitochondrial genes of grape and their homologs from plant mitochondrial genomes. Many incongruent gene tree topologies were obtained. However, the extent of incongruence between these gene trees is not significantly greater than that observed among optimal trees for chloroplast genes, the common ancestry of which has never been in doubt. In both cases, we attribute this incongruence to artifacts of tree reconstruction, insufficient numbers of characters, and gene paralogy. This finding leads us to question the recent phylogenetic interpretation of Bergthorsson et al. (2003, 2004) and Richardson and Palmer (2007) that rampant HGT into the mtDNA of Amborella best explains phylogenetic incongruence between mitochondrial gene trees for angiosperms. The only evidence for HGT into the Vitis mtDNA found involves fragments of two coding sequences stemming from two closteroviruses that cause the leaf roll disease of this plant. We also report that analysis of sequences shared by both chloroplast and mitochondrial genomes provides evidence for a previously unknown gene transfer route from the mitochondrion to the chloroplast.
The human MCP-2 gene (SCYA8): Cloning, sequence analysis, tissue expression, and assignment to the CC chemokine gene contig on chromosome 17q11.2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Van Coillie, E.; Fiten, P.; Van Damme, J.

1997-03-01

Monocyte chemotactic proteins (MCPs) form a subfamily of chemokines that recruit leukocytes to sites of inflammation and that may contribute to tumor-associated leukocyte infiltration and to the antiviral state against HIV infection. With the use of degenerate primers that were based on CC chemokine consensus sequences, the known MIP-1{alpha}/LD78{alpha}, MCP-1, and MCP-3 genes and the previously unidentified eotaxin and MCP-2 genes were isolated from a YAC contig from human chromosome 17q11.2. The amplified genomic MCP-2 fragment was used to isolate an MCP-2 cosmid from which the gene sequence was determined. The MCP-2 gene shares with the MCP-1 and MCP-3 genesmore » a conserved intron-exon structure and a coding nucleotide sequence homology of 77%. By Northern blot analysis the 1.0-kb MCP-2 mRNA was predominantly detectable in the small intestine, peripheral blood, heart, placenta, lung, skeletal muscle, ovary, colon, spinal cord, pancreas, and thymus. Transcripts of 1.5 and 2.4 kb were found in the testis, the small intestine, and the colon. The isolation of the MCP-2 gene from the chemokine contig localized it on YAC clones of chromosome 17q11.2, which also contain the eotaxin, MCP-1, MCP-3, and NCC-1/MCP-4 genes. The combination of using degenerate primer PCR and YACs illustrates that novel genes can efficiently be isolated from gene cluster contigs with less redundancy and effort than the isolation of novel ESTs. 42 refs., 5 figs., 2 tabs.« less
The mitochondrial genome of the Japanese skeleton shrimp Caprella mutica (Amphipoda: Caprellidea) reveals a unique gene order and shared apomorphic translocations with Gammaridea.

PubMed

Kilpert, Fabian; Podsiadlowski, Lars

2010-06-01

This study presents the complete mitochondrial (mt) genome of the amphipod Caprella mutica, an east-Asian species, which recently invaded the coastal regions of North America, Europe, and New Zealand. It is the first complete sequence of a member of the amphipod subclade Caprellidea. The mt genome has a total length of 15,427 bp and is organized in a circular double-strand molecule. All 37 mt genes are present, including the common set of 22 tRNAs. Particularly noticeable is the duplication of the control region (CR). The additional CR is located between nad6 and cob, and is almost identical to the original one. The most extensive changes in the gene order affect nad5 and a block consisting of trnH, nad4, nad4L, and trnP-all inserted near the original CR. The gene nad5 is also inverted. Furthermore, a comparison with the pancrustacean ground pattern reveals additional changes of individual tRNA genes. Some of these changes are also shared by Metacrangonyx longipes and Parhyale hawaiensis. These arrangements were found only in amphipods and might be considered as apomorphic by character states of Amphipoda. In all the three species, there is good evidence that trnG originated from a rare duplication/remolding event of the adjacent trnW gene. Nevertheless, each of the three available amphipod mitogenome sequences also bears unique rearrangements. C. mutica, however, shows the most extensive rearrangement in comparison with the pancrustacean ground pattern.
Promoters, toll like receptors and microRNAs: a strange association.

PubMed

Korla, Kalyani; Arrigo, Patrizio; Mitra, Chanchal K

2013-06-01

Toll-like receptors (TLRs) are proteins that play key role in the innate immune system. In the present study, -1000 base pairs upstream are taken from the transcription start site of the various TLR genes (10 known) in human. About 40 microRNAs have been identified that share 12-19 nucleotide sequence similarity with the promoter regions of 10 TLRs. It is proposed that the microRNA performs potential role in identification of promoter sequence and initiation of transcription.
The Cytochrome P450 Homepage

PubMed Central

2009-01-01

The Cytochrome P450 Homepage is a universal resource for nomenclature and sequence information on cytochrome P450 (CYP) genes. The site has been in continuous operation since February 1995. Currently, naming information for 11,512 CYPs are available on the web pages. The P450 sequences are manually curated by David Nelson, and the nomenclature system conforms to an evolutionary scheme such that members of CYP families and subfamilies share common ancestors. The organisation and content of the Homepage are described. PMID:19951895
Defining the Estimated Core Genome of Bacterial Populations Using a Bayesian Decision Model

PubMed Central

van Tonder, Andries J.; Mistry, Shilan; Bray, James E.; Hill, Dorothea M. C.; Cody, Alison J.; Farmer, Chris L.; Klugman, Keith P.; von Gottberg, Anne; Bentley, Stephen D.; Parkhill, Julian; Jolley, Keith A.; Maiden, Martin C. J.; Brueggemann, Angela B.

2014-01-01

The bacterial core genome is of intense interest and the volume of whole genome sequence data in the public domain available to investigate it has increased dramatically. The aim of our study was to develop a model to estimate the bacterial core genome from next-generation whole genome sequencing data and use this model to identify novel genes associated with important biological functions. Five bacterial datasets were analysed, comprising 2096 genomes in total. We developed a Bayesian decision model to estimate the number of core genes, calculated pairwise evolutionary distances (p-distances) based on nucleotide sequence diversity, and plotted the median p-distance for each core gene relative to its genome location. We designed visually-informative genome diagrams to depict areas of interest in genomes. Case studies demonstrated how the model could identify areas for further study, e.g. 25% of the core genes with higher sequence diversity in the Campylobacter jejuni and Neisseria meningitidis genomes encoded hypothetical proteins. The core gene with the highest p-distance value in C. jejuni was annotated in the reference genome as a putative hydrolase, but further work revealed that it shared sequence homology with beta-lactamase/metallo-beta-lactamases (enzymes that provide resistance to a range of broad-spectrum antibiotics) and thioredoxin reductase genes (which reduce oxidative stress and are essential for DNA replication) in other C. jejuni genomes. Our Bayesian model of estimating the core genome is principled, easy to use and can be applied to large genome datasets. This study also highlighted the lack of knowledge currently available for many core genes in bacterial genomes of significant global public health importance. PMID:25144616
Nucleotide sequences and regulational analysis of genes involved in conversion of aniline to catechol in Pseudomonas putida UCC22(pTDN1).

PubMed Central

Fukumori, F; Saint, C P

1997-01-01

A 9,233-bp HindIII fragment of the aromatic amine catabolic plasmid pTDN1, isolated from a derivative of Pseudomonas putida mt-2 (UCC22), confers the ability to degrade aniline on P. putida KT2442. The fragment encodes six open reading frames which are arranged in the same direction. Their 5' upstream region is part of the direct-repeat sequence of pTDN1. Nucleotide sequence of 1.8 kb of the repeat sequence revealed only a single base pair change compared to the known sequence of IS1071 which is involved in the transposition of the chlorobenzoate genes (C. Nakatsu, J. Ng, R. Singh, N. Straus, and C. Wyndham, Proc. Natl. Acad. Sci. USA 88:8312-8316, 1991). Four open reading frames encode proteins with considerable homology to proteins found in other aromatic-compound degradation pathways. On the basis of sequence similarity, these genes are proposed to encode the large and small subunits of aniline oxygenase (tdnA1 and tdnA2, respectively), a reductase (tdnB), and a LysR-type regulatory gene (tdnR). The putative large subunit has a conserved [2Fe-2S]R Rieske-type ligand center. Two genes, tdnQ and tdnT, which may be involved in amino group transfer, are localized upstream of the putative oxygenase genes. The tdnQ gene product shares about 30% similarity with glutamine synthetases; however, a pUC-based plasmid carrying tdnQ did not support the growth of an Escherichia coli glnA strain in the absence of glutamine. TdnT possesses domains that are conserved among amidotransferases. The tdnQ, tdnA1, tdnA2, tdnB, and tdnR genes are essential for the conversion of aniline to catechol. PMID:8990291
Rare deleterious mutations are associated with disease in bipolar disorder families.

PubMed

Rao, A R; Yourshaw, M; Christensen, B; Nelson, S F; Kerner, B

2017-07-01

Bipolar disorder (BD) is a common, complex and heritable psychiatric disorder characterized by episodes of severe mood swings. The identification of rare, damaging genomic mutations in families with BD could inform about disease mechanisms and lead to new therapeutic interventions. To determine whether rare, damaging mutations shared identity-by-descent in families with BD could be associated with disease, exome sequencing was performed in multigenerational families of the NIMH BD Family Study followed by in silico functional prediction. Disease association and disease specificity was determined using 5090 exomes from the Sweden-Schizophrenia (SZ) Population-Based Case-Control Exome Sequencing study. We identified 14 rare and likely deleterious mutations in 14 genes that were shared identity-by-descent among affected family members. The variants were associated with BD (P<0.05 after Bonferroni's correction) and disease specificity was supported by the absence of the mutations in patients with SZ. In addition, we found rare, functional mutations in known causal genes for neuropsychiatric disorders including holoprosencephaly and epilepsy. Our results demonstrate that exome sequencing in multigenerational families with BD is effective in identifying rare genomic variants of potential clinical relevance and also disease modifiers related to coexisting medical conditions. Replication of our results and experimental validation are required before disease causation could be assumed.
Sequence-based evidence for major histocompatibility complex-disassortative mating in a colonial seabird.

PubMed

Juola, Frans A; Dearborn, Donald C

2012-01-07

The major histocompatibility complex (MHC) is a polymorphic gene family associated with immune defence, and it can play a role in mate choice. Under the genetic compatibility hypothesis, females choose mates that differ genetically from their own MHC genotypes, avoiding inbreeding and/or enhancing the immunocompetence of their offspring. We tested this hypothesis of disassortative mating based on MHC genotypes in a population of great frigatebirds (Fregata minor) by sequencing the second exon of MHC class II B. Extensive haploid cloning yielded two to four alleles per individual, suggesting the amplification of two genes. MHC similarity between mates was not significantly different between pairs that did (n = 4) or did not (n = 42) exhibit extra-pair paternity. Comparing all 46 mated pairs to a distribution based on randomized re-pairings, we observed the following (i): no evidence for mate choice based on maximal or intermediate levels of MHC allele sharing (ii), significantly disassortative mating based on similarity of MHC amino acid sequences, and (iii) no evidence for mate choice based on microsatellite alleles, as measured by either allele sharing or similarity in allele size. This suggests that females choose mates that differ genetically from themselves at MHC loci, but not as an inbreeding-avoidance mechanism.
Linkage Study Revealed Complex Haplotypes in a Multifamily due to Different Mutations in CAPN3 Gene in an Iranian Ethnic Group.

PubMed

Mojbafan, Marzieh; Tonekaboni, Seyed Hassan; Abiri, Maryam; Kianfar, Soudeh; Sarhadi, Ameneh; Nilipour, Yalda; Tavakkoly-Bazzaz, Javad; Zeinali, Sirous

2016-07-01

Calpainopathy is an autosomal recessive form of limb girdle muscular dystrophies which is caused by mutation in CAPN3 gene. In the present study, co-segregation of this disorder was analyzed with four short tandem repeat markers linked to the CAPN3 gene. Three apparently unrelated Iranian families with same ethnicity were investigated. Haplotype analysis and sequencing of the CAPN3 gene were performed. DNA sample from one of the patients was simultaneously sent for next-generation sequencing. DNA sequencing identified two mutations. It was seen as a homozygous c.2105C>T in exon 19 in one family, a homozygous novel mutation c.380G>A in exon 3 in another family, and a compound heterozygote form of these two mutations in the third family. Next-generation sequencing also confirmed our results. It was expected that, due to the rare nature of limb girdle muscular dystrophies, affected individuals from the same ethnic group share similar mutations. Haplotype analysis showed two different homozygote patterns in two families, yet a compound heterozygote pattern in the third family as seen in the mutation analysis. This study shows that haplotype analysis would help in determining presence of different founders.
Microbial diversity in an Armenian geothermal spring assessed by molecular and culture-based methods.

PubMed

Panosyan, Hovik; Birkeland, Nils-Kåre

2014-11-01

The phylogenetic diversity of the prokaryotic community thriving in the Arzakan hot spring in Armenia was studied using molecular and culture-based methods. A sequence analysis of 16S rRNA gene clone libraries demonstrated the presence of a diversity of microorganisms belonging to the Alphaproteobacteria, Betaproteobacteria, Gammaproteobacteria, Epsilonproteobacteria, Firmicutes, Bacteroidetes phyla, and Cyanobacteria. Proteobacteria was the dominant group, representing 52% of the bacterial clones. Denaturing gradient gel electrophoresis profiles of the bacterial 16S rRNA gene fragments also indicated the abundance of Proteobacteria, Bacteroidetes, and Cyanobacteria populations. Most of the sequences were most closely related to uncultivated microorganisms and shared less than 96% similarity with their closest matches in GenBank, indicating that this spring harbors a unique community of novel microbial species or genera. The majority of the sequences of an archaeal 16S rRNA gene library, generated from a methanogenic enrichment, were close relatives of members of the genus Methanoculleus. Aerobic endospore-forming bacteria mainly belonging to Bacillus and Geobacillus were detected only by culture-dependent methods. Three isolates were successfully obtained having 99, 96, and 96% 16S rRNA gene sequence similarities to Arcobacter sp., Methylocaldum sp., and Methanoculleus sp., respectively. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Comparative genomics and prediction of conditionally dispensable sequences in legume-infecting Fusarium oxysporum formae speciales facilitates identification of candidate effectors.

PubMed

Williams, Angela H; Sharma, Mamta; Thatcher, Louise F; Azam, Sarwar; Hane, James K; Sperschneider, Jana; Kidd, Brendan N; Anderson, Jonathan P; Ghosh, Raju; Garg, Gagan; Lichtenzveig, Judith; Kistler, H Corby; Shea, Terrance; Young, Sarah; Buck, Sally-Anne G; Kamphuis, Lars G; Saxena, Rachit; Pande, Suresh; Ma, Li-Jun; Varshney, Rajeev K; Singh, Karam B

2016-03-05

Soil-borne fungi of the Fusarium oxysporum species complex cause devastating wilt disease on many crops including legumes that supply human dietary protein needs across many parts of the globe. We present and compare draft genome assemblies for three legume-infecting formae speciales (ff. spp.): F. oxysporum f. sp. ciceris (Foc-38-1) and f. sp. pisi (Fop-37622), significant pathogens of chickpea and pea respectively, the world's second and third most important grain legumes, and lastly f. sp. medicaginis (Fom-5190a) for which we developed a model legume pathosystem utilising Medicago truncatula. Focusing on the identification of pathogenicity gene content, we leveraged the reference genomes of Fusarium pathogens F. oxysporum f. sp. lycopersici (tomato-infecting) and F. solani (pea-infecting) and their well-characterised core and dispensable chromosomes to predict genomic organisation in the newly sequenced legume-infecting isolates. Dispensable chromosomes are not essential for growth and in Fusarium species are known to be enriched in host-specificity and pathogenicity-associated genes. Comparative genomics of the publicly available Fusarium species revealed differential patterns of sequence conservation across F. oxysporum formae speciales, with legume-pathogenic formae speciales not exhibiting greater sequence conservation between them relative to non-legume-infecting formae speciales, possibly indicating the lack of a common ancestral source for legume pathogenicity. Combining predicted dispensable gene content with in planta expression in the model legume-infecting isolate, we identified small conserved regions and candidate effectors, four of which shared greatest similarity to proteins from another legume-infecting ff. spp. We demonstrate that distinction of core and potential dispensable genomic regions of novel F. oxysporum genomes is an effective tool to facilitate effector discovery and the identification of gene content possibly linked to host specificity. While the legume-infecting isolates didn't share large genomic regions of pathogenicity-related content, smaller regions and candidate effector proteins were highly conserved, suggesting that they may play specific roles in inducing disease on legume hosts.
Knock down of Whitefly Gut Gene Expression and Mortality by Orally Delivered Gut Gene-Specific dsRNAs.

PubMed

Vyas, Meenal; Raza, Amir; Ali, Muhammad Yousaf; Ashraf, Muhammad Aleem; Mansoor, Shahid; Shahid, Ahmad Ali; Brown, Judith K

2017-01-01

Control of the whitefly Bemisia tabaci (Genn.) agricultural pest and plant virus vector relies on the use of chemical insecticides. RNA-interference (RNAi) is a homology-dependent innate immune response in eukaryotes, including insects, which results in degradation of the corresponding transcript following its recognition by a double-stranded RNA (dsRNA) that shares 100% sequence homology. In this study, six whitefly 'gut' genes were selected from an in silico-annotated transcriptome library constructed from the whitefly alimentary canal or 'gut' of the B biotype of B. tabaci, and tested for knock down efficacy, post-ingestion of dsRNAs that share 100% sequence homology to each respective gene target. Candidate genes were: Acetylcholine receptor subunit α, Alpha glucosidase 1, Aquaporin 1, Heat shock protein 70, Trehalase1, and Trehalose transporter1. The efficacy of RNAi knock down was further tested in a gene-specific functional bioassay, and mortality was recorded in 24 hr intervals, six days, post-treatment. Based on qPCR analysis, all six genes tested showed significantly reduced gene expression. Moderate-to-high whitefly mortality was associated with the down-regulation of osmoregulation, sugar metabolism and sugar transport-associated genes, demonstrating that whitefly survivability was linked with RNAi results. Silenced Acetylcholine receptor subunit α and Heat shock protein 70 genes showed an initial low whitefly mortality, however, following insecticide or high temperature treatments, respectively, significantly increased knockdown efficacy and death was observed, indicating enhanced post-knockdown sensitivity perhaps related to systemic silencing. The oral delivery of gut-specific dsRNAs, when combined with qPCR analysis of gene expression and a corresponding gene-specific bioassay that relates knockdown and mortality, offers a viable approach for functional genomics analysis and the discovery of prospective dsRNA biopesticide targets. The approach can be applied to functional genomics analyses to facilitate, species-specific dsRNA-mediated control of other non-model hemipterans.
Genome-wide patterns of promoter sharing and co-expression in bovine skeletal muscle.

PubMed

Gu, Quan; Nagaraj, Shivashankar H; Hudson, Nicholas J; Dalrymple, Brian P; Reverter, Antonio

2011-01-12

Gene regulation by transcription factors (TF) is species, tissue and time specific. To better understand how the genetic code controls gene expression in bovine muscle we associated gene expression data from developing Longissimus thoracis et lumborum skeletal muscle with bovine promoter sequence information. We created a highly conserved genome-wide promoter landscape comprising 87,408 interactions relating 333 TFs with their 9,242 predicted target genes (TGs). We discovered that the complete set of predicted TGs share an average of 2.75 predicted TF binding sites (TFBSs) and that the average co-expression between a TF and its predicted TGs is higher than the average co-expression between the same TF and all genes. Conversely, pairs of TFs sharing predicted TGs showed a co-expression correlation higher that pairs of TFs not sharing TGs. Finally, we exploited the co-occurrence of predicted TFBS in the context of muscle-derived functionally-coherent modules including cell cycle, mitochondria, immune system, fat metabolism, muscle/glycolysis, and ribosome. Our findings enabled us to reverse engineer a regulatory network of core processes, and correctly identified the involvement of E2F1, GATA2 and NFKB1 in the regulation of cell cycle, fat, and muscle/glycolysis, respectively. The pivotal implication of our research is two-fold: (1) there exists a robust genome-wide expression signal between TFs and their predicted TGs in cattle muscle consistent with the extent of promoter sharing; and (2) this signal can be exploited to recover the cellular mechanisms underpinning transcription regulation of muscle structure and development in bovine. Our study represents the first genome-wide report linking tissue specific co-expression to co-regulation in a non-model vertebrate.
Quaranfil, Johnston Atoll, and Lake Chad viruses are novel members of the family Orthomyxoviridae.

PubMed

Presti, Rachel M; Zhao, Guoyan; Beatty, Wandy L; Mihindukulasuriya, Kathie A; da Rosa, Amelia P A Travassos; Popov, Vsevolod L; Tesh, Robert B; Virgin, Herbert W; Wang, David

2009-11-01

Arboviral infections are an important cause of emerging infections due to the movements of humans, animals, and hematophagous arthropods. Quaranfil virus (QRFV) is an unclassified arbovirus originally isolated from children with mild febrile illness in Quaranfil, Egypt, in 1953. It has subsequently been isolated in multiple geographic areas from ticks and birds. We used high-throughput sequencing to classify QRFV as a novel orthomyxovirus. The genome of this virus is comprised of multiple RNA segments; five were completely sequenced. Proteins with limited amino acid similarity to conserved domains in polymerase (PA, PB1, and PB2) and hemagglutinin (HA) genes from known orthomyxoviruses were predicted to be present in four of the segments. The fifth sequenced segment shared no detectable similarity to any protein and is of uncertain function. The end-terminal sequences of QRFV are conserved between segments and are different from those of the known orthomyxovirus genera. QRFV is known to cross-react serologically with two other unclassified viruses, Johnston Atoll virus (JAV) and Lake Chad virus (LKCV). The complete open reading frames of PB1 and HA were sequenced for JAV, while a fragment of PB1 of LKCV was identified by mass sequencing. QRFV and JAV PB1 and HA shared 80% and 70% amino acid identity to each other, respectively; the LKCV PB1 fragment shared 83% amino acid identity with the corresponding region of QRFV PB1. Based on phylogenetic analyses, virion ultrastructural features, and the unique end-terminal sequences identified, we propose that QRFV, JAV, and LKCV comprise a novel genus of the family Orthomyxoviridae.
Quaranfil, Johnston Atoll, and Lake Chad Viruses Are Novel Members of the Family Orthomyxoviridae▿

PubMed Central

Presti, Rachel M.; Zhao, Guoyan; Beatty, Wandy L.; Mihindukulasuriya, Kathie A.; Travassos da Rosa, Amelia P. A.; Popov, Vsevolod L.; Tesh, Robert B.; Virgin, Herbert W.; Wang, David

2009-01-01

Arboviral infections are an important cause of emerging infections due to the movements of humans, animals, and hematophagous arthropods. Quaranfil virus (QRFV) is an unclassified arbovirus originally isolated from children with mild febrile illness in Quaranfil, Egypt, in 1953. It has subsequently been isolated in multiple geographic areas from ticks and birds. We used high-throughput sequencing to classify QRFV as a novel orthomyxovirus. The genome of this virus is comprised of multiple RNA segments; five were completely sequenced. Proteins with limited amino acid similarity to conserved domains in polymerase (PA, PB1, and PB2) and hemagglutinin (HA) genes from known orthomyxoviruses were predicted to be present in four of the segments. The fifth sequenced segment shared no detectable similarity to any protein and is of uncertain function. The end-terminal sequences of QRFV are conserved between segments and are different from those of the known orthomyxovirus genera. QRFV is known to cross-react serologically with two other unclassified viruses, Johnston Atoll virus (JAV) and Lake Chad virus (LKCV). The complete open reading frames of PB1 and HA were sequenced for JAV, while a fragment of PB1 of LKCV was identified by mass sequencing. QRFV and JAV PB1 and HA shared 80% and 70% amino acid identity to each other, respectively; the LKCV PB1 fragment shared 83% amino acid identity with the corresponding region of QRFV PB1. Based on phylogenetic analyses, virion ultrastructural features, and the unique end-terminal sequences identified, we propose that QRFV, JAV, and LKCV comprise a novel genus of the family Orthomyxoviridae. PMID:19726499

Rapid evolution of avirulence genes in rice blast fungus Magnaporthe oryzae

PubMed Central

2014-01-01

Background Rice blast fungus Magnaporthe oryzae is one of the most devastating pathogens in rice. Avirulence genes in this fungus share a gene-for-gene relationship with the resistance genes in its host rice. Although numerous studies have shown that rice blast R-genes are extremely diverse and evolve rapidly in their host populations, little is known about the evolutionary patterns of the Avr-genes in the pathogens. Results Here, six well-characterized Avr-genes and seven randomly selected non-Avr control genes were used to investigate the genetic variations in 62 rice blast strains from different parts of China. Frequent presence/absence polymorphisms, high levels of nucleotide variation (~10-fold higher than non-Avr genes), high non-synonymous to synonymous substitution ratios, and frequent shared non-synonymous substitution were observed in the Avr-genes of these diversified blast strains. In addition, most Avr-genes are closely associated with diverse repeated sequences, which may partially explain the frequent presence/absence polymorphisms in Avr-genes. Conclusion The frequent deletion and gain of Avr-genes and rapid non-synonymous variations might be the primary mechanisms underlying rapid adaptive evolution of pathogens toward virulence to their host plants, and these features can be used as the indicators for identifying additional Avr-genes. The high number of nucleotide polymorphisms among Avr-gene alleles could also be used to distinguish genetic groups among different strains. PMID:24725999
DroSpeGe: rapid access database for new Drosophila species genomes.

PubMed

Gilbert, Donald G

2007-01-01

The Drosophila species comparative genome database DroSpeGe (http://insects.eugenes.org/DroSpeGe/) provides genome researchers with rapid, usable access to 12 new and old Drosophila genomes, since its inception in 2004. Scientists can use, with minimal computing expertise, the wealth of new genome information for developing new insights into insect evolution. New genome assemblies provided by several sequencing centers have been annotated with known model organism gene homologies and gene predictions to provided basic comparative data. TeraGrid supplies the shared cyberinfrastructure for the primary computations. This genome database includes homologies to Drosophila melanogaster and eight other eukaryote model genomes, and gene predictions from several groups. BLAST searches of the newest assemblies are integrated with genome maps. GBrowse maps provide detailed views of cross-species aligned genomes. BioMart provides for data mining of annotations and sequences. Common chromosome maps identify major synteny among species. Potential gain and loss of genes is suggested by Gene Ontology groupings for genes of the new species. Summaries of essential genome statistics include sizes, genes found and predicted, homology among genomes, phylogenetic trees of species and comparisons of several gene predictions for sensitivity and specificity in finding new and known genes.
Characterization of perch rhabdovirus (PRV) in farmed grayling Thymallus thymallus.

PubMed

Gadd, Tuija; Viljamaa-Dirks, Satu; Holopainen, Riikka; Koski, Perttu; Jakava-Viljanen, Miia

2013-10-11

Two Finnish fish farms experienced elevated mortality rates in farmed grayling Thymallus thymallus fry during the summer months, most typically in July. The mortalities occurred during several years and were connected with a few neurological disorders and peritonitis. Virological investigation detected an infection with an unknown rhabdovirus. Based on the entire glycoprotein (G) and partial RNA polymerase (L) gene sequences, the virus was classified as a perch rhabdovirus (PRV). Pairwise comparisons of the G and L gene regions of grayling isolates revealed that all isolates were very closely related, with 99 to 100% nucleotide identity, which suggests the same origin of infection. Phylogenetic analysis demonstrated that they were closely related to the strain isolated from perch Perca fluviatilis and sea trout Salmo trutta trutta caught from the Baltic Sea. The entire G gene sequences revealed that all Finnish grayling isolates, and both the perch and sea trout isolates, were most closely related to a PRV isolated in France in 2004. According to the partial L gene sequences, all of the Finnish grayling isolates were most closely related to the Danish isolate DK5533 from pike. The genetic analysis of entire G gene and partial L gene sequences showed that the Finnish brown trout isolate ka907_87 shared only approximately 67 and 78% identity, respectively, with our grayling isolates. The grayling isolates were also analysed by an immunofluorescence antibody test. This is the first report of a PRV causing disease in grayling in Finland.
Whole-exome sequencing identified a variant in EFTUD2 gene in establishing a genetic diagnosis.

PubMed

Rengasamy Venugopalan, S; Farrow, E G; Lypka, M

2017-06-01

Craniofacial anomalies are complex and have an overlapping phenotype. Mandibulofacial Dysostosis and Oculo-Auriculo-Vertebral Spectrum are conditions that share common craniofacial phenotype and present a challenge in arriving at a diagnosis. In this report, we present a case of female proband who was given a differential diagnosis of Treacher Collins syndrome or Hemifacial Microsomia without certainty. Prior genetic testing reported negative for 22q deletion and FGFR screenings. The objective of this study was to demonstrate the critical role of whole-exome sequencing in establishing a genetic diagnosis of the proband. The participants were 14½-year-old affected female proband/parent trio. Proband/parent trio were enrolled in the study. Surgical tissue sample from the proband and parental blood samples were collected and prepared for whole-exome sequencing. Illumina HiSeq 2500 instrument was used for sequencing (125 nucleotide reads/84X coverage). Analyses of variants were performed using custom-developed software, RUNES and VIKING. Variant analyses following whole-exome sequencing identified a heterozygous de novo pathogenic variant, c.259C>T (p.Gln87*), in EFTUD2 (NM_004247.3) gene in the proband. Previous studies have reported that the variants in EFTUD2 gene were associated with Mandibulofacial Dysostosis with Microcephaly. Patients with facial asymmetry, micrognathia, choanal atresia and microcephaly should be analyzed for variants in EFTUD2 gene. Next-generation sequencing techniques, such as whole-exome sequencing offer great promise to improve the understanding of etiologies of sporadic genetic diseases. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Plasmid Characterization and Chromosome Analysis of Two netF+ Clostridium perfringens Isolates Associated with Foal and Canine Necrotizing Enteritis

PubMed Central

Mehdizadeh Gohari, Iman; Kropinski, Andrew M.; Weese, Scott J.; Parreira, Valeria R.; Whitehead, Ashley E.; Boerlin, Patrick; Prescott, John F.

2016-01-01

The recent discovery of a novel beta-pore-forming toxin, NetF, which is strongly associated with canine and foal necrotizing enteritis should improve our understanding of the role of type A Clostridium perfringens associated disease in these animals. The current study presents the complete genome sequence of two netF-positive strains, JFP55 and JFP838, which were recovered from cases of foal necrotizing enteritis and canine hemorrhagic gastroenteritis, respectively. Genome sequencing was done using Single Molecule, Real-Time (SMRT) technology-PacBio and Illumina Hiseq2000. The JFP55 and JFP838 genomes include a single 3.34 Mb and 3.53 Mb chromosome, respectively, and both genomes include five circular plasmids. Plasmid annotation revealed that three plasmids were shared by the two newly sequenced genomes, including a NetF/NetE toxins-encoding tcp-conjugative plasmid, a CPE/CPB2 toxins-encoding tcp-conjugative plasmid and a putative bacteriocin-encoding plasmid. The putative beta-pore-forming toxin genes, netF, netE and netG, were located in unique pathogenicity loci on tcp-conjugative plasmids. The C. perfringens JFP55 chromosome carries 2,825 protein-coding genes whereas the chromosome of JFP838 contains 3,014 protein-encoding genes. Comparison of these two chromosomes with three available reference C. perfringens chromosome sequences identified 48 (~247 kb) and 81 (~430 kb) regions unique to JFP55 and JFP838, respectively. Some of these divergent genomic regions in both chromosomes are phage- and plasmid-related segments. Sixteen of these unique chromosomal regions (~69 kb) were shared between the two isolates. Five of these shared regions formed a mosaic of plasmid-integrated segments, suggesting that these elements were acquired early in a clonal lineage of netF-positive C. perfringens strains. These results provide significant insight into the basis of canine and foal necrotizing enteritis and are the first to demonstrate that netF resides on a large and unique plasmid-encoded locus. PMID:26859667
Genevar: a database and Java application for the analysis and visualization of SNP-gene associations in eQTL studies.

PubMed

Yang, Tsun-Po; Beazley, Claude; Montgomery, Stephen B; Dimas, Antigone S; Gutierrez-Arcelus, Maria; Stranger, Barbara E; Deloukas, Panos; Dermitzakis, Emmanouil T

2010-10-01

Genevar (GENe Expression VARiation) is a database and Java tool designed to integrate multiple datasets, and provides analysis and visualization of associations between sequence variation and gene expression. Genevar allows researchers to investigate expression quantitative trait loci (eQTL) associations within a gene locus of interest in real time. The database and application can be installed on a standard computer in database mode and, in addition, on a server to share discoveries among affiliations or the broader community over the Internet via web services protocols. http://www.sanger.ac.uk/resources/software/genevar.
Evidence for 5S rDNA horizontal transfer in the toadfish Halobatrachus didactylus (Schneider, 1801) based on the analysis of three multigene families.

PubMed

Merlo, Manuel A; Cross, Ismael; Palazón, José L; Ubeda-Manzanaro, María; Sarasquete, Carmen; Rebordinos, Laureana

2012-10-07

The Batrachoididae family is a group of marine teleosts that includes several species with more complicated physiological characteristics, such as their excretory, reproductive, cardiovascular and respiratory systems. Previous studies of the 5S rDNA gene family carried out in four species from the Western Atlantic showed two types of this gene in two species but only one in the other two, under processes of concerted evolution and birth-and-death evolution with purifying selection. Here we present results of the 5S rDNA and another two gene families in Halobatrachus didactylus, an Eastern Atlantic species, and draw evolutionary inferences regarding the gene families. In addition we have also mapped the genes on the chromosomes by two-colour fluorescence in situ hybridization (FISH). Two types of 5S rDNA were observed, named type α and type β. Molecular analysis of the 5S rDNA indicates that H. didactylus does not share the non-transcribed spacer (NTS) sequences with four other species of the family; therefore, it must have evolved in isolation. Amplification with the type β specific primers amplified a specific band in 9 specimens of H. didactylus and two of Sparus aurata. Both types showed regulatory regions and a secondary structure which mark them as functional genes. However, the U2 snRNA gene and the ITS-1 sequence showed one electrophoretic band and with one type of sequence. The U2 snRNA sequence was the most variable of the three multigene families studied. Results from two-colour FISH showed no co-localization of the gene coding from three multigene families and provided the first map of the chromosomes of the species. A highly significant finding was observed in the analysis of the 5S rDNA, since two such distant species as H. didactylus and Sparus aurata share a 5S rDNA type. This 5S rDNA type has been detected in other species belonging to the Batrachoidiformes and Perciformes orders, but not in the Pleuronectiformes and Clupeiformes orders. Two hypotheses have been outlined: one is the possible vertical permanence of the shared type in some fish lineages, and the other is the possibility of a horizontal transference event between ancient species of the Perciformes and Batrachoidiformes orders. This finding opens a new perspective in fish evolution and in the knowledge of the dynamism of the 5S rDNA. Cytogenetic analysis allowed some evolutionary trends to be roughed out, such as the progressive change in the U2 snDNA and the organization of (GATA)n repeats, from dispersed to localized in one locus. The accumulation of (GATA)n repeats in one chromosome pair could be implicated in the evolution of a pair of proto-sex chromosomes. This possibility could situate H. didactylus as the most highly evolved of the Batrachoididae family in terms of sex chromosome biology.
Complexities of gene expression patterns in natural populations of an extremophile fish (Poecilia mexicana, Poeciliidae)

PubMed Central

Passow, Courtney N.; Brown, Anthony P.; Arias-Rodriguez, Lenin; Yee, Muh-Ching; Sockell, Alexandra; Schartl, Manfred; Warren, Wesley C.; Bustamante, Carlos; Kelley, Joanna L.; Tobler, Michael

2017-01-01

Variation in gene expression can provide insights into organismal responses to environmental stress and physiological mechanisms mediating adaptation to habitats with contrasting environmental conditions. We performed an RNA-sequencing experiment to quantify gene expression patterns in fish adapted to habitats with different combinations of environmental stressors, including the presence of toxic hydrogen sulphide (H2S) and the absence of light in caves. We specifically asked how gene expression varies among populations living in different habitats, whether population differences were consistent among organs, and whether there is evidence for shared expression responses in populations exposed to the same stressors. We analysed organ-specific transcriptome-wide data from four ecotypes of Poecilia mexicana (nonsulphidic surface, sulphidic surface, nonsulphidic cave and sulphidic cave). The majority of variation in gene expression was correlated with organ type, and the presence of specific environmental stressors elicited unique expression differences among organs. Shared patterns of gene expression between populations exposed to the same environmental stressors increased with levels of organismal organization (from transcript to gene to physiological pathway). In addition, shared patterns of gene expression were more common between populations from sulphidic than populations from cave habitats, potentially indicating that physiochemical stressors with clear biochemical consequences can constrain the diversity of adaptive solutions that mitigate their adverse effects. Overall, our analyses provided insights into transcriptional variation in a unique system, in which adaptation to H2S and darkness coincide. Functional annotations of differentially expressed genes provide a springboard for investigating physiological mechanisms putatively underlying adaptation to extreme environments. PMID:28598519
Fanconi Anemia Core Complex Gene Promoters Harbor Conserved Transcription Regulatory Elements

PubMed Central

Meier, Daniel; Schindler, Detlev

2011-01-01

The Fanconi anemia (FA) gene family is a recent addition to the complex network of proteins that respond to and repair certain types of DNA damage in the human genome. Since little is known about the regulation of this novel group of genes at the DNA level, we characterized the promoters of the eight genes (FANCA, B, C, E, F, G, L and M) that compose the FA core complex. The promoters of these genes show the characteristic attributes of housekeeping genes, such as a high GC content and CpG islands, a lack of TATA boxes and a low conservation. The promoters functioned in a monodirectional way and were, in their most active regions, comparable in strength to the SV40 promoter in our reporter plasmids. They were also marked by a distinctive transcriptional start site (TSS). In the 5′ region of each promoter, we identified a region that was able to negatively regulate the promoter activity in HeLa and HEK 293 cells in isolation. The central and 3′ regions of the promoter sequences harbor binding sites for several common and rare transcription factors, including STAT, SMAD, E2F, AP1 and YY1, which indicates that there may be cross-connections to several established regulatory pathways. Electrophoretic mobility shift assays and siRNA experiments confirmed the shared regulatory responses between the prominent members of the TGF-β and JAK/STAT pathways and members of the FA core complex. Although the promoters are not well conserved, they share region and sequence specific regulatory motifs and transcription factor binding sites (TBFs), and we identified a bi-partite nature to these promoters. These results support a hypothesis based on the co-evolution of the FA core complex genes that was expanded to include their promoters. PMID:21826217
Fanconi anemia core complex gene promoters harbor conserved transcription regulatory elements.

PubMed

Meier, Daniel; Schindler, Detlev

2011-01-01

The Fanconi anemia (FA) gene family is a recent addition to the complex network of proteins that respond to and repair certain types of DNA damage in the human genome. Since little is known about the regulation of this novel group of genes at the DNA level, we characterized the promoters of the eight genes (FANCA, B, C, E, F, G, L and M) that compose the FA core complex. The promoters of these genes show the characteristic attributes of housekeeping genes, such as a high GC content and CpG islands, a lack of TATA boxes and a low conservation. The promoters functioned in a monodirectional way and were, in their most active regions, comparable in strength to the SV40 promoter in our reporter plasmids. They were also marked by a distinctive transcriptional start site (TSS). In the 5' region of each promoter, we identified a region that was able to negatively regulate the promoter activity in HeLa and HEK 293 cells in isolation. The central and 3' regions of the promoter sequences harbor binding sites for several common and rare transcription factors, including STAT, SMAD, E2F, AP1 and YY1, which indicates that there may be cross-connections to several established regulatory pathways. Electrophoretic mobility shift assays and siRNA experiments confirmed the shared regulatory responses between the prominent members of the TGF-β and JAK/STAT pathways and members of the FA core complex. Although the promoters are not well conserved, they share region and sequence specific regulatory motifs and transcription factor binding sites (TBFs), and we identified a bi-partite nature to these promoters. These results support a hypothesis based on the co-evolution of the FA core complex genes that was expanded to include their promoters.
The temperate marine phage PhiHAP-1 of Halomonas aquamarina possesses a linear plasmid-like prophage genome.

PubMed

Mobberley, Jennifer M; Authement, R Nathan; Segall, Anca M; Paul, John H

2008-07-01

A myovirus-like temperate phage, PhiHAP-1, was induced with mitomycin C from a Halomonas aquamarina strain isolated from surface waters in the Gulf of Mexico. The induced cultures produced significantly more virus-like particles (VLPs) (3.73 x 10(10) VLP ml(-1)) than control cultures (3.83 x 10(7) VLP ml(-1)) when observed with epifluorescence microscopy. The induced phage was sequenced by using linker-amplified shotgun libraries and contained a genome 39,245 nucleotides in length with a G+C content of 59%. The PhiHAP-1 genome contained 46 putative open reading frames (ORFs), with 76% sharing significant similarity (E value of <10(-3)) at the protein level with other sequences in GenBank. Putative functional gene assignments included small and large terminase subunits, capsid and tail genes, an N6-DNA adenine methyltransferase, and lysogeny-related genes. Although no integrase was found, the PhiHAP-1 genome contained ORFs similar to protelomerase and parA genes found in linear plasmid-like phages with telomeric ends. Southern probing and PCR analysis of host genomic, plasmid, and PhiHAP-1 DNA indicated a lack of integration of the prophage with the host chromosome and a difference in genome arrangement between the prophage and virion forms. The linear plasmid prophage form of PhiHAP-1 begins with the protelomerase gene, presumably due to the activity of the protelomerase, while the induced phage particle has a circularly permuted genome that begins with the terminase genes. The PhiHAP-1 genome shares synteny and gene similarity with coliphage N15 and vibriophages VP882 and VHML, suggesting an evolutionary heritage from an N15-like linear plasmid prophage ancestor.
Complete sequence analysis reveals two distinct poleroviruses infecting cucurbits in China.

PubMed

Xiang, Hai-ying; Shang, Qiao-xia; Han, Cheng-gui; Li, Da-wei; Yu, Jia-lin

2008-01-01

The complete RNA genomes of a Chinese isolate of cucurbit aphid-borne yellows virus (CABYV-CHN) and a new polerovirus tentatively referred to as melon aphid-borne yellows virus (MABYV) were determined. The entire genome of CABYV-CHN shared 89.0% nucleotide sequence identity with the French CABYV isolate. In contrast, nucleotide sequence identities between MABYV and CABYV and other poleroviruses were in the range of 50.7-74.2%, with amino acid sequence identities ranging from 24.8 to 82.9% for individual gene products. We propose that CABYV-CHN is a strain of CABYV and that MABYV is a member of a tentative distinct species within the genus Polerovirus.
Reassortant group A rotavirus from straw-colored fruit bat (Eidolon helvum).

PubMed

Esona, Mathew D; Mijatovic-Rustempasic, Slavica; Conrardy, Christina; Tong, Suxiang; Kuzmin, Ivan V; Agwanda, Bernard; Breiman, Robert F; Banyai, Krisztian; Niezgoda, Michael; Rupprecht, Charles E; Gentsch, Jon R; Bowen, Michael D

2010-12-01

Bats are known reservoirs of viral zoonoses. We report genetic characterization of a bat rotavirus (Bat/KE4852/07) detected in the feces of a straw-colored fruit bat (Eidolon helvum). Six bat rotavirus genes (viral protein [VP] 2, VP6, VP7, nonstructural protein [NSP] 2, NSP3, and NSP5) shared ancestry with other mammalian rotaviruses but were distantly related. The VP4 gene was nearly identical to that of human P[6] rotavirus strains, and the NSP4 gene was closely related to those of previously described mammalian rotaviruses, including human strains. Analysis of partial sequence of the VP1 gene indicated that it was distinct from cognate genes of other rotaviruses. No sequences were obtained for the VP3 and NSP1 genes of the bat rotavirus. This rotavirus was designated G25-P[6]-I15-R8(provisional)-C8-Mx-Ax-N8-T11-E2-H10. Results suggest that several reassortment events have occurred between human, animal, and bat rotaviruses. Several additional rotavirus strains were detected in bats.
Reassortant Group A Rotavirus from Straw-colored Fruit Bat (Eidolon helvum)

PubMed Central

Esona, Mathew D.; Mijatovic-Rustempasic, Slavica; Conrardy, Christina; Tong, Suxiang; Kuzmin, Ivan V.; Agwanda, Bernard; Breiman, Robert F.; Banyai, Krisztian; Niezgoda, Michael; Rupprecht, Charles E.; Gentsch, Jon R.

2010-01-01

Bats are known reservoirs of viral zoonoses. We report genetic characterization of a bat rotavirus (Bat/KE4852/07) detected in the feces of a straw-colored fruit bat (Eidolon helvum). Six bat rotavirus genes (viral protein [VP] 2, VP6, VP7, nonstructural protein [NSP] 2, NSP3, and NSP5) shared ancestry with other mammalian rotaviruses but were distantly related. The VP4 gene was nearly identical to that of human P[6] rotavirus strains, and the NSP4 gene was closely related to those of previously described mammalian rotaviruses, including human strains. Analysis of partial sequence of the VP1 gene indicated that it was distinct from cognate genes of other rotaviruses. No sequences were obtained for the VP3 and NSP1 genes of the bat rotavirus. This rotavirus was designated G25-P[6]-I15-R8(provisional)-C8-Mx-Ax-N8-T11-E2-H10. Results suggest that several reassortment events have occurred between human, animal, and bat rotaviruses. Several additional rotavirus strains were detected in bats. PMID:21122212
Origins of domestication and polyploidy in oca (Oxalis Tuberosa: Oxalidaceae). 2. Chloroplast-expressed glutamine synthetase data.

PubMed

Emshwiller, Eve; Doyle, Jeff J

2002-07-01

In continuing study of the origins of the octoploid tuber crop oca, Oxalis tuberosa Molina, we used phylogenetic analysis of DNA sequences of the chloroplast-active (nuclear encoded) isozyme of glutamine synthetase (ncpGS) from cultivated oca, its allies in the "Oxalis tuberosa alliance," and other Andean Oxalis. Multiple ncpGS sequences found within individuals of both the cultigen and a yet unnamed wild tuber-bearing taxon of Bolivia were separated by molecular cloning, but some cloned sequences appeared to be artifacts of polymerase chain reaction (PCR) recombination and/or Taq error. Nonetheless, three classes of nonrecombinant sequences each joined a different part of the O. tuberosa alliance clade on the ncpGS gene tree. Octoploid oca shares two sequence classes with the Bolivian tuber-bearing taxon (of unknown ploidy level). Fixed heterozygosity of these two sequence classes in all ocas sampled suggests that they represent homeologous loci and that oca is allopolyploid. A third sequence class, found in eight of nine oca plants sampled, might represent a third homeologous locus, suggesting that oca may be autoallopolyploid, and is shared with another wild tuber-bearing species, tetraploid O. picchensis of southern Peru. Thus, ncpGS data identify these two taxa as the best candidates as progenitors of cultivated oca.
Genetic diversity of merozoite surface antigens in Babesia bovis detected from Sri Lankan cattle.

PubMed

Sivakumar, Thillaiampalam; Okubo, Kazuhiro; Igarashi, Ikuo; de Silva, Weligodage Kumarawansa; Kothalawala, Hemal; Silva, Seekkuge Susil Priyantha; Vimalakumar, Singarayar Caniciyas; Meewewa, Asela Sanjeewa; Yokoyama, Naoaki

2013-10-01

Babesia bovis, the causative agent of severe bovine babesiosis, is endemic in Sri Lanka. The live attenuated vaccine (K-strain), which was introduced in the early 1990s, has been used to immunize cattle populations in endemic areas of the country. The present study was undertaken to determine the genetic diversity of merozoite surface antigens (MSAs) in B. bovis isolates from Sri Lankan cattle, and to compare the gene sequences obtained from such isolates against those of the K-strain. Forty-four bovine blood samples isolated from different geographical regions of Sri Lanka and judged to be B. bovis-positive by PCR screening were used to amplify MSAs (MSA-1, MSA-2c, MSA-2a1, MSA-2a2, and MSA-2b), AMA-1, and 12D3 genes from parasite DNA. Although the AMA-1 and 12D3 gene sequences were highly conserved among the Sri Lankan isolates, the MSA gene sequences from the same isolates were highly diverse. Sri Lankan MSA-1, MSA-2c, MSA-2a1, MSA-2a2, and MSA-2b sequences clustered within 5, 2, 4, 1, and 9 different clades in the gene phylograms, respectively, while the minimum similarity values among the deduced amino acid sequences of these genes were 36.8%, 68.7%, 80.3%, 100%, and 68.3%, respectively. In the phylograms, none of the Sri Lankan sequences fell within clades containing the respective K-strain sequences. Additionally, the similarity values for MSA-1 and MSA-2c were 40-61.8% and 90.9-93.2% between the Sri Lankan isolates and the K-strain, respectively, while the K-strain MSA-2a/b sequence shared 64.5-69.8%, 69.3%, and 70.5-80.3% similarities with the Sri Lankan MSA-2a1, MSA-2a2, and MSA-2b sequences, respectively. The present study has shown that genetic diversity among MSAs of Sri Lankan B. bovis isolates is very high, and that the sequences of field isolates diverged genetically from the K-strain. Copyright © 2013 Elsevier B.V. All rights reserved.
B-Bolivia, an Allele of the Maize b1 Gene with Variable Expression, Contains a High Copy Retrotransposon-Related Sequence Immediately Upstream1

PubMed Central

Selinger, David A.; Chandler, Vicki L.

2001-01-01

The maize (Zea mays) b1 gene encodes a transcription factor that regulates the anthocyanin pigment pathway. Of the b1 alleles with distinct tissue-specific expression, B-Peru and B-Bolivia are the only alleles that confer seed pigmentation. B-Bolivia produces variable and weaker seed expression but darker, more regular plant expression relative to B-Peru. Our experiments demonstrated that B-Bolivia is not expressed in the seed when transmitted through the male. When transmitted through the female the proportion of kernels pigmented and the intensity of pigment varied. Molecular characterization of B-Bolivia demonstrated that it shares the first 530 bp of the upstream region with B-Peru, a region sufficient for seed expression. Immediately upstream of 530 bp, B-Bolivia is completely divergent from B-Peru. These sequences share sequence similarity to retrotransposons. Transient expression assays of various promoter constructs identified a 33-bp region in B-Bolivia that can account for the reduced aleurone pigment amounts (40%) observed with B-Bolivia relative to B-Peru. Transgenic plants carrying the B-Bolivia promoter proximal region produced pigmented seeds. Similar to native B-Bolivia, some transgene loci are variably expressed in seeds. In contrast to native B-Bolivia, the transgene loci are expressed in seeds when transmitted through both the male and female. Some transgenic lines produced pigment in vegetative tissues, but the tissue-specificity was different from B-Bolivia, suggesting the introduced sequences do not contain the B-Bolivia plant-specific regulatory sequences. We hypothesize that the chromatin context of the B-Bolivia allele controls its epigenetic seed expression properties, which could be influenced by the adjacent highly repeated retrotransposon sequence. PMID:11244116
Structures of two Arabidopsis thaliana major latex proteins represent novel helix-grip folds

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lytle, Betsy L.; Song, Jikui; de la Cruz, Norberto B.

2009-06-02

Here we report the first structures of two major latex proteins (MLPs) which display unique structural differences from the canonical Bet v 1 fold described earlier. MLP28 (SwissProt/TrEMBL ID Q9SSK9), the product of gene At1g70830.1, and the At1g24000.1 gene product (Swiss- Prot/TrEMBL ID P0C0B0), proteins which share 32% sequence identity, were independently selected as foldspace targets by the Center for Eukaryotic Structural Genomics. The structure of a single domain (residues 17-173) of MLP28 was solved by NMR spectroscopy, while the full-length At1g24000.1 structure was determined by X-ray crystallography. MLP28 displays greater than 30% sequence identity to at least eight MLPsmore » from other species. For example, the MLP28 sequence shares 64% identity to peach Pp-MLP119 and 55% identity to cucumber Csf2.20 In contrast, the At1g24000.1 sequence is highly divergent (see Fig. 1), containing a gap of 33 amino acids when compared with all other known MLPs. Even when the gap is excluded, the sequence identity with MLPs from other species is less than 30%. Unlike some of the MLPs from other species, none of the A. thaliana MLPs have been characterized biochemically. We show by NMR chemical shift mapping that At1g24000.1 binds progesterone, demonstrating that despite its sequence dissimilarity, the hydrophobic binding pocket is conserved and, therefore, may play a role in its biological function and that of the MLP family in general.« less
The Histone Modification H3K27me3 Is Retained after Gene Duplication and Correlates with Conserved Noncoding Sequences in Arabidopsis

PubMed Central

Berke, Lidija; Snel, Berend

2014-01-01

The histone modification H3K27me3 is involved in repression of transcription and plays a crucial role in developmental transitions in both animals and plants. It is deposited by PRC2 (Polycomb repressive complex 2), a conserved protein complex. In Arabidopsis thaliana, H3K27me3 is found at 15% of all genes. These tend to encode transcription factors and other regulators important for development. However, it is not known how PRC2 is recruited to target loci nor how this set of target genes arose during Arabidopsis evolution. To resolve the latter, we integrated A. thaliana gene families with five independent genome-wide H3K27me3 data sets. Gene families were either significantly enriched or depleted of H3K27me3, showing a strong impact of shared ancestry to H3K27me3 distribution. To quantify this, we performed ancestral state reconstruction of H3K27me3 on phylogenetic trees of gene families. The set of H3K27me3-marked genes changed less than expected by chance, suggesting that H3K27me3 was retained after gene duplication. This retention suggests that the PRC2-recruiting signal could be encoded in the DNA and also conserved among certain duplicated genes. Indeed, H3K27me3-marked genes were overrepresented among paralogs sharing conserved noncoding sequences (CNSs) that are enriched with transcription factor binding sites. The association of upstream CNSs with H3K27me3-marked genes represents the first genome-wide connection between H3K27me3 and potential regulatory elements in plants. Thus, we propose that CNSs likely function as part of the PRC2 recruitment in plants. PMID:24567304
Microbial genomic taxonomy

PubMed Central

2013-01-01

A need for a genomic species definition is emerging from several independent studies worldwide. In this commentary paper, we discuss recent studies on the genomic taxonomy of diverse microbial groups and a unified species definition based on genomics. Accordingly, strains from the same microbial species share >95% Average Amino Acid Identity (AAI) and Average Nucleotide Identity (ANI), >95% identity based on multiple alignment genes, <10 in Karlin genomic signature, and > 70% in silico Genome-to-Genome Hybridization similarity (GGDH). Species of the same genus will form monophyletic groups on the basis of 16S rRNA gene sequences, Multilocus Sequence Analysis (MLSA) and supertree analysis. In addition to the established requirements for species descriptions, we propose that new taxa descriptions should also include at least a draft genome sequence of the type strain in order to obtain a clear outlook on the genomic landscape of the novel microbe. The application of the new genomic species definition put forward here will allow researchers to use genome sequences to define simultaneously coherent phenotypic and genomic groups. PMID:24365132

Polynucleobacter bacteria in the brackish-water species Euplotes harpa (Ciliata Hypotrichia).

PubMed

Vannini, Claudia; Petroni, Giulio; Verni, Franco; Rosati, Giovanna

2005-01-01

We have found a Polynucleobacter bacterium in the cytoplasm of Euplotes harpa, a species living in a brackish-water habitat, with a cirral pattern not corresponding to that of the freshwater Euplotes species known to harbor this type of bacteria. The symbiont has been found in three strains of the species, obtained by clonal cultures from ciliates collected in different geographic regions. The 16S rRNA gene sequence of this bacterium identifies it as a member of the beta-proteobacterial genus Polynucleobacter. This sequence shares a high similarity value (98.4-98.5%) with P. necessarius, the type species of the genus, and is associated with 16S rRNA gene sequences of environmental clones and bacterial strains included in the Polynucleobacter cluster (>95%). An oligonucleotide probe was designed to corroborate the assignment of the retrieved sequence to the symbiont and to detect similar bacteria rapidly. Antibiotic experiments showed that the elimination of the bacteria stops the reproductive cycle in E. harpa, as has been shown for the freshwater Euplotes species.
The importance of de novo mutations for pediatric neurological disease--It is not all in utero or birth trauma.

PubMed

Erickson, Robert P

2016-01-01

The advent of next generation sequencing (NGS, which consists of massively parallel sequencing to perform TGS (total genome sequencing) or WES (whole exome sequencing)) has abundantly discovered many causative mutations in patients with pediatric neurological disease. A surprisingly high number of these are de novo mutations which have not been inherited from either parent. For epilepsy, autism spectrum disorders, and neuromotor disorders, including cerebral palsy, initial estimates put the frequency of causative de novo mutations at about 15% and about 10% of these are somatic. There are some shared mutated genes between these three classes of disease. Studies of copy number variation by comparative genomic hybridization (CGH) proceded the NGS approaches but they also detect de novo variation which is especially important for ASDs. There are interesting differences between the mutated genes detected by CGS and NGS. In summary, de novo mutations cause a very significant proportion of pediatric neurological disease. Copyright © 2015 Elsevier B.V. All rights reserved.
The DNA sequence of the human X chromosome

PubMed Central

Ross, Mark T.; Grafham, Darren V.; Coffey, Alison J.; Scherer, Steven; McLay, Kirsten; Muzny, Donna; Platzer, Matthias; Howell, Gareth R.; Burrows, Christine; Bird, Christine P.; Frankish, Adam; Lovell, Frances L.; Howe, Kevin L.; Ashurst, Jennifer L.; Fulton, Robert S.; Sudbrak, Ralf; Wen, Gaiping; Jones, Matthew C.; Hurles, Matthew E.; Andrews, T. Daniel; Scott, Carol E.; Searle, Stephen; Ramser, Juliane; Whittaker, Adam; Deadman, Rebecca; Carter, Nigel P.; Hunt, Sarah E.; Chen, Rui; Cree, Andrew; Gunaratne, Preethi; Havlak, Paul; Hodgson, Anne; Metzker, Michael L.; Richards, Stephen; Scott, Graham; Steffen, David; Sodergren, Erica; Wheeler, David A.; Worley, Kim C.; Ainscough, Rachael; Ambrose, Kerrie D.; Ansari-Lari, M. Ali; Aradhya, Swaroop; Ashwell, Robert I. S.; Babbage, Anne K.; Bagguley, Claire L.; Ballabio, Andrea; Banerjee, Ruby; Barker, Gary E.; Barlow, Karen F.; Barrett, Ian P.; Bates, Karen N.; Beare, David M.; Beasley, Helen; Beasley, Oliver; Beck, Alfred; Bethel, Graeme; Blechschmidt, Karin; Brady, Nicola; Bray-Allen, Sarah; Bridgeman, Anne M.; Brown, Andrew J.; Brown, Mary J.; Bonnin, David; Bruford, Elspeth A.; Buhay, Christian; Burch, Paula; Burford, Deborah; Burgess, Joanne; Burrill, Wayne; Burton, John; Bye, Jackie M.; Carder, Carol; Carrel, Laura; Chako, Joseph; Chapman, Joanne C.; Chavez, Dean; Chen, Ellson; Chen, Guan; Chen, Yuan; Chen, Zhijian; Chinault, Craig; Ciccodicola, Alfredo; Clark, Sue Y.; Clarke, Graham; Clee, Chris M.; Clegg, Sheila; Clerc-Blankenburg, Kerstin; Clifford, Karen; Cobley, Vicky; Cole, Charlotte G.; Conquer, Jen S.; Corby, Nicole; Connor, Richard E.; David, Robert; Davies, Joy; Davis, Clay; Davis, John; Delgado, Oliver; DeShazo, Denise; Dhami, Pawandeep; Ding, Yan; Dinh, Huyen; Dodsworth, Steve; Draper, Heather; Dugan-Rocha, Shannon; Dunham, Andrew; Dunn, Matthew; Durbin, K. James; Dutta, Ireena; Eades, Tamsin; Ellwood, Matthew; Emery-Cohen, Alexandra; Errington, Helen; Evans, Kathryn L.; Faulkner, Louisa; Francis, Fiona; Frankland, John; Fraser, Audrey E.; Galgoczy, Petra; Gilbert, James; Gill, Rachel; Glöckner, Gernot; Gregory, Simon G.; Gribble, Susan; Griffiths, Coline; Grocock, Russell; Gu, Yanghong; Gwilliam, Rhian; Hamilton, Cerissa; Hart, Elizabeth A.; Hawes, Alicia; Heath, Paul D.; Heitmann, Katja; Hennig, Steffen; Hernandez, Judith; Hinzmann, Bernd; Ho, Sarah; Hoffs, Michael; Howden, Phillip J.; Huckle, Elizabeth J.; Hume, Jennifer; Hunt, Paul J.; Hunt, Adrienne R.; Isherwood, Judith; Jacob, Leni; Johnson, David; Jones, Sally; de Jong, Pieter J.; Joseph, Shirin S.; Keenan, Stephen; Kelly, Susan; Kershaw, Joanne K.; Khan, Ziad; Kioschis, Petra; Klages, Sven; Knights, Andrew J.; Kosiura, Anna; Kovar-Smith, Christie; Laird, Gavin K.; Langford, Cordelia; Lawlor, Stephanie; Leversha, Margaret; Lewis, Lora; Liu, Wen; Lloyd, Christine; Lloyd, David M.; Loulseged, Hermela; Loveland, Jane E.; Lovell, Jamieson D.; Lozado, Ryan; Lu, Jing; Lyne, Rachael; Ma, Jie; Maheshwari, Manjula; Matthews, Lucy H.; McDowall, Jennifer; McLaren, Stuart; McMurray, Amanda; Meidl, Patrick; Meitinger, Thomas; Milne, Sarah; Miner, George; Mistry, Shailesh L.; Morgan, Margaret; Morris, Sidney; Müller, Ines; Mullikin, James C.; Nguyen, Ngoc; Nordsiek, Gabriele; Nyakatura, Gerald; O’Dell, Christopher N.; Okwuonu, Geoffery; Palmer, Sophie; Pandian, Richard; Parker, David; Parrish, Julia; Pasternak, Shiran; Patel, Dina; Pearce, Alex V.; Pearson, Danita M.; Pelan, Sarah E.; Perez, Lesette; Porter, Keith M.; Ramsey, Yvonne; Reichwald, Kathrin; Rhodes, Susan; Ridler, Kerry A.; Schlessinger, David; Schueler, Mary G.; Sehra, Harminder K.; Shaw-Smith, Charles; Shen, Hua; Sheridan, Elizabeth M.; Shownkeen, Ratna; Skuce, Carl D.; Smith, Michelle L.; Sotheran, Elizabeth C.; Steingruber, Helen E.; Steward, Charles A.; Storey, Roy; Swann, R. Mark; Swarbreck, David; Tabor, Paul E.; Taudien, Stefan; Taylor, Tineace; Teague, Brian; Thomas, Karen; Thorpe, Andrea; Timms, Kirsten; Tracey, Alan; Trevanion, Steve; Tromans, Anthony C.; d’Urso, Michele; Verduzco, Daniel; Villasana, Donna; Waldron, Lenee; Wall, Melanie; Wang, Qiaoyan; Warren, James; Warry, Georgina L.; Wei, Xuehong; West, Anthony; Whitehead, Siobhan L.; Whiteley, Mathew N.; Wilkinson, Jane E.; Willey, David L.; Williams, Gabrielle; Williams, Leanne; Williamson, Angela; Williamson, Helen; Wilming, Laurens; Woodmansey, Rebecca L.; Wray, Paul W.; Yen, Jennifer; Zhang, Jingkun; Zhou, Jianling; Zoghbi, Huda; Zorilla, Sara; Buck, David; Reinhardt, Richard; Poustka, Annemarie; Rosenthal, André; Lehrach, Hans; Meindl, Alfons; Minx, Patrick J.; Hillier, LaDeana W.; Willard, Huntington F.; Wilson, Richard K.; Waterston, Robert H.; Rice, Catherine M.; Vaudin, Mark; Coulson, Alan; Nelson, David L.; Weinstock, George; Sulston, John E.; Durbin, Richard; Hubbard, Tim; Gibbs, Richard A.; Beck, Stephan; Rogers, Jane; Bentley, David R.

2009-01-01

The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence. PMID:15772651
Short communication: Conservation of Streptococcus uberis adhesion molecule and the sua gene in strains of Streptococcus uberis isolated from geographically diverse areas.

PubMed

Yuan, Ying; Dego, Oudessa Kerro; Chen, Xueyan; Abadin, Eurife; Chan, Shangfeng; Jory, Lauren; Kovacevic, Steven; Almeida, Raul A; Oliver, Stephen P

2014-12-01

The objective was to identify and sequence the sua gene (GenBank no. DQ232760; http://www.ncbi.nlm.nih.gov/genbank/) and detect Streptococcus uberis adhesion molecule (SUAM) expression by Western blot using serum from naturally S. uberis-infected cows in strains of S. uberis isolated in milk from cows with mastitis from geographically diverse areas of the world. All strains evaluated yielded a 4.4-kb sua-containing PCR fragment that was subsequently sequenced. Deduced SUAM AA sequences from those S. uberis strains evaluated shared >97% identity. The pepSUAM sequence located at the N terminus of SUAM was >99% identical among strains of S. uberis. Streptococcus uberis adhesion molecule expression was detected in all strains of S. uberis tested. These results suggest that sua is ubiquitous among strains of S. uberis isolated from diverse geographic locations and that SUAM is immunogenic. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Bph32, a novel gene encoding an unknown SCR domain-containing protein, confers resistance against the brown planthopper in rice.

PubMed

Ren, Juansheng; Gao, Fangyuan; Wu, Xianting; Lu, Xianjun; Zeng, Lihua; Lv, Jianqun; Su, Xiangwen; Luo, Hong; Ren, Guangjun

2016-11-23

An urgent need exists to identify more brown planthopper (Nilaparvata lugens Stål, BPH) resistance genes, which will allow the development of rice varieties with resistance to BPH to counteract the increased incidence of this pest species. Here, using bioinformatics and DNA sequencing approaches, we identified a novel BPH resistance gene, LOC_Os06g03240 (MSU LOCUS ID), from the rice variety Ptb33 in the interval between the markers RM19291 and RM8072 on the short arm of chromosome 6, where a gene for resistance to BPH was mapped by Jirapong Jairin et al. and renamed as "Bph32". This gene encodes a unique short consensus repeat (SCR) domain protein. Sequence comparison revealed that the Bph32 gene shares 100% sequence identity with its allele in Oryza latifolia. The transgenic introgression of Bph32 into a susceptible rice variety significantly improved resistance to BPH. Expression analysis revealed that Bph32 was highly expressed in the leaf sheaths, where BPH primarily settles and feeds, at 2 and 24 h after BPH infestation, suggesting that Bph32 may inhibit feeding in BPH. Western blotting revealed the presence of Pph (Ptb33) and Tph (TN1) proteins using a Penta-His antibody, and both proteins were insoluble. This study provides information regarding a valuable gene for rice defence against insect pests.
Bph32, a novel gene encoding an unknown SCR domain-containing protein, confers resistance against the brown planthopper in rice

PubMed Central

Ren, Juansheng; Gao, Fangyuan; Wu, Xianting; Lu, Xianjun; Zeng, Lihua; Lv, Jianqun; Su, Xiangwen; Luo, Hong; Ren, Guangjun

2016-01-01

An urgent need exists to identify more brown planthopper (Nilaparvata lugens Stål, BPH) resistance genes, which will allow the development of rice varieties with resistance to BPH to counteract the increased incidence of this pest species. Here, using bioinformatics and DNA sequencing approaches, we identified a novel BPH resistance gene, LOC_Os06g03240 (MSU LOCUS ID), from the rice variety Ptb33 in the interval between the markers RM19291 and RM8072 on the short arm of chromosome 6, where a gene for resistance to BPH was mapped by Jirapong Jairin et al. and renamed as “Bph32”. This gene encodes a unique short consensus repeat (SCR) domain protein. Sequence comparison revealed that the Bph32 gene shares 100% sequence identity with its allele in Oryza latifolia. The transgenic introgression of Bph32 into a susceptible rice variety significantly improved resistance to BPH. Expression analysis revealed that Bph32 was highly expressed in the leaf sheaths, where BPH primarily settles and feeds, at 2 and 24 h after BPH infestation, suggesting that Bph32 may inhibit feeding in BPH. Western blotting revealed the presence of Pph (Ptb33) and Tph (TN1) proteins using a Penta-His antibody, and both proteins were insoluble. This study provides information regarding a valuable gene for rice defence against insect pests. PMID:27876888
Surface Diversity in Mycoplasma agalactiae Is Driven by Site-Specific DNA Inversions within the vpma Multigene Locus

PubMed Central

Glew, Michelle D.; Marenda, Marc; Rosengarten, Renate; Citti, Christine

2002-01-01

The ruminant pathogen Mycoplasma agalactiae possesses a family of abundantly expressed variable surface lipoproteins called Vpmas. Phenotypic switches between Vpma members have previously been correlated with DNA rearrangements within a locus of vpma genes and are proposed to play an important role in disease pathogenesis. In this study, six vpma genes were characterized in the M. agalactiae type strain PG2. All vpma genes clustered within an 8-kb region and shared highly conserved 5′ untranslated regions, lipoprotein signal sequences, and short N-terminal sequences. Analyses of the vpma loci from consecutive clonal isolates showed that vpma DNA rearrangements were site specific and that cleavage and strand exchange occurred within a minimal region of 21 bp located within the 5′ untranslated region of all vpma genes. This process controlled expression of vpma genes by effectively linking the open reading frame (ORF) of a silent gene to a unique active promoter sequence within the locus. An ORF (xer1) immediately adjacent to one end of the vpma locus did not undergo rearrangement and had significant homology to a distinct subset of genes belonging to the λ integrase family of site-specific xer recombinases. It is proposed that xer1 codes for a site-specific recombinase that is not involved in chromosome dimer resolution but rather is responsible for the observed vpma-specific recombination in M. agalactiae. PMID:12374833
Transcriptional regulation of three EIN3-like genes of carnation (Dianthus caryophyllus L. cv. Improved White Sim) during flower development and upon wounding, pollination, and ethylene exposure.

PubMed

Iordachescu, Mihaela; Verlinden, Sven

2005-08-01

Using a combination of approaches, three EIN3-like (EIL) genes DC-EIL1/2 (AY728191), DC-EIL3 (AY728192), and DC-EIL4 (AY728193) were isolated from carnation (Dianthus caryophyllus) petals. DC-EIL1/2 deduced amino acid sequence shares 98% identity with the previously cloned and characterized carnation DC-EIL1 (AF261654), 62% identity with DC-EIL3, and 60% identity with DC-EIL4. DC-EIL3 deduced amino acid sequence shares 100% identity with a previously cloned carnation gene fragment, Dc106 (CF259543), 61% identity with Dianthus caryophyllus DC-EIL1 (AF261654), and 59% identity with DC-EIL4. DC-EIL4 shared 60% identity with DC-EIL1 (AF261654). Expression analyses performed on vegetative and flower tissues (petals, ovaries, and styles) during growth and development and senescence (natural and ethylene-induced) indicated that the mRNA accumulation of the DC-EIL family of genes in carnation is regulated developmentally and by ethylene. DC-EIL3 mRNA showed significant accumulation upon ethylene exposure, during flower development, and upon pollination in petals and styles. Interestingly, decreasing levels of DC-EIL3 mRNA were found in wounded leaves and ovaries of senescing flowers whenever ethylene levels increased. Flowers treated with sucrose showed a 2 d delay in the accumulation of DC-EIL3 transcripts when compared with control flowers. These observations suggest an important role for DC-EIL3 during growth and development. Changes in DC-EIL1/2 and DC-EIL4 mRNA levels during flower development, and upon ethylene exposure and pollination were very similar. mRNA levels of the DC-EILs in styles of pollinated flowers showed a positive correlation with ethylene production after pollination. The cloning and characterization of the EIN3-like genes in the present study showed their transcriptional regulation not previously observed for EILs.
Targeted deletion of miR-132/-212 impairs memory and alters the hippocampal transcriptome.

PubMed

Hansen, Katelin F; Sakamoto, Kensuke; Aten, Sydney; Snider, Kaitlin H; Loeser, Jacob; Hesse, Andrea M; Page, Chloe E; Pelz, Carl; Arthur, J Simon C; Impey, Soren; Obrietan, Karl

2016-02-01

miR-132 and miR-212 are structurally related microRNAs that have been found to exert powerful modulatory effects within the central nervous system (CNS). Notably, these microRNAs are tandomly processed from the same noncoding transcript, and share a common seed sequence: thus it has been difficult to assess the distinct contribution of each microRNA to gene expression within the CNS. Here, we employed a combination of conditional knockout and transgenic mouse models to examine the contribution of the miR-132/-212 gene locus to learning and memory, and then to assess the distinct effects that each microRNA has on hippocampal gene expression. Using a conditional deletion approach, we show that miR-132/-212 double-knockout mice exhibit significant cognitive deficits in spatial memory, recognition memory, and in tests of novel object recognition. Next, we utilized transgenic miR-132 and miR-212 overexpression mouse lines and the miR-132/-212 double-knockout line to explore the distinct effects of these two miRNAs on the transcriptional profile of the hippocampus. Illumina sequencing revealed that miR-132/-212 deletion increased the expression of 1138 genes; Venn analysis showed that 96 of these genes were also downregulated in mice overexpressing miR-132. Of the 58 genes that were decreased in animals overexpressing miR-212, only four of them were also increased in the knockout line. Functional gene ontology analysis of downregulated genes revealed significant enrichment of genes related to synaptic transmission, neuronal proliferation, and morphogenesis, processes known for their roles in learning, and memory formation. These data, coupled with previous studies, firmly establish a role for the miR-132/-212 gene locus as a key regulator of cognitive capacity. Further, although miR-132 and miR-212 share a seed sequence, these data indicate that these miRNAs do not exhibit strongly overlapping mRNA targeting profiles, thus indicating that these two genes may function in a complex, nonredundant manner to shape the transcriptional profile of the CNS. The dysregulation of miR-132/-212 expression could contribute to signaling mechanisms that are involved in an array of cognitive disorders. © 2016 Hansen et al.; Published by Cold Spring Harbor Laboratory Press.
Targeted deletion of miR-132/-212 impairs memory and alters the hippocampal transcriptome

PubMed Central

Hansen, Katelin F.; Sakamoto, Kensuke; Aten, Sydney; Snider, Kaitlin H.; Loeser, Jacob; Hesse, Andrea M.; Page, Chloe E.; Pelz, Carl; Arthur, J. Simon C.; Impey, Soren

2016-01-01

miR-132 and miR-212 are structurally related microRNAs that have been found to exert powerful modulatory effects within the central nervous system (CNS). Notably, these microRNAs are tandomly processed from the same noncoding transcript, and share a common seed sequence: thus it has been difficult to assess the distinct contribution of each microRNA to gene expression within the CNS. Here, we employed a combination of conditional knockout and transgenic mouse models to examine the contribution of the miR-132/-212 gene locus to learning and memory, and then to assess the distinct effects that each microRNA has on hippocampal gene expression. Using a conditional deletion approach, we show that miR-132/-212 double-knockout mice exhibit significant cognitive deficits in spatial memory, recognition memory, and in tests of novel object recognition. Next, we utilized transgenic miR-132 and miR-212 overexpression mouse lines and the miR-132/-212 double-knockout line to explore the distinct effects of these two miRNAs on the transcriptional profile of the hippocampus. Illumina sequencing revealed that miR-132/-212 deletion increased the expression of 1138 genes; Venn analysis showed that 96 of these genes were also downregulated in mice overexpressing miR-132. Of the 58 genes that were decreased in animals overexpressing miR-212, only four of them were also increased in the knockout line. Functional gene ontology analysis of downregulated genes revealed significant enrichment of genes related to synaptic transmission, neuronal proliferation, and morphogenesis, processes known for their roles in learning, and memory formation. These data, coupled with previous studies, firmly establish a role for the miR-132/-212 gene locus as a key regulator of cognitive capacity. Further, although miR-132 and miR-212 share a seed sequence, these data indicate that these miRNAs do not exhibit strongly overlapping mRNA targeting profiles, thus indicating that these two genes may function in a complex, nonredundant manner to shape the transcriptional profile of the CNS. The dysregulation of miR-132/-212 expression could contribute to signaling mechanisms that are involved in an array of cognitive disorders. PMID:26773099
Nucleic acids encoding plant glutamine phenylpyruvate transaminase (GPT) and uses thereof

DOEpatents

Unkefer, Pat J.; Anderson, Penelope S.; Knight, Thomas J.

2016-03-29

Glutamine phenylpyruvate transaminase (GPT) proteins, nucleic acid molecules encoding GPT proteins, and uses thereof are disclosed. Provided herein are various GPT proteins and GPT gene coding sequences isolated from a number of plant species. As disclosed herein, GPT proteins share remarkable structural similarity within plant species, and are active in catalyzing the synthesis of 2-hydroxy-5-oxoproline (2-oxoglutaramate), a powerful signal metabolite which regulates the function of a large number of genes involved in the photosynthesis apparatus, carbon fixation and nitrogen metabolism.
VDJServer: A Cloud-Based Analysis Portal and Data Commons for Immune Repertoire Sequences and Rearrangements.

PubMed

Christley, Scott; Scarborough, Walter; Salinas, Eddie; Rounds, William H; Toby, Inimary T; Fonner, John M; Levin, Mikhail K; Kim, Min; Mock, Stephen A; Jordan, Christopher; Ostmeyer, Jared; Buntzman, Adam; Rubelt, Florian; Davila, Marco L; Monson, Nancy L; Scheuermann, Richard H; Cowell, Lindsay G

2018-01-01

Recent technological advances in immune repertoire sequencing have created tremendous potential for advancing our understanding of adaptive immune response dynamics in various states of health and disease. Immune repertoire sequencing produces large, highly complex data sets, however, which require specialized methods and software tools for their effective analysis and interpretation. VDJServer is a cloud-based analysis portal for immune repertoire sequence data that provide access to a suite of tools for a complete analysis workflow, including modules for preprocessing and quality control of sequence reads, V(D)J gene segment assignment, repertoire characterization, and repertoire comparison. VDJServer also provides sophisticated visualizations for exploratory analysis. It is accessible through a standard web browser via a graphical user interface designed for use by immunologists, clinicians, and bioinformatics researchers. VDJServer provides a data commons for public sharing of repertoire sequencing data, as well as private sharing of data between users. We describe the main functionality and architecture of VDJServer and demonstrate its capabilities with use cases from cancer immunology and autoimmunity. VDJServer provides a complete analysis suite for human and mouse T-cell and B-cell receptor repertoire sequencing data. The combination of its user-friendly interface and high-performance computing allows large immune repertoire sequencing projects to be analyzed with no programming or software installation required. VDJServer is a web-accessible cloud platform that provides access through a graphical user interface to a data management infrastructure, a collection of analysis tools covering all steps in an analysis, and an infrastructure for sharing data along with workflows, results, and computational provenance. VDJServer is a free, publicly available, and open-source licensed resource.
The floral transcriptomes of four bamboo species (Bambusoideae; Poaceae): support for common ancestry among woody bamboos.

PubMed

Wysocki, William P; Ruiz-Sanchez, Eduardo; Yin, Yanbin; Duvall, Melvin R

2016-05-20

Next-generation sequencing now allows for total RNA extracts to be sequenced in non-model organisms such as bamboos, an economically and ecologically important group of grasses. Bamboos are divided into three lineages, two of which are woody perennials with bisexual flowers, which undergo gregarious monocarpy. The third lineage, which are herbaceous perennials, possesses unisexual flowers that undergo annual flowering events. Transcriptomes were assembled using both reference-based and de novo methods. These two methods were tested by characterizing transcriptome content using sequence alignment to previously characterized reference proteomes and by identifying Pfam domains. Because of the striking differences in floral morphology and phenology between the herbaceous and woody bamboo lineages, MADS-box genes, transcription factors that control floral development and timing, were characterized and analyzed in this study. Transcripts were identified using phylogenetic methods and categorized as A, B, C, D or E-class genes, which control floral development, or SOC or SVP-like genes, which control the timing of flowering events. Putative nuclear orthologues were also identified in bamboos to use as phylogenetic markers. Instances of gene copies exhibiting topological patterns that correspond to shared phenotypes were observed in several gene families including floral development and timing genes. Alignments and phylogenetic trees were generated for 3,878 genes and for all genes in a concatenated analysis. Both the concatenated analysis and those of 2,412 separate gene trees supported monophyly among the woody bamboos, which is incongruent with previous phylogenetic studies using plastid markers.
New enzymes from environmental cassette arrays: Functional attributes of a phosphotransferase and an RNA-methyltransferase

PubMed Central

Nield, Blair S.; Willows, Robert D.; Torda, Andrew E.; Gillings, Michael R.; Holmes, Andrew J.; Nevalainen, K.M. Helena; Stokes, H.W.; Mabbutt, Bridget C.

2004-01-01

By targeting gene cassettes by polymerase chain reaction (PCR) directly from environmentally derived DNA, we are able to amplify entire open reading frames (ORFs) independently of prior sequence knowledge. Approximately 10% of the mobile genes recovered by these means can be attributed to known protein families. Here we describe the characterization of two ORFs which show moderate homology to known proteins: (1) an aminoglycoside phosphotransferase displaying 25% sequence identity with APH(7″) from Streptomyces hygroscopicus, and (2) an RNA methyltransferase sharing 25%–28% identity with a group of recently defined bacterial RNA methyltransferases distinct from the SpoU enzyme family. Our novel genes were expressed as recombinant products and assayed for appropriate enzyme activity. The aminoglycoside phosphotransferase displayed ATPase activity, consistent with the presence of characteristic Mg2+-binding residues. Unlike related APH(4) or APH(7″) enzymes, however, this activity was not enhanced by hygromycin B or kanamycin, suggesting the normal substrate to be a different aminoglycoside. The RNA methyltransferase contains sequence motifs of the RNA methyltransferase superfamily, and our recombinant version showed methyltransferase activity with RNA. Our data confirm that gene cassettes present in the environment encode folded enzymes with novel sequence variation and demonstrable catalytic activity. Our PCR approach (cassette PCR) may be used to identify a diverse range of ORFs from any environmental sample, as well as to directly access the gene pool found in mobile gene cassettes commonly associated with integrons. This gene pool can be accessed from both cultured and uncultured microbial samples as a source of new enzymes and proteins. PMID:15152095
New enzymes from environmental cassette arrays: functional attributes of a phosphotransferase and an RNA-methyltransferase.

PubMed

Nield, Blair S; Willows, Robert D; Torda, Andrew E; Gillings, Michael R; Holmes, Andrew J; Nevalainen, K M Helena; Stokes, H W; Mabbutt, Bridget C

2004-06-01

By targeting gene cassettes by polymerase chain reaction (PCR) directly from environmentally derived DNA, we are able to amplify entire open reading frames (ORFs) independently of prior sequence knowledge. Approximately 10% of the mobile genes recovered by these means can be attributed to known protein families. Here we describe the characterization of two ORFs which show moderate homology to known proteins: (1) an aminoglycoside phosphotransferase displaying 25% sequence identity with APH(7") from Streptomyces hygroscopicus, and (2) an RNA methyltransferase sharing 25%-28% identity with a group of recently defined bacterial RNA methyltransferases distinct from the SpoU enzyme family. Our novel genes were expressed as recombinant products and assayed for appropriate enzyme activity. The aminoglycoside phosphotransferase displayed ATPase activity, consistent with the presence of characteristic Mg(2+)-binding residues. Unlike related APH(4) or APH(7") enzymes, however, this activity was not enhanced by hygromycin B or kanamycin, suggesting the normal substrate to be a different aminoglycoside. The RNA methyltransferase contains sequence motifs of the RNA methyltransferase superfamily, and our recombinant version showed methyltransferase activity with RNA. Our data confirm that gene cassettes present in the environment encode folded enzymes with novel sequence variation and demonstrable catalytic activity. Our PCR approach (cassette PCR) may be used to identify a diverse range of ORFs from any environmental sample, as well as to directly access the gene pool found in mobile gene cassettes commonly associated with integrons. This gene pool can be accessed from both cultured and uncultured microbial samples as a source of new enzymes and proteins.
Comparative analysis of chloroplast genomes of the genus Citrus and its close relatives.

PubMed

Liu, Xiaogang; Wu, Hongkun; Luo, Yan; Xi, Wanpeng; Zhou, Zhiqin

2017-01-01

The genus Citrus and its close relatives are economically and nutritionally important fruit trees. However, the huge controversy over the phylogeny of key wild species, as well as the genetic relationship between the cultivated species and their putative wild progenitors, remains unresolved. Comparative analyses of chloroplast (cp) genomes have been useful in resolving various phylogenetic issues. Thus far, the cp genomes of only two Citrus species have been sequenced. In this study, we sequenced six complete cp genomes, four belonging to the genus Citrus, and two belonging to the genera Fortunella and Poncirus, respectively. These newly sequenced genomes together with the two publicly available were used for comparative analyses of the genus Citrus and its close relatives. All eight cp genomes share similar basic structure, gene order and gene content. Phylogenetic analyses supported the monophyly of the three genera in the order Sapindales within the major clade Malvidae.
Evolution of the bovine lysozyme gene family: changes in gene expression and reversion of function.

PubMed

Irwin, D M

1995-09-01

Recruitment of lysozyme to a digestive function in ruminant artiodactyls is associated with amplification of the gene. At least four of the approximately ten genes are expressed in the stomach, and several are expressed in nonstomach tissues. Characterization of additional lysozymelike sequences in the bovine genome has identified most, if not all, of the members of this gene family. There are at least six stomachlike lysozyme genes, two of which are pseudogenes. The stomach lysozyme pseudogenes show a pattern of concerted evolution similar to that of the functional stomach genes. At least four nonstomach lysozyme genes exist. The nonstomach lysozyme genes are not monophyletic. A gene encoding a tracheal lysozyme was isolated, and the stomach lysozyme of advanced ruminants was found to be more closely related to the tracheal lysozyme than to the stomach lysozyme of the camel or other nonstomach lysozyme genes of ruminants. The tracheal lysozyme shares with stomach lysozymes of advanced ruminants the deletion of amino acid 103, and several other adaptive sequence characteristics of stomach lysozymes. I suggest here that tracheal lysozyme has reverted from a functional stomach lysozyme. Tracheal lysozyme then represents a second instance of a change in lysozyme gene expression and function within ruminants.
Statistical inference of the generation probability of T-cell receptors from sequence repertoires.

PubMed

Murugan, Anand; Mora, Thierry; Walczak, Aleksandra M; Callan, Curtis G

2012-10-02

Stochastic rearrangement of germline V-, D-, and J-genes to create variable coding sequence for certain cell surface receptors is at the origin of immune system diversity. This process, known as "VDJ recombination", is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Because any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on nonproductive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our probabilistic model predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.
Genome sequence and analysis of a stress-tolerant, wild-derived strain of Saccharomyces cerevisiae used in biofuels research

DOE Office of Scientific and Technical Information (OSTI.GOV)

McIlwain, Sean J.; Peris, Davis; Sardi, Maria

The genome sequences of more than 100 strains of the yeast Saccharomyces cerevisiae have been published. Unfortunately, most of these genome assemblies contain dozens to hundreds of gaps at repetitive sequences, including transposable elements, tRNAs, and subtelomeric regions, which is where novel genes generally reside. Relatively few strains have been chosen for genome sequencing based on their biofuel production potential, leaving an additional knowledge gap. Here, we describe the nearly complete genome sequence of GLBRCY22-3 (Y22-3), a strain of S. cerevisiae derived from the stress-tolerant wild strain NRRL YB-210 and subsequently engineered for xylose metabolism. After benchmarking several genome assemblymore » approaches, we developed a pipeline to integrate Pacific Biosciences (PacBio) and Illumina sequencing data and achieved one of the highest quality genome assemblies for any S. cerevisiae strain. Specifically, the contig N50 is 693 kbp, and the sequences of most chromosomes, the mitochondrial genome, and the 2-micron plasmid are complete. Our annotation predicts 92 genes that are not present in the reference genome of the laboratory strain S288c, over 70% of which were expressed. We predicted functions for 43 of these genes, 28 of which were previously uncharacterized and unnamed. Remarkably, many of these genes are predicted to be involved in stress tolerance and carbon metabolism and are shared with a Brazilian bioethanol production strain, even though the strains differ dramatically at most genetic loci. Lastly, the Y22-3 genome sequence provides an exceptionally high-quality resource for basic and applied research in bioenergy and genetics.« less
Genome sequence and analysis of a stress-tolerant, wild-derived strain of Saccharomyces cerevisiae used in biofuels research

DOE PAGES

McIlwain, Sean J.; Peris, Davis; Sardi, Maria; ...

2016-04-20

The genome sequences of more than 100 strains of the yeast Saccharomyces cerevisiae have been published. Unfortunately, most of these genome assemblies contain dozens to hundreds of gaps at repetitive sequences, including transposable elements, tRNAs, and subtelomeric regions, which is where novel genes generally reside. Relatively few strains have been chosen for genome sequencing based on their biofuel production potential, leaving an additional knowledge gap. Here, we describe the nearly complete genome sequence of GLBRCY22-3 (Y22-3), a strain of S. cerevisiae derived from the stress-tolerant wild strain NRRL YB-210 and subsequently engineered for xylose metabolism. After benchmarking several genome assemblymore » approaches, we developed a pipeline to integrate Pacific Biosciences (PacBio) and Illumina sequencing data and achieved one of the highest quality genome assemblies for any S. cerevisiae strain. Specifically, the contig N50 is 693 kbp, and the sequences of most chromosomes, the mitochondrial genome, and the 2-micron plasmid are complete. Our annotation predicts 92 genes that are not present in the reference genome of the laboratory strain S288c, over 70% of which were expressed. We predicted functions for 43 of these genes, 28 of which were previously uncharacterized and unnamed. Remarkably, many of these genes are predicted to be involved in stress tolerance and carbon metabolism and are shared with a Brazilian bioethanol production strain, even though the strains differ dramatically at most genetic loci. Lastly, the Y22-3 genome sequence provides an exceptionally high-quality resource for basic and applied research in bioenergy and genetics.« less

The Genome of Gryllus bimaculatus Nudivirus Indicates an Ancient Diversification of Baculovirus-Related Nonoccluded Nudiviruses of Insects▿

PubMed Central

Wang, Yongjie; Kleespies, Regina G.; Huger, Alois M.; Jehle, Johannes A.

2007-01-01

The Gryllus bimaculatus nudivirus (GbNV) infects nymphs and adults of the cricket Gryllus bimaculatus (Orthoptera: Gryllidae). GbNV and other nudiviruses such as Heliothis zea nudivirus 1 (HzNV-1) and Oryctes rhinoceros nudivirus (OrNV) were previously called “nonoccluded baculoviruses” as they share some similar structural, genomic, and replication aspects with members of the family Baculoviridae. Their relationships to each other and to baculoviruses are elucidated by the sequence of the complete genome of GbNV, which is 96,944 bp, has an AT content of 72%, and potentially contains 98 predicted protein-coding open reading frames (ORFs). Forty-one ORFs of GbNV share sequence similarities with ORFs found in OrNV, HzNV-1, baculoviruses, and bacteria. Most notably, 15 GbNV ORFs are homologous to the baculovirus core genes, which are associated with transcription (lef-8, lef-9, lef-4, vlf-1, and lef-5), replication (dnapol), structural proteins (p74, pif-1, pif-2, pif-3, vp91, and odv-e56), and proteins of unknown function (38K, ac81, and 19kda). Homologues to these baculovirus core genes have been predicted in HzNV-1 as well. Six GbNV ORFs are homologous to nonconserved baculovirus genes dnaligase, helicase 2, rr1, rr2, iap-3, and desmoplakin. However, the remaining 57 ORFs revealed no homology or poor similarities to the current gene databases. No homologous repeat (hr) sequences but fourteen short direct repeat (dr) regions were detected in the GbNV genome. Gene content and sequence similarity suggest that the nudiviruses GbNV, HzNV-1, and OrNV form a monophyletic group of nonoccluded double-stranded DNA viruses, which separated from the baculovirus lineage before this radiated into dipteran-, hymenopteran-, and lepidopteran-specific clades of occluded nucleopolyhedroviruses and granuloviruses. The accumulated information on the GbNV genome suggests that nudiviruses form a highly diverse and phylogenetically ancient sister group of the baculoviruses, which have evolved in a variety of highly divergent host orders. PMID:17360757
Second generation DNA sequencing of the mitogenome of the Chinstrap penguin and comparative genomics of Antarctic penguins.

PubMed

Subramanian, Sankar; Lingala, Syamala Gowri; Swaminathan, Siva; Huynen, Leon; Lambert, David

2014-08-01

The complete mitochondrial genome of the Chinstrap penguin (Pygoscelis antarcticus) was sequenced and compared with other penguin mitogenomes. The genome is 15,972 bp in length with the number and order of protein coding genes and RNAs being very similar to that of other known penguin mitogenomes. Comparative nucleotide analysis showed the Chinstrap mitogenome shares 94% homology with the mitogenome of its sister species, Pygoscelis adelie (Adélie penguin). Divergence at nonsynonymous nucleotide positions was found to be up to 23 times less than that observed in synonymous positions of protein coding genes, suggesting high selection constraints. The complete mitogenome data will be useful for genetic and evolutionary studies of penguins.
Multiple copies of a bile acid-inducible gene in Eubacterium sp. strain VPI 12708.

PubMed Central

Gopal-Srivastava, R; Mallonee, D H; White, W B; Hylemon, P B

1990-01-01

Eubacterium sp. strain VPI 12708 is an anaerobic intestinal bacterium which possesses inducible bile acid 7-dehydroxylation activity. Several new polypeptides are produced in this strain following induction with cholic acid. Genes coding for two copies of a bile acid-inducible 27,000-dalton polypeptide (baiA1 and baiA2) have been previously cloned and sequenced. We now report on a gene coding for a third copy of this 27,000-dalton polypeptide (baiA3). The baiA3 gene has been cloned in lambda DASH on an 11.2-kilobase DNA fragment from a partial Sau3A digest of the Eubacterium DNA. DNA sequence analysis of the baiA3 gene revealed 100% homology with the baiA1 gene within the coding region of the 27,000-dalton polypeptides. The baiA2 gene shares 81% sequence identity with the other two genes at the nucleotide level. The flanking nucleotide sequences associated with the baiA1 and baiA3 genes are identical for 930 bases in the 5' direction from the initiation codon and for at least 325 bases in the 3' direction from the stop codon, including the putative promoter regions for the genes. An additional open reading frame (occupying from 621 to 648 bases, depending on the correct start codon) was found in the identical 5' regions associated with the baiA1 and baiA3 clones. The 5' sequence 930 bases upstream from the baiA1 and baiA3 genes was totally divergent. The baiA2 gene, which is part of a large bile acid-inducible operon, showed no homology with the other two genes either in the 5' or 3' direction from the polypeptide coding region, except for a 15-base-pair presumed ribosome-binding site in the 5' region. These studies strongly suggest that a gene duplication (baiA1 and baiA3) has occurred and is stably maintained in this bacterium. Images PMID:2376563
Metabolic Pathway Assignment of Plant Genes based on Phylogenetic Profiling–A Feasibility Study

PubMed Central

Weißenborn, Sandra; Walther, Dirk

2017-01-01

Despite many developed experimental and computational approaches, functional gene annotation remains challenging. With the rapidly growing number of sequenced genomes, the concept of phylogenetic profiling, which predicts functional links between genes that share a common co-occurrence pattern across different genomes, has gained renewed attention as it promises to annotate gene functions based on presence/absence calls alone. We applied phylogenetic profiling to the problem of metabolic pathway assignments of plant genes with a particular focus on secondary metabolism pathways. We determined phylogenetic profiles for 40,960 metabolic pathway enzyme genes with assigned EC numbers from 24 plant species based on sequence and pathway annotation data from KEGG and Ensembl Plants. For gene sequence family assignments, needed to determine the presence or absence of particular gene functions in the given plant species, we included data of all 39 species available at the Ensembl Plants database and established gene families based on pairwise sequence identities and annotation information. Aside from performing profiling comparisons, we used machine learning approaches to predict pathway associations from phylogenetic profiles alone. Selected metabolic pathways were indeed found to be composed of gene families of greater than expected phylogenetic profile similarity. This was particularly evident for primary metabolism pathways, whereas for secondary pathways, both the available annotation in different species as well as the abstraction of functional association via distinct pathways proved limiting. While phylogenetic profile similarity was generally not found to correlate with gene co-expression, direct physical interactions of proteins were reflected by a significantly increased profile similarity suggesting an application of phylogenetic profiling methods as a filtering step in the identification of protein-protein interactions. This feasibility study highlights the potential and challenges associated with phylogenetic profiling methods for the detection of functional relationships between genes as well as the need to enlarge the set of plant genes with proven secondary metabolism involvement as well as the limitations of distinct pathways as abstractions of relationships between genes. PMID:29163570
A comparison of microbial communities in deep-sea polymetallic nodules and the surrounding sediments in the Pacific Ocean

NASA Astrophysics Data System (ADS)

Wu, Yue-Hong; Liao, Li; Wang, Chun-Sheng; Ma, Wei-Lin; Meng, Fan-Xu; Wu, Min; Xu, Xue-Wei

2013-09-01

Deep-sea polymetallic nodules, rich in metals such as Fe, Mn, and Ni, are potential resources for future exploitation. Early culturing and microscopy studies suggest that polymetallic nodules are at least partially biogenic. To understand the microbial communities in this environment, we compared microbial community composition and diversity inside nodules and in the surrounding sediments. Three sampling sites in the Pacific Ocean containing polymetallic nodules were used for culture-independent investigations of microbial diversity. A total of 1013 near full-length bacterial 16S rRNA gene sequences and 640 archaeal 16S rRNA gene sequences with ~650 bp from nodules and the surrounding sediments were analyzed. Bacteria showed higher diversity than archaea. Interestingly, sediments contained more diverse bacterial communities than nodules, while the opposite was detected for archaea. Bacterial communities tend to be mostly unique to sediments or nodules, with only 13.3% of sequences shared. The most abundant bacterial groups detected only in nodules were Pseudoalteromonas and Alteromonas, which were predicted to play a role in building matrix outside cells to induce or control mineralization. However, archaeal communities were mostly shared between sediments and nodules, including the most abundant OTU containing 290 sequences from marine group I Thaumarchaeota. PcoA analysis indicated that microhabitat (i.e., nodule or sediment) seemed to be a major factor influencing microbial community composition, rather than sampling locations or distances between locations.
Genome sequencing identifies Listeria fleischmannii subsp. coloradonensis subsp. nov., isolated from a ranch.

PubMed

den Bakker, Henk C; Manuel, Clyde S; Fortes, Esther D; Wiedmann, Martin; Nightingale, Kendra K

2013-09-01

Twenty Listeria-like isolates were obtained from environmental samples collected on a cattle ranch in northern Colorado; all of these isolates were found to share an identical partial sigB sequence, suggesting close relatedness. The isolates were similar to members of the genus Listeria in that they were Gram-stain-positive, short rods, oxidase-negative and catalase-positive; the isolates were similar to Listeria fleischmannii because they were non-motile at 25 °C. 16S rRNA gene sequencing for representative isolates and whole genome sequencing for one isolate was performed. The genome of the type strain of Listeria fleischmannii (strain LU2006-1(T)) was also sequenced. The draft genomes were very similar in size and the average MUMmer nucleotide identity across 91% of the genomes was 95.16%. Genome sequence data were used to design primers for a six-gene multi-locus sequence analysis (MLSA) scheme. Phylogenies based on (i) the near-complete 16S rRNA gene, (ii) 31 core genes and (iii) six housekeeping genes illustrated the close relationship of these Listeria-like isolates to Listeria fleischmannii LU2006-1(T). Sufficient genetic divergence of the Listeria-like isolates from the type strain of Listeria fleischmannii and differing phenotypic characteristics warrant these isolates to be classified as members of a distinct infraspecific taxon, for which the name Listeria fleischmannii subsp. coloradonensis subsp. nov. is proposed. The type strain is TTU M1-001(T) ( =BAA-2414(T) =DSM 25391(T)). The isolates of Listeria fleischmannii subsp. coloradonensis subsp. nov. differ from the nominate subspecies by the inability to utilize melezitose, turanose and sucrose, and the ability to utilize inositol. The results also demonstrate the utility of whole genome sequencing to facilitate identification of novel taxa within a well-described genus. The genomes of both subspecies of Listeria fleischmannii contained putative enhancin genes; the Listeria fleischmannii subsp. coloradonensis subsp. nov. genome also encoded a putative mosquitocidal toxin. The presence of these genes suggests possible adaptation to an insect host, and further studies are needed to probe niche adaptation of Listeria fleischmannii.
An unexpectedly large and loosely packed mitochondrial genome in the charophycean green alga Chlorokybus atmophyticus

PubMed Central

Turmel, Monique; Otis, Christian; Lemieux, Claude

2007-01-01

Background The Streptophyta comprises all land plants and six groups of charophycean green algae. The scaly biflagellate Mesostigma viride (Mesostigmatales) and the sarcinoid Chlorokybus atmophyticus (Chlorokybales) represent the earliest diverging lineages of this phylum. In trees based on chloroplast genome data, these two charophycean green algae are nested in the same clade. To validate this relationship and gain insight into the ancestral state of the mitochondrial genome in the Charophyceae, we sequenced the mitochondrial DNA (mtDNA) of Chlorokybus and compared this genome sequence with those of three other charophycean green algae and the bryophytes Marchantia polymorpha and Physcomitrella patens. Results The Chlorokybus genome differs radically from its 42,424-bp Mesostigma counterpart in size, gene order, intron content and density of repeated elements. At 201,763-bp, it is the largest mtDNA yet reported for a green alga. The 70 conserved genes represent 41.4% of the genome sequence and include nad10 and trnL(gag), two genes reported for the first time in a streptophyte mtDNA. At the gene order level, the Chlorokybus genome shares with its Chara, Chaetosphaeridium and bryophyte homologues eight to ten gene clusters including about 20 genes. Notably, some of these clusters exhibit gene linkages not previously found outside the Streptophyta, suggesting that they originated early during streptophyte evolution. In addition to six group I and 14 group II introns, short repeated sequences accounting for 7.5% of the genome were identified. Mitochondrial trees were unable to resolve the correct position of Mesostigma, due to analytical problems arising from accelerated sequence evolution in this lineage. Conclusion The Chlorokybus and Mesostigma mtDNAs exemplify the marked fluidity of the mitochondrial genome in charophycean green algae. The notion that the mitochondrial genome was constrained to remain compact during charophycean evolution is no longer tenable. Our data raise the possibility that the emergence of land plants was not associated with a substantial gain of intergenic sequences by the mitochondrial genome. PMID:17537252
Gene discovery in an invasive tephritid model pest species, the Mediterranean fruit fly, Ceratitis capitata

PubMed Central

Gomulski, Ludvik M; Dimopoulos, George; Xi, Zhiyong; Soares, Marcelo B; Bonaldo, Maria F; Malacrida, Anna R; Gasperi, Giuliano

2008-01-01

Background The medfly, Ceratitis capitata, is a highly invasive agricultural pest that has become a model insect for the development of biological control programs. Despite research into the behavior and classical and population genetics of this organism, the quantity of sequence data available is limited. We have utilized an expressed sequence tag (EST) approach to obtain detailed information on transcriptome signatures that relate to a variety of physiological systems in the medfly; this information emphasizes on reproduction, sex determination, and chemosensory perception, since the study was based on normalized cDNA libraries from embryos and adult heads. Results A total of 21,253 high-quality ESTs were obtained from the embryo and head libraries. Clustering analyses performed separately for each library resulted in 5201 embryo and 6684 head transcripts. Considering an estimated 19% overlap in the transcriptomes of the two libraries, they represent about 9614 unique transcripts involved in a wide range of biological processes and molecular functions. Of particular interest are the sequences that share homology with Drosophila genes involved in sex determination, olfaction, and reproductive behavior. The medfly transformer2 (tra2) homolog was identified among the embryonic sequences, and its genomic organization and expression were characterized. Conclusion The sequences obtained in this study represent the first major dataset of expressed genes in a tephritid species of agricultural importance. This resource provides essential information to support the investigation of numerous questions regarding the biology of the medfly and other related species and also constitutes an invaluable tool for the annotation of complete genome sequences. Our study has revealed intriguing findings regarding the transcript regulation of tra2 and other sex determination genes, as well as insights into the comparative genomics of genes implicated in chemosensory reception and reproduction. PMID:18500975
Sequences of two related multiple antibiotic resistance virulence plasmids sharing a unique IS26-related molecular signature isolated from different Escherichia coli pathotypes from different hosts.

PubMed

Venturini, Carola; Hassan, Karl A; Roy Chowdhury, Piklu; Paulsen, Ian T; Walker, Mark J; Djordjevic, Steven P

2013-01-01

Enterohemorrhagic Escherichia coli (EHEC) and atypical enteropathogenic E. coli (aEPEC) are important zoonotic pathogens that increasingly are becoming resistant to multiple antibiotics. Here we describe two plasmids, pO26-CRL125 (125 kb) from a human O26:H- EHEC, and pO111-CRL115 (115kb) from a bovine O111 aEPEC, that impart resistance to ampicillin, kanamycin, neomycin, streptomycin, sulfathiazole, trimethoprim and tetracycline and both contain atypical class 1 integrons with an identical IS26-mediated deletion in their 3´-conserved segment. Complete sequence analysis showed that pO26-CRL125 and pO111-CRL115 are essentially identical except for a 9.7 kb fragment, present in the backbone of pO26-CRL125 but absent in pO111-CRL115, and several indels. The 9.7 kb fragment encodes IncI-associated genes involved in plasmid stability during conjugation, a putative transposase gene and three imperfect repeats. Contiguous sequence identical to regions within these pO26-CRL125 imperfect repeats was identified in pO111-CRL115 precisely where the 9.7 kb fragment is missing, suggesting it may be mobile. Sequences shared between the plasmids include a complete IncZ replicon, a unique toxin/antitoxin system, IncI stability and maintenance genes, a novel putative serine protease autotransporter, and an IncI1 transfer system including a unique shufflon. Both plasmids carry a derivate Tn21 transposon with an atypical class 1 integron comprising a dfrA5 gene cassette encoding resistance to trimethoprim, and 24 bp of the 3´-conserved segment followed by Tn6026, which encodes resistance to ampicillin, kanymycin, neomycin, streptomycin and sulfathiazole. The Tn21-derivative transposon is linked to a truncated Tn1721, encoding resistance to tetracycline, via a region containing the IncP-1α oriV. Absence of the 5 bp direct repeats flanking Tn3-family transposons, indicates that homologous recombination events played a key role in the formation of this complex antibiotic resistance gene locus. Comparative sequence analysis of these closely related plasmids reveals aspects of plasmid evolution in pathogenic E. coli from different hosts.
Identification and Sequencing of a Novel Rodent Gammaherpesvirus That Establishes Acute and Latent Infection in Laboratory Mice ▿

PubMed Central

Loh, Joy; Zhao, Guoyan; Nelson, Christopher A.; Coder, Penny; Droit, Lindsay; Handley, Scott A.; Johnson, L. Steven; Vachharajani, Punit; Guzman, Hilda; Tesh, Robert B.; Wang, David; Fremont, Daved H.; Virgin, Herbert W.

2011-01-01

Gammaherpesviruses encode numerous immunomodulatory molecules that contribute to their ability to evade the host immune response and establish persistent, lifelong infections. As the human gammaherpesviruses are strictly species specific, small animal models of gammaherpesvirus infection, such as murine gammaherpesvirus 68 (γHV68) infection, are important for studying the roles of gammaherpesvirus immune evasion genes in in vivo infection and pathogenesis. We report here the genome sequence and characterization of a novel rodent gammaherpesvirus, designated rodent herpesvirus Peru (RHVP), that shares conserved genes and genome organization with γHV68 and the primate gammaherpesviruses but is phylogenetically distinct from γHV68. RHVP establishes acute and latent infection in laboratory mice. Additionally, RHVP contains multiple open reading frames (ORFs) not present in γHV68 that have sequence similarity to primate gammaherpesvirus immunomodulatory genes or cellular genes. These include ORFs with similarity to major histocompatibility complex class I (MHC-I), C-type lectins, and the mouse mammary tumor virus and herpesvirus saimiri superantigens. As these ORFs may function as immunomodulatory or virulence factors, RHVP presents new opportunities for the study of mechanisms of immune evasion by gammaherpesviruses. PMID:21209105
Seasonal and regional diversity of maple sap microbiota revealed using community PCR fingerprinting and 16S rRNA gene clone libraries.

PubMed

Filteau, Marie; Lagacé, Luc; LaPointe, Gisèle; Roy, Denis

2010-04-01

An arbitrary primed community PCR fingerprinting technique based on capillary electrophoresis was developed to study maple sap microbial community characteristics among 19 production sites in Québec over the tapping season. Presumptive fragment identification was made with corresponding fingerprint profiles of bacterial isolate cultures. Maple sap microbial communities were subsequently compared using a representative subset of 13 16S rRNA gene clone libraries followed by gene sequence analysis. Results from both methods indicated that all maple sap production sites and flow periods shared common microbiota members, but distinctive features also existed. Changes over the season in relative abundance of predominant populations showed evidence of a common pattern. Pseudomonas (64%) and Rahnella (8%) were the most abundantly and frequently represented genera of the 2239 sequences analyzed. Janthinobacterium, Leuconostoc, Lactococcus, Weissella, Epilithonimonas and Sphingomonas were revealed as occasional contaminants in maple sap. Maple sap microbiota showed a low level of deep diversity along with a high variation of similar 16S rRNA gene sequences within the Pseudomonas genus. Predominance of Pseudomonas is suggested as a typical feature of maple sap microbiota across geographical regions, production sites, and sap flow periods.
Sequence analyses reveal that a TPR-DP module, surrounded by recombinable flanking introns, could be at the origin of eukaryotic Hop and Hip TPR-DP domains and prokaryotic GerD proteins.

PubMed

Hernández Torres, Jorge; Papandreou, Nikolaos; Chomilier, Jacques

2009-05-01

The co-chaperone Hop [heat shock protein (HSP) organising protein] is known to bind both Hsp70 and Hsp90. Hop comprises three repeats of a tetratricopeptide repeat (TPR) domain, each consisting of three TPR motifs. The first and last TPR domains are followed by a domain containing several dipeptide (DP) repeats called the DP domain. These analyses suggest that the hop genes result from successive recombination events of an ancestral TPR-DP module. From a hydrophobic cluster analysis of homologous Hop protein sequences derived from gene families, we can postulate that shifts in the open reading frames are at the origin of the present sequences. Moreover, these shifts can be related to the presence or absence of biological function. We propose to extend the family of Hop co-chaperons into the kingdom of bacteria, as several structurally related genes have been identified by hydrophobic cluster analysis. We also provide evidence of common structural characteristics between hop and hip genes, suggesting a shared precursor of ancestral TPR-DP domains.
Cloning and expression of a small heat and salt tolerant protein (Hsp22) from Chaetomium globosum.

PubMed

Aggarwal, Rashmi; Gupta, Sangeeta; Sharma, Sapna; Banerjee, Sagar; Singh, Priyanka

2012-11-01

The present study reports molecular characterization of small heat shock protein gene in Indian isolates of Chaetomium globosum, C. perlucidum, C. reflexum, C. cochlioides and C. cupreum. Six isolates of C. globosum and other species showed a band of 630bp using specific primers. Amplified cDNA product of C. globosum (Cg 1) cloned and sequenced showed 603bp open reading frame encoding 200 amino-acids. The protein sequence had a molecular mass of 22 kDa and was therefore, named Hsp22. BlastX analysis revealed that the gene codes for a protein homologous to previously characterized Hsp22.4 gene from C. globosum (AAR36902.1, XP 001229241.1) and shared 95% identity in amino acid sequence. It also showed varying degree of similarities with small Hsp protein from Neurospora spp. (60%), Myceliophthora sp. (59%), Glomerella sp. (50%), Hypocrea sp. (52%), and Fusarium spp. (51%). This gene was further cloned into pET28a (+) and transformed E. coli BL21 cells were induced by IPTG, and the expressed protein of 30 kDa was analyzed by SDS-PAGE. The IPTG induced transformants displayed significantly greater resistance to NaCl and Na2CO3 stresses.
Biodiversity and phylogenetic analysis of culturable bacteria indigenous to Khewra salt mine of pakistan and their industrial importance

PubMed Central

Akhtar, Nasrin; Ghauri, Muhammad A.; Iqbal, Aamira; Anwar, Munir A.; Akhtar, Kalsoom

2008-01-01

Culturable bacterial biodiversity and industrial importance of the isolates indigenous to Khewra salt mine, Pakistan was assessed. PCR Amplification of 16S rDNA of isolates was carried out by using universal primers FD1 and rP1and products were sequenced commercially. These gene sequences were compared with other gene sequences in the GenBank databases to find the closely related sequences. The alignment of these sequences with sequences available from GenBank database was carried out to construct a phylogenetic tree for these bacteria. These genes were deposited to GenBank and accession numbers were obtained. Most of the isolates belonged to different species of genus Bacillus, sharing 92-99% 16S rDNA identity with the respective type strain. Other isolates had close similarities with Escherichia coli, Staphylococcus arlettae and Staphylococcus gallinarum with 97%, 98% and 99% 16S rDNA similarity respectively. The abilities of isolates to produce industrial enzymes (amylase, carboxymethylcellulase, xylanase, cellulase and protease) were checked. All isolates were tested against starch, carboxymethylcellulose (CMC), xylane, cellulose, and casein degradation in plate assays. BPT-5, 11,18,19 and 25 indicated the production of copious amounts of carbohydrates and protein degrading enzymes. Based on this study it can be concluded that Khewra salt mine is populated with diverse bacterial groups, which are potential source of industrial enzymes for commercial applications. PMID:24031194
First Staphylococcal Cassette Chromosome mec Containing a mecB-Carrying Gene Complex Independent of Transposon Tn6045 in a Macrococcus caseolyticus Isolate from a Canine Infection

PubMed Central

Gómez-Sanz, Elena; Schwendener, Sybille; Thomann, Andreas; Gobeli Brawand, Stefanie

2015-01-01

A methicillin-resistant mecB-positive Macrococcus caseolyticus (strain KM45013) was isolated from the nares of a dog with rhinitis. It contained a novel 39-kb transposon-defective complete mecB-carrying staphylococcal cassette chromosome mec element (SCCmecKM45013). SCCmecKM45013 contained 49 coding sequences (CDSs), was integrated at the 3′ end of the chromosomal orfX gene, and was delimited at both ends by imperfect direct repeats functioning as integration site sequences (ISSs). SCCmecKM45013 presented two discontinuous regions of homology (SCCmec coverage of 35%) to the chromosomal and transposon Tn6045-associated SCCmec-like element of M. caseolyticus JCSC7096: (i) the mec gene complex (98.8% identity) and (ii) the ccr-carrying segment (91.8% identity). The mec gene complex, located at the right junction of the cassette, also carried the β-lactamase gene blaZm (mecRm-mecIm-mecB-blaZm). SCCmecKM45013 contained two cassette chromosome recombinase genes, ccrAm2 and ccrBm2, which shared 94.3% and 96.6% DNA identity with those of the SCCmec-like element of JCSC7096 but shared less than 52% DNA identity with the staphylococcal ccrAB and ccrC genes. Three distinct extrachromosomal circularized elements (the entire SCCmecKM45013, ΨSCCmecKM45013 lacking the ccr genes, and SCCKM45013 lacking mecB) flanked by one ISS copy, as well as the chromosomal regions remaining after excision, were detected. An unconventional circularized structure carrying the mecB gene complex was associated with two extensive direct repeat regions, which enclosed two open reading frames (ORFs) (ORF46 and ORF51) flanking the chromosomal mecB-carrying gene complex. This study revealed M. caseolyticus as a potential disease-associated bacterium in dogs and also unveiled an SCCmec element carrying mecB not associated with Tn6045 in the genus Macrococcus. PMID:25987634
KpnBI is the prototype of a new family (IE) of bacterial type I restriction-modification system

PubMed Central

Chin, V.; Valinluck, V.; Magaki, S.; Ryu, J.

2004-01-01

KpnBI is a restriction-modification (R-M) system recognized in the GM236 strain of Klebsiella pneumoniae. Here, the KpnBI modification genes were cloned into a plasmid using a modification expression screening method. The modification genes that consist of both hsdM (2631 bp) and hsdS (1344 bp) genes were identified on an 8.2 kb EcoRI chromosomal fragment. These two genes overlap by one base and share the same promoter located upstream of the hsdM gene. Using recently developed plasmid R-M tests and a computer program RM Search, the DNA recognition sequence for the KpnBI enzymes was identified as a new 8 nt sequence containing one degenerate base with a 6 nt spacer, CAAANNNNNNRTCA. From Dam methylation and HindIII sensitivity tests, the methylation loci were predicted to be the italicized third adenine in the 5′ specific region and the adenine opposite the italicized thymine in the 3′ specific region. Combined with previous sequence data for hsdR, we concluded that the KpnBI system is a typical type I R-M system. The deduced amino acid sequences of the three subunits of the KpnBI system show only limited homologies (25 to 33% identity) at best, to the four previously categorized type I families (IA, IB, IC, and ID). Furthermore, their identity scores to other uncharacterized putative genome type I sequences were 53% at maximum. Therefore, we propose that KpnBI is the prototype of a new ‘type IE’ family. PMID:15475385
Tracing outbreaks of Streptococcus equi infection (strangles) in horses using sequence variation in the seM gene and pulsed-field gel electrophoresis.

PubMed

Lindahl, Susanne; Söderlund, Robert; Frosth, Sara; Pringle, John; Båverud, Viveca; Aspán, Anna

2011-11-21

Strangles is a serious respiratory disease in horses caused by Streptococcus equi subspecies equi (S. equi). Transmission of the disease occurs by direct contact with an infected horse or contaminated equipment. Genetically, S. equi strains are highly homogenous and differentiation of strains has proven difficult. However, the S. equi M-protein SeM contains a variable N-terminal region and has been proposed as a target gene to distinguish between different strains of S. equi and determine the source of an outbreak. In this study, strains of S. equi (n=60) from 32 strangles outbreaks in Sweden during 1998-2003 and 2008-2009 were genetically characterized by sequencing the SeM protein gene (seM), and by pulsed-field gel electrophoresis (PFGE). Swedish strains belonged to 10 different seM types, of which five have not previously been described. Most were identical or highly similar to allele types from strangles outbreaks in the UK. Outbreaks in 2008/2009 sharing the same seM type were associated by geographic location and/or type of usage of the horses (racing stables). Sequencing of the seM gene generally agreed with pulsed-field gel electrophoresis profiles. Our data suggest that seM sequencing as a epidemiological tool is supported by the agreement between seM and PFGE and that sequencing of the SeM protein gene is more sensitive than PFGE in discriminating strains of S. equi. Copyright © 2011 Elsevier B.V. All rights reserved.
Characterizing the genetic diversity of the monkey malaria parasite Plasmodium cynomolgi

PubMed Central

Sutton, Patrick L.; Luo, Zunping; Divis, Paul C. S.; Friedrich, Volney K.; Conway, David J.; Singh, Balbir; Barnwell, John W.; Carlton, Jane M.; Sullivan, Steven A.

2016-01-01

Plasmodium cynomolgi is a malaria parasite that typically infects Asian macaque monkeys, and humans on rare occasions. P. cynomolgi serves as a model system for the human malaria parasite Plasmodium vivax, with which it shares such important biological characteristics as formation of a dormant liver stage and a preference to invade reticulocytes. While genomes of three P. cynomolgi strains have been sequenced, genetic diversity of P. cynomolgi has not been widely investigated. To address this we developed the first panel of P. cynomolgi microsatellite markers to genotype eleven P. cynomolgi laboratory strains and 18 field isolates from Sarawak, Malaysian Borneo. We found diverse genotypes among most of the laboratory strains, though two nominally different strains were found to be genetically identical, We also investigated sequence polymorphism in two erythrocyte invasion gene families, the reticulocyte binding protein and Duffy binding protein genes, in these strains. We also observed copy number variation in rbp genes. PMID:26980604
Expression of three mammalian cDNAs that interfere with RAS function in Saccharomyces cerevisiae.

PubMed Central

Colicelli, J; Nicolette, C; Birchmeier, C; Rodgers, L; Riggs, M; Wigler, M

1991-01-01

Saccharomyces cerevisiae strains expressing the activated RAS2Val19 gene or lacking both cAMP phosphodiesterase genes, PDE1 and PDE2, have impaired growth control and display an acute sensitivity to heat shock. We have isolated two classes of mammalian cDNAs from yeast expression libraries that suppress the heat shock-sensitive phenotype of RAS2Val19 strain. Members of the first class of cDNAs also suppress the heat shock-sensitive phenotype of pde1- pde2- strains and encode cAMP phosphodiesterases. Members of the second class fail to suppress the phenotype of pde1- pde2- strains and therefore are candidate cDNAs encoding proteins that interact with RAS proteins. We report the nucleotide sequence of three members of this class. Two of these cDNAs share considerable sequence similarity, but none are clearly similar to previously isolated genes. Images PMID:1849280
[Personal genome research and neurological diseases: overview].

PubMed

Toda, Tatsushi

2013-03-01

Neurological diseases include those caused by a single defective gene,e.g., Huntington's disease, other polyglutamine diseases, and muscular dystrophies, and those that are mostly sporadic but rarely show Mendelian inheritance in some families, e.g., Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, and epilepsy. The latter diseases are considered polygenic disorders. Both sporadic and Mendelian cases of these diseases are believed to share some common pathological mechanisms. Since the detection of causal genes for the Mendelian cases, studies have been initiated on disease pathology. SNPs and rare gene variants play important roles in common neurological diseases. From a technological perspective, next-generation sequencers have become widely available and have contributed to the advancement of research based on individual genome sequences (personal genome). This paper presents an overview, as well as a historical context, of the contribution of personal genome research to neurological disease studies.

Contribution of human growth hormone-releasing hormone receptor (GHRHR) gene sequence variation to isolated severe growth hormone deficiency (ISGHD) and normal adult height.

PubMed

Camats, Núria; Fernández-Cancio, Mónica; Carrascosa, Antonio; Andaluz, Pilar; Albisu, M Ángeles; Clemente, María; Gussinyé, Miquel; Yeste, Diego; Audí, Laura

2012-10-01

Molecular causes of isolated severe growth hormone deficiency (ISGHD) in several genes have been established. The aim of this study was to analyse the contribution of growth hormone-releasing hormone receptor (GHRHR) gene sequence variation to GH deficiency in a series of prepubertal ISGHD patients and to normal adult height. A systematic GHRHR gene sequence analysis was performed in 69 ISGHD patients and 60 normal adult height controls (NAHC). Four GHRHR single-nucleotide polymorphisms (SNPs) were genotyped in 248 additional NAHC. An analysis was performed on individual SNPs and combined genotype associations with diagnosis in ISGHD patients and with height-SDS in NAHC. Twenty-one SNPs were found. P3, P13, P15 and P20 had not been previously described. Patients and controls shared 12 SNPs (P1, P2, P4-P11, P16 and P21). Significantly different frequencies of the heterozygous genotype and alternate allele were detected in P9 (exon 4, rs4988498) and P12 (intron 6, rs35609199); P9 heterozygous genotype frequencies were similar in patients and the shortest control group (heights between -2 and -1 SDS) and significantly different in controls (heights between -1 and +2 SDS). GHRHR P9 together with 4 GH1 SNP genotypes contributed to 6·2% of height-SDS variation in the entire 308 NAHC. This study established the GHRHR gene sequence variation map in ISGHD patients and NAHC. No evidence of GHRHR mutation contribution to ISGHD was found in this population, although P9 and P12 SNP frequencies were significantly different between ISGHD and NAHC. Thus, the gene sequence may contribute to normal adult height, as demonstrated in NAHC. © 2012 Blackwell Publishing Ltd.
A Nomadic Subtelomeric Disease Resistance Gene Cluster in Common Bean1[W

PubMed Central

David, Perrine; Chen, Nicolas W.G.; Pedrosa-Harand, Andrea; Thareau, Vincent; Sévignac, Mireille; Cannon, Steven B.; Debouck, Daniel; Langin, Thierry; Geffroy, Valérie

2009-01-01

The B4 resistance (R) gene cluster is one of the largest clusters known in common bean (Phaseolus vulgaris [Pv]). It is located in a peculiar genomic environment in the subtelomeric region of the short arm of chromosome 4, adjacent to two heterochromatic blocks (knobs). We sequenced 650 kb spanning this locus and annotated 97 genes, 26 of which correspond to Coiled-Coil-Nucleotide-Binding-Site-Leucine-Rich-Repeat (CNL). Conserved microsynteny was observed between the Pv B4 locus and corresponding regions of Medicago truncatula and Lotus japonicus in chromosomes Mt6 and Lj2, respectively. The notable exception was the CNL sequences, which were completely absent in these regions. The origin of the Pv B4-CNL sequences was investigated through phylogenetic analysis, which reveals that, in the Pv genome, paralogous CNL genes are shared among nonhomologous chromosomes (4 and 11). Together, our results suggest that Pv B4-CNL was derived from CNL sequences from another cluster, the Co-2 cluster, through an ectopic recombination event. Integration of the soybean (Glycine max) genome data enables us to date more precisely this event and also to infer that a single CNL moved from the Co-2 to the B4 cluster. Moreover, we identified a new 528-bp satellite repeat, referred to as khipu, specific to the Phaseolus genus, present both between B4-CNL sequences and in the two knobs identified at the B4 R gene cluster. The khipu repeat is present on most chromosomal termini, indicating the existence of frequent ectopic recombination events in Pv subtelomeric regions. Our results highlight the importance of ectopic recombination in R gene evolution. PMID:19776165
Identification of a Transcriptionally Forward α Gene and Two υ Genes within the Pigeon (Columba livia) IgH Gene Locus.

PubMed

Huang, Tian; Wang, Xifeng; Si, Run; Chi, Hao; Han, Binyue; Han, Haitang; Cao, Gengsheng; Zhao, Yaofeng

2018-06-01

Compared with mammals, the bird Ig genetic system relies on gene conversion to create an Ab repertoire, with inversion of the IgA-encoding gene and very few cases of Ig subclass diversification. Although gene conversion has been studied intensively, class-switch recombination, a mechanism by which the IgH C region is exchanged, has rarely been investigated in birds. In this study, based on the published genome of pigeon ( Columba livia ) and high-throughput transcriptome sequencing of immune-related tissues, we identified a transcriptionally forward α gene and found that the pigeon IgH gene locus is arranged as μ-α-υ1-υ2. In this article, we show that both DNA deletion and inversion may result from IgA and IgY class switching, and similar junction patterns were observed for both types of class-switch recombination. We also identified two subclasses of υ genes in pigeon, which share low sequence identity. Phylogenetic analysis suggests that divergence of the two pigeon υ genes occurred during the early stage of bird evolution. The data obtained in this study provide new insight into class-switch recombination and Ig gene evolution in birds. Copyright © 2018 by The American Association of Immunologists, Inc.
Acquisition and dissemination of cephalosporin-resistant E. coli in migratory birds sampled at an Alaska landfill as inferred through genomic analysis.

PubMed

Ahlstrom, Christina A; Bonnedahl, Jonas; Woksepp, Hanna; Hernandez, Jorge; Olsen, Björn; Ramey, Andrew M

2018-05-09

Antimicrobial resistance (AMR) in bacterial pathogens threatens global health, though the spread of AMR bacteria and AMR genes between humans, animals, and the environment is still largely unknown. Here, we investigated the role of wild birds in the epidemiology of AMR Escherichia coli. Using next-generation sequencing, we characterized cephalosporin-resistant E. coli cultured from sympatric gulls and bald eagles inhabiting a landfill habitat in Alaska to identify genetic determinants conferring AMR, explore potential transmission pathways of AMR bacteria and genes at this site, and investigate how their genetic diversity compares to isolates reported in other taxa. We found genetically diverse E. coli isolates with sequence types previously associated with human infections and resistance genes of clinical importance, including bla CTX-M and bla CMY . Identical resistance profiles were observed in genetically unrelated E. coli isolates from both gulls and bald eagles. Conversely, isolates with indistinguishable core-genomes were found to have different resistance profiles. Our findings support complex epidemiological interactions including bacterial strain sharing between gulls and bald eagles and horizontal gene transfer among E. coli harboured by birds. Results suggest that landfills may serve as a source for AMR acquisition and/or maintenance, including bacterial sequence types and AMR genes relevant to human health.
Genomic and proteomic characterization of a thermophilic Geobacillus bacteriophage GBSV1.

PubMed

Liu, Bin; Zhou, Fengfeng; Wu, Suijie; Xu, Ying; Zhang, Xiaobo

2009-03-01

Phages are present wherever life is found, and play roles in many biogeochemical and ecological processes. The thermophilic bacteriophages, however, have not been well studied. In this study, phage GBSV1 was obtained from a thermophilic bacterium Geobacillus sp. 6k51 isolated from a hot spring. GBSV1 contains a double-stranded linear DNA of 34,683bp, which encodes 54 putative open reading frames (ORFs). Thirty three of these 54 ORFs exhibit sequence similarities to genes from 7 species of Geobacillus or Bacillus bacteria, as well as of bacteriophages infecting these bacteria. Twenty-two ORFs have been functionally annotated based on both their sequence similarities to known genes and predicted Pfam protein domains. Five structural proteins of the purified GBSV1 virion have been identified by proteomic analyses. Surprisingly, 7 of the GBSV1 ORFs share sequence similarities with genes from bacteria relevant to human diseases. This is the first report that genes of human disease-inducing bacteria are found in a thermophilic phage. It is suggested that thermophilic phages may be the potential evolutionary link between thermophiles and human pathogens. The characterization of GBSV1 may possibly lead to new insights into virus-host interactions and to a better understanding of gene transfers and evolution of life on earth in general.
The role of prophage for genome diversification within a clonal lineage of Lactobacillus johnsonii: characterization of the defective prophage LJ771.

PubMed

Denou, Emmanuel; Pridmore, Raymond David; Ventura, Marco; Pittet, Anne-Cécile; Zwahlen, Marie-Camille; Berger, Bernard; Barretto, Caroline; Panoff, Jean-Michel; Brüssow, Harald

2008-09-01

Two independent isolates of the gut commensal Lactobacillus johnsonii were sequenced. These isolates belonged to the same clonal lineage and differed mainly by a 40.8-kb prophage, LJ771, belonging to the Sfi11 phage lineage. LJ771 shares close DNA sequence identity with Lactobacillus gasseri prophages. LJ771 coexists as an integrated prophage and excised circular phage DNA, but phage DNA packaged into extracellular phage particles was not detected. Between the phage lysin gene and attR a likely mazE ("antitoxin")/pemK ("toxin") gene cassette was detected in LJ771 but not in the L. gasseri prophages. Expressed pemK could be cloned in Escherichia coli only together with the mazE gene. LJ771 was shown to be highly stable and could be cured only by coexpression of mazE from a plasmid. The prophage was integrated into the methionine sulfoxide reductase gene (msrA) and complemented the 5' end of this gene, creating a protein with a slightly altered N-terminal sequence. The two L. johnsonii strains had identical in vitro growth and in vivo gut persistence phenotypes. Also, in an isogenic background, the presence of the prophage resulted in no growth disadvantage.
The genome of Mesobuthus martensii reveals a unique adaptation model of arthropods

PubMed Central

Cao, Zhijian; Yu, Yao; Wu, Yingliang; Hao, Pei; Di, Zhiyong; He, Yawen; Chen, Zongyun; Yang, Weishan; Shen, Zhiyong; He, Xiaohua; Sheng, Jia; Xu, Xiaobo; Pan, Bohu; Feng, Jing; Yang, Xiaojuan; Hong, Wei; Zhao, Wenjuan; Li, Zhongjie; Huang, Kai; Li, Tian; Kong, Yimeng; Liu, Hui; Jiang, Dahe; Zhang, Binyan; Hu, Jun; Hu, Youtian; Wang, Bin; Dai, Jianliang; Yuan, Bifeng; Feng, Yuqi; Huang, Wei; Xing, Xiaojing; Zhao, Guoping; Li, Xuan; Li, Yixue; Li, Wenxin

2013-01-01

Representing a basal branch of arachnids, scorpions are known as ‘living fossils’ that maintain an ancient anatomy and are adapted to have survived extreme climate changes. Here we report the genome sequence of Mesobuthus martensii, containing 32,016 protein-coding genes, the most among sequenced arthropods. Although M. martensii appears to evolve conservatively, it has a greater gene family turnover than the insects that have undergone diverse morphological and physiological changes, suggesting the decoupling of the molecular and morphological evolution in scorpions. Underlying the long-term adaptation of scorpions is the expansion of the gene families enriched in basic metabolic pathways, signalling pathways, neurotoxins and cytochrome P450, and the different dynamics of expansion between the shared and the scorpion lineage-specific gene families. Genomic and transcriptomic analyses further illustrate the important genetic features associated with prey, nocturnal behaviour, feeding and detoxification. The M. martensii genome reveals a unique adaptation model of arthropods, offering new insights into the genetic bases of the living fossils. PMID:24129506
FOXI2: a possible gene contributing to ectodermal dysplasia.

PubMed

Kurban, Mazen; Zeineddine, Savo Bou; Hamie, Lamiaa; Safi, Remi; Abbas, Ossama; Kibbi, Abdul Ghani; Bitar, Fadi; Nemer, Georges

2017-12-01

Cardio-facio-cutaneous syndrome (CFC), Noonan syndrome (NS), and Costello syndrome are a group of diseases that belong to the RASopathies. The syndromes share clinical features making diagnosis a challenge. To investigate the phenotype and genotype of a 10-year-old Iraqi girl with overlapping features of CFC, NS, and Costello syndromes, with additional features of ectodermal dysplasia. DNA was examined by exome sequencing and protein expression by immunohistochemistry. Exome sequencing identified a mutation in the SOS1 gene and a de novo deletion in the FOXI2 gene which was neither present in the international databases, nor in 400 chromosomes from the same population. Based on immunohistochemical staining, FOXI2 was identified in the basal cell layer of the skin and overlapped with the expression of P63, a major player in ectodermal dysplasia. We therefore suggest screening for FOXI2 mutation in the setting of ectodermal features that are not associated with genes known to contribute to ectodermal dysplasia.
Comparative genome analysis of Prevotella intermedia strain isolated from infected root canal reveals features related to pathogenicity and adaptation.

PubMed

Ruan, Yunfeng; Shen, Lu; Zou, Yan; Qi, Zhengnan; Yin, Jun; Jiang, Jie; Guo, Liang; He, Lin; Chen, Zijiang; Tang, Zisheng; Qin, Shengying

2015-02-25

Many species of the genus Prevotella are pathogens that cause oral diseases. Prevotella intermedia is known to cause various oral disorders e.g. periodontal disease, periapical periodontitis and noma as well as colonize in the respiratory tract and be associated with cystic fibrosis and chronic bronchitis. It is of clinical significance to identify the main drive of its various adaptation and pathogenicity. In order to explore the intra-species genetic differences among strains of Prevotella intermedia of different niches, we isolated a strain Prevotella intermedia ZT from the infected root canal of a Chinese patient with periapical periodontitis and gained a draft genome sequence. We annotated the genome and compared it with the genomes of other taxa in the genus Prevotella. The raw data set, consisting of approximately 65X-coverage reads, was trimmed and assembled into contigs from which 2165 ORFs were predicted. The comparison of the Prevotella intermedia ZT genome sequence with the published genome sequence of Prevotella intermedia 17 and Prevotella intermedia ATCC25611 revealed that ~14% of the genes were strain-specific. The Preveotella intermedia strains share a set of conserved genes contributing to its adaptation and pathogenic and possess strain-specific genes especially those involved in adhesion and secreting bacteriocin. The Prevotella intermedia ZT shares similar gene content with other taxa of genus Prevotella. The genomes of the genus Prevotella is highly dynamic with relative conserved parts: on average, about half of the genes in one Prevotella genome were not included in another genome of the different Prevotella species. The degree of conservation varied with different pathways: the ability of amino acid biosynthesis varied greatly with species but the pathway of cell wall components biosynthesis were nearly constant. Phylogenetic tree shows that the taxa from different niches are scarcely distributed among clades. Prevotella intermedia ZT belongs to a genus marked with highly dynamic genomes. The specific genes of Prevotella intermedia indicate that adhesion, competing with surrounding microbes and horizontal gene transfer are the main drive of the evolution of Prevotella intermedia.
Comparative analysis of the 5{prime} genomic and promoter regions between the mouse (Hdh) and human Huntington disease (HD) gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kalchman, M.; Lin, B.; Nasir, J.

1994-09-01

The mouse homologue of the Huntington disease gene (Hdh) has recently been cloned and mapped to a region of synteny with the human, on mouse chromosome 5. The two genes share a high degree of both coding (90% amino acid) and nucleotide (86.2%) identity. We have subsequently performed a detailed comparison of the genomic organization of the 5{prime} region of the two genes encompassing the promoter region and first five exons of both the human and mouse genes. The comparative sequence analysis of the promoter region between HD and Hdh reveals two highly conserved regions. One region (-56 to -118)more » (+1 is the ATG start codon), shared 84% nucleotide identity and another region (-130 to -206) had 81% nucleotide identity. Nine putative Sp1 sites appear in the human promoter region contrasted with only 3 in a similar region in the mouse. Furthermore, 17 and 20 base pair direct repeats present in the HD 5{prime} region are absent in the similar Hdh region. Although both the mouse and human intron/exon boundaries conform to the GT/AG rule, the intron sizes between HD and Hdh are markedly different. The first four introns in Hdh are 15, 7, 5 and 0.5 kb compared to sizes of 10, 15, 7 and 0.5 kb, respectively. Comparison between the mouse and human intronic sequences immediately adjacent to the first five exons (excluding exon 1) reveals only about 46 to 50% identity within the first 60 bp of intronic sequence. Furthermore, we have identified novel polymorphic di-, tri- and tetra-nucleotide repeats in Hdh introns of various mouse strains that are not present in the human. For example, polymorphic CT repeats are present in introns 2 and 4 of Hdh and a novel mouse 56 AAG trinucleotide repeat (interrupted by an AAGG) is also located within intron 2. This information concerning the promoter and genomic organization of both HD and Hdh is critical for designing appropriate gene targetting vectors for studying the normal function of the HD and Hdh genes in model systems.« less
Whole genome sequencing as a tool to investigate a cluster of seven cases of listeriosis in Austria and Germany, 2011-2013.

PubMed

Schmid, D; Allerberger, F; Huhulescu, S; Pietzka, A; Amar, C; Kleta, S; Prager, R; Preußel, K; Aichinger, E; Mellmann, A

2014-05-01

A cluster of seven human cases of listeriosis occurred in Austria and in Germany between April 2011 and July 2013. The Listeria monocytogenes serovar (SV) 1/2b isolates shared pulsed-field gel electrophoresis (PFGE) and fluorescent amplified fragment length polymorphism (fAFLP) patterns indistinguishable from those from five food producers. The seven human isolates, a control strain with a different PFGE/fAFLP profile and ten food isolates were subjected to whole genome sequencing (WGS) in a blinded fashion. A gene-by-gene comparison (multilocus sequence typing (MLST)+) was performed, and the resulting whole genome allelic profiles were compared using SeqSphere(+) software version 1.0. On analysis of 2298 genes, the four human outbreak isolates from 2012 to 2013 had different alleles at ≤6 genes, i.e. differed by ≤6 genes from each other; the dendrogram placed these isolates in between five Austrian unaged soft cheese isolates from producer A (≤19-gene difference from the human cluster) and two Austrian ready-to-eat meat isolates from producer B (≤8-gene difference from the human cluster). Both food products appeared on grocery bills prospectively collected by these outbreak cases after hospital discharge. Epidemiological results on food consumption and MLST+ clearly separated the three cases in 2011 from the four 2012-2013 outbreak cases (≥48 different genes). We showed that WGS is capable of discriminating L. monocytogenes SV1/2b clones not distinguishable by PFGE and fAFLP. The listeriosis outbreak described clearly underlines the potential of sequence-based typing methods to offer enhanced resolution and comparability of typing systems for public health applications. © 2014 The Authors Clinical Microbiology and Infection © 2014 European Society of Clinical Microbiology and Infectious Diseases.
Genome Sequence, Assembly and Characterization of Two Metschnikowia fructicola Strains Used as Biocontrol Agents of Postharvest Diseases

PubMed Central

Piombo, Edoardo; Sela, Noa; Wisniewski, Michael; Hoffmann, Maria; Gullino, Maria L.; Allard, Marc W.; Levin, Elena; Spadaro, Davide; Droby, Samir

2018-01-01

The yeast Metschnikowia fructicola was reported as an efficient biological control agent of postharvest diseases of fruits and vegetables, and it is the bases of the commercial formulated product “Shemer.” Several mechanisms of action by which M. fructicola inhibits postharvest pathogens were suggested including iron-binding compounds, induction of defense signaling genes, production of fungal cell wall degrading enzymes and relatively high amounts of superoxide anions. We assembled the whole genome sequence of two strains of M. fructicola using PacBio and Illumina shotgun sequencing technologies. Using the PacBio, a high-quality draft genome consisting of 93 contigs, with an estimated genome size of approximately 26 Mb, was obtained. Comparative analysis of M. fructicola proteins with the other three available closely related genomes revealed a shared core of homologous proteins coded by 5,776 genes. Comparing the genomes of the two M. fructicola strains using a SNP calling approach resulted in the identification of 564,302 homologous SNPs with 2,004 predicted high impact mutations. The size of the genome is exceptionally high when compared with those of available closely related organisms, and the high rate of homology among M. fructicola genes points toward a recent whole-genome duplication event as the cause of this large genome. Based on the assembled genome, sequences were annotated with a gene description and gene ontology (GO term) and clustered in functional groups. Analysis of CAZymes family genes revealed 1,145 putative genes, and transcriptomic analysis of CAZyme expression levels in M. fructicola during its interaction with either grapefruit peel tissue or Penicillium digitatum revealed a high level of CAZyme gene expression when the yeast was placed in wounded fruit tissue. PMID:29666611
Candidate new rotavirus species in Schreiber's bats, Serbia.

PubMed

Bányai, Krisztián; Kemenesi, Gábor; Budinski, Ivana; Földes, Fanni; Zana, Brigitta; Marton, Szilvia; Varga-Kugler, Renáta; Oldal, Miklós; Kurucz, Kornélia; Jakab, Ferenc

2017-03-01

The genus Rotavirus comprises eight species designated A to H and one tentative species, Rotavirus I. In a virus metagenomic analysis of Schreiber's bats sampled in Serbia in 2014 we obtained sequences likely representing novel rotavirus species. Whole genome sequencing and phylogenetic analysis classified the representative strain into a tentative tenth rotavirus species, we provisionally called Rotavirus J. The novel virus shared a maximum of 50% amino acid sequence identity within the VP6 gene to currently known members of the genus. This study extends our understanding of the genetic diversity of rotaviruses in bats. Copyright © 2016 Elsevier B.V. All rights reserved.
Cloning of Giardia lamblia heat shock protein HSP70 homologs: implications regarding origin of eukaryotic cells and of endoplasmic reticulum.

PubMed Central

Gupta, R S; Aitken, K; Falah, M; Singh, B

1994-01-01

The genes for two different 70-kDa heat shock protein (HSP70) homologs have been cloned and sequenced from the protozoan Giardia lamblia. On the basis of their sequence features, one of these genes corresponds to the cytoplasmic form of HSP70. The second gene, on the basis of its characteristic N-terminal hydrophobic signal sequence and C-terminal endoplasmic reticulum (ER) retention sequence (Lys-Asp-Glu-Leu), is the equivalent of ER-resident GRP78 or the Bip family of proteins. Phylogenetic trees based on HSP70 sequences show that G. lamblia homologs show the deepest divergence among eukaryotic species. The identification of a GRP78 or Bip homolog in G. lamblia strongly suggests the existence of ER in this ancient eukaryote. Detailed phylogenetic analyses of HSP70 sequences by boot-strap neighbor-joining and maximum-parsimony methods show that the cytoplasmic and ER homologs form distinct subfamilies that evolved from a common eukaryotic ancestor by gene duplication that occurred very early in the evolution of eukaryotic cells. It is postulated that because of the essential "molecular chaperone" function of these proteins in translocation of other proteins across membranes, duplication of their genes accompanied the evolution of ER or nucleus in the eukaryotic cell ancestor. The presence in all eukaryotic cytoplasmic HSP70 homologs (including the cognate, heat-induced, and ER forms) of a number of autapomorphic sequence signatures that are not present in any prokaryotic or organellar homologs provides strong evidence regarding the monophyletic nature of eukaryotic lineage. Further, all eukaryotic HSP70 homologs share in common with the Gram-negative group of eubacteria a number of sequence features that are not present in any archaebacterium or Gram-positive bacterium, indicating their evolution from this group of organisms. Some implications of these findings regarding the evolution of eukaryotic cells and ER are discussed. Images PMID:8159675
Genome-wide analysis of WRKY transcription factors in Solanum lycopersicum.

PubMed

Huang, Shengxiong; Gao, Yongfeng; Liu, Jikai; Peng, Xiaoli; Niu, Xiangli; Fei, Zhangjun; Cao, Shuqing; Liu, Yongsheng

2012-06-01

The WRKY transcription factors have been implicated in multiple biological processes in plants, especially in regulating defense against biotic and abiotic stresses. However, little information is available about the WRKYs in tomato (Solanum lycopersicum). The recent release of the whole-genome sequence of tomato allowed us to perform a genome-wide investigation for tomato WRKY proteins, and to compare these positively identified proteins with their orthologs in model plants, such as Arabidopsis and rice. In the present study, based on the recently released tomato whole-genome sequences, we identified 81 SlWRKY genes that were classified into three main groups, with the second group further divided into five subgroups. Depending on WRKY domains' sequences derived from tomato, Arabidopsis and rice, construction of a phylogenetic tree demonstrated distinct clustering and unique gene expansion of WRKY genes among the three species. Genome mapping analysis revealed that tomato WRKY genes were enriched on several chromosomes, especially on chromosome 5, and 16 % of the family members were tandemly duplicated genes. The tomato WRKYs from each group were shown to share similar motif compositions. Furthermore, tomato WRKY genes showed distinct temporal and spatial expression patterns in different developmental processes and in response to various biotic and abiotic stresses. The expression of 18 selected tomato WRKY genes in response to drought and salt stresses and Pseudomonas syringae invasion, respectively, was validated by quantitative RT-PCR. Our results will provide a platform for functional identification and molecular breeding study of WRKY genes in tomato and probably other Solanaceae plants.
Comparative genomics and the role of lateral gene transfer in the evolution of bovine adapted Streptococcus agalactiae

PubMed Central

Richards, Vincent P.; Lang, Ping; Pavinski Bitar, Paulina D.; Lefébure, Tristan; Schukken, Ynte H.; Zadoks, Ruth N.; Stanhope, Michael J.

2011-01-01

In addition to causing severe invasive infections in humans, Streptococcus agalactiae, or group B Streptococcus (GBS), is also a major cause of bovine mastitis. Here we provide the first genome sequence for S. agalactiae isolated from a cow diagnosed with clinical mastitis (strain FSL S3-026). Comparison to eight S. agalactiae genomes obtained from human disease isolates revealed 183 genes specific to the bovine strain. Subsequent polymerase chain reaction (PCR) screening for the presence/absence of a subset of these loci in additional bovine and human strains revealed strong differentiation between the two groups (Fisher exact test: p < 0.0001). The majority of the bovine strain-specific genes (~85%) clustered tightly into eight genomic islands, suggesting these genes were acquired through lateral gene transfer (LGT). This bovine GBS also contained an unusually high proportion of insertion sequences (4.3% of the total genome), suggesting frequent genomic rearrangement. Comparison to other mastitis-causing species of bacteria provided strong evidence for two cases of interspecies LGT within the shared bovine environment: bovine S. agalactiae with Streptococcus uberis (nisin U operon) and Streptococcus dysgalactiae subsp. dysgalactiae (lactose operon). We also found evidence for LGT, involving the salivaricin operon, between the bovine S. agalactiae strain and either Streptococcus pyogenes or Streptococcus salivarius. Our findings provide insight intomechanismsfacilitatingenvironmentaladaptationandacquisitionofpotential virulence factors, while highlighting both the key role LGT has played in the recent evolution of the bovine S. agalactiae strain, and the importance of LGT among pathogens within a shared environment. PMID:21536150
Center for Regenerative Biology and Medicine at Mount Desert Island Biological Laboratory

DTIC Science & Technology

2012-06-01

Code Axolotl microRNAs Zebrafish Polypterus 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF...controlled in both Polypterus and axolotl samples. These comparisons revealed a total of 2779 shared genes that are significantly upregulated during...UPREGULATED DOWNREGULATED Figure 1: Venn diagram of UniProt protein sequence IDs among Axolotl and Polypterus contigs that were up-regulated
Preferential Allele Expression Analysis Identifies Shared Germline and Somatic Driver Genes in Advanced Ovarian Cancer

PubMed Central

Halabi, Najeeb M.; Martinez, Alejandra; Al-Farsi, Halema; Mery, Eliane; Puydenus, Laurence; Pujol, Pascal; Khalak, Hanif G.; McLurcan, Cameron; Ferron, Gwenael; Querleu, Denis; Al-Azwani, Iman; Al-Dous, Eman; Mohamoud, Yasmin A.; Malek, Joel A.; Rafii, Arash

2016-01-01

Identifying genes where a variant allele is preferentially expressed in tumors could lead to a better understanding of cancer biology and optimization of targeted therapy. However, tumor sample heterogeneity complicates standard approaches for detecting preferential allele expression. We therefore developed a novel approach combining genome and transcriptome sequencing data from the same sample that corrects for sample heterogeneity and identifies significant preferentially expressed alleles. We applied this analysis to epithelial ovarian cancer samples consisting of matched primary ovary and peritoneum and lymph node metastasis. We find that preferentially expressed variant alleles include germline and somatic variants, are shared at a relatively high frequency between patients, and are in gene networks known to be involved in cancer processes. Analysis at a patient level identifies patient-specific preferentially expressed alleles in genes that are targets for known drugs. Analysis at a site level identifies patterns of site specific preferential allele expression with similar pathways being impacted in the primary and metastasis sites. We conclude that genes with preferentially expressed variant alleles can act as cancer drivers and that targeting those genes could lead to new therapeutic strategies. PMID:26735499
Strain variation in Mycobacterium marinum fish isolates.

PubMed

Ucko, M; Colorni, A; Kvitt, H; Diamant, A; Zlotkin, A; Knibb, W R

2002-11-01

A molecular characterization of two Mycobacterium marinum genes, 16S rRNA and hsp65, was carried out with a total of 21 isolates from various species of fish from both marine and freshwater environments of Israel, Europe, and the Far East. The nucleotide sequences of both genes revealed that all M. marinum isolates from fish in Israel belonged to two different strains, one infecting marine (cultured and wild) fish and the other infecting freshwater (cultured) fish. A restriction enzyme map based on the nucleotide sequences of both genes confirmed the divergence of the Israeli marine isolates from the freshwater isolates and differentiated the Israeli isolates from the foreign isolates, with the exception of one of three Greek isolates from marine fish which was identical to the Israeli marine isolates. The second isolate from Greece exhibited a single base alteration in the 16S rRNA sequence, whereas the third isolate was most likely a new Mycobacterium species. Isolates from Denmark and Thailand shared high sequence homology to complete identity with reference strain ATCC 927. Combined analysis of the two gene sequences increased the detection of intraspecific variations and was thus of importance in studying the taxonomy and epidemiology of this aquatic pathogen. Whether the Israeli M. marinum strain infecting marine fish is endemic to the Red Sea and found extremely susceptible hosts in the exotic species imported for aquaculture or rather was accidentally introduced with occasional imports of fingerlings from the Mediterranean Sea could not be determined.
Characterization and Nucleotide Sequence of CARB-6, a New Carbenicillin-Hydrolyzing β-Lactamase from Vibrio cholerae

PubMed Central

Choury, Danièle; Aubert, Gérald; Szajnert, Marie-France; Azibi, Kemal; Delpech, Marc; Paul, Gérard

1999-01-01

A clinical strain of Vibrio cholerae non-O1 non-O139 isolated in France produced a new β-lactamase with a pI of 5.35. The purified enzyme, with a molecular mass of 33,000 Da, was characterized. Its kinetic constants show it to be a carbenicillin-hydrolyzing enzyme comparable to the five previously reported CARB β-lactamases and to SAR-1, another carbenicillin-hydrolyzing β-lactamase that has a pI of 4.9 and that is produced by a V. cholerae strain from Tanzania. This β-lactamase is designated CARB-6, and the gene for CARB-6 could not be transferred to Escherichia coli K-12 by conjugation. The nucleotide sequence of the structural gene was determined by direct sequencing of PCR-generated fragments from plasmid DNA with four pairs of primers covering the whole sequence of the reference CARB-3 gene. The gene encodes a 288-amino-acid protein that shares 94% homology with the CARB-1, CARB-2, and CARB-3 enzymes, 93% homology with the Proteus mirabilis N29 enzyme, and 86.5% homology with the CARB-4 enzyme. The sequence of CARB-6 differs from those of CARB-3, CARB-2, CARB-1, N29, and CARB-4 at 15, 16, 17, 19, and 37 amino acid positions, respectively. All these mutations are located in the C-terminal region of the sequence and at the surface of the molecule, according to the crystal structure of the Staphylococcus aureus PC-1 β-lactamase. PMID:9925522

Distribution and molecular diversity of three cucurbit-infecting poleroviruses in China.

PubMed

Shang, Qiao-xia; Xiang, Hai-ying; Han, Cheng-gui; Li, Da-wei; Yu, Jia-lin

2009-11-01

Cucurbit aphid-borne yellows virus (CABYV) and Melon aphid-borne yellows virus (MABYV) have been found to be associated with cucurbit yellowing disease in China. Our report identifies for the first time a third distinct polerovirus, tentatively named Suakwa aphid-borne yellows virus (SABYV), infecting Suakwa vegetable sponge. To better understand the distribution and molecular diversity of these three poleroviruses infecting cucurbits, a total of 214 cucurbitaceous crop samples were collected from 25 provinces in China, and were investigated by RT-PCR and sequencing. Of these, 108 samples tested positive for CABYV, while 40 samples from five provinces were positive for MABYV, and SABYV was detected in only 4 samples which were collected in the southern part of China. Forty-one PCR-amplified fragments containing a portion of the RdRp gene, intergenic NCR and CP gene were cloned and sequenced. Sequence comparisons showed that CABYV isolates shared 78.0-79.2% nucleotide sequence identity with MABYV isolates, and 69.7-70.8% with SABYV. Sequence identity between MABYV and SABYV was 73.3-76.5%. In contrast, the nucleotide identities within each species were 93.2-98.7% (CABYV), 98.1-99.9% (MABYV), and 96.1-98.6% (SABYV). Phylogenetic analyses revealed that the polerovirus isolates fit into three distinct groups, corresponding to the three species. The CABYV group could be further divided into two subgroups: the Asia subgroup and the Mediterranean subgroup, based on CP gene and partial RdRp gene sequences. Recombination analysis suggested that MABYV may be a recombinant virus.
Expression profiles of urbilaterian genes uniquely shared between honey bee and vertebrates

PubMed Central

Matsui, Toshiaki; Yamamoto, Toshiyuki; Wyder, Stefan; Zdobnov, Evgeny M; Kadowaki, Tatsuhiko

2009-01-01

Background Large-scale comparison of metazoan genomes has revealed that a significant fraction of genes of the last common ancestor of Bilateria (Urbilateria) is lost in each animal lineage. This event could be one of the underlying mechanisms involved in generating metazoan diversity. However, the present functions of these ancient genes have not been addressed extensively. To understand the functions and evolutionary mechanisms of such ancient Urbilaterian genes, we carried out comprehensive expression profile analysis of genes shared between vertebrates and honey bees but not with the other sequenced ecdysozoan genomes (honey bee-vertebrate specific, HVS genes) as a model. Results We identified 30 honey bee and 55 mouse HVS genes. Many HVS genes exhibited tissue-selective expression patterns; intriguingly, the expression of 60% of honey bee HVS genes was found to be brain enriched, and 24% of mouse HVS genes were highly expressed in either or both the brain and testis. Moreover, a minimum of 38% of mouse HVS genes demonstrated neuron-enriched expression patterns, and 62% of them exhibited expression in selective brain areas, particularly the forebrain and cerebellum. Furthermore, gene ontology (GO) analysis of HVS genes predicted that 35% of genes are associated with DNA transcription and RNA processing. Conclusion These results suggest that HVS genes include genes that are biased towards expression in the brain and gonads. They also demonstrate that at least some of Urbilaterian genes retained in the specific animal lineage may be selectively maintained to support the species-specific phenotypes. PMID:19138430
Expression profiles of urbilaterian genes uniquely shared between honey bee and vertebrates.

PubMed

Matsui, Toshiaki; Yamamoto, Toshiyuki; Wyder, Stefan; Zdobnov, Evgeny M; Kadowaki, Tatsuhiko

2009-01-12

Large-scale comparison of metazoan genomes has revealed that a significant fraction of genes of the last common ancestor of Bilateria (Urbilateria) is lost in each animal lineage. This event could be one of the underlying mechanisms involved in generating metazoan diversity. However, the present functions of these ancient genes have not been addressed extensively. To understand the functions and evolutionary mechanisms of such ancient Urbilaterian genes, we carried out comprehensive expression profile analysis of genes shared between vertebrates and honey bees but not with the other sequenced ecdysozoan genomes (honey bee-vertebrate specific, HVS genes) as a model. We identified 30 honey bee and 55 mouse HVS genes. Many HVS genes exhibited tissue-selective expression patterns; intriguingly, the expression of 60% of honey bee HVS genes was found to be brain enriched, and 24% of mouse HVS genes were highly expressed in either or both the brain and testis. Moreover, a minimum of 38% of mouse HVS genes demonstrated neuron-enriched expression patterns, and 62% of them exhibited expression in selective brain areas, particularly the forebrain and cerebellum. Furthermore, gene ontology (GO) analysis of HVS genes predicted that 35% of genes are associated with DNA transcription and RNA processing. These results suggest that HVS genes include genes that are biased towards expression in the brain and gonads. They also demonstrate that at least some of Urbilaterian genes retained in the specific animal lineage may be selectively maintained to support the species-specific phenotypes.
RNAi Experiments in D. melanogaster: Solutions to the Overlooked Problem of Off-Targets Shared by Independent dsRNAs

PubMed Central

Seinen, Erwin; Burgerhof, Johannes G. M.; Jansen, Ritsert C.; Sibon, Ody C. M.

2010-01-01

Background RNAi technology is widely used to downregulate specific gene products. Investigating the phenotype induced by downregulation of gene products provides essential information about the function of the specific gene of interest. When RNAi is applied in Drosophila melanogaster or Caenorhabditis elegans, often large dsRNAs are used. One of the drawbacks of RNAi technology is that unwanted gene products with sequence similarity to the gene of interest can be down regulated too. To verify the outcome of an RNAi experiment and to avoid these unwanted off-target effects, an additional non-overlapping dsRNA can be used to down-regulate the same gene. However it has never been tested whether this approach is sufficient to reduce the risk of off-targets. Methodology We created a novel tool to analyse the occurance of off-target effects in Drosophila and we analyzed 99 randomly chosen genes. Principal Findings Here we show that nearly all genes contain non-overlapping internal sequences that do show overlap in a common off-target gene. Conclusion Based on our in silico findings, off-target effects should not be ignored and our presented on-line tool enables the identification of two RNA interference constructs, free of overlapping off-targets, from any gene of interest. PMID:20957038
The mitochondrial genome of Priapulus caudatus Lamarck (Priapulida: Priapulidae).

PubMed

Webster, Bonnie L; Mackenzie-Dodds, Jacqueline A; Telford, Maximilian J; Littlewood, D Timothy J

2007-03-01

We sequenced and annotated the complete mitochondrial (mt) genome of the priapulid Priapulus caudatus in order to provide a source of phylogenetic characters including an assessment of gene order arrangement. The genome was 14,919 bp in its entirety with few, short non-coding regions. A number of protein-coding and tRNA genes overlapped, making the genome relatively compact. The gene order was: cox1, cox2, trnK, trnD, atp8, atp6, cox3, trnG, nad3, trnA, trnR, trnN, rrnS, trnV, rrnL, trnL(yaa), trnL(nag), nad1, -trnS(nga), -cob, -nad6, trnP, -trnT, nad4L, nad4, trnH, nad5, trnF, -trnE, -trnS(nct), trnI, -trnQ, trnM, nad2, trnW, -trnC, -trnY; where '-' indicates genes transcribed on the opposite strand. The gene order, although unique amongst Metazoa, shared the greatest number of gene boundaries and the longest contiguous fragments with the chelicerate Limulus polyphemus. The mt genomes of these taxa differed only by a single inversion of 18 contiguous genes bounded by rrnS and trnS(nct). Other arthropods and nematodes shared fewer gene boundaries but considerably more than the most similar non-ecdysozoan.
Parallel and convergent evolution of the dim-light vision gene RH1 in bats (Order: Chiroptera).

PubMed

Shen, Yong-Yi; Liu, Jie; Irwin, David M; Zhang, Ya-Ping

2010-01-21

Rhodopsin, encoded by the gene Rhodopsin (RH1), is extremely sensitive to light, and is responsible for dim-light vision. Bats are nocturnal mammals that inhabit poor light environments. Megabats (Old-World fruit bats) generally have well-developed eyes, while microbats (insectivorous bats) have developed echolocation and in general their eyes were degraded, however, dramatic differences in the eyes, and their reliance on vision, exist in this group. In this study, we examined the rod opsin gene (RH1), and compared its evolution to that of two cone opsin genes (SWS1 and M/LWS). While phylogenetic reconstruction with the cone opsin genes SWS1 and M/LWS generated a species tree in accord with expectations, the RH1 gene tree united Pteropodidae (Old-World fruit bats) and Yangochiroptera, with very high bootstrap values, suggesting the possibility of convergent evolution. The hypothesis of convergent evolution was further supported when nonsynonymous sites or amino acid sequences were used to construct phylogenies. Reconstructed RH1 sequences at internal nodes of the bat species phylogeny showed that: (1) Old-World fruit bats share an amino acid change (S270G) with the tomb bat; (2) Miniopterus share two amino acid changes (V104I, M183L) with Rhinolophoidea; (3) the amino acid replacement I123V occurred independently on four branches, and the replacements L99M, L266V and I286V occurred each on two branches. The multiple parallel amino acid replacements that occurred in the evolution of bat RH1 suggest the possibility of multiple convergences of their ecological specialization (i.e., various photic environments) during adaptation for the nocturnal lifestyle, and suggest that further attention is needed on the study of the ecology and behavior of bats.
High mature grain phytase activity in the Triticeae has evolved by duplication followed by neofunctionalization of the purple acid phosphatase phytase (PAPhy) gene

PubMed Central

Brinch-Pedersen, Henrik

2013-01-01

The phytase activity in food and feedstuffs is an important nutritional parameter. Members of the Triticeae tribe accumulate purple acid phosphatase phytases (PAPhy) during grain filling. This accumulation elevates mature grain phytase activities (MGPA) up to levels between ~650 FTU/kg for barley and 6000 FTU/kg for rye. This is notably more than other cereals. For instance, rice, maize, and oat have MGPAs below 100 FTU/kg. The cloning and characterization of the PAPhy gene complement from wheat, barley, rye, einkorn, and Aegilops tauschii is reported here. The Triticeae PAPhy genes generally consist of a set of paralogues, PAPhy_a and PAPhy_b, and have been mapped to Triticeae chromosomes 5 and 3, respectively. The promoters share a conserved core but the PAPhy_a promoter have acquired a novel cis-acting regulatory element for expression during grain filling while the PAPhy_b promoter has maintained the archaic function and drives expression during germination. Brachypodium is the only sequenced Poaceae sharing the PAPhy duplication. As for the Triticeae, the duplication is reflected in a high MGPA of ~4200 FTU/kg in Brachypodium. The sequence conservation of the paralogous loci on Brachypodium chromosomes 1 and 2 does not extend beyond the PAPhy gene. The results indicate that a single-gene segmental duplication may have enabled the evolution of high MGPA by creating functional redundancy of the parent PAPhy gene. This implies that similar MGPA levels may be out of reach in breeding programs for some Poaceae, e.g. maize and rice, whereas Triticeae breeders should focus on PAPhy_a. PMID:23918958
Parallel and Convergent Evolution of the Dim-Light Vision Gene RH1 in Bats (Order: Chiroptera)

PubMed Central

Shen, Yong-Yi; Liu, Jie; Irwin, David M.; Zhang, Ya-Ping

2010-01-01

Rhodopsin, encoded by the gene Rhodopsin (RH1), is extremely sensitive to light, and is responsible for dim-light vision. Bats are nocturnal mammals that inhabit poor light environments. Megabats (Old-World fruit bats) generally have well-developed eyes, while microbats (insectivorous bats) have developed echolocation and in general their eyes were degraded, however, dramatic differences in the eyes, and their reliance on vision, exist in this group. In this study, we examined the rod opsin gene (RH1), and compared its evolution to that of two cone opsin genes (SWS1 and M/LWS). While phylogenetic reconstruction with the cone opsin genes SWS1 and M/LWS generated a species tree in accord with expectations, the RH1 gene tree united Pteropodidae (Old-World fruit bats) and Yangochiroptera, with very high bootstrap values, suggesting the possibility of convergent evolution. The hypothesis of convergent evolution was further supported when nonsynonymous sites or amino acid sequences were used to construct phylogenies. Reconstructed RH1 sequences at internal nodes of the bat species phylogeny showed that: (1) Old-World fruit bats share an amino acid change (S270G) with the tomb bat; (2) Miniopterus share two amino acid changes (V104I, M183L) with Rhinolophoidea; (3) the amino acid replacement I123V occurred independently on four branches, and the replacements L99M, L266V and I286V occurred each on two branches. The multiple parallel amino acid replacements that occurred in the evolution of bat RH1 suggest the possibility of multiple convergences of their ecological specialization (i.e., various photic environments) during adaptation for the nocturnal lifestyle, and suggest that further attention is needed on the study of the ecology and behavior of bats. PMID:20098620
Homez, a homeobox leucine zipper gene specific to the vertebrate lineage.

PubMed

Bayarsaihan, Dashzeveg; Enkhmandakh, Badam; Makeyev, Aleksandr; Greally, John M; Leckman, James F; Ruddle, Frank H

2003-09-02

This work describes a vertebrate homeobox gene, designated Homez (homeodomain leucine zipper-encoding gene), that encodes a protein with an unusual structural organization. There are several regions within Homez, including three atypical homeodomains, two leucine zipper-like motifs, and an acidic domain. The gene is ubiquitously expressed in human and murine tissues, although the expression pattern is more restricted during mouse development. Genomic analysis revealed that human and mouse genes are located at 14q11.2 and 14C, respectively, and are composed of two exons. The zebrafish and pufferfish homologs share high similarity to mammalian sequences, particularly within the homeodomain sequences. Based on homology of homeodomains and on the similarity in overall protein structure, we delineate Homez and members of ZHX family of zinc finger homeodomain factors as a subset within the superfamily of homeobox-containing proteins. The type and composition of homeodomains in the Homez subfamily are vertebrate-specific. Phylogenetic analysis indicates that Homez lineage was separated from related genes >400 million years ago before separation of ray- and lobe-finned fishes. We apply a duplication-degeneration-complementation model to explain how this family of genes has evolved.
Diversity of bacteria and glycosyl hydrolase family 48 genes in cellulolytic consortia enriched from thermophilic biocompost.

PubMed

Izquierdo, Javier A; Sizova, Maria V; Lynd, Lee R

2010-06-01

The enrichment from nature of novel microbial communities with high cellulolytic activity is useful in the identification of novel organisms and novel functions that enhance the fundamental understanding of microbial cellulose degradation. In this work we identify predominant organisms in three cellulolytic enrichment cultures with thermophilic compost as an inoculum. Community structure based on 16S rRNA gene clone libraries featured extensive representation of clostridia from cluster III, with minor representation of clostridial clusters I and XIV and a novel Lutispora species cluster. Our studies reveal different levels of 16S rRNA gene diversity, ranging from 3 to 18 operational taxonomic units (OTUs), as well as variability in community membership across the three enrichment cultures. By comparison, glycosyl hydrolase family 48 (GHF48) diversity analyses revealed a narrower breadth of novel clostridial genes associated with cultured and uncultured cellulose degraders. The novel GHF48 genes identified in this study were related to the novel clostridia Clostridium straminisolvens and Clostridium clariflavum, with one cluster sharing as little as 73% sequence similarity with the closest known relative. In all, 14 new GHF48 gene sequences were added to the known diversity of 35 genes from cultured species.
Mitochondrial DNA analyses reveal low genetic diversity in Culex quinquefasciatus from residential areas in Malaysia.

PubMed

Low, V L; Lim, P E; Chen, C D; Lim, Y A L; Tan, T K; Norma-Rashid, Y; Lee, H L; Sofian-Azirun, M

2014-06-01

The present study explored the intraspecific genetic diversity, dispersal patterns and phylogeographic relationships of Culex quinquefasciatus Say (Diptera: Culicidae) in Malaysia using reference data available in GenBank in order to reveal this species' phylogenetic relationships. A statistical parsimony network of 70 taxa aligned as 624 characters of the cytochrome c oxidase subunit I (COI) gene and 685 characters of the cytochrome c oxidase subunit II (COII) gene revealed three haplotypes (A1-A3) and four haplotypes (B1-B4), respectively. The concatenated sequences of both COI and COII genes with a total of 1309 characters revealed seven haplotypes (AB1-AB7). Analysis using tcs indicated that haplotype AB1 was the common ancestor and the most widespread haplotype in Malaysia. The genetic distance based on concatenated sequences of both COI and COII genes ranged from 0.00076 to 0.00229. Sequence alignment of Cx. quinquefasciatus from Malaysia and other countries revealed four haplotypes (AA1-AA4) by the COI gene and nine haplotypes (BB1-BB9) by the COII gene. Phylogenetic analyses demonstrated that Malaysian Cx. quinquefasciatus share the same genetic lineage as East African and Asian Cx. quinquefasciatus. This study has inferred the genetic lineages, dispersal patterns and hypothetical ancestral genotypes of Cx. quinquefasciatus. © 2013 The Royal Entomological Society.
Recombination–deletion between homologous cassettes in retrovirus is suppressed via a strategy of degenerate codon substitution

PubMed Central

Im, Eung Jun; Bais, Anthony J; Yang, Wen; Ma, Qiangzhong; Guo, Xiuyang; Sepe, Steven M; Junghans, Richard P

2014-01-01

Transduction and expression procedures in gene therapy protocols may optimally transfer more than a single gene to correct a defect and/or transmit new functions to recipient cells or organisms. This may be accomplished by transduction with two (or more) vectors, or, more efficiently, in a single vector. Occasionally, it may be useful to coexpress homologous genes or chimeric proteins with regions of shared homology. Retroviridae include the dominant vector systems for gene transfer (e.g., gamma-retro and lentiviruses) and are capable of such multigene expression. However, these same viruses are known for efficient recombination–deletion when domains are duplicated within the viral genome. This problem can be averted by resorting to two-vector strategies (two-chain two-vector), but at a penalty to cost, convenience, and efficiency. Employing a chimeric antigen receptor system as an example, we confirm that coexpression of two genes with homologous domains in a single gamma-retroviral vector (two-chain single-vector) leads to recombination–deletion between repeated sequences, excising the equivalent of one of the chimeric antigen receptors. Here, we show that a degenerate codon substitution strategy in the two-chain single-vector format efficiently suppressed intravector deletional loss with rescue of balanced gene coexpression by minimizing sequence homology between repeated domains and preserving the final protein sequence. PMID:25419532
Sequencing and analysis of 10,967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis reveals post-tetraploidization transcriptome remodeling

PubMed Central

Morin, Ryan D.; Chang, Elbert; Petrescu, Anca; Liao, Nancy; Griffith, Malachi; Kirkpatrick, Robert; Butterfield, Yaron S.; Young, Alice C.; Stott, Jeffrey; Barber, Sarah; Babakaiff, Ryan; Dickson, Mark C.; Matsuo, Corey; Wong, David; Yang, George S.; Smailus, Duane E.; Wetherby, Keith D.; Kwong, Peggy N.; Grimwood, Jane; Brinkley, Charles P.; Brown-John, Mabel; Reddix-Dugue, Natalie D.; Mayo, Michael; Schmutz, Jeremy; Beland, Jaclyn; Park, Morgan; Gibson, Susan; Olson, Teika; Bouffard, Gerard G.; Tsai, Miranda; Featherstone, Ruth; Chand, Steve; Siddiqui, Asim S.; Jang, Wonhee; Lee, Ed; Klein, Steven L.; Blakesley, Robert W.; Zeeberg, Barry R.; Narasimhan, Sudarshan; Weinstein, John N.; Pennacchio, Christa Prange; Myers, Richard M.; Green, Eric D.; Wagner, Lukas; Gerhard, Daniela S.; Marra, Marco A.; Jones, Steven J.M.; Holt, Robert A.

2006-01-01

Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X. tropicalis) as a community resource. Because the genome of X. laevis, but not X. tropicalis, has undergone allotetraploidization, comparison of coding sequences from these two clawed (pipid) frogs provides a unique angle for exploring the molecular evolution of duplicate genes. Within our clone set, we have identified 445 gene trios, each comprised of an allotetraploidization-derived X. laevis gene pair and their shared X. tropicalis ortholog. Pairwise dN/dS, comparisons within trios show strong evidence for purifying selection acting on all three members. However, dN/dS ratios between X. laevis gene pairs are elevated relative to their X. tropicalis ortholog. This difference is highly significant and indicates an overall relaxation of selective pressures on duplicated gene pairs. We have found that the paralogs that have been lost since the tetraploidization event are enriched for several molecular functions, but have found no such enrichment in the extant paralogs. Approximately 14% of the paralogous pairs analyzed here also show differential expression indicative of subfunctionalization. PMID:16672307
Evolution of plant virus movement proteins from the 30K superfamily and of their homologs integrated in plant genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mushegian, Arcady R., E-mail: mushegian2@gmail.com; Elena, Santiago F., E-mail: sfelena@ibmcp.upv.es; The Santa Fe Institute, Santa Fe, NM 87501

Homologs of Tobacco mosaic virus 30K cell-to-cell movement protein are encoded by diverse plant viruses. Mechanisms of action and evolutionary origins of these proteins remain obscure. We expand the picture of conservation and evolution of the 30K proteins, producing sequence alignment of the 30K superfamily with the broadest phylogenetic coverage thus far and illuminating structural features of the core all-beta fold of these proteins. Integrated copies of pararetrovirus 30K movement genes are prevalent in euphyllophytes, with at least one copy intact in nearly every examined species, and mRNAs detected for most of them. Sequence analysis suggests repeated integrations, pseudogenizations, andmore » positive selection in those provirus genes. An unannotated 30K-superfamily gene in Arabidopsis thaliana genome is likely expressed as a fusion with the At1g37113 transcript. This molecular background of endopararetrovirus gene products in plants may change our view of virus infection and pathogenesis, and perhaps of cellular homeostasis in the hosts. - Highlights: • Sequence region shared by plant virus “30K” movement proteins has an all-beta fold. • Most euphyllophyte genomes contain integrated copies of pararetroviruses. • These integrated virus genomes often include intact movement protein genes. • Molecular evidence suggests that these “30K” genes may be selected for function.« less
Differentiation of Trypanosoma cruzi I subgroups through characterization of cytochrome b gene sequences.

PubMed

Spotorno O, Angel E; Córdova, Luis; Solari I, Aldo

2008-12-01

To identify and characterize chilean samples of Trypanosoma cruzi and their association with hosts, the first 516 bp of the mitochondrial cytochrome b gene were sequenced from eight biological samples, and phylogenetically compared with other known 20 American sequences. The molecular characterization of these 28 sequences in a maximum likelihood phylogram (-lnL = 1255.12, tree length = 180, consistency index = 0.79) allowed the robust identification (bootstrap % > 99) of three previously known discrete typing units (DTU): DTU IIb, IIa, and I. An apparently undescribed new sequence found in four new chilean samples was detected and designated as DTU Ib; they were separated by 24.7 differences, but robustly related (bootstrap % = 97 in 500 replicates) to those of DTU I by sharing 12 substitutions, among which four were nonsynonymous ones. Such new DTU Ib was also robust (bootstrap % = 100), and characterized by 10 unambiguous substitutions, with a single nonsynonymous G to T change at site 409. The fact that two of such new sequences were found in parasites from a chilean endemic caviomorph rodent, Octodon degus, and that they were closely related to the ancient DTU I suggested old origins and a long association to caviomorph hosts.
Novel Molecular Method for Identification of Streptococcus pneumoniae Applicable to Clinical Microbiology and 16S rRNA Sequence-Based Microbiome Studies

PubMed Central

Scholz, Christian F. P.; Poulsen, Knud

2012-01-01

The close phylogenetic relationship of the important pathogen Streptococcus pneumoniae and several species of commensal streptococci, particularly Streptococcus mitis and Streptococcus pseudopneumoniae, and the recently demonstrated sharing of genes and phenotypic traits previously considered specific for S. pneumoniae hamper the exact identification of S. pneumoniae. Based on sequence analysis of 16S rRNA genes of a collection of 634 streptococcal strains, identified by multilocus sequence analysis, we detected a cytosine at position 203 present in all 440 strains of S. pneumoniae but replaced by an adenosine residue in all strains representing other species of mitis group streptococci. The S. pneumoniae-specific sequence signature could be demonstrated by sequence analysis or indirectly by restriction endonuclease digestion of a PCR amplicon covering the site. The S. pneumoniae-specific signature offers an inexpensive means for validation of the identity of clinical isolates and should be used as an integrated marker in the annotation procedure employed in 16S rRNA-based molecular studies of complex human microbiotas. This may avoid frequent misidentifications such as those we demonstrate to have occurred in previous reports and in reference sequence databases. PMID:22442329
Next Generation Sequencing of Actinobacteria for the Discovery of Novel Natural Products

PubMed Central

Gomez-Escribano, Juan Pablo; Alt, Silke; Bibb, Mervyn J.

2016-01-01

Like many fields of the biosciences, actinomycete natural products research has been revolutionised by next-generation DNA sequencing (NGS). Hundreds of new genome sequences from actinobacteria are made public every year, many of them as a result of projects aimed at identifying new natural products and their biosynthetic pathways through genome mining. Advances in these technologies in the last five years have meant not only a reduction in the cost of whole genome sequencing, but also a substantial increase in the quality of the data, having moved from obtaining a draft genome sequence comprised of several hundred short contigs, sometimes of doubtful reliability, to the possibility of obtaining an almost complete and accurate chromosome sequence in a single contig, allowing a detailed study of gene clusters and the design of strategies for refactoring and full gene cluster synthesis. The impact that these technologies are having in the discovery and study of natural products from actinobacteria, including those from the marine environment, is only starting to be realised. In this review we provide a historical perspective of the field, analyse the strengths and limitations of the most relevant technologies, and share the insights acquired during our genome mining projects. PMID:27089350
Defining the ABC of gene essentiality in streptococci.

PubMed

Charbonneau, Amelia R L; Forman, Oliver P; Cain, Amy K; Newland, Graham; Robinson, Carl; Boursnell, Mike; Parkhill, Julian; Leigh, James A; Maskell, Duncan J; Waller, Andrew S

2017-05-31

Utilising next generation sequencing to interrogate saturated bacterial mutant libraries provides unprecedented information for the assignment of genome-wide gene essentiality. Exposure of saturated mutant libraries to specific conditions and subsequent sequencing can be exploited to uncover gene essentiality relevant to the condition. Here we present a barcoded transposon directed insertion-site sequencing (TraDIS) system to define an essential gene list for Streptococcus equi subsp. equi, the causative agent of strangles in horses, for the first time. The gene essentiality data for this group C Streptococcus was compared to that of group A and B streptococci. Six barcoded variants of pGh9:ISS1 were designed and used to generate mutant libraries containing between 33,000-66,000 unique mutants. TraDIS was performed on DNA extracted from each library and data were analysed separately and as a combined master pool. Gene essentiality determined that 19.5% of the S. equi genome was essential. Gene essentialities were compared to those of group A and group B streptococci, identifying concordances of 90.2% and 89.4%, respectively and an overall concordance of 83.7% between the three species. The use of barcoded pGh9:ISS1 to generate mutant libraries provides a highly useful tool for the assignment of gene function in S. equi and other streptococci. The shared essential gene set of group A, B and C streptococci provides further evidence of the close genetic relationships between these important pathogenic bacteria. Therefore, the ABC of gene essentiality reported here provides a solid foundation towards reporting the functional genome of streptococci.
Genome of turbot rhabdovirus exhibits unusual non-coding regions and an additional ORF that could be expressed in fish cell.

PubMed

Zhu, Ruo-Lin; Lei, Xiao-Ying; Ke, Fei; Yuan, Xiu-Ping; Zhang, Qi-Ya

2011-02-01

Genomic sequence of Scophthalmus maximus rhabdovirus (SMRV) isolated from diseased turbot has been characterized. The complete genome of SMRV comprises 11,492 nucleotides and encodes five typical rhabdovirus genes N, P, M, G and L. In addition, two open reading frames (ORF) are predicted overlapping with P gene, one upstream of P and smaller than P (temporarily called Ps), and another in P gene which may encodes a protein similar to the vesicular stomatitis virus C protein. The C ORF is contained within the P ORF. The five typical proteins share the highest sequence identities (48.9%) with the corresponding proteins of rhabdoviruses in genus Vesiculovirus. Phylogenetic analysis of partial L protein sequence indicates that SMRV is close to genus Vesiculovirus. The first 13 nucleotides at the ends of the SMRV genome are absolutely inverse complementarity. The gene junctions between the five genes show conserved polyadenylation signal (CATGA(7)) and intergenic dinucleotide (CT) followed by putative transcription initiation sequence A(A/G)(C/G)A(A/G/T), which are different from known rhabdoviruses. The entire Ps ORF was cloned and expressed, and used to generate polyclonal antibody in mice. One obvious band could be detected in SMRV-infected carp leucocyte cells (CLCs) by anti-Ps/C serum via Western blot, and the subcellular localization of Ps-GFP fusion protein exhibited cytoplasm distribution as multiple punctuate or doughnut shaped foci of uneven size. Copyright Â© 2010 Elsevier B.V. All rights reserved.
K-shuff: A Novel Algorithm for Characterizing Structural and Compositional Diversity in Gene Libraries.

PubMed

Jangid, Kamlesh; Kao, Ming-Hung; Lahamge, Aishwarya; Williams, Mark A; Rathbun, Stephen L; Whitman, William B

2016-01-01

K-shuff is a new algorithm for comparing the similarity of gene sequence libraries, providing measures of the structural and compositional diversity as well as the significance of the differences between these measures. Inspired by Ripley's K-function for spatial point pattern analysis, the Intra K-function or IKF measures the structural diversity, including both the richness and overall similarity of the sequences, within a library. The Cross K-function or CKF measures the compositional diversity between gene libraries, reflecting both the number of OTUs shared as well as the overall similarity in OTUs. A Monte Carlo testing procedure then enables statistical evaluation of both the structural and compositional diversity between gene libraries. For 16S rRNA gene libraries from complex bacterial communities such as those found in seawater, salt marsh sediments, and soils, K-shuff yields reproducible estimates of structural and compositional diversity with libraries greater than 50 sequences. Similarly, for pyrosequencing libraries generated from a glacial retreat chronosequence and Illumina® libraries generated from US homes, K-shuff required >300 and 100 sequences per sample, respectively. Power analyses demonstrated that K-shuff is sensitive to small differences in Sanger or Illumina® libraries. This extra sensitivity of K-shuff enabled examination of compositional differences at much deeper taxonomic levels, such as within abundant OTUs. This is especially useful when comparing communities that are compositionally very similar but functionally different. K-shuff will therefore prove beneficial for conventional microbiome analysis as well as specific hypothesis testing.

Multiple genome sequences reveal adaptations of a phototrophic bacterium to sediment microenvironments.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oda, Yasuhiro; Larimer, Frank W; Chain, Patrick S. G.

The bacterial genus Rhodopseudomonas is comprised of photosynthetic bacteria found widely distributed in aquatic sediments. Members of the genus catalyze hydrogen gas production, carbon dioxide sequestration, and biomass turnover. The genome sequence of Rhodopseudomonas palustris CGA009 revealed a surprising richness of metabolic versatility that would seem to explain its ability to live in a heterogeneous environment like sediment. However, there is considerable genotypic diversity among Rhodopseudomonas isolates. Here we report the complete genome sequences of four additional members of the genus isolated from a restricted geographical area. The sequences confirm that the isolates belong to a coherent taxonomic unit, butmore » they also have significant differences. Whole genome alignments show that the circular chromosomes of the isolates consist of a collinear backbone with a moderate number of genomic rearrangements that impact local gene order and orientation. There are 3,319 genes, 70% of the genes in each genome, shared by four or more strains. Between 10% and 18% of the genes in each genome are strain specific. Some of these genes suggest specialized physiological traits, which we verified experimentally, that include expanded light harvesting, oxygen respiration, and nitrogen fixation capabilities, as well as anaerobic fermentation. Strain-specific adaptations include traits that may be useful in bioenergy applications. This work suggests that against a backdrop of metabolic versatility that is a defining characteristic of Rhodopseudomonas, different ecotypes have evolved to take advantage of physical and chemical conditions in sediment microenvironments that are too small for human observation.« less
Complete chloroplast genome sequence of a tree fern Alsophila spinulosa: insights into evolutionary changes in fern chloroplast genomes.

PubMed

Gao, Lei; Yi, Xuan; Yang, Yong-Xia; Su, Ying-Juan; Wang, Ting

2009-06-11

Ferns have generally been neglected in studies of chloroplast genomics. Before this study, only one polypod and two basal ferns had their complete chloroplast (cp) genome reported. Tree ferns represent an ancient fern lineage that first occurred in the Late Triassic. In recent phylogenetic analyses, tree ferns were shown to be the sister group of polypods, the most diverse group of living ferns. Availability of cp genome sequence from a tree fern will facilitate interpretation of the evolutionary changes of fern cp genomes. Here we have sequenced the complete cp genome of a scaly tree fern Alsophila spinulosa (Cyatheaceae). The Alsophila cp genome is 156,661 base pairs (bp) in size, and has a typical quadripartite structure with the large (LSC, 86,308 bp) and small single copy (SSC, 21,623 bp) regions separated by two copies of an inverted repeat (IRs, 24,365 bp each). This genome contains 117 different genes encoding 85 proteins, 4 rRNAs and 28 tRNAs. Pseudogenes of ycf66 and trnT-UGU are also detected in this genome. A unique trnR-UCG gene (derived from trnR-CCG) is found between rbcL and accD. The Alsophila cp genome shares some unusual characteristics with the previously sequenced cp genome of the polypod fern Adiantum capillus-veneris, including the absence of 5 tRNA genes that exist in most other cp genomes. The genome shows a high degree of synteny with that of Adiantum, but differs considerably from two basal ferns (Angiopteris evecta and Psilotum nudum). At one endpoint of an ancient inversion we detected a highly repeated 565-bp-region that is absent from the Adiantum cp genome. An additional minor inversion of the trnD-GUC, which is possibly shared by all ferns, was identified by comparison between the fern and other land plant cp genomes. By comparing four fern cp genome sequences it was confirmed that two major rearrangements distinguish higher leptosporangiate ferns from basal fern lineages. The Alsophila cp genome is very similar to that of the polypod fern Adiantum in terms of gene content, gene order and GC content. However, there exist some striking differences between them: the trnR-UCG gene represents a putative molecular apomorphy of tree ferns; and the repeats observed at one inversion endpoint may be a vestige of some unknown rearrangement(s). This work provided fresh insights into the fern cp genome evolution as well as useful data for future phylogenetic studies.
Characterization and Comparative Overview of Complete Sequences of the First Plasmids of Pandoraea across Clinical and Non-clinical Strains

PubMed Central

Yong, Delicia; Tee, Kok Keng; Yin, Wai-Fong; Chan, Kok-Gan

2016-01-01

To date, information on plasmid analysis in Pandoraea spp. is scarce. To address the gap of knowledge on this, the complete sequences of eight plasmids from Pandoraea spp. namely Pandoraea faecigallinarum DSM 23572T (pPF72-1, pPF72-2), Pandoraea oxalativorans DSM 23570T (pPO70-1, pPO70-2, pPO70-3, pPO70-4), Pandoraea vervacti NS15 (pPV15) and Pandoraea apista DSM 16535T (pPA35) were studied for the first time in this study. The information on plasmid sequences in Pandoraea spp. is useful as the sequences did not match any known plasmid sequence deposited in public databases. Replication genes were not identified in some plasmids, a situation that has led to the possibility of host interaction involvement. Some plasmids were also void of par genes and intriguingly, repA gene was also not discovered in these plasmids. This further leads to the hypothesis of host-plasmid interaction. Plasmid stabilization/stability protein-encoding genes were observed in some plasmids but were not established for participating in plasmid segregation. Toxin-antitoxin systems MazEF, VapBC, RelBE, YgiT-MqsR, HigBA, and ParDE were identified across the plasmids and their presence would improve plasmid maintenance. Conjugation genes were identified portraying the conjugation ability amongst Pandoraea plasmids. Additionally, we found a shared region amongst some of the plasmids that consists of conjugation genes. The identification of genes involved in replication, segregation, toxin-antitoxin systems and conjugation, would aid the design of drugs to prevent the survival or transmission of plasmids carrying pathogenic properties. Additionally, genes conferring virulence and antibiotic resistance were identified amongst the plasmids. The observed features in the plasmids shed light on the Pandoraea spp. as opportunistic pathogens. PMID:27790203
Haplotype Detection from Next-Generation Sequencing in High-Ploidy-Level Species: 45S rDNA Gene Copies in the Hexaploid Spartina maritima

PubMed Central

Boutte, Julien; Aliaga, Benoît; Lima, Oscar; Ferreira de Carvalho, Julie; Ainouche, Abdelkader; Macas, Jiri; Rousseau-Gueutin, Mathieu; Coriton, Olivier; Ainouche, Malika; Salmon, Armel

2015-01-01

Gene and whole-genome duplications are widespread in plant nuclear genomes, resulting in sequence heterogeneity. Identification of duplicated genes may be particularly challenging in highly redundant genomes, especially when there are no diploid parents as a reference. Here, we developed a pipeline to detect the different copies in the ribosomal RNA gene family in the hexaploid grass Spartina maritima from next-generation sequencing (Roche-454) reads. The heterogeneity of the different domains of the highly repeated 45S unit was explored by identifying single nucleotide polymorphisms (SNPs) and assembling reads based on shared polymorphisms. SNPs were validated using comparisons with Illumina sequence data sets and by cloning and Sanger (re)sequencing. Using this approach, 29 validated polymorphisms and 11 validated haplotypes were reported (out of 34 and 20, respectively, that were initially predicted by our program). The rDNA domains of S. maritima have similar lengths as those found in other Poaceae, apart from the 5′-ETS, which is approximately two-times longer in S. maritima. Sequence homogeneity was encountered in coding regions and both internal transcribed spacers (ITS), whereas high intragenomic variability was detected in the intergenic spacer (IGS) and the external transcribed spacer (ETS). Molecular cytogenetic analysis by fluorescent in situ hybridization (FISH) revealed the presence of one pair of 45S rDNA signals on the chromosomes of S. maritima instead of three expected pairs for a hexaploid genome, indicating loss of duplicated homeologous loci through the diploidization process. The procedure developed here may be used at any ploidy level and using different sequencing technologies. PMID:26530424
Complete genome sequencing and analysis of a Lancefield group G Streptococcus dysgalactiae subsp. equisimilis strain causing streptococcal toxic shock syndrome (STSS)

PubMed Central

2011-01-01

Background Streptococcus dysgalactiae subsp. equisimilis (SDSE) causes invasive streptococcal infections, including streptococcal toxic shock syndrome (STSS), as does Lancefield group A Streptococcus pyogenes (GAS). We sequenced the entire genome of SDSE strain GGS_124 isolated from a patient with STSS. Results We found that GGS_124 consisted of a circular genome of 2,106,340 bp. Comparative analyses among bacterial genomes indicated that GGS_124 was most closely related to GAS. GGS_124 and GAS, but not other streptococci, shared a number of virulence factor genes, including genes encoding streptolysin O, NADase, and streptokinase A, distantly related to SIC (DRS), suggesting the importance of these factors in the development of invasive disease. GGS_124 contained 3 prophages, with one containing a virulence factor gene for streptodornase. All 3 prophages were significantly similar to GAS prophages that carry virulence factor genes, indicating that these prophages had transferred these genes between pathogens. SDSE was found to contain a gene encoding a superantigen, streptococcal exotoxin type G, but lacked several genes present in GAS that encode virulence factors, such as other superantigens, cysteine protease speB, and hyaluronan synthase operon hasABC. Similar to GGS_124, the SDSE strains contained larger numbers of clustered, regularly interspaced, short palindromic repeats (CRISPR) spacers than did GAS, suggesting that horizontal gene transfer via streptococcal phages between SDSE and GAS is somewhat restricted, although they share phage species. Conclusion Genome wide comparisons of SDSE with GAS indicate that SDSE is closely and quantitatively related to GAS. SDSE, however, lacks several virulence factors of GAS, including superantigens, SPE-B and the hasABC operon. CRISPR spacers may limit the horizontal transfer of phage encoded GAS virulence genes into SDSE. These findings may provide clues for dissecting the pathological roles of the virulence factors in SDSE and GAS that cause STSS. PMID:21223537
Complete genome sequencing and analysis of a Lancefield group G Streptococcus dysgalactiae subsp. equisimilis strain causing streptococcal toxic shock syndrome (STSS).

PubMed

Shimomura, Yumi; Okumura, Kayo; Murayama, Somay Yamagata; Yagi, Junji; Ubukata, Kimiko; Kirikae, Teruo; Miyoshi-Akiyama, Tohru

2011-01-11

Streptococcus dysgalactiae subsp. equisimilis (SDSE) causes invasive streptococcal infections, including streptococcal toxic shock syndrome (STSS), as does Lancefield group A Streptococcus pyogenes (GAS). We sequenced the entire genome of SDSE strain GGS_124 isolated from a patient with STSS. We found that GGS_124 consisted of a circular genome of 2,106,340 bp. Comparative analyses among bacterial genomes indicated that GGS_124 was most closely related to GAS. GGS_124 and GAS, but not other streptococci, shared a number of virulence factor genes, including genes encoding streptolysin O, NADase, and streptokinase A, distantly related to SIC (DRS), suggesting the importance of these factors in the development of invasive disease. GGS_124 contained 3 prophages, with one containing a virulence factor gene for streptodornase. All 3 prophages were significantly similar to GAS prophages that carry virulence factor genes, indicating that these prophages had transferred these genes between pathogens. SDSE was found to contain a gene encoding a superantigen, streptococcal exotoxin type G, but lacked several genes present in GAS that encode virulence factors, such as other superantigens, cysteine protease speB, and hyaluronan synthase operon hasABC. Similar to GGS_124, the SDSE strains contained larger numbers of clustered, regularly interspaced, short palindromic repeats (CRISPR) spacers than did GAS, suggesting that horizontal gene transfer via streptococcal phages between SDSE and GAS is somewhat restricted, although they share phage species. Genome wide comparisons of SDSE with GAS indicate that SDSE is closely and quantitatively related to GAS. SDSE, however, lacks several virulence factors of GAS, including superantigens, SPE-B and the hasABC operon. CRISPR spacers may limit the horizontal transfer of phage encoded GAS virulence genes into SDSE. These findings may provide clues for dissecting the pathological roles of the virulence factors in SDSE and GAS that cause STSS.
Mitochondrial Genes of Dinoflagellates Are Transcribed by a Nuclear-Encoded Single-Subunit RNA Polymerase.

PubMed

Teng, Chang Ying; Dang, Yunkun; Danne, Jillian C; Waller, Ross F; Green, Beverley R

2013-01-01

Dinoflagellates are a large group of algae that contribute significantly to marine productivity and are essential photosynthetic symbionts of corals. Although these algae have fully-functioning mitochondria and chloroplasts, both their organelle genomes have been highly reduced and the genes fragmented and rearranged, with many aberrant transcripts. However, nothing is known about their RNA polymerases. We cloned and sequenced the gene for the nuclear-encoded mitochondrial polymerase (RpoTm) of the dinoflagellate Heterocapsa triquetra and showed that the protein presequence targeted a GFP construct into yeast mitochondria. The gene belongs to a small gene family, which includes a variety of 3'-truncated copies that may have originated by retroposition. The catalytic C-terminal domain of the protein shares nine conserved sequence blocks with other single-subunit polymerases and is predicted to have the same fold as the human enzyme. However, the N-terminal (promoter binding/transcription initiation) domain is not well-conserved. In conjunction with the degenerate nature of the mitochondrial genome, this suggests a requirement for novel accessory factors to ensure the accurate production of functional mRNAs.
A new method for detection and discrimination of Pepino mosaic virus isolates using high resolution melting analysis of the triple gene block 3.

PubMed

Hasiów-Jaroszewska, Beata; Komorowska, Beata

2013-10-01

Diagnostic methods distinguished different Pepino mosaic virus (PepMV) genotypes but the methods do not detect sequence variation in particular gene segments. The necrotic and non-necrotic isolates (pathotypes) of PepMV share a 99% sequence similarity. These isolates differ from each other at one nucleotide site in the triple gene block 3. In this study, a combination of real-time reverse transcription polymerase chain reaction and high resolution melting curve analysis of triple gene block 3 was developed for simultaneous detection and differentiation of PepMV pathotypes. The triple gene block 3 region carrying a transition A → G was amplified using two primer pairs from twelve virus isolates, and was subjected to high resolution melting curve analysis. The results showed two distinct melting curve profiles related to each pathotype. The results also indicated that the high resolution melting method could readily differentiate between necrotic and non-necrotic PepMV pathotypes. Copyright © 2013 Elsevier B.V. All rights reserved.
Genevar: a database and Java application for the analysis and visualization of SNP-gene associations in eQTL studies

PubMed Central

Yang, Tsun-Po; Beazley, Claude; Montgomery, Stephen B.; Dimas, Antigone S.; Gutierrez-Arcelus, Maria; Stranger, Barbara E.; Deloukas, Panos; Dermitzakis, Emmanouil T.

2010-01-01

Summary: Genevar (GENe Expression VARiation) is a database and Java tool designed to integrate multiple datasets, and provides analysis and visualization of associations between sequence variation and gene expression. Genevar allows researchers to investigate expression quantitative trait loci (eQTL) associations within a gene locus of interest in real time. The database and application can be installed on a standard computer in database mode and, in addition, on a server to share discoveries among affiliations or the broader community over the Internet via web services protocols. Availability: http://www.sanger.ac.uk/resources/software/genevar Contact: emmanouil.dermitzakis@unige.ch PMID:20702402
Molecular mechanisms of floral organ specification by MADS domain proteins.

PubMed

Yan, Wenhao; Chen, Dijun; Kaufmann, Kerstin

2016-02-01

Flower development is a model system to understand organ specification in plants. The identities of different types of floral organs are specified by homeotic MADS transcription factors that interact in a combinatorial fashion. Systematic identification of DNA-binding sites and target genes of these key regulators show that they have shared and unique sets of target genes. DNA binding by MADS proteins is not based on 'simple' recognition of a specific DNA sequence, but depends on DNA structure and combinatorial interactions. Homeotic MADS proteins regulate gene expression via alternative mechanisms, one of which may be to modulate chromatin structure and accessibility in their target gene promoters. Copyright © 2015 Elsevier Ltd. All rights reserved.
Assembling networks of microbial genomes using linear programming.

PubMed

Holloway, Catherine; Beiko, Robert G

2010-11-20

Microbial genomes exhibit complex sets of genetic affinities due to lateral genetic transfer. Assessing the relative contributions of parent-to-offspring inheritance and gene sharing is a vital step in understanding the evolutionary origins and modern-day function of an organism, but recovering and showing these relationships is a challenging problem. We have developed a new approach that uses linear programming to find between-genome relationships, by treating tables of genetic affinities (here, represented by transformed BLAST e-values) as an optimization problem. Validation trials on simulated data demonstrate the effectiveness of the approach in recovering and representing vertical and lateral relationships among genomes. Application of the technique to a set comprising Aquifex aeolicus and 75 other thermophiles showed an important role for large genomes as 'hubs' in the gene sharing network, and suggested that genes are preferentially shared between organisms with similar optimal growth temperatures. We were also able to discover distinct and common genetic contributors to each sequenced representative of genus Pseudomonas. The linear programming approach we have developed can serve as an effective inference tool in its own right, and can be an efficient first step in a more-intensive phylogenomic analysis.
Characterizing partial AZFc deletions of the Y chromosome with amplicon-specific sequence markers

PubMed Central

Navarro-Costa, Paulo; Pereira, Luísa; Alves, Cíntia; Gusmão, Leonor; Proença, Carmen; Marques-Vidal, Pedro; Rocha, Tiago; Correia, Sónia C; Jorge, Sónia; Neves, António; Soares, Ana P; Nunes, Joaquim; Calhaz-Jorge, Carlos; Amorim, António; Plancha, Carlos E; Gonçalves, João

2007-01-01

Background The AZFc region of the human Y chromosome is a highly recombinogenic locus containing multi-copy male fertility genes located in repeated DNA blocks (amplicons). These AZFc gene families exhibit slight sequence variations between copies which are considered to have functional relevance. Yet, partial AZFc deletions yield phenotypes ranging from normospermia to azoospermia, thwarting definite conclusions on their real impact on fertility. Results The amplicon content of partial AZFc deletion products was characterized with novel amplicon-specific sequence markers. Data indicate that partial AZFc deletions are a male infertility risk [odds ratio: 5.6 (95% CI: 1.6–30.1)] and although high diversity of partial deletion products and sequence conversion profiles were recorded, the AZFc marker profiles detected in fertile men were also observed in infertile men. Additionally, the assessment of rearrangement recurrence by Y-lineage analysis indicated that while partial AZFc deletions occurred in highly diverse samples, haplotype diversity was minimal in fertile men sharing identical marker profiles. Conclusion Although partial AZFc deletion products are highly heterogeneous in terms of amplicon content, this plasticity is not sufficient to account for the observed phenotypical variance. The lack of causative association between the deletion of specific gene copies and infertility suggests that AZFc gene content might be part of a multifactorial network, with Y-lineage evolution emerging as a possible phenotype modulator. PMID:17903263
Cloning and sequence analysis of the Antheraea pernyi nucleopolyhedrovirus gp64 gene.

PubMed

Wang, Wenbing; Zhu, Shanying; Wang, Liqun; Yu, Feng; Shen, Weide

2005-12-01

Frequent outbreaks of the purulence disease of Chinese oak silkworm are reported in Middle and Northeast China. The disease is produced by the pathogen Antheraea pernyi nucleopolyhedrovirus (AnpeNPV). To obtain molecular information of the virus, the polyhedra of AnpeNPV were purified and characterized. The genomic DNA of AnpeNPV was extracted and digested with HindIII. The genome size of AnpeNPV is estimated at 128 kb. Based on the analysis of DNA fragments digested with HindIII, 23 fragments were bigger than 564 bp. A genomic library was generated using HindIII and the positive clones were sequenced and analysed. The gp64 gene, encoding the baculovirus envelope protein GP64, was found in an insert. The nucleotide sequence analysis indicated that the AnpeNPV gp64 gene consists of a 1,530 nucleotide open reading frame (ORF), encoding a protein of 509 amino acids. Of the eight gp64 homologues, the AnpeNPV gp64 ORF shared the most sequence similarity with the gp64 gene of Anticarsia gemmatalis NPV, but not Bombyx mori NPV. The upstream region of the AnpeNPV gp64 ORF encoded the conserved transcriptional elements for early and late stage of the viral infection cycle. These results indicated that AnpeNPV belongs to group I NPV and was far removed in molecular phylogeny from the BmNPV.
A new species of cellular slime mold from southern Portugal based on morphology, ITS and SSU sequences.

PubMed

Romeralo, M; Baldauf, S L; Cavender, J C

2009-01-01

Sampling soils to look for dictyostelids in southern Portugal we found an isolate that has a morphology that differed from any previously described species of the group. We sequenced the internally transcribed spacer (ITS) and small subunit (SSU) genes of the nuclear ribosomal RNA and found that both sequences are distinct from all previously described sequences. Phylogenetic analyses place the new species in dictyostelid Group 3 (Rhizostelids) together with D. potamoides, with which it shares 65.8% identity for ITS and 96.6% for SSU. In this paper we describe a new species of cellular slime mold, Dictyostelium ibericum, based on morphological and molecular characters. It is a small species with polar granules in its spores.
A cluster of novel serotonin receptor 3-like genes on human chromosome 3.

PubMed

Karnovsky, Alla M; Gotow, Lisa F; McKinley, Denise D; Piechan, Julie L; Ruble, Cara L; Mills, Cynthia J; Schellin, Kathleen A B; Slightom, Jerry L; Fitzgerald, Laura R; Benjamin, Christopher W; Roberds, Steven L

2003-11-13

The ligand-gated ion channel family includes receptors for serotonin (5-hydroxytryptamine, 5-HT), acetylcholine, GABA, and glutamate. Drugs targeting subtypes of these receptors have proven useful for the treatment of various neuropsychiatric and neurological disorders. To identify new ligand-gated ion channels as potential therapeutic targets, drafts of human genome sequence were interrogated. Portions of four novel genes homologous to 5-HT(3A) and 5-HT(3B) receptors were identified within human sequence databases. We named the genes 5-HT(3C1)-5-HT(3C4). Radiation hybrid (RH) mapping localized these genes to chromosome 3q27-28. All four genes shared similar intron-exon organizations and predicted protein secondary structure with 5-HT(3A) and 5-HT(3B). Orthologous genes were detected by Southern blotting in several species including dog, cow, and chicken, but not in rodents, suggesting that these novel genes are not present in rodents or are very poorly conserved. Two of the novel genes are predicted to be pseudogenes, but two other genes are transcribed and spliced to form appropriate open reading frames. The 5-HT(3C1) transcript is expressed almost exclusively in small intestine and colon, suggesting a possible role in the serotonin-responsiveness of the gut.
Sequence and expression of the genes for HPr (ptsH) and enzyme I (ptsI) of the phosphoenolpyruvate-dependent phosphotransferase transport system from Streptococcus mutans.

PubMed Central

Boyd, D A; Cvitkovitch, D G; Hamilton, I R

1994-01-01

We report the sequencing of a 2,242-bp region of the Streptococcus mutants NG5 genome containing the genes for ptsH and ptsI, which encode HPr and enzyme I (EI), respectively, of the phosphoenolpyruvate-dependent phosphotransferase transport system. The sequence was obtained from two cloned overlapping genomic fragments; one expresses HPr and a truncated EI, while the other expresses a full-length EI in Escherichia coli, as determined by Western immunoblotting. The ptsI gene appeared to be expressed from a region located in the ptsH gene. The S. mutans NG5 pts operon does not appear to be linked to other phosphotransferase transport system proteins as has been found in other bacteria. A positive fermentation pattern on MacConkey-glucose plates by an E. coli ptsI mutant harboring the S. mutans NG5 ptsI gene on a plasmid indicated that the S. mutans NG5 EI can complement a defect in the E. coli gene. This was confirmed by protein phosphorylation experiments with 32P-labeled phosphoenolpyruvate indicating phosphotransfer from the S. mutans NG5 EI to the E. coli HPr. Two forms of the cloned EI, both truncated to varying degrees in the C-terminal region, were inefficiently phosphorylated and unable to complement fully the ptsI defect in the E. coli mutant. The deduced amino acid sequence of HPr shows a high degree of homology, particularly around the active site, to the same protein from other gram-positive bacteria, notably, S. salivarius, and to a lesser extent with those of gram-negative bacteria. The deduced amino acid sequence of S. mutans NG5 EI also shares several regions of homology with other sequenced EIs, notably, with the region around the active site, a region that contains the only conserved cystidyl residue among the various proteins and which may be involved in substrate binding. Images PMID:8132321
High-quality draft genome sequence of Ensifer meliloti Mlalz-1, a microsymbiont of Medicago laciniata (L.) miller collected in Lanzarote, Canary Islands, Spain

DOE Office of Scientific and Technical Information (OSTI.GOV)

Osman, Wan Adnawani Meor; van Berkum, Peter; León-Barrios, Milagros

Ensifer meliloti Mlalz-1 (INSDC = ATZD00000000) is an aerobic, motile, Gram-negative, non-spore-forming rod that was isolated from an effective nitrogen-fixing nodule of Medicago laciniata (L.) Miller from a soil sample collected near the town of Guatiza on the island of Lanzarote, the Canary Islands, Spain. This strain nodulates and forms an effective symbiosis with the highly specific host M. laciniata. This rhizobial genome was sequenced as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) sequencing project. Here in this paper, the features of E. meliloti Mlalz-1 are described, together with high-qualitymore » permanent draft genome sequence information and annotation. The 6,664,116 bp high-quality draft genome is arranged in 99 scaffolds of 100 contigs, containing 6314 protein-coding genes and 74 RNA-only encoding genes. Strain Mlalz-1 is closely related to Ensifer meliloti IAM 12611 T, Ensifer medicae A 321T and Ensifer numidicus ORS 1407 T, based on 16S rRNA gene sequences. gANI values of ≥98.1% support the classification of strain Mlalz-1 as E. meliloti . Nodulation of M. laciniata requires a specific nodC allele, and the nodC gene of strain Mlalz-1 shares ≥98% sequence identity with nodC of M. laciniata-nodulating Ensifer strains, but ≤93% with nodC of Ensifer strains that nodulate other Medicago species. Strain Mlalz-1 is unique among sequenced E. meliloti strains in possessing genes encoding components of a T2SS and in having two versions of the adaptive acid tolerance response lpiA-acvB operon. In E. medicae strain WSM419, lpiA is essential for enhancing survival in lethal acid conditions. The second copy of the lpiA-acvB operon of strain Mlalz-1 has highest sequence identity (> 96%) with that of E. medicae strains, which suggests genetic recombination between strain Mlalz-1 and E. medicae and the horizontal gene transfer of lpiA-acvB.« less
High-quality draft genome sequence of Ensifer meliloti Mlalz-1, a microsymbiont of Medicago laciniata (L.) miller collected in Lanzarote, Canary Islands, Spain

DOE PAGES

Osman, Wan Adnawani Meor; van Berkum, Peter; León-Barrios, Milagros; ...

2017-09-25

Ensifer meliloti Mlalz-1 (INSDC = ATZD00000000) is an aerobic, motile, Gram-negative, non-spore-forming rod that was isolated from an effective nitrogen-fixing nodule of Medicago laciniata (L.) Miller from a soil sample collected near the town of Guatiza on the island of Lanzarote, the Canary Islands, Spain. This strain nodulates and forms an effective symbiosis with the highly specific host M. laciniata. This rhizobial genome was sequenced as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) sequencing project. Here in this paper, the features of E. meliloti Mlalz-1 are described, together with high-qualitymore » permanent draft genome sequence information and annotation. The 6,664,116 bp high-quality draft genome is arranged in 99 scaffolds of 100 contigs, containing 6314 protein-coding genes and 74 RNA-only encoding genes. Strain Mlalz-1 is closely related to Ensifer meliloti IAM 12611 T, Ensifer medicae A 321T and Ensifer numidicus ORS 1407 T, based on 16S rRNA gene sequences. gANI values of ≥98.1% support the classification of strain Mlalz-1 as E. meliloti . Nodulation of M. laciniata requires a specific nodC allele, and the nodC gene of strain Mlalz-1 shares ≥98% sequence identity with nodC of M. laciniata-nodulating Ensifer strains, but ≤93% with nodC of Ensifer strains that nodulate other Medicago species. Strain Mlalz-1 is unique among sequenced E. meliloti strains in possessing genes encoding components of a T2SS and in having two versions of the adaptive acid tolerance response lpiA-acvB operon. In E. medicae strain WSM419, lpiA is essential for enhancing survival in lethal acid conditions. The second copy of the lpiA-acvB operon of strain Mlalz-1 has highest sequence identity (> 96%) with that of E. medicae strains, which suggests genetic recombination between strain Mlalz-1 and E. medicae and the horizontal gene transfer of lpiA-acvB.« less
A network approach to analyzing highly recombinant malaria parasite genes.

PubMed

Larremore, Daniel B; Clauset, Aaron; Buckee, Caroline O

2013-01-01

The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences.
A Network Approach to Analyzing Highly Recombinant Malaria Parasite Genes

PubMed Central

Larremore, Daniel B.; Clauset, Aaron; Buckee, Caroline O.

2013-01-01

The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences. PMID:24130474

Znrg, a novel gene expressed mainly in the developing notochord of zebrafish.

PubMed

Zhou, Yaping; Xu, Yan; Li, Jianzhen; Liu, Yao; Zhang, Zhe; Deng, Fengjiao

2010-06-01

The notochord, a defining characteristic of the chordate embryo is a critical midline structure required for axial skeletal formation in vertebrates, and acts as a signaling center throughout embryonic development. We utilized the digital differential display program of the National Center for Biotechnology Information, and identified a contig of expressed sequence tags (no. Dr. 83747) from the zebrafish ovary library in Genbank. Full-length cDNA of the identified gene was cloned by 5'- and 3'- RACE, and the resulting sequence was confirmed by polymerase chain reaction and sequencing. The cDNA clone contains 2,505 base pairs and encodes a novel protein of 707 amino acids that shares no significant homology with any known proteins. This gene was expressed in mature oocytes and at the one-cell stage, and persisted until the 5th day of development, as determined by RT-PCR. Transcripts were detected by whole-mount RNA in situ hybridization from the two-cell stage to 72 h of embryonic development. This gene was uniformly distributed from the cleavage stage up to the blastula stage. During early gastrulation, it was present in the dorsal region, and became restricted to the notochord and pectoral fin at 48 and 72 h of embryonic development. Based on its abundance in the notochord, we hypothesized that the novel gene may play an important role in notochord development in zebrafish; we named this gene, zebrafish notochord-related gene, or znrg.
Molecular characterization of classical and nonclassical MHC class I genes from the golden pheasant (Chrysolophus pictus).

PubMed

Zeng, Q-Q; Zhong, G-H; He, K; Sun, D-D; Wan, Q-H

2016-02-01

Classical major histocompatibility complex (MHC) class I allelic polymorphism is essential for competent antigen presentation. To improve the genotyping efforts in the golden pheasant, it is necessary to differentiate more accurately between classical and nonclassical class I molecules. In our study, all MHC class I genes were isolated from one golden pheasant based on two overlapping PCR amplifications. In total, six full-length class I nucleotide sequences (A-F) were identified, and four were novel. Two (A and C) belonged to the IA1 gene, two (B and D) were alleles derived from the IA2 gene through transgene amplification, and two (E and F) comprised a third novel locus, IA3 that was excluded from the core region of the golden pheasant MHC-B. IA1 and IA2 exhibited the broad expression profiles characteristic of classical loci, while IA3 showed no expression in multiple tissues and was therefore defined as a nonclassical gene. Phylogenetic analysis indicated that the three IA genes in the golden pheasant share a much closer evolutionary relationship than the corresponding sequences in other galliform species. This observation was consistent with high sequence similarity among them, which likely arises from the homogenizing effect of recombination. Our careful distinction between the classical and nonclassical MHC class I genes in the golden pheasant lays the foundation for developing locus-specific genotyping and establishing a good molecular marker system of classical MHC I loci. © 2015 John Wiley & Sons Ltd.
Identification of mediator complex 26 (Crsp7) gametologs on platypus X1 and Y5 sex chromosomes: a candidate testis-determining gene in monotremes?

PubMed

Tsend-Ayush, Enkhjargal; Kortschak, R Daniel; Bernard, Pascal; Lim, Shu Ly; Ryan, Janelle; Rosenkranz, Ruben; Borodina, Tatiana; Dohm, Juliane C; Himmelbauer, Heinz; Harley, Vincent R; Grützner, Frank

2012-01-01

The basal lineage of monotremes features an extraordinarily complex sex chromosome system which has provided novel insights into the evolution of mammalian sex chromosomes. Recently, sequence information from autosomes, X chromosomes, and XY-shared pseudoautosomal regions has become available. However, no gene has so far been described on any of the Y chromosome-specific regions. We analyzed sequences derived from Y-specific BAC clones to identify genes with potentially male-specific function. Here, we report the identification and characterization of the mediator complex protein gametologs on platypus Y5 (Crspy). We also identified the X-chromosomal copy which unexpectedly maps to X1 (Crspx). Sequence comparison shows extensive divergence between the X and Y copy, but we found no significant positive selection on either gametolog. Expression analysis shows widespread expression of Crspx. Crspy is expressed exclusively in males with particularly strong expression in testis and kidney. Reporter gene assays to investigate whether Crspx/y can act on the recently discovered mouse Sox9 testis-specific enhancer element did reveal a modest effect together with mouse Sox9 + Sf1, but showed overall no significant upregulation of the reporter gene. This is the first report of a differentiated functional male-specific gene on platypus Y chromosomes, providing new insights into sex chromosome evolution and a candidate gene for male-specific function in monotremes.
Molecular Evolution and Mosaicism of Leptospiral Outer Membrane Proteins Involves Horizontal DNA Transfer

PubMed Central

Haake, David A.; Suchard, Marc A.; Kelley, Melissa M.; Dundoo, Manjula; Alt, David P.; Zuerner, Richard L.

2004-01-01

Leptospires belong to a genus of parasitic bacterial spirochetes that have adapted to a broad range of mammalian hosts. Mechanisms of leptospiral molecular evolution were explored by sequence analysis of four genes shared by 38 strains belonging to the core group of pathogenic Leptospira species: L. interrogans, L. kirschneri, L. noguchii, L. borgpetersenii, L. santarosai, and L. weilii. The 16S rRNA and lipL32 genes were highly conserved, and the lipL41 and ompL1 genes were significantly more variable. Synonymous substitutions are distributed throughout the ompL1 gene, whereas nonsynonymous substitutions are clustered in four variable regions encoding surface loops. While phylogenetic trees for the 16S, lipL32, and lipL41 genes were relatively stable, 8 of 38 (20%) ompL1 sequences had mosaic compositions consistent with horizontal transfer of DNA between related bacterial species. A novel Bayesian multiple change point model was used to identify the most likely sites of recombination and to determine the phylogenetic relatedness of the segments of the mosaic ompL1 genes. Segments of the mosaic ompL1 genes encoding two of the surface-exposed loops were likely acquired by horizontal transfer from a peregrine allele of unknown ancestry. Identification of the most likely sites of recombination with the Bayesian multiple change point model, an approach which has not previously been applied to prokaryotic gene sequence analysis, serves as a model for future studies of recombination in molecular evolution of genes. PMID:15090524
Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags

PubMed Central

Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jakob; Gilchrist, Michael J; Panitz, Frank; Jørgensen, Claus; Scheibye-Knudsen, Karsten; Arvin, Troels; Lumholdt, Steen; Sawera, Milena; Green, Trine; Nielsen, Bente J; Havgaard, Jakob H; Rosenkilde, Carina; Wang, Jun; Li, Heng; Li, Ruiqiang; Liu, Bin; Hu, Songnian; Dong, Wei; Li, Wei; Yu, Jun; Wang, Jian; Stærfeldt, Hans-Henrik; Wernersson, Rasmus; Madsen, Lone B; Thomsen, Bo; Hornshøj, Henrik; Bujie, Zhan; Wang, Xuegang; Wang, Xuefei; Bolund, Lars; Brunak, Søren; Yang, Huanming; Bendixen, Christian; Fredholm, Merete

2007-01-01

Background Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies. PMID:17407547
Beta-globin locus activation regions: conservation of organization, structure, and function.

PubMed Central

Li, Q L; Zhou, B; Powers, P; Enver, T; Stamatoyannopoulos, G

1990-01-01

The human beta-globin locus activation region (LAR) comprises four erythroid-specific DNase I hypersensitive sites (I-IV) thought to be largely responsible for activating the beta-globin domain and facilitating high-level erythroid-specific globin gene expression. We identified the goat beta-globin LAR, determined 10.2 kilobases of its sequence, and demonstrated its function in transgenic mice. The human and goat LARs share 6.5 kilobases of homologous sequences that are as highly conserved as the epsilon-globin gene promoters. Furthermore, the overall spatial organization of the two LARs has been conserved. These results suggest that the functionally relevant regions of the LAR are large and that in addition to their primary structure, the spatial relationship of the conserved elements is important for LAR function. Images PMID:2236034
Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences

PubMed Central

2018-01-01

Prediction of taxonomy for marker gene sequences such as 16S ribosomal RNA (rRNA) is a fundamental task in microbiology. Most experimentally observed sequences are diverged from reference sequences of authoritatively named organisms, creating a challenge for prediction methods. I assessed the accuracy of several algorithms using cross-validation by identity, a new benchmark strategy which explicitly models the variation in distances between query sequences and the closest entry in a reference database. When the accuracy of genus predictions was averaged over a representative range of identities with the reference database (100%, 99%, 97%, 95% and 90%), all tested methods had ≤50% accuracy on the currently-popular V4 region of 16S rRNA. Accuracy was found to fall rapidly with identity; for example, better methods were found to have V4 genus prediction accuracy of ∼100% at 100% identity but ∼50% at 97% identity. The relationship between identity and taxonomy was quantified as the probability that a rank is the lowest shared by a pair of sequences with a given pair-wise identity. With the V4 region, 95% identity was found to be a twilight zone where taxonomy is highly ambiguous because the probabilities that the lowest shared rank between pairs of sequences is genus, family, order or class are approximately equal. PMID:29682424
Evidence for 5S rDNA Horizontal Transfer in the toadfish Halobatrachus didactylus (Schneider, 1801) based on the analysis of three multigene families

PubMed Central

2012-01-01

Background The Batrachoididae family is a group of marine teleosts that includes several species with more complicated physiological characteristics, such as their excretory, reproductive, cardiovascular and respiratory systems. Previous studies of the 5S rDNA gene family carried out in four species from the Western Atlantic showed two types of this gene in two species but only one in the other two, under processes of concerted evolution and birth-and-death evolution with purifying selection. Here we present results of the 5S rDNA and another two gene families in Halobatrachus didactylus, an Eastern Atlantic species, and draw evolutionary inferences regarding the gene families. In addition we have also mapped the genes on the chromosomes by two-colour fluorescence in situ hybridization (FISH). Results Two types of 5S rDNA were observed, named type α and type β. Molecular analysis of the 5S rDNA indicates that H. didactylus does not share the non-transcribed spacer (NTS) sequences with four other species of the family; therefore, it must have evolved in isolation. Amplification with the type β specific primers amplified a specific band in 9 specimens of H. didactylus and two of Sparus aurata. Both types showed regulatory regions and a secondary structure which mark them as functional genes. However, the U2 snRNA gene and the ITS-1 sequence showed one electrophoretic band and with one type of sequence. The U2 snRNA sequence was the most variable of the three multigene families studied. Results from two-colour FISH showed no co-localization of the gene coding from three multigene families and provided the first map of the chromosomes of the species. Conclusions A highly significant finding was observed in the analysis of the 5S rDNA, since two such distant species as H. didactylus and Sparus aurata share a 5S rDNA type. This 5S rDNA type has been detected in other species belonging to the Batrachoidiformes and Perciformes orders, but not in the Pleuronectiformes and Clupeiformes orders. Two hypotheses have been outlined: one is the possible vertical permanence of the shared type in some fish lineages, and the other is the possibility of a horizontal transference event between ancient species of the Perciformes and Batrachoidiformes orders. This finding opens a new perspective in fish evolution and in the knowledge of the dynamism of the 5S rDNA. Cytogenetic analysis allowed some evolutionary trends to be roughed out, such as the progressive change in the U2 snDNA and the organization of (GATA)n repeats, from dispersed to localized in one locus. The accumulation of (GATA)n repeats in one chromosome pair could be implicated in the evolution of a pair of proto-sex chromosomes. This possibility could situate H. didactylus as the most highly evolved of the Batrachoididae family in terms of sex chromosome biology. PMID:23039906
Determination of the melon chloroplast and mitochondrial genome sequences reveals that the largest reported mitochondrial genome in plants contains a significant amount of DNA having a nuclear origin

PubMed Central

2011-01-01

Background The melon belongs to the Cucurbitaceae family, whose economic importance among vegetable crops is second only to Solanaceae. The melon has a small genome size (454 Mb), which makes it suitable for molecular and genetic studies. Despite similar nuclear and chloroplast genome sizes, cucurbits show great variation when their mitochondrial genomes are compared. The melon possesses the largest plant mitochondrial genome, as much as eight times larger than that of other cucurbits. Results The nucleotide sequences of the melon chloroplast and mitochondrial genomes were determined. The chloroplast genome (156,017 bp) included 132 genes, with 98 single-copy genes dispersed between the small (SSC) and large (LSC) single-copy regions and 17 duplicated genes in the inverted repeat regions (IRa and IRb). A comparison of the cucumber and melon chloroplast genomes showed differences in only approximately 5% of nucleotides, mainly due to short indels and SNPs. Additionally, 2.74 Mb of mitochondrial sequence, accounting for 95% of the estimated mitochondrial genome size, were assembled into five scaffolds and four additional unscaffolded contigs. An 84% of the mitochondrial genome is contained in a single scaffold. The gene-coding region accounted for 1.7% (45,926 bp) of the total sequence, including 51 protein-coding genes, 4 conserved ORFs, 3 rRNA genes and 24 tRNA genes. Despite the differences observed in the mitochondrial genome sizes of cucurbit species, Citrullus lanatus (379 kb), Cucurbita pepo (983 kb) and Cucumis melo (2,740 kb) share 120 kb of sequence, including the predicted protein-coding regions. Nevertheless, melon contained a high number of repetitive sequences and a high content of DNA of nuclear origin, which represented 42% and 47% of the total sequence, respectively. Conclusions Whereas the size and gene organisation of chloroplast genomes are similar among the cucurbit species, mitochondrial genomes show a wide variety of sizes, with a non-conserved structure both in gene number and organisation, as well as in the features of the noncoding DNA. The transfer of nuclear DNA to the melon mitochondrial genome and the high proportion of repetitive DNA appear to explain the size of the largest mitochondrial genome reported so far. PMID:21854637
Proposal of Vespertiliibacter pulmonis gen. nov., sp. nov. and two genomospecies as new members of the family Pasteurellaceae isolated from European bats.

PubMed

Mühldorfer, Kristin; Speck, Stephanie; Wibbelt, Gudrun

2014-07-01

Five bacterial strains isolated from bats of the family Vespertilionidae were characterized by phenotypic tests and multilocus sequence analysis (MLSA) using the 16S rRNA gene and four housekeeping genes (rpoA, rpoB, infB, recN). Phylogenetic analyses of individual and combined datasets indicated that the five strains represent a monophyletic cluster within the family Pasteurellaceae. Comparison of 16S rRNA gene sequences demonstrated a high degree of similarity (98.3-99.9%) among the group of bat-derived strains, while searches in nucleotide databases indicated less than 96% sequence similarity to known members of the Pasteurellaceae. The housekeeping genes rpoA, rpoB, infB and recN provided higher resolution compared with the 16S rRNA gene and subdivided the group according to the bat species from which the strains were isolated. Three strains derived from noctule bats shared 98.6-100% sequence similarity in all four genes investigated, whereas, based on rpoB, infB and recN gene sequences, 91.8-96% similarity was observed with and between the remaining two strains isolated from a serotine bat and a pipistrelle bat, respectively. Genome relatedness as deduced from recN gene sequences correlated well with the results of MLSA and indicated that the five strains represent a new genus. Based on these results, it is proposed to classify the five strains derived from bats within Vespertiliibacter pulmonis gen. nov., sp. nov. (the type species), Vespertiliibacter genomospecies 1 and Vespertiliibacter genomospecies 2. The genus can be distinguished phenotypically from recognized genera of the Pasteurellaceae by at least three characteristics. All strains are nutritionally fastidious and require a chemically defined supplement with NAD for growth. The DNA G+C content of strain E127/08(T) is 38.2 mol%. The type strain of Vespertiliibacter pulmonis gen. nov., sp. nov. is E127/08(T) ( = CCUG 64585(T) = DSM 27238(T)). The reference strains of Vespertiliibacter genomospecies 1 and 2 are E145/08 and E157/08, respectively. © 2014 IUMS.
The complete genome sequences of poxviruses isolated from a penguin and a pigeon in South Africa and comparison to other sequenced avipoxviruses.

PubMed

Offerman, Kristy; Carulei, Olivia; van der Walt, Anelda Philine; Douglass, Nicola; Williamson, Anna-Lise

2014-06-12

Two novel avipoxviruses from South Africa have been sequenced, one from a Feral Pigeon (Columba livia) (FeP2) and the other from an African penguin (Spheniscus demersus) (PEPV). We present a purpose-designed bioinformatics pipeline for analysis of next generation sequence data of avian poxviruses and compare the different avipoxviruses sequenced to date with specific emphasis on their evolution and gene content. The FeP2 (282 kbp) and PEPV (306 kbp) genomes encode 271 and 284 open reading frames respectively and are more closely related to one another (94.4%) than to either fowlpox virus (FWPV) (85.3% and 84.0% respectively) or Canarypox virus (CNPV) (62.0% and 63.4% respectively). Overall, FeP2, PEPV and FWPV have syntenic gene arrangements; however, major differences exist throughout their genomes. The most striking difference between FeP2 and the FWPV-like avipoxviruses is a large deletion of ~16 kbp from the central region of the genome of FeP2 deleting a cc-chemokine-like gene, two Variola virus B22R orthologues, an N1R/p28-like gene and a V-type Ig domain family gene. FeP2 and PEPV both encode orthologues of vaccinia virus C7L and Interleukin 10. PEPV contains a 77 amino acid long orthologue of Ubiquitin sharing 97% amino acid identity to human ubiquitin. The genome sequences of FeP2 and PEPV have greatly added to the limited repository of genomic information available for the Avipoxvirus genus. In the comparison of FeP2 and PEPV to existing sequences, FWPV and CNPV, we have established insights into African avipoxvirus evolution. Our data supports the independent evolution of these South African avipoxviruses from a common ancestral virus to FWPV and CNPV.
A novel mutation in FRMD7 causing X-linked idiopathic congenital nystagmus in a large family

PubMed Central

He, Xiang; Gu, Feng; Wang, Yujing; Yan, Jinting; Zhang, Meng; Huang, Shangzhi

2008-01-01

Purpose To identify the gene responsible for causing an X-linked idiopathic congenital nystagmus (XLICN) in a six-generation Chinese family. Methods Forty-nine members of an XLICN family were recruited and examined after obtaining informed consent. Affected male individuals were genotyped with microsatellite markers around the FRMD7 locus. Mutations were comprehensively screened by direct sequencing using gene specific primers. An X-inactivation pattern was investigated by X chromosome methylation analysis. Results The patients showed phenotypes consistent with XLICN. Genotype analysis showed that male affected individuals in the family shared a common haplotype with the selected markers. Sequencing FRMD7 revealed a G>T transversion (c.812G>T) in exon 9, which caused a conservative substitution of Cys to Phe at codon 271 (p.C271F). This mutation co-segregated with all affected individuals and was present in the obligate, non-penetrant female carriers. However, the mutation was not observed in unaffected familial males or 400 control males. Females with the mutant gene could be affected or carrier and they shared the same inactivated X chromosome harboring the mutation in blood cells, which showed there is no clear causal link between X-inactivation pattern and phenotype. Conclusions We identified a novel mutation in FRMD7 and confirmed the role of this mutation in the pathogenesis of X-linked congenital nystagmus. PMID:18246032
Comparison of taxon-specific versus general locus sets for targeted sequence capture in plant phylogenomics.

PubMed

Chau, John H; Rahfeldt, Wolfgang A; Olmstead, Richard G

2018-03-01

Targeted sequence capture can be used to efficiently gather sequence data for large numbers of loci, such as single-copy nuclear loci. Most published studies in plants have used taxon-specific locus sets developed individually for a clade using multiple genomic and transcriptomic resources. General locus sets can also be developed from loci that have been identified as single-copy and have orthologs in large clades of plants. We identify and compare a taxon-specific locus set and three general locus sets (conserved ortholog set [COSII], shared single-copy nuclear [APVO SSC] genes, and pentatricopeptide repeat [PPR] genes) for targeted sequence capture in Buddleja (Scrophulariaceae) and outgroups. We evaluate their performance in terms of assembly success, sequence variability, and resolution and support of inferred phylogenetic trees. The taxon-specific locus set had the most target loci. Assembly success was high for all locus sets in Buddleja samples. For outgroups, general locus sets had greater assembly success. Taxon-specific and PPR loci had the highest average variability. The taxon-specific data set produced the best-supported tree, but all data sets showed improved resolution over previous non-sequence capture data sets. General locus sets can be a useful source of sequence capture targets, especially if multiple genomic resources are not available for a taxon.
Genome-wide survey of Aux/IAA gene family members in potato (Solanum tuberosum): Identification, expression analysis, and evaluation of their roles in tuber development.

PubMed

Gao, Junpeng; Cao, Xiaoli; Shi, Shandang; Ma, Yuling; Wang, Kai; Liu, Shengjie; Chen, Dan; Chen, Qin; Ma, Haoli

2016-03-04

The Auxin/indole-3-acetic acid (Aux/IAA) genes encode short-lived nuclear proteins that are known to be involved in the primary cellular responses to auxin. To date, systematic analysis of the Aux/IAA genes in potato (Solanum tuberosum) has not been conducted. In this study, a total of 26 potato Aux/IAA genes were identified (designated from StIAA1 to StIAA26), and the distribution of four conserved domains shared by the StIAAs were analyzed based on multiple sequence alignment and a motif-based sequence analysis. A phylogenetic analysis of the Aux/IAA gene families of potato and Arabidopsis was also conducted. In order to assess the roles of StIAA genes in tuber development, the results of RNA-seq studies were reformatted to analyze the expression patterns of StIAA genes, and then verified by quantitative real-time PCR. A large number of StIAA genes (12 genes) were highly expressed in stolon organs and in during the tuber initiation and expansion developmental stages, and most of these genes were responsive to indoleacetic acid treatment. Our results suggested that StIAA genes were involved in the process of tuber development and provided insights into functional roles of potato Aux/IAA genes. Copyright © 2016 Elsevier Inc. All rights reserved.
Bedside Back to Bench: Building Bridges between Basic and Clinical Genomic Research.

PubMed

Manolio, Teri A; Fowler, Douglas M; Starita, Lea M; Haendel, Melissa A; MacArthur, Daniel G; Biesecker, Leslie G; Worthey, Elizabeth; Chisholm, Rex L; Green, Eric D; Jacob, Howard J; McLeod, Howard L; Roden, Dan; Rodriguez, Laura Lyman; Williams, Marc S; Cooper, Gregory M; Cox, Nancy J; Herman, Gail E; Kingsmore, Stephen; Lo, Cecilia; Lutz, Cathleen; MacRae, Calum A; Nussbaum, Robert L; Ordovas, Jose M; Ramos, Erin M; Robinson, Peter N; Rubinstein, Wendy S; Seidman, Christine; Stranger, Barbara E; Wang, Haoyi; Westerfield, Monte; Bult, Carol

2017-03-23

Genome sequencing has revolutionized the diagnosis of genetic diseases. Close collaborations between basic scientists and clinical genomicists are now needed to link genetic variants with disease causation. To facilitate such collaborations, we recommend prioritizing clinically relevant genes for functional studies, developing reference variant-phenotype databases, adopting phenotype description standards, and promoting data sharing. Published by Elsevier Inc.
Bedside Back to Bench: Building Bridges between Basic and Clinical Genomic Research

PubMed Central

Manolio, Teri A.; Fowler, Douglas M.; Starita, Lea M.; Haendel, Melissa A.; MacArthur, Daniel G.; Biesecker, Leslie G.; Worthey, Elizabeth; Chisholm, Rex L.; Green, Eric D.; Jacob, Howard J.; McLeod, Howard L.; Roden, Dan; Rodriguez, Laura Lyman; Williams, Marc S.; Cooper, Gregory M.; Cox, Nancy J.; Herman, Gail E.; Kingsmore, Stephen; Lo, Cecilia; Lutz, Cathleen; MacRae, Calum A.; Nussbaum, Robert L.; Ordovas, Jose M.; Ramos, Erin M.; Robinson, Peter N.; Rubinstein, Wendy S.; Seidman, Christine; Stranger, Barbara E.; Wang, Haoyi; Westerfield, Monte; Bult, Carol

2017-01-01

Summary Genome sequencing has revolutionized the diagnosis of genetic diseases. Close collaborations between basic scientists and clinical genomicists are now needed to link genetic variants with disease causation. To facilitate such collaborations we recommend prioritizing clinically relevant genes for functional studies, developing reference variant-phenotype databases, adopting phenotype description standards, and promoting data sharing. PMID:28340351
A gene family for acidic ribosomal proteins in Schizosaccharomyces pombe: two essential and two nonessential genes.

PubMed Central

Beltrame, M; Bianchi, M E

1990-01-01

We have cloned the genes for small acidic ribosomal proteins (A-proteins) of the fission yeast Schizosaccharomyces pombe. S. pombe contains four transcribed genes for small A-proteins per haploid genome, as is the case for Saccharomyces cerevisiae. In contrast, multicellular eucaryotes contain two transcribed genes per haploid genome. The four proteins of S. pombe, besides sharing a high overall similarity, form two couples of nearly identical sequences. Their corresponding genes have a very conserved structure and are transcribed to a similar level. Surprisingly, of each couple of genes coding for nearly identical proteins, one is essential for cell growth, whereas the other is not. We suggest that the unequal importance of the four small A-proteins for cell survival is related to their physical organization in 60S ribosomal subunits. Images PMID:2325655
Identification and qualification of 500 nuclear, single-copy, orthologous genes for the Eupulmonata (Gastropoda) using transcriptome sequencing and exon capture.

PubMed

Teasdale, Luisa C; Köhler, Frank; Murray, Kevin D; O'Hara, Tim; Moussalli, Adnan

2016-09-01

The qualification of orthology is a significant challenge when developing large, multiloci phylogenetic data sets from assembled transcripts. Transcriptome assemblies have various attributes, such as fragmentation, frameshifts and mis-indexing, which pose problems to automated methods of orthology assessment. Here, we identify a set of orthologous single-copy genes from transcriptome assemblies for the land snails and slugs (Eupulmonata) using a thorough approach to orthology determination involving manual alignment curation, gene tree assessment and sequencing from genomic DNA. We qualified the orthology of 500 nuclear, protein-coding genes from the transcriptome assemblies of 21 eupulmonate species to produce the most complete phylogenetic data matrix for a major molluscan lineage to date, both in terms of taxon and character completeness. Exon capture targeting 490 of the 500 genes (those with at least one exon >120 bp) from 22 species of Australian Camaenidae successfully captured sequences of 2825 exons (representing all targeted genes), with only a 3.7% reduction in the data matrix due to the presence of putative paralogs or pseudogenes. The automated pipeline Agalma retrieved the majority of the manually qualified 500 single-copy gene set and identified a further 375 putative single-copy genes, although it failed to account for fragmented transcripts resulting in lower data matrix completeness when considering the original 500 genes. This could potentially explain the minor inconsistencies we observed in the supported topologies for the 21 eupulmonate species between the manually curated and 'Agalma-equivalent' data set (sharing 458 genes). Overall, our study confirms the utility of the 500 gene set to resolve phylogenetic relationships at a range of evolutionary depths and highlights the importance of addressing fragmentation at the homolog alignment stage for probe design. © 2016 John Wiley & Sons Ltd.
CHD8 regulates neurodevelopmental pathways associated with autism spectrum disorder in neural progenitors

PubMed Central

Sugathan, Aarathi; Biagioli, Marta; Golzio, Christelle; Erdin, Serkan; Blumenthal, Ian; Manavalan, Poornima; Ragavendran, Ashok; Brand, Harrison; Lucente, Diane; Miles, Judith; Sheridan, Steven D.; Stortchevoi, Alexei; Kellis, Manolis; Haggarty, Stephen J.; Katsanis, Nicholas; Gusella, James F.; Talkowski, Michael E.

2014-01-01

Truncating mutations of chromodomain helicase DNA-binding protein 8 (CHD8), and of many other genes with diverse functions, are strong-effect risk factors for autism spectrum disorder (ASD), suggesting multiple mechanisms of pathogenesis. We explored the transcriptional networks that CHD8 regulates in neural progenitor cells (NPCs) by reducing its expression and then integrating transcriptome sequencing (RNA sequencing) with genome-wide CHD8 binding (ChIP sequencing). Suppressing CHD8 to levels comparable with the loss of a single allele caused altered expression of 1,756 genes, 64.9% of which were up-regulated. CHD8 showed widespread binding to chromatin, with 7,324 replicated sites that marked 5,658 genes. Integration of these data suggests that a limited array of direct regulatory effects of CHD8 produced a much larger network of secondary expression changes. Genes indirectly down-regulated (i.e., without CHD8-binding sites) reflect pathways involved in brain development, including synapse formation, neuron differentiation, cell adhesion, and axon guidance, whereas CHD8-bound genes are strongly associated with chromatin modification and transcriptional regulation. Genes associated with ASD were strongly enriched among indirectly down-regulated loci (P < 10−8) and CHD8-bound genes (P = 0.0043), which align with previously identified coexpression modules during fetal development. We also find an intriguing enrichment of cancer-related gene sets among CHD8-bound genes (P < 10−10). In vivo suppression of chd8 in zebrafish produced macrocephaly comparable to that of humans with inactivating mutations. These data indicate that heterozygous disruption of CHD8 precipitates a network of gene-expression changes involved in neurodevelopmental pathways in which many ASD-associated genes may converge on shared mechanisms of pathogenesis. PMID:25294932
Complete sequence of a novel 178-kilobase plasmid carrying bla(NDM-1) in a Providencia stuartii strain isolated in Afghanistan.

PubMed

McGann, Patrick; Hang, Jun; Clifford, Robert J; Yang, Yu; Kwak, Yoon I; Kuschner, Robert A; Lesho, Emil P; Waterman, Paige E

2012-04-01

In response to global concerns over the spread of the New Delhi metallo-β-lactamase gene 1, bla(NDM-1), a monthly surveillance program was initiated in September 2010. All carbapenem-resistant Gram-negative strains forwarded to our facility are screened for this gene. To date, 321 carbapenem-resistant isolates, encompassing 11 bacterial species, have been tested. In February 2011, two strains of Providencia stuartii, submitted from a military hospital in Afghanistan, tested positive for bla(NDM-1). Both strains were identical by pulsed-field gel electrophoresis (PFGE). bla(NDM-1) was carried on a large plasmid, pMR0211, which was sequenced by emulsion PCR and pyrosequencing. pMR0211 is 178,277 bp in size and belongs to incompatibility group A/C. The plasmid consists of a backbone with considerable homology to pAR060302 from Escherichia coli, and it retains many of the antibiotic resistance genes associated with it. The plasmid also shares common elements with the pNDM-HK plasmid, including bla(NDM-1), armA, and sul1. However, gene orientation is reversed, and a 3-kb fragment from this region is absent from pMR0211. pMR0211 also contains additional genes, including the aminoglycoside-modifying enzyme loci aadA and aac(6'), the quinolone resistance gene qnrA, a gene with highest homology to a U32 family peptidase from Shewanella amazonensis, and the bla(OXA-10) gene. The finding of this gene in an intrinsically colistin-resistant species such as Providencia stuartii is especially worrisome, as it renders the organism resistant to nearly every available antibiotic. The presence of multiple insertion sequences and transposons flanking the region containing the bla(NDM-1) gene further highlights the potential mobility associated with this gene.

Comparative genome analysis of Pediococcus damnosus LMG 28219, a strain well-adapted to the beer environment.

PubMed

Snauwaert, Isabel; Stragier, Pieter; De Vuyst, Luc; Vandamme, Peter

2015-04-03

Pediococcus damnosus LMG 28219 is a lactic acid bacterium dominating the maturation phase of Flemish acid beer productions. It proved to be capable of growing in beer, thereby resisting this environment, which is unfavorable for microbial growth. The molecular mechanisms underlying its metabolic capabilities and niche adaptations were unknown up to now. In the present study, whole-genome sequencing and comparative genome analysis were used to investigate this strain's mechanisms to reside in the beer niche, with special focus on not only stress and hop resistances but also folate biosynthesis and exopolysaccharide (EPS) production. The draft genome sequence of P. damnosus LMG 28219 harbored 183 contigs, including an intact prophage region and several coding sequences involved in plasmid replication. The annotation of 2178 coding sequences revealed the presence of many transporters and transcriptional regulators and several genes involved in oxidative stress response, hop resistance, de novo folate biosynthesis, and EPS production. Comparative genome analysis of P. damnosus LMG 28219 with Pediococcus claussenii ATCC BAA-344(T) (beer origin) and Pediococcus pentosaceus ATCC 25745 (plant origin) revealed that various hop resistance genes and genes involved in de novo folate biosynthesis were unique to the strains isolated from beer. This contrasted with the genes related to osmotic stress responses, which were shared between the strains compared. Furthermore, transcriptional regulators were enriched in the genomes of bacteria capable of growth in beer, suggesting that those cause rapid up- or down-regulation of gene expression. Genome sequence analysis of P. damnosus LMG 28219 provided insights into the underlying mechanisms of its adaptation to the beer niche. The results presented will enable analysis of the transcriptome and proteome of P. damnosus LMG 28219, which will result in additional knowledge on its metabolic activities.
Draft genome sequence of type strain HBR26T and description of Rhizobium aethiopicum sp. nov.

DOE PAGES

Aserse, Aregu Amsalu; Woyke, Tanja; Kyrpides, Nikos C.; ...

2017-01-26

Rhizobium aethiopicum sp. nov. is a newly proposed species within the genus Rhizobium. This species includes six rhizobial strains; which were isolated from root nodules of the legume plant Phaseolus vulgaris growing in soils of Ethiopia. The species fixes nitrogen effectively in symbiosis with the host plant P. vulgaris, and is composed of aerobic, Gram-negative staining, rod-shaped bacteria. The genome of type strain HBR26 T of R. aethiopicum sp. nov. was one of the rhizobial genomes sequenced as a part of the DOE JGI 2014 Genomic Encyclopedia project designed for soil and plant-associated and newly described type strains. The genomemore » sequence is arranged in 62 scaffolds and consists of 6,557,588 bp length, with a 61% G + C content and 6221 protein-coding and 86 RNAs genes. The genome of HBR26 T contains repABC genes (plasmid replication genes) homologous to the genes found in five differen t Rhizobium etli CFN42 T plasmids, suggesting that HBR26 T may have five additional replicons other than the chromosome. In the genome of HBR26 T , the nodulation genes nodB, nodC, nodS, nodI, nodJ and nodD are located in the same module, and organized in a similar way as nod genes found in the genome of other known common bean-nodulating rhizobial species. nodA gene is found in a different scaffold, but it is also very similar to nodA genes of other bean-nodulating rhizobial strains. Though HBR26 T is distinct on the phylogenetic tree and based on ANI analysis (the highest value 90.2% ANI with CFN42 T ) from other bean-nodulating species, these nod genes and most nitrogen-fixing genes found in the genome of HBR26 T share high identity with the corresponding genes of known bean-nodulating rhizobial species (96-100% identity). This suggests that symbiotic genes might be shared between bean-nodulating rhizobia through horizontal gene transfer. R. aethiopicum sp. nov. was grouped into the genus Rhizobium but was distinct from all recognized species of that genus by phylogenetic analyses of combined sequences of the housekeeping genes recA and glnII. The closest reference type strains for HBR26 T were R. etli CFN42 T (94% similarity of the combined recA and glnII sequences) and Rhizobium bangladeshense BLR175 T (93%). Genomic ANI calculation based on protein-coding genes also revealed that the closest reference strains were R. bangladeshense BLR175 T and R. etli CFN42 T with ANI values 91.8 and 90.2%, respectively. Nevertheless, the ANI values between HBR26 T and BLR175 T or CFN42 T are far lower than the cutoff value of ANI ( > = 96%) between strains in the same species, confirming that HBR26 T belongs to a novel species. Thus, on the basis of phylogenetic, comparative genomic analyses and ANI results, we formally propose the creation of R. aethiopicum sp. nov. with strain HBR26 T (=HAMBI 3550 T =LMG 29711 T ) as the type strain. The genome assembly and annotation data is deposited in the DOE JGI portal and also available at European Nucleotide Archive under accession numbers FMAJ01000001-FMAJ01000062.« less
Draft genome sequence of type strain HBR26T and description of Rhizobium aethiopicum sp. nov.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aserse, Aregu Amsalu; Woyke, Tanja; Kyrpides, Nikos C.

Rhizobium aethiopicum sp. nov. is a newly proposed species within the genus Rhizobium. This species includes six rhizobial strains; which were isolated from root nodules of the legume plant Phaseolus vulgaris growing in soils of Ethiopia. The species fixes nitrogen effectively in symbiosis with the host plant P. vulgaris, and is composed of aerobic, Gram-negative staining, rod-shaped bacteria. The genome of type strain HBR26 T of R. aethiopicum sp. nov. was one of the rhizobial genomes sequenced as a part of the DOE JGI 2014 Genomic Encyclopedia project designed for soil and plant-associated and newly described type strains. The genomemore » sequence is arranged in 62 scaffolds and consists of 6,557,588 bp length, with a 61% G + C content and 6221 protein-coding and 86 RNAs genes. The genome of HBR26 T contains repABC genes (plasmid replication genes) homologous to the genes found in five differen t Rhizobium etli CFN42 T plasmids, suggesting that HBR26 T may have five additional replicons other than the chromosome. In the genome of HBR26 T , the nodulation genes nodB, nodC, nodS, nodI, nodJ and nodD are located in the same module, and organized in a similar way as nod genes found in the genome of other known common bean-nodulating rhizobial species. nodA gene is found in a different scaffold, but it is also very similar to nodA genes of other bean-nodulating rhizobial strains. Though HBR26 T is distinct on the phylogenetic tree and based on ANI analysis (the highest value 90.2% ANI with CFN42 T ) from other bean-nodulating species, these nod genes and most nitrogen-fixing genes found in the genome of HBR26 T share high identity with the corresponding genes of known bean-nodulating rhizobial species (96-100% identity). This suggests that symbiotic genes might be shared between bean-nodulating rhizobia through horizontal gene transfer. R. aethiopicum sp. nov. was grouped into the genus Rhizobium but was distinct from all recognized species of that genus by phylogenetic analyses of combined sequences of the housekeeping genes recA and glnII. The closest reference type strains for HBR26 T were R. etli CFN42 T (94% similarity of the combined recA and glnII sequences) and Rhizobium bangladeshense BLR175 T (93%). Genomic ANI calculation based on protein-coding genes also revealed that the closest reference strains were R. bangladeshense BLR175 T and R. etli CFN42 T with ANI values 91.8 and 90.2%, respectively. Nevertheless, the ANI values between HBR26 T and BLR175 T or CFN42 T are far lower than the cutoff value of ANI ( > = 96%) between strains in the same species, confirming that HBR26 T belongs to a novel species. Thus, on the basis of phylogenetic, comparative genomic analyses and ANI results, we formally propose the creation of R. aethiopicum sp. nov. with strain HBR26 T (=HAMBI 3550 T =LMG 29711 T ) as the type strain. The genome assembly and annotation data is deposited in the DOE JGI portal and also available at European Nucleotide Archive under accession numbers FMAJ01000001-FMAJ01000062.« less
GWATCH: a web platform for automated gene association discovery analysis.

PubMed

Svitin, Anton; Malov, Sergey; Cherkasov, Nikolay; Geerts, Paul; Rotkevich, Mikhail; Dobrynin, Pavel; Shevchenko, Andrey; Guan, Li; Troyer, Jennifer; Hendrickson, Sher; Dilks, Holli Hutcheson; Oleksyk, Taras K; Donfield, Sharyne; Gomperts, Edward; Jabs, Douglas A; Sezgin, Efe; Van Natta, Mark; Harrigan, P Richard; Brumme, Zabrina L; O'Brien, Stephen J

2014-01-01

As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Here we present a dynamic web-based platform - GWATCH - that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH.
Mitochondrial gene arrangement of the horseshoe crab Limulus polyphemus L.: conservation of major features among arthropod classes

NASA Technical Reports Server (NTRS)

Staton, J. L.; Daehler, L. L.; Brown, W. M.; Jacobs, D. K. (Principal Investigator)

1997-01-01

Numerous complete mitochondrial DNA sequences have been determined for species within two arthropod groups, insects and crustaceans, but there are none for a third, the chelicerates. Most mitochondrial gene arrangements reported for crustaceans and insect species are identical or nearly identical to that of Drosophila yakuba. Sequences across 36 of the gene boundaries in the mitochondrial DNA (mtDNA) of a representative chelicerate. Limulus polyphemus L., also reveal an arrangement like that of Drosophila yakuba. Only the position of the tRNA(LEU)(UUR) gene differs; in Limulus it is between the genes for tRNA(LEU)(CUN) and ND1. This positioning is also found in onychophorans, mollusks, and annelids, but not in insects and crustaceans, and indicates that tRNA(LEU)(CUN)-tRNA(LEU)(UUR)-ND1 was the ancestral gene arrangement for these groups, as suggested earlier. There are no differences in the relative arrangements of protein-coding and ribosomal RNA genes between Limulus and Drosophila, and none have been observed within arthropods. The high degree of similarity of mitochondrial gene arrangements within arthropods is striking, since some taxa last shared a common ancestor before the Cambrian, and contrasts with the extensive mtDNA rearrangements occasionally observed within some other metazoan phyla (e.g., mollusks and nematodes).
In vitro mapping of Myotonic Dystrophy (DM) gene promoter

DOE Office of Scientific and Technical Information (OSTI.GOV)

Storbeck, C.J.; Sabourin, L.; Baird, S.

1994-09-01

The Myotonic Dystrophy Kinase (DMK) gene has been cloned and shared homology to serine/threonine protein kinases. Overexpression of this gene in stably transfected mouse myoblasts has been shown to inhibit fusion into myotubes while myoblasts stably transfected with an antisense construct show increased fusion potential. These experiments, along with data showing that the DM gene is highly expressed in muscle have highlighted the possibility of DMK being involved in myogenesis. The promoter region of the DM gene lacks a consensus TATA box and CAAT box, but harbours numerous transcription binding sites. Clones containing extended 5{prime} upstream sequences (UPS) of DMKmore » only weakly drive the reporter gene chloramphenicol acetyl transferase (CAT) when transfected into C2C12 mouse myoblasts. However, four E-boxes are present in the first intron of the DM gene and transient assays show increased expression of the CAT gene when the first intron is present downstream of these 5{prime} UPS in an orientation dependent manner. Comparison between mouse and human sequence reveals that the regions in the first intron where the E-boxes are located are highly conserved. The mapping of the promoter and the importance of the first intron in the control of DMK expression will be presented.« less
HomozygosityMapper2012--bridging the gap between homozygosity mapping and deep sequencing.

PubMed

Seelow, Dominik; Schuelke, Markus

2012-07-01

Homozygosity mapping is a common method to map recessive traits in consanguineous families. To facilitate these analyses, we have developed HomozygosityMapper, a web-based approach to homozygosity mapping. HomozygosityMapper allows researchers to directly upload the genotype files produced by the major genotyping platforms as well as deep sequencing data. It detects stretches of homozygosity shared by the affected individuals and displays them graphically. Users can interactively inspect the underlying genotypes, manually refine these regions and eventually submit them to our candidate gene search engine GeneDistiller to identify the most promising candidate genes. Here, we present the new version of HomozygosityMapper. The most striking new feature is the support of Next Generation Sequencing *.vcf files as input. Upon users' requests, we have implemented the analysis of common experimental rodents as well as of important farm animals. Furthermore, we have extended the options for single families and loss of heterozygosity studies. Another new feature is the export of *.bed files for targeted enrichment of the potential disease regions for deep sequencing strategies. HomozygosityMapper also generates files for conventional linkage analyses which are already restricted to the possible disease regions, hence superseding CPU-intensive genome-wide analyses. HomozygosityMapper is freely available at http://www.homozygositymapper.org/.
DNA barcoding of freshwater fishes and the development of a quantitative qPCR assay for the species-specific detection and quantification of fish larvae from plankton samples.

PubMed

Loh, W K W; Bond, P; Ashton, K J; Roberts, D T; Tibbetts, I R

2014-08-01

The barcoding of mitochondrial cytochrome c oxidase subunit 1 (coI) gene was amplified and sequenced from 16 species of freshwater fishes found in Lake Wivenhoe (south-eastern Queensland, Australia) to support monitoring of reservoir fish populations, ecosystem function and water health. In this study, 630-650 bp sequences of the coI barcoding gene from 100 specimens representing 15 genera, 13 families and two subclasses of fishes allowed 14 of the 16 species to be identified and differentiated. The mean ± s.e. Kimura 2 parameter divergence within and between species was 0.52 ± 0.10 and 23.8 ± 2.20% respectively, indicating that barcodes can be used to discriminate most of the fish species accurately. The two terapontids, Amniataba percoides and Leiopotherapon unicolor, however, shared coI DNA sequences and could not be differentiated using this gene. A barcoding database was established and a qPCR assay was developed using coI sequences to identify and quantify proportional abundances of fish species in ichthyoplankton samples from Lake Wivenhoe. These methods provide a viable alternative to the time-consuming process of manually enumerating and identifying ichthyoplankton samples. © 2014 The Fisheries Society of the British Isles.
Identification of Human Lineage-Specific Transcriptional Coregulators Enabled by a Glossary of Binding Modules and Tunable Genomic Backgrounds.

PubMed

Mariani, Luca; Weinand, Kathryn; Vedenko, Anastasia; Barrera, Luis A; Bulyk, Martha L

2017-09-27

Transcription factors (TFs) control cellular processes by binding specific DNA motifs to modulate gene expression. Motif enrichment analysis of regulatory regions can identify direct and indirect TF binding sites. Here, we created a glossary of 108 non-redundant TF-8mer "modules" of shared specificity for 671 metazoan TFs from publicly available and new universal protein binding microarray data. Analysis of 239 ENCODE TF chromatin immunoprecipitation sequencing datasets and associated RNA sequencing profiles suggest the 8mer modules are more precise than position weight matrices in identifying indirect binding motifs and their associated tethering TFs. We also developed GENRE (genomically equivalent negative regions), a tunable tool for construction of matched genomic background sequences for analysis of regulatory regions. GENRE outperformed four state-of-the-art approaches to background sequence construction. We used our TF-8mer glossary and GENRE in the analysis of the indirect binding motifs for the co-occurrence of tethering factors, suggesting novel TF-TF interactions. We anticipate that these tools will aid in elucidating tissue-specific gene-regulatory programs. Copyright © 2017 Elsevier Inc. All rights reserved.
Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species.

PubMed

Hezroni, Hadas; Koppstein, David; Schwartz, Matthew G; Avrutin, Alexandra; Bartel, David P; Ulitsky, Igor

2015-05-19

The inability to predict long noncoding RNAs from genomic sequence has impeded the use of comparative genomics for studying their biology. Here, we develop methods that use RNA sequencing (RNA-seq) data to annotate the transcriptomes of 16 vertebrates and the echinoid sea urchin, uncovering thousands of previously unannotated genes, most of which produce long intervening noncoding RNAs (lincRNAs). Although in each species, >70% of lincRNAs cannot be traced to homologs in species that diverged >50 million years ago, thousands of human lincRNAs have homologs with similar expression patterns in other species. These homologs share short, 5'-biased patches of sequence conservation nested in exonic architectures that have been extensively rewired, in part by transposable element exonization. Thus, over a thousand human lincRNAs are likely to have conserved functions in mammals, and hundreds beyond mammals, but those functions require only short patches of specific sequences and can tolerate major changes in gene architecture. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Identification and Characterization of a Pesticide Degrading Flavobacterium Species EMBS0145 by 16S rRNA Gene Sequencing.

PubMed

Nayarisseri, Anuraj; Suppahia, Anjana; Nadh, Anuroopa G; Nair, Achuthsankar S

2015-06-01

Organophosphates like chlorpyrifos, diazinon, or malathion have become most common and indisputably most toxic pest control agents that adversely affects the human nervous system even at low levels of exposure. Because of their relatively low cost and ability to be applied on a wide range of target insects and crop, organophosphorus pesticides account for a large share of all insecticides used in India, and this in turn raises severe health concerns. In this view, the present investigation was aimed to identify novel species of Flavobacterium bacteria which is bestowed with the capacity to degrade pesticides like chlorpyrifos, diazinon, or malathion. The bacterium was isolated from agricultural soil collected from Guntur District, Andhra Pradesh, India. The samples were serially diluted, and the aliquots were incubated for a suitable time following which the suspected colony was subjected to 16S rRNA gene sequencing. The sequence thus obtained was aligned pairwise against Flavobacterium species, which resulted in identification of novel species of Flavobacterium later which was named as EMBS0145 and sequence was deposited in GenBank with Accession Number: JN794045.
Identification and characterization of a pesticide degrading flavobacterium species EMBS0145 by 16S rRNA gene sequencing.

PubMed

Nayarisseri, Anuraj; Suppahia, Anjana; Nadh, Anuroopa G; Nair, Achuthsankar S

2014-08-09

Organophosphates (OPs) like chlorpyrifos, diazinon, or malathion have become most common and indisputably most toxic pest-control agents that adversely affects the human nervous system even at low levels of exposure. Because of their relatively low cost and ability to be applied on a wide range of target insects and crop, organophosphorus pesticides account for a large share of all insecticides used in India, this in turn raises severe health concerns. In this view, the present investigation was aimed to identify novel species of Flavobacterium bacteria which is bestowed with the capacity to degrade pesticides like chlorpyrifos, diazinon or malathion. The bacterium was isolated from agricultural soil collected from Guntur District, Andhra Pradesh, India. The samples were serially diluted and the aliquots were incubated for a suitable time following which the suspected colony was subjected to 16S rRNA gene sequencing. The sequence thus obtained was aligned pairwise against Flavobacterium species, which resulted in identification of novel species of Flavobacterium later which was named as EMBS0145 and sequence was deposited in GenBank with accession number JN794045.
Resequencing Pathogen Microarray (RPM) for prospective detection and identification of emergent pathogen strains and variants

NASA Astrophysics Data System (ADS)

Tibbetts, Clark; Lichanska, Agnieszka M.; Borsuk, Lisa A.; Weslowski, Brian; Morris, Leah M.; Lorence, Matthew C.; Schafer, Klaus O.; Campos, Joseph; Sene, Mohamadou; Myers, Christopher A.; Faix, Dennis; Blair, Patrick J.; Brown, Jason; Metzgar, David

2010-04-01

High-density resequencing microarrays support simultaneous detection and identification of multiple viral and bacterial pathogens. Because detection and identification using RPM is based upon multiple specimen-specific target pathogen gene sequences generated in the individual test, the test results enable both a differential diagnostic analysis and epidemiological tracking of detected pathogen strains and variants from one specimen to the next. The RPM assay enables detection and identification of pathogen sequences that share as little as 80% sequence similarity to prototype target gene sequences represented as detector tiles on the array. This capability enables the RPM to detect and identify previously unknown strains and variants of a detected pathogen, as in sentinel cases associated with an infectious disease outbreak. We illustrate this capability using assay results from testing influenza A virus vaccines configured with strains that were first defined years after the design of the RPM microarray. Results are also presented from RPM-Flu testing of three specimens independently confirmed to the positive for the 2009 Novel H1N1 outbreak strain of influenza virus.
Preferential V beta gene usage and lack of junctional sequence conservation among human T cell receptors specific for a tetanus toxin- derived peptide: evidence for a dominant role of a germline-encoded V region in antigen/major histocompatibility complex recognition

PubMed Central

1992-01-01

To investigate the structural and genetic basis of the T cell response to defined peptide/major histocompatibility (MHC) class II complexes in humans, we established a large panel of T cell clones (61) from donors of different HLA-DR haplotypes and reactive with a tetanus toxin- derived peptide (tt830-844) recognized in association with most DR molecules (universal peptide). By using a bacterial enterotoxin-based proliferation assay and cDNA sequencing, we found preferential use of a particular V beta region gene segment, V beta 2.1, in three of the individuals studied (64%, n = 58), irrespective of whether the peptide was presented by the DR6wcI, DR4w4, or DRw11.1 and DRw11.2 alleles, demonstrating that shared MHC class II antigens are not required for shared V beta gene use by T cell receptors (TCRs) specific for this peptide. V alpha gene use was more heterogeneous, with at least seven different V alpha segments derived from five distinct families encoding alpha chains able to pair with V beta 2.1 chains to form a tt830-844/DR- specific binding site. Several cases were found of clones restricted to different DR alleles that expressed identical V beta and (or very closely related) V alpha gene segments and that differed only in their junctional sequences. Thus, changes in the putative complementary determining region 3 (CDR3) of the TCR may, in certain cases, alter MHC specificity and maintain peptide reactivity. Finally, in contrast to what has been observed in other defined peptide/MHC systems, a striking heterogeneity was found in the junctional regions of both alpha and beta chains, even for TCRs with identical V alpha and/or V beta gene segments and the same restriction. Among 14 anti-tt830-844 clones using the V beta 2.1 gene segment, 14 unique V beta-D-J beta junctions were found, with no evident conservation in length and/or amino acid composition. One interpretation for this apparent lack of coselection of specific junctional sequences in the context of a common V element, V beta 2.1, is that this V region plays a dominant role in the recognition of the tt830-844/DR complex. PMID:1371303
Two Orangutan Species Have Evolved Different KIR Alleles and Haplotypes1

PubMed Central

Guethlein, Lisbeth A.; Norman, Paul J.; Heijmans, Corinne M. C.; de Groot, Natasja G.; Hilton, Hugo G.; Babrzadeh, Farbod; Abi-Rached, Laurent; Bontrop, Ronald E.; Parham, Peter

2017-01-01

The immune and reproductive functions of human Natural Killer (NK) cells are regulated by interactions of the C1 and C2 epitopes of HLA-C with C1-specific and C2-specific lineage III killer cell immunoglobulin-like receptors (KIR). This rapidly evolving and diverse system of ligands and receptors is restricted to humans and great apes. In this context, the orangutan has particular relevance because it represents an evolutionary intermediate, one having the C1 epitope and corresponding KIR, but lacking the C2 epitope. Through a combination of direct sequencing, KIR genotyping and data mining from the Great Ape Genome Project (GAGP) we characterized the KIR alleles and haplotypes for panels of ten Bornean orangutans and 19 Sumatran orangutans. The orangutan KIR haplotypes have between five and ten KIR genes. The seven orangutan lineage III KIR genes all locate to the centromeric region of the KIR locus, whereas their human counterparts also populate the telomeric region. One lineage III KIR gene is Bornean-specific, one is Sumatran-specific and five are shared. Of twelve KIR gene-content haplotypes five are Bornean-specific, five are Sumatran-specific and two are shared. The haplotypes have different combinations of genes encoding activating and inhibitory C1 receptors that can be of higher or lower affinity. All haplotypes encode an inhibitory C1 receptor, but only some haplotypes encode an activating C1 receptor. Of 130 KIR alleles, 55 are Bornean-specific, 65 are Sumatran specific and ten are shared. PMID:28264973
An ergot alkaloid biosynthesis gene and clustered hypothetical genes from Aspergillus fumigatus.

PubMed

Coyle, Christine M; Panaccione, Daniel G

2005-06-01

The ergot alkaloids are a family of indole-derived mycotoxins with a variety of significant biological activities. Aspergillus fumigatus, a common airborne fungus and opportunistic human pathogen, and several fungi in the relatively distant taxon Clavicipitaceae (clavicipitaceous fungi) produce different sets of ergot alkaloids. The ergot alkaloids of these divergent fungi share a four-member ergoline ring but differ in the number, type, and position of the side chains. Several genes required for ergot alkaloid production are known in the clavicipitaceous fungi, and these genes are clustered in the genome of the ergot fungus Claviceps purpurea. We investigated whether the ergot alkaloids of A. fumigatus have a common biosynthetic and genetic origin with those of the clavicipitaceous fungi. A homolog of dmaW, the gene controlling the determinant step in the ergot alkaloid pathway of clavicipitaceous fungi, was identified in the A. fumigatus genome. Knockout of dmaW eliminated all known ergot alkaloids from A. fumigatus, and complementation of the mutation restored ergot alkaloid production. Clustered with dmaW in the A. fumigatus genome are sequences corresponding to five genes previously proposed to encode steps in the ergot alkaloid pathway of C. purpurea, as well as additional sequences whose deduced protein products are consistent with their involvement in the ergot alkaloid pathway. The corresponding genes have similarities in their nucleotide sequences, but the orientations and positions within the cluster of several of these genes differ. The data indicate that the ergot alkaloid biosynthetic capabilities in A. fumigatus and the clavicipitaceous fungi had a common origin.
An Ergot Alkaloid Biosynthesis Gene and Clustered Hypothetical Genes from Aspergillus fumigatus†

PubMed Central

Coyle, Christine M.; Panaccione, Daniel G.

2005-01-01

The ergot alkaloids are a family of indole-derived mycotoxins with a variety of significant biological activities. Aspergillus fumigatus, a common airborne fungus and opportunistic human pathogen, and several fungi in the relatively distant taxon Clavicipitaceae (clavicipitaceous fungi) produce different sets of ergot alkaloids. The ergot alkaloids of these divergent fungi share a four-member ergoline ring but differ in the number, type, and position of the side chains. Several genes required for ergot alkaloid production are known in the clavicipitaceous fungi, and these genes are clustered in the genome of the ergot fungus Claviceps purpurea. We investigated whether the ergot alkaloids of A. fumigatus have a common biosynthetic and genetic origin with those of the clavicipitaceous fungi. A homolog of dmaW, the gene controlling the determinant step in the ergot alkaloid pathway of clavicipitaceous fungi, was identified in the A. fumigatus genome. Knockout of dmaW eliminated all known ergot alkaloids from A. fumigatus, and complementation of the mutation restored ergot alkaloid production. Clustered with dmaW in the A. fumigatus genome are sequences corresponding to five genes previously proposed to encode steps in the ergot alkaloid pathway of C. purpurea, as well as additional sequences whose deduced protein products are consistent with their involvement in the ergot alkaloid pathway. The corresponding genes have similarities in their nucleotide sequences, but the orientations and positions within the cluster of several of these genes differ. The data indicate that the ergot alkaloid biosynthetic capabilities in A. fumigatus and the clavicipitaceous fungi had a common origin. PMID:15933009
Sequence analyses reveal that a TPR–DP module, surrounded by recombinable flanking introns, could be at the origin of eukaryotic Hop and Hip TPR–DP domains and prokaryotic GerD proteins

PubMed Central

Papandreou, Nikolaos; Chomilier, Jacques

2008-01-01

The co-chaperone Hop [heat shock protein (HSP) organising protein] is known to bind both Hsp70 and Hsp90. Hop comprises three repeats of a tetratricopeptide repeat (TPR) domain, each consisting of three TPR motifs. The first and last TPR domains are followed by a domain containing several dipeptide (DP) repeats called the DP domain. These analyses suggest that the hop genes result from successive recombination events of an ancestral TPR–DP module. From a hydrophobic cluster analysis of homologous Hop protein sequences derived from gene families, we can postulate that shifts in the open reading frames are at the origin of the present sequences. Moreover, these shifts can be related to the presence or absence of biological function. We propose to extend the family of Hop co-chaperons into the kingdom of bacteria, as several structurally related genes have been identified by hydrophobic cluster analysis. We also provide evidence of common structural characteristics between hop and hip genes, suggesting a shared precursor of ancestral TPR–DP domains. Electronic supplementary material The online version of this article (doi:10.1007/s12192-008-0083-8) contains supplementary material, which is available to authorized users. PMID:18987995
Complete Chloroplast Genome Sequence of Coptis chinensis Franch. and Its Evolutionary History

PubMed Central

He, Yang; Deng, Cao; Fan, Gang; Qin, Shishang

2017-01-01

The Coptis chinensis Franch. is an important medicinal plant from the Ranunculales. We used next generation sequencing technology to determine the complete chloroplast genome of C. chinensis. This genome is 155,484 bp long with 38.17% GC content. Two 26,758 bp long inverted repeats separated the genome into a typical quadripartite structure. The C. chinensis chloroplast genome consists of 128 gene loci, including eight rRNA gene loci, 28 tRNA gene loci, and 92 protein-coding gene loci. Most of the SSRs in C. chinensis are poly-A/T. The numbers of mononucleotide SSRs in C. chinensis and other Ranunculaceae species are fewer than those in Berberidaceae species, while the number of dinucleotide SSRs is greater than that in the Berberidaceae. C. chinensis diverged from other Ranunculaceae species an estimated 81 million years ago (Mya). The divergence between Ranunculaceae and Berberidaceae was ~111 Mya, while the Ranunculales and Magnoliaceae shared a common ancestor during the Jurassic, ~153 Mya. Position 104 of the C. chinensis ndhG protein was identified as a positively selected site, indicating possible selection for the photosystem-chlororespiration system in C. chinensis. In summary, the complete sequencing and annotation of the C. chinensis chloroplast genome will facilitate future studies on this important medicinal species. PMID:28698879
Next Generation Sequencing and ALS: known genes, different phenotyphes.

PubMed

Campopiano, Rosa; Ryskalin, Larisa; Giardina, Emiliano; Zampatti, Stefania; Busceti, Carla L; Biagioni, Francesca; Ferese, Rosangela; Storto, Marianna; Gambardella, Stefano; Fornai, Francesco

2017-12-01

Amyotrophic lateral sclerosis (ALS) is fatal neurodegenerative disease clinically characterized by upper and lower motor neuron dysfunction resulting in rapidly progressive paralysis and death from respiratory failure. Most cases appear to be sporadic, but 5-10 % of cases have a family history of the disease, and over the last decade, identification of mutations in about 20 genes predisposing to these disorders has provided the means to better understand their pathogenesis. Next Generation sequencing (NGS) is an advanced high-throughput DNA sequencing technology which have rapidly contributed to an acceleration in the discovery of genetic risk factors for both familial and sporadic neurological and neurodegenerative diseases. These strategies allowed to rapidly identify disease-associated variants and genetic risk factors for both familial (fALS) and sporadic ALS (sALS), strongly contributing to the knowledge of the genetic architecture of ALS. Moreover, as the number of ALS genes grows, many of the proteins they encode are in intracellular processes shared with other known diseases, suggesting an overlapping of clinical and phatological features between different diseases. To emphasize this concept, the review focuses on genes coding for Valosin-containing protein (VPC) and two Heterogeneous nuclear RNA-binding proteins (HNRNPA1 and hnRNPA2B1), recently idefied through NGS, where different mutations have been associated in both ALS and other neurological and neurodegenerative diseases.

Somatic mutations in benign breast disease tissue and risk of subsequent invasive breast cancer.

PubMed

Rohan, Thomas E; Miller, Christopher A; Li, Tiandao; Wang, Yihong; Loudig, Olivier; Ginsberg, Mindy; Glass, Andrew; Mardis, Elaine

2018-06-06

Insights into the molecular pathogenesis of breast cancer might come from molecular analysis of tissue from early stages of the disease. We conducted a case-control study nested in a cohort of women who had biopsy-confirmed benign breast disease (BBD) diagnosed between 1971 and 2006 at Kaiser Permanente Northwest and who were followed to mid-2015 to ascertain subsequent invasive breast cancer (IBC); cases (n = 218) were women with BBD who developed subsequent IBC and controls, individually matched (1:1) to cases, were women with BBD who did not develop IBC in the same follow-up interval as that for the corresponding case. Targeted sequence capture and sequencing were performed for 83 genes of importance in breast cancer. There were no significant case-control differences in mutation burden overall, for non-silent mutations, for individual genes, or with respect either to the nature of the gene mutations or to mutational enrichment at the pathway level. For seven subjects with DNA from the BBD and ipsilateral IBC, virtually no mutations were shared. This study, the first to use a targeted multi-gene sequencing approach on early breast cancer precursor lesions to investigate the genomic basis of the disease, showed that somatic mutations detected in BBD tissue were not associated with breast cancer risk.
Complete mitochondrial genome of Porzana fusca and Porzana pusilla and phylogenetic relationship of 16 Rallidae species.

PubMed

Chen, Peng; Han, Yuqing; Zhu, Chaoying; Gao, Bin; Ruan, Luzhang

2017-12-01

The complete mitochondrial genome sequences of Porzana fusca and Porzana pusilla were determined. The two avian species share a high degree of homology in terms of mitochondrial genome organization and gene arrangement. Their corresponding mitochondrial genomes are 16,935 and 16,978 bp and consist of 37 genes and a control region. Their PCGs were both 11,365 bp long and have similar structure. Their tRNA gene sequences could be folded into canonical cloverleaf secondary structure, except for tRNA Ser (AGY) , which lost its "DHU" arm. Based on the concatenated nucleotide sequences of the complete mitochondrial DNA genes of 16 Rallidae species, reconstruction of phylogenetic trees and analysis of the molecular clock of P. fusca and P. pusilla indicated that these species from a sister group, which in turn are sister group to Rallina eurizonoides. The genus Gallirallus is a sister group to genus Lewinia, and these groups in turn are sister groups to genus Porphyrio. Moreover, molecular clock analyses suggested that the basal divergence of Rallidae could be traced back to 40.47 (41.46‒39.45) million years ago (Mya), and the divergence of Porzana occurred approximately 5.80 (15.16‒0.79) Mya.
Chalcone synthase genes from milk thistle (Silybum marianum): isolation and expression analysis.

PubMed

Sanjari, Sepideh; Shobbar, Zahra Sadat; Ebrahimi, Mohsen; Hasanloo, Tahereh; Sadat-Noori, Seyed-Ahmad; Tirnaz, Soodeh

2015-12-01

Silymarin is a flavonoid compound derived from milk thistle (Silybum marianum) seeds which has several pharmacological applications. Chalcone synthase (CHS) is a key enzyme in the biosynthesis of flavonoids; thereby, the identification of CHS encoding genes in milk thistle plant can be of great importance. In the current research, fragments of CHS genes were amplified using degenerate primers based on the conserved parts of Asteraceae CHS genes, and then cloned and sequenced. Analysis of the resultant nucleotide and deduced amino acid sequences led to the identification of two different members of CHS gene family,SmCHS1 and SmCHS2. Third member, full-length cDNA (SmCHS3) was isolated by rapid amplification of cDNA ends (RACE), whose open reading frame contained 1239 bp including exon 1 (190 bp) and exon 2 (1049 bp), encoding 63 and 349 amino acids, respectively. In silico analysis of SmCHS3 sequence contains all the conserved CHS sites and shares high homology with CHS proteins from other plants.Real-time PCR analysis indicated that SmCHS1 and SmCHS3 had the highest transcript level in petals in the early flowering stage and in the stem of five upper leaves, followed by five upper leaves in the mid-flowering stage which are most probably involved in anthocyanin and silymarin biosynthesis.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Howe, Adina; Yang, Fan; Williams, Ryan J.

Despite the central role of soil microbial communities in global carbon (C) cycling, little is known about soil microbial community structure and even less about their metabolic pathways. Efforts to characterize soil communities often focus on identifying differences in gene content across environmental gradients, but an alternative question is what genes are similar in soils. These genes may indicate critical species or potential functions that are required in all soils. Here we identified the “core” set of C cycling sequences widely present in multiple soil metagenomes from a fertilized prairie (FP). Of 226,887 sequences associated with known enzymes involved inmore » the synthesis, metabolism, and transport of carbohydrates, 843 were identified to be consistently prevalent across four replicate soil metagenomes. This core metagenome was functionally and taxonomically diverse, representing five enzyme classes and 99 enzyme families within the CAZy database. Though it only comprised 0.4% of all CAZy-associated genes identified in FP metagenomes, the core was found to be comprised of functions similar to those within cumulative soils. The FP CAZy-associated core sequences were present in multiple publicly available soil metagenomes and most similar to soils sharing geographic proximity. As a result, in soil ecosystems, where high diversity remains a key challenge for metagenomic investigations, these core genes represent a subset of critical functions necessary for carbohydrate metabolism, which can be targeted to evaluate important C fluxes in these and other similar soils.« less
A new polymorphic and multicopy MHC gene family related to nonmammalian class I

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leelayuwat, C.; Degli-Esposti, M.A.; Abraham, L.J.

1994-12-31

The authors have used genomic analysis to characterize a region of the central major histocompatibility complex (MHC) spanning {approximately} 300 kilobases (kb) between TNF and HLA-B. This region has been suggested to carry genetic factors relevant to the development of autoimmune diseases such as myasthenia gravis (MG) and insulin dependent diabetes mellitus (IDDM). Genomic sequence was analyzed for coding potential, using two neural network programs, GRAIL and GeneParser. A genomic probe, JAB, containing putative coding sequences (PERB11) located 60 kb centromeric of HLA-B, was used for northern analysis of human tissues. Multiple transcripts were detected. Southern analysis of genomic DNAmore » and overlapping YAC clones, covering the region from BAT1 to HLA-F, indicated that there are at least five copies of PERB11, four of which are located within this region of the MHC. The partial cDNA sequence of PERB11 was obtained from poly-A RNA derived from skeletal muscle. The putative amino acid sequence of PERB11 shares {approximately} 30% identity to MHC class I molecules from various species, including reptiles, chickens, and frogs, as well as to other MHC class I-like molecules, such as the IgG FcR of the mouse and rat and the human Zn-{alpha}2-glycoprotein. From direct comparison of amino acid sequences, it is concluded that PERB11 is a distinct molecule more closely related to nonmammalian than known mammalian MHC class I molecules. Genomic sequence analysis of PERB11 from five MHC ancestral haplotypes (AH) indicated that the gene is polymorphic at both DNA and protein level. The results suggest that the authors have identified a novel polymorphic gene family with multiple copies within the MHC. 48 refs., 10 figs., 2 tabs.« less
Epidemiology, pathology, and genetic analysis of a canine distemper epidemic in Namibia.

PubMed

Gowtage-Sequeira, Sonya; Banyard, Ashley C; Barrett, Tom; Buczkowski, Hubert; Funk, Stephan M; Cleaveland, Sarah

2009-10-01

Severe population declines have resulted from the spillover of canine distemper virus (CDV) into susceptible wildlife, with both domestic and wild canids being involved in the maintenance and transmission of the virus. This study (March 2001 to October 2003) collated case data, serologic, pathologic, and molecular data to describe the spillover of CDV from domestic dogs (Canis familiaris) to black-backed jackals (Canis mesomelas) during an epidemic on the Namibian coast. Antibody prevalence in jackals peaked at 74.1%, and the clinical signs and histopathologic observations closely resembled those observed in domestic dog cases. Viral RNA was isolated from the brain of a domestic dog from the outbreak area. Sequence data from the phosphoprotein (P) gene and the hemagglutinin (H) genes were used for phylogenetic analyses. The P gene sequence from the domestic dog shared 98% identity with the sequence data available for other CDV isolates of African carnivores. For the H gene, the two sequences available from the outbreak that decimated the lion population in Tanzania in 1994 were the closest match with the Namibian sample, being 94% identical across 1,122 base pairs (bp). Phylogenetic analyses based on this region clustered the Namibian sample with the CDV that is within the morbilliviruses. This is the first description of an epidemic involving black-backed jackals in Namibia, demonstrating that this species has the capacity for rapid and large-scale dissemination of CDV. This work highlights the threat posed to endangered wildlife in Namibia by the spillover of CDV from domestic dog populations. Very few sequence data are currently available for CDV isolates from African carnivores, and this work provides the first sequence data from a Namibian CDV isolate.
Comparative genomics and evolution of the amylase-binding proteins of oral streptococci.

PubMed

Haase, Elaine M; Kou, Yurong; Sabharwal, Amarpreet; Liao, Yu-Chieh; Lan, Tianying; Lindqvist, Charlotte; Scannapieco, Frank A

2017-04-20

Successful commensal bacteria have evolved to maintain colonization in challenging environments. The oral viridans streptococci are pioneer colonizers of dental plaque biofilm. Some of these bacteria have adapted to life in the oral cavity by binding salivary α-amylase, which hydrolyzes dietary starch, thus providing a source of nutrition. Oral streptococcal species bind α-amylase by expressing a variety of amylase-binding proteins (ABPs). Here we determine the genotypic basis of amylase binding where proteins of diverse size and function share a common phenotype. ABPs were detected in culture supernatants of 27 of 59 strains representing 13 oral Streptococcus species screened using the amylase-ligand binding assay. N-terminal sequences from ABPs of diverse size were obtained from 18 strains representing six oral streptococcal species. Genome sequencing and BLAST searches using N-terminal sequences, protein size, and key words identified the gene associated with each ABP. Among the sequenced ABPs, 14 matched amylase-binding protein A (AbpA), 6 matched amylase-binding protein B (AbpB), and 11 unique ABPs were identified as peptidoglycan-binding, glutamine ABC-type transporter, hypothetical, or choline-binding proteins. Alignment and phylogenetic analyses performed to ascertain evolutionary relationships revealed that ABPs cluster into at least six distinct, unrelated families (AbpA, AbpB, and four novel ABPs) with no phylogenetic evidence that one group evolved from another, and no single ancestral gene found within each group. AbpA-like sequences can be divided into five subgroups based on the N-terminal sequences. Comparative genomics focusing on the abpA gene locus provides evidence of horizontal gene transfer. The acquisition of an ABP by oral streptococci provides an interesting example of adaptive evolution.
Polynucleobacter meluiroseus sp. nov., a bacterium isolated from a lake located in the mountains of the Mediterranean island of Corsica.

PubMed

Pitt, Alexandra; Schmidt, Johanna; Lang, Elke; Whitman, William B; Woyke, Tanja; Hahn, Martin W

2018-06-01

Strain AP-Melu-1000-B4 was isolated from a lake located in the mountains of the Mediterranean island of Corsica (France). Phenotypic, chemotaxonomic and genomic traits were investigated. Phylogenetic analyses based on 16S rRNA gene sequencing referred the strain to the cryptic species complex PnecC within the genus Polynucleobacter. The strain encoded genes for biosynthesis of proteorhodopsin and retinal. When pelleted by centrifugation the strain showed an intense rose colouring. Major fatty acids were C16 : 1ω7c, C16 : 0, C18 : 1ω7c and summed feature 2 (C16 : 1 isoI and C14 : 0-3OH). The sequence of the 16S rRNA gene contained an indel which was not present in any previously described Polynucleobacter species. Genome sequencing revealed a genome size of 1.89 Mbp and a G+C content of 46.6 mol%. In order to resolve the phylogenetic position of the new strain within subcluster PnecC, its phylogeny was reconstructed from sequences of 319 shared genes. To represent all currently described Polynucleobacter species by whole genome sequences, three type strains were additionally sequenced. Our phylogenetic analysis revealed that strain AP-Melu-100-B4 occupied a basal position compared with previously described PnecC strains. Pairwise determined whole genome average nucleotide identity (gANI) values suggested that strain AP-Melu-1000-B4 represents a new species, for which we propose the name Polynucleobacter meluiroseus sp. nov. with the type strain AP-Melu-1000-B4 T (=DSM 103591 T =CIP 111329 T ).
Genome-Wide Comparative Analysis Reveals Similar Types of NBS Genes in Hybrid Citrus sinensis Genome and Original Citrus clementine Genome and Provides New Insights into Non-TIR NBS Genes

PubMed Central

Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K.; Duan, Yongping; Luo, Feng

2015-01-01

In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention. PMID:25811466
Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes.

PubMed

Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K; Duan, Yongping; Luo, Feng

2015-01-01

In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention.
Shifts in the evolutionary rate and intensity of purifying selection between two Brassica genomes revealed by analyses of orthologous transposons and relics of a whole genome triplication.

PubMed

Zhao, Meixia; Du, Jianchang; Lin, Feng; Tong, Chaobo; Yu, Jingyin; Huang, Shunmou; Wang, Xiaowu; Liu, Shengyi; Ma, Jianxin

2013-10-01

Recent sequencing of the Brassica rapa and Brassica oleracea genomes revealed extremely contrasting genomic features such as the abundance and distribution of transposable elements between the two genomes. However, whether and how these structural differentiations may have influenced the evolutionary rates of the two genomes since their split from a common ancestor are unknown. Here, we investigated and compared the rates of nucleotide substitution between two long terminal repeats (LTRs) of individual orthologous LTR-retrotransposons, the rates of synonymous and non-synonymous substitution among triplicated genes retained in both genomes from a shared whole genome triplication event, and the rates of genetic recombination estimated/deduced by the comparison of physical and genetic distances along chromosomes and ratios of solo LTRs to intact elements. Overall, LTR sequences and genic sequences showed more rapid nucleotide substitution in B. rapa than in B. oleracea. Synonymous substitution of triplicated genes retained from a shared whole genome triplication was detected at higher rates in B. rapa than in B. oleracea. Interestingly, non-synonymous substitution was observed at lower rates in the former than in the latter, indicating shifted densities of purifying selection between the two genomes. In addition to evolutionary asymmetry, orthologous genes differentially regulated and/or disrupted by transposable elements between the two genomes were also characterized. Our analyses suggest that local genomic and epigenomic features, such as recombination rates and chromatin dynamics reshaped by independent proliferation of transposable elements and elimination between the two genomes, are perhaps partially the causes and partially the outcomes of the observed inter-specific asymmetric evolution. © 2013 Purdue University The Plant Journal © 2013 John Wiley & Sons Ltd.
Distribution and phylogenetic significance of the 71-kb inversion in the plastid genome in Funariidae (Bryophyta).

PubMed

Goffinet, Bernard; Wickett, Norman J; Werner, Olaf; Ros, Rosa Maria; Shaw, A Jonathan; Cox, Cymon J

2007-04-01

The recent assembly of the complete sequence of the plastid genome of the model taxon Physcomitrella patens (Funariaceae, Bryophyta) revealed that a 71-kb fragment, encompassing much of the large single copy region, is inverted. This inversion of 57% of the genome is the largest rearrangement detected in the plastid genomes of plants to date. Although initially considered diagnostic of Physcomitrella patens, the inversion was recently shown to characterize the plastid genome of two species from related genera within Funariaceae, but was lacking in another member of Funariidae. The phylogenetic significance of the inversion has remained ambiguous. Exemplars of all families included in Funariidae were surveyed. DNA sequences spanning the inversion break ends were amplified, using primers that anneal to genes on either side of the putative end points of the inversion. Primer combinations were designed to yield a product for either the inverted or the non-inverted architecture. The survey reveals that exemplars of eight genera of Funariaceae, the sole species of Disceliaceae and three generic representatives of Encalyptales all share the 71-kb inversion in the large single copy of the plastid genome. By contrast, the plastid genome of Gigaspermaceae (Funariales) is characterized by a gene order congruent with that described for other mosses, liverworts and hornworts, and hence it does not possess this inversion. The phylogenetic distribution of the inversion in the gene order supports a hypothesis only weakly supported by inferences from sequence data whereby Funariales are paraphyletic, with Funariaceae and Disceliaceae sharing a common ancestor with Encalyptales, and Gigaspermaceae sister to this combined clade. To reflect these relationships, Gigaspermaceae are excluded from Funariales and accommodated in their own order, Gigaspermales order nov., within Funariideae.
Novel small RNA (sRNA) landscape of the starvation-stress response transcriptome of Salmonella enterica serovar typhimurium.

PubMed

Amin, Shivam V; Roberts, Justin T; Patterson, Dillon G; Coley, Alexander B; Allred, Jonathan A; Denner, Jason M; Johnson, Justin P; Mullen, Genevieve E; O'Neal, Trenton K; Smith, Jason T; Cardin, Sara E; Carr, Hank T; Carr, Stacie L; Cowart, Holly E; DaCosta, David H; Herring, Brendon R; King, Valeria M; Polska, Caroline J; Ward, Erin E; Wise, Alice A; McAllister, Kathleen N; Chevalier, David; Spector, Michael P; Borchert, Glen M

2016-01-01

Small RNAs (sRNAs) are short (∼50-200 nucleotides) noncoding RNAs that regulate cellular activities across bacteria. Salmonella enterica starved of a carbon-energy (C) source experience a host of genetic and physiological changes broadly referred to as the starvation-stress response (SSR). In an attempt to identify novel sRNAs contributing to SSR control, we grew log-phase, 5-h C-starved and 24-h C-starved cultures of the virulent Salmonella enterica subspecies enterica serovar Typhimurium strain SL1344 and comprehensively sequenced their small RNA transcriptomes. Strikingly, after employing a novel strategy for sRNA discovery based on identifying dynamic transcripts arising from "gene-empty" regions, we identify 58 wholly undescribed Salmonella sRNA genes potentially regulating SSR averaging an ∼1,000-fold change in expression between log-phase and C-starved cells. Importantly, the expressions of individual sRNA loci were confirmed by both comprehensive transcriptome analyses and northern blotting of select candidates. Of note, we find 43 candidate sRNAs share significant sequence identity to characterized sRNAs in other bacteria, and ∼70% of our sRNAs likely assume characteristic sRNA structural conformations. In addition, we find 53 of our 58 candidate sRNAs either overlap neighboring mRNA loci or share significant sequence complementarity to mRNAs transcribed elsewhere in the SL1344 genome strongly suggesting they regulate the expression of transcripts via antisense base-pairing. Finally, in addition to this work resulting in the identification of 58 entirely novel Salmonella enterica genes likely participating in the SSR, we also find evidence suggesting that sRNAs are significantly more prevalent than currently appreciated and that Salmonella sRNAs may actually number in the thousands.
Novel small RNA (sRNA) landscape of the starvation-stress response transcriptome of Salmonella enterica serovar typhimurium

PubMed Central

Amin, Shivam V.; Roberts, Justin T.; Patterson, Dillon G.; Coley, Alexander B.; Allred, Jonathan A.; Denner, Jason M.; Johnson, Justin P.; Mullen, Genevieve E.; O'Neal, Trenton K.; Smith, Jason T.; Cardin, Sara E.; Carr, Hank T.; Carr, Stacie L.; Cowart, Holly E.; DaCosta, David H.; Herring, Brendon R.; King, Valeria M.; Polska, Caroline J.; Ward, Erin E.; Wise, Alice A.; McAllister, Kathleen N.; Chevalier, David; Spector, Michael P.; Borchert, Glen M.

2016-01-01

ABSTRACT Small RNAs (sRNAs) are short (∼50–200 nucleotides) noncoding RNAs that regulate cellular activities across bacteria. Salmonella enterica starved of a carbon-energy (C) source experience a host of genetic and physiological changes broadly referred to as the starvation-stress response (SSR). In an attempt to identify novel sRNAs contributing to SSR control, we grew log-phase, 5-h C-starved and 24-h C-starved cultures of the virulent Salmonella enterica subspecies enterica serovar Typhimurium strain SL1344 and comprehensively sequenced their small RNA transcriptomes. Strikingly, after employing a novel strategy for sRNA discovery based on identifying dynamic transcripts arising from “gene-empty” regions, we identify 58 wholly undescribed Salmonella sRNA genes potentially regulating SSR averaging an ∼1,000-fold change in expression between log-phase and C-starved cells. Importantly, the expressions of individual sRNA loci were confirmed by both comprehensive transcriptome analyses and northern blotting of select candidates. Of note, we find 43 candidate sRNAs share significant sequence identity to characterized sRNAs in other bacteria, and ∼70% of our sRNAs likely assume characteristic sRNA structural conformations. In addition, we find 53 of our 58 candidate sRNAs either overlap neighboring mRNA loci or share significant sequence complementarity to mRNAs transcribed elsewhere in the SL1344 genome strongly suggesting they regulate the expression of transcripts via antisense base-pairing. Finally, in addition to this work resulting in the identification of 58 entirely novel Salmonella enterica genes likely participating in the SSR, we also find evidence suggesting that sRNAs are significantly more prevalent than currently appreciated and that Salmonella sRNAs may actually number in the thousands. PMID:26853797
Shared genetic predisposition in rheumatoid arthritis-interstitial lung disease and familial pulmonary fibrosis.

PubMed

Juge, Pierre-Antoine; Borie, Raphaël; Kannengiesser, Caroline; Gazal, Steven; Revy, Patrick; Wemeau-Stervinou, Lidwine; Debray, Marie-Pierre; Ottaviani, Sébastien; Marchand-Adam, Sylvain; Nathan, Nadia; Thabut, Gabriel; Richez, Christophe; Nunes, Hilario; Callebaut, Isabelle; Justet, Aurélien; Leulliot, Nicolas; Bonnefond, Amélie; Salgado, David; Richette, Pascal; Desvignes, Jean-Pierre; Lioté, Huguette; Froguel, Philippe; Allanore, Yannick; Sand, Olivier; Dromer, Claire; Flipo, René-Marc; Clément, Annick; Béroud, Christophe; Sibilia, Jean; Coustet, Baptiste; Cottin, Vincent; Boissier, Marie-Christophe; Wallaert, Benoit; Schaeverbeke, Thierry; Dastot le Moal, Florence; Frazier, Aline; Ménard, Christelle; Soubrier, Martin; Saidenberg, Nathalie; Valeyre, Dominique; Amselem, Serge; Boileau, Catherine; Crestani, Bruno; Dieudé, Philippe

2017-05-01

Despite its high prevalence and mortality, little is known about the pathogenesis of rheumatoid arthritis-associated interstitial lung disease (RA-ILD). Given that familial pulmonary fibrosis (FPF) and RA-ILD frequently share the usual pattern of interstitial pneumonia and common environmental risk factors, we hypothesised that the two diseases might share additional risk factors, including FPF-linked genes. Our aim was to identify coding mutations of FPF-risk genes associated with RA-ILD.We used whole exome sequencing (WES), followed by restricted analysis of a discrete number of FPF-linked genes and performed a burden test to assess the excess number of mutations in RA-ILD patients compared to controls.Among the 101 RA-ILD patients included, 12 (11.9%) had 13 WES-identified heterozygous mutations in the TERT , RTEL1 , PARN or SFTPC coding regions . The burden test, based on 81 RA-ILD patients and 1010 controls of European ancestry, revealed an excess of TERT , RTEL1 , PARN or SFTPC mutations in RA-ILD patients (OR 3.17, 95% CI 1.53-6.12; p=9.45×10 -4 ). Telomeres were shorter in RA-ILD patients with a TERT , RTEL1 or PARN mutation than in controls (p=2.87×10 -2 ).Our results support the contribution of FPF-linked genes to RA-ILD susceptibility. Copyright ©ERS 2017.
Unexpected genetic heterogeneity for primary ciliary dyskinesia in the Irish Traveller population.

PubMed

Casey, Jillian P; McGettigan, Paul A; Healy, Fiona; Hogg, Claire; Reynolds, Alison; Kennedy, Breandan N; Ennis, Sean; Slattery, Dubhfeasa; Lynch, Sally A

2015-02-01

We present a study of five children from three unrelated Irish Traveller families presenting with primary ciliary dyskinesia (PCD). As previously characterized disorders in the Irish Traveller population are caused by common homozygous mutations, we hypothesised that all three PCD families shared the same recessive mutation. However, exome sequencing showed that there was no pathogenic homozygous mutation common to all families. This finding was supported by histology, which showed that each family has a different type of ciliary defect; transposition defect (family A), nude epithelium (family B) and absence of inner and outer dynein arms (family C). Therefore, each family was analysed independently using homozygosity mapping and exome sequencing. The affected siblings in family A share a novel 1 bp duplication in RSPH4A (NM_001161664.1:c.166dup; p.Arg56Profs*11), a radial-spoke head protein involved in ciliary movement. In family B, we identified three candidate genes (CCNO, KCNN3 and CDKN1C), with a 5-bp duplication in CCNO (NM_021147.3:c.258_262dup; p.Gln88Argfs*8) being the most likely cause of ciliary aplasia. This is the first study to implicate CCNO, a DNA repair gene reported to be involved in multiciliogenesis, in PCD. In family C, we identified a ∼3.5-kb deletion in DYX1C1, a neuronal migration gene previously associated with PCD. This is the first report of a disorder in the relatively small Irish Traveller population to be caused by >1 disease gene. Our study identified at least three different PCD genes in the Irish Traveller population, highlighting that one cannot always assume genetic homogeneity, even in small consanguineous populations.
Genome-Wide Identification and Evolution Analysis of Trehalose-6-Phosphate Synthase Gene Family in Nelumbo nucifera

PubMed Central

Jin, Qijiang; Hu, Xin; Li, Xin; Wang, Bei; Wang, Yanjie; Jiang, Hongwei; Mattson, Neil; Xu, Yingchun

2016-01-01

Trehalose-6-phosphate synthase (TPS) plays a key role in plant carbohydrate metabolism and the perception of carbohydrate availability. In the present work, the publicly available Nelumbo nucifera (lotus) genome sequence database was analyzed which led to identification of nine lotus TPS genes (NnTPS). It was found that at least two introns are included in the coding sequences of NnTPS genes. When the motif compositions were analyzed we found that NnTPS generally shared the similar motifs, implying that they have similar functions. The dN/dS ratios were always less than 1 for different domains and regions outside domains, suggesting purifying selection on the lotus TPS gene family. The regions outside TPS domain evolved relatively faster than NnTPS domains. A phylogenetic tree was constructed using all predicted coding sequences of lotus TPS genes, together with those from Arabidopsis, poplar, soybean, and rice. The result indicated that those TPS genes could be clearly divided into two main subfamilies (I-II), where each subfamily could be further divided into 2 (I) and 5 (II) subgroups. Analyses of divergence and adaptive evolution show that purifying selection may have been the main force driving evolution of plant TPS genes. Some of the critical sites that contributed to divergence may have been under positive selection. Transcriptome data analysis revealed that most NnTPS genes were predominantly expressed in sink tissues. Expression pattern of NnTPS genes under copper and submergence stress indicated that NNU_014679 and NNU_022788 might play important roles in lotus energy metabolism and participate in stress response. Our results can facilitate further functional studies of TPS genes in lotus. PMID:27746792
Clusters of ancestrally related genes that show paralogy in whole or in part are a major feature of the genomes of humans and other species.

PubMed

Walker, Michael B; King, Benjamin L; Paigen, Kenneth

2012-01-01

Arrangements of genes along chromosomes are a product of evolutionary processes, and we can expect that preferable arrangements will prevail over the span of evolutionary time, often being reflected in the non-random clustering of structurally and/or functionally related genes. Such non-random arrangements can arise by two distinct evolutionary processes: duplications of DNA sequences that give rise to clusters of genes sharing both sequence similarity and common sequence features and the migration together of genes related by function, but not by common descent. To provide a background for distinguishing between the two, which is important for future efforts to unravel the evolutionary processes involved, we here provide a description of the extent to which ancestrally related genes are found in proximity.Towards this purpose, we combined information from five genomic datasets, InterPro, SCOP, PANTHER, Ensembl protein families, and Ensembl gene paralogs. The results are provided in publicly available datasets (http://cgd.jax.org/datasets/clustering/paraclustering.shtml) describing the extent to which ancestrally related genes are in proximity beyond what is expected by chance (i.e. form paraclusters) in the human and nine other vertebrate genomes, as well as the D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae genomes. With the exception of Saccharomyces, paraclusters are a common feature of the genomes we examined. In the human genome they are estimated to include at least 22% of all protein coding genes. Paraclusters are far more prevalent among some gene families than others, are highly species or clade specific and can evolve rapidly, sometimes in response to environmental cues. Altogether, they account for a large portion of the functional clustering previously reported in several genomes.
Identification of Inherited Retinal Disease-Associated Genetic Variants in 11 Candidate Genes.

PubMed

Astuti, Galuh D N; van den Born, L Ingeborgh; Khan, M Imran; Hamel, Christian P; Bocquet, Béatrice; Manes, Gaël; Quinodoz, Mathieu; Ali, Manir; Toomes, Carmel; McKibbin, Martin; El-Asrag, Mohammed E; Haer-Wigman, Lonneke; Inglehearn, Chris F; Black, Graeme C M; Hoyng, Carel B; Cremers, Frans P M; Roosing, Susanne

2018-01-10

Inherited retinal diseases (IRDs) display an enormous genetic heterogeneity. Whole exome sequencing (WES) recently identified genes that were mutated in a small proportion of IRD cases. Consequently, finding a second case or family carrying pathogenic variants in the same candidate gene often is challenging. In this study, we searched for novel candidate IRD gene-associated variants in isolated IRD families, assessed their causality, and searched for novel genotype-phenotype correlations. Whole exome sequencing was performed in 11 probands affected with IRDs. Homozygosity mapping data was available for five cases. Variants with minor allele frequencies ≤ 0.5% in public databases were selected as candidate disease-causing variants. These variants were ranked based on their: (a) presence in a gene that was previously implicated in IRD; (b) minor allele frequency in the Exome Aggregation Consortium database (ExAC); (c) in silico pathogenicity assessment using the combined annotation dependent depletion (CADD) score; and (d) interaction of the corresponding protein with known IRD-associated proteins. Twelve unique variants were found in 11 different genes in 11 IRD probands. Novel autosomal recessive and dominant inheritance patterns were found for variants in Small Nuclear Ribonucleoprotein U5 Subunit 200 ( SNRNP200 ) and Zinc Finger Protein 513 ( ZNF513 ), respectively. Using our pathogenicity assessment, a variant in DEAH-Box Helicase 32 ( DHX32 ) was the top ranked novel candidate gene to be associated with IRDs, followed by eight medium and lower ranked candidate genes. The identification of candidate disease-associated sequence variants in 11 single families underscores the notion that the previously identified IRD-associated genes collectively carry > 90% of the defects implicated in IRDs. To identify multiple patients or families with variants in the same gene and thereby provide extra proof for pathogenicity, worldwide data sharing is needed.
Identification of the Monooxygenase Gene Clusters Responsible for the Regioselective Oxidation of Phenol to Hydroquinone in Mycobacteria▿

PubMed Central

Furuya, Toshiki; Hirose, Satomi; Osanai, Hisashi; Semba, Hisashi; Kino, Kuniki

2011-01-01

Mycobacterium goodii strain 12523 is an actinomycete that is able to oxidize phenol regioselectively at the para position to produce hydroquinone. In this study, we investigated the genes responsible for this unique regioselective oxidation. On the basis of the fact that the oxidation activity of M. goodii strain 12523 toward phenol is induced in the presence of acetone, we first identified acetone-induced proteins in this microorganism by two-dimensional electrophoretic analysis. The N-terminal amino acid sequence of one of these acetone-induced proteins shares 100% identity with that of the protein encoded by the open reading frame Msmeg_1971 in Mycobacterium smegmatis strain mc2155, whose genome sequence has been determined. Since Msmeg_1971, Msmeg_1972, Msmeg_1973, and Msmeg_1974 constitute a putative binuclear iron monooxygenase gene cluster, we cloned this gene cluster of M. smegmatis strain mc2155 and its homologous gene cluster found in M. goodii strain 12523. Sequence analysis of these binuclear iron monooxygenase gene clusters revealed the presence of four genes designated mimABCD, which encode an oxygenase large subunit, a reductase, an oxygenase small subunit, and a coupling protein, respectively. When the mimA gene (Msmeg_1971) of M. smegmatis strain mc2155, which was also found to be able to oxidize phenol to hydroquinone, was deleted, this mutant lost the oxidation ability. This ability was restored by introduction of the mimA gene of M. smegmatis strain mc2155 or of M. goodii strain 12523 into this mutant. Interestingly, we found that these gene clusters also play essential roles in propane and acetone metabolism in these mycobacteria. PMID:21183637

Genomic and Proteomic Analyses of the Fungus Arthrobotrys oligospora Provide Insights into Nematode-Trap Formation

PubMed Central

Feng, Yun; Li, Xiaomin; Zou, Chenggang; Xu, Jianping; Ren, Yan; Mi, Qili; Wu, Junli; Liu, Shuqun; Liu, Yu; Huang, Xiaowei; Wang, Haiyan; Niu, Xuemei; Li, Juan; Liang, Lianming; Luo, Yanlu; Ji, Kaifang; Zhou, Wei; Yu, Zefen; Li, Guohong; Liu, Yajun; Li, Lei; Qiao, Min; Feng, Lu; Zhang, Ke-Qin

2011-01-01

Nematode-trapping fungi are “carnivorous” and attack their hosts using specialized trapping devices. The morphological development of these traps is the key indicator of their switch from saprophytic to predacious lifestyles. Here, the genome of the nematode-trapping fungus Arthrobotrys oligospora Fres. (ATCC24927) was reported. The genome contains 40.07 Mb assembled sequence with 11,479 predicted genes. Comparative analysis showed that A. oligospora shared many more genes with pathogenic fungi than with non-pathogenic fungi. Specifically, compared to several sequenced ascomycete fungi, the A. oligospora genome has a larger number of pathogenicity-related genes in the subtilisin, cellulase, cellobiohydrolase, and pectinesterase gene families. Searching against the pathogen-host interaction gene database identified 398 homologous genes involved in pathogenicity in other fungi. The analysis of repetitive sequences provided evidence for repeat-induced point mutations in A. oligospora. Proteomic and quantitative PCR (qPCR) analyses revealed that 90 genes were significantly up-regulated at the early stage of trap-formation by nematode extracts and most of these genes were involved in translation, amino acid metabolism, carbohydrate metabolism, cell wall and membrane biogenesis. Based on the combined genomic, proteomic and qPCR data, a model for the formation of nematode trapping device in this fungus was proposed. In this model, multiple fungal signal transduction pathways are activated by its nematode prey to further regulate downstream genes associated with diverse cellular processes such as energy metabolism, biosynthesis of the cell wall and adhesive proteins, cell division, glycerol accumulation and peroxisome biogenesis. This study will facilitate the identification of pathogenicity-related genes and provide a broad foundation for understanding the molecular and evolutionary mechanisms underlying fungi-nematodes interactions. PMID:21909256
Genomic and proteomic analyses of the fungus Arthrobotrys oligospora provide insights into nematode-trap formation.

PubMed

Yang, Jinkui; Wang, Lei; Ji, Xinglai; Feng, Yun; Li, Xiaomin; Zou, Chenggang; Xu, Jianping; Ren, Yan; Mi, Qili; Wu, Junli; Liu, Shuqun; Liu, Yu; Huang, Xiaowei; Wang, Haiyan; Niu, Xuemei; Li, Juan; Liang, Lianming; Luo, Yanlu; Ji, Kaifang; Zhou, Wei; Yu, Zefen; Li, Guohong; Liu, Yajun; Li, Lei; Qiao, Min; Feng, Lu; Zhang, Ke-Qin

2011-09-01

Nematode-trapping fungi are "carnivorous" and attack their hosts using specialized trapping devices. The morphological development of these traps is the key indicator of their switch from saprophytic to predacious lifestyles. Here, the genome of the nematode-trapping fungus Arthrobotrys oligospora Fres. (ATCC24927) was reported. The genome contains 40.07 Mb assembled sequence with 11,479 predicted genes. Comparative analysis showed that A. oligospora shared many more genes with pathogenic fungi than with non-pathogenic fungi. Specifically, compared to several sequenced ascomycete fungi, the A. oligospora genome has a larger number of pathogenicity-related genes in the subtilisin, cellulase, cellobiohydrolase, and pectinesterase gene families. Searching against the pathogen-host interaction gene database identified 398 homologous genes involved in pathogenicity in other fungi. The analysis of repetitive sequences provided evidence for repeat-induced point mutations in A. oligospora. Proteomic and quantitative PCR (qPCR) analyses revealed that 90 genes were significantly up-regulated at the early stage of trap-formation by nematode extracts and most of these genes were involved in translation, amino acid metabolism, carbohydrate metabolism, cell wall and membrane biogenesis. Based on the combined genomic, proteomic and qPCR data, a model for the formation of nematode trapping device in this fungus was proposed. In this model, multiple fungal signal transduction pathways are activated by its nematode prey to further regulate downstream genes associated with diverse cellular processes such as energy metabolism, biosynthesis of the cell wall and adhesive proteins, cell division, glycerol accumulation and peroxisome biogenesis. This study will facilitate the identification of pathogenicity-related genes and provide a broad foundation for understanding the molecular and evolutionary mechanisms underlying fungi-nematodes interactions.
Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate.

PubMed

Liu, Xuejun; Shi, Xinxin; Chen, Chunlin; Zhang, Li

2015-10-16

The high-throughput sequencing technology, RNA-Seq, has been widely used to quantify gene and isoform expression in the study of transcriptome in recent years. Accurate expression measurement from the millions or billions of short generated reads is obstructed by difficulties. One is ambiguous mapping of reads to reference transcriptome caused by alternative splicing. This increases the uncertainty in estimating isoform expression. The other is non-uniformity of read distribution along the reference transcriptome due to positional, sequencing, mappability and other undiscovered sources of biases. This violates the uniform assumption of read distribution for many expression calculation approaches, such as the direct RPKM calculation and Poisson-based models. Many methods have been proposed to address these difficulties. Some approaches employ latent variable models to discover the underlying pattern of read sequencing. However, most of these methods make bias correction based on surrounding sequence contents and share the bias models by all genes. They therefore cannot estimate gene- and isoform-specific biases as revealed by recent studies. We propose a latent variable model, NLDMseq, to estimate gene and isoform expression. Our method adopts latent variables to model the unknown isoforms, from which reads originate, and the underlying percentage of multiple spliced variants. The isoform- and exon-specific read sequencing biases are modeled to account for the non-uniformity of read distribution, and are identified by utilizing the replicate information of multiple lanes of a single library run. We employ simulation and real data to verify the performance of our method in terms of accuracy in the calculation of gene and isoform expression. Results show that NLDMseq obtains competitive gene and isoform expression compared to popular alternatives. Finally, the proposed method is applied to the detection of differential expression (DE) to show its usefulness in the downstream analysis. The proposed NLDMseq method provides an approach to accurately estimate gene and isoform expression from RNA-Seq data by modeling the isoform- and exon-specific read sequencing biases. It makes use of a latent variable model to discover the hidden pattern of read sequencing. We have shown that it works well in both simulations and real datasets, and has competitive performance compared to popular methods. The method has been implemented as a freely available software which can be found at https://github.com/PUGEA/NLDMseq.
Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

PubMed

Jo, Yeong Deuk; Choi, Yoomi; Kim, Dong-Hwan; Kim, Byung-Dong; Kang, Byoung-Cheorl

2014-07-04

Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp. We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes. Although large portion of sequence context was shared by mitochondrial genomes of CMS and male-fertile pepper lines, extensive genome rearrangements were detected. CMS candidate genes located on the edges of highly-rearranged CMS-specific DNA regions and near to repeat sequences. These characteristics were detected among CMS-associated genes in other species, implying a common mechanism might be involved in the evolution of CMS-associated genes.
Whole Genome Sequencing of Danish Staphylococcus argenteus Reveals a Genetically Diverse Collection with Clear Separation from Staphylococcus aureus.

PubMed

Hansen, Thomas A; Bartels, Mette D; Høgh, Silje V; Dons, Lone E; Pedersen, Michael; Jensen, Thøger G; Kemp, Michael; Skov, Marianne N; Gumpert, Heidi; Worning, Peder; Westh, Henrik

2017-01-01

Staphylococcus argenteus ( S. argenteus ) is a newly identified Staphylococcus species that has been misidentified as Staphylococcus aureus ( S. aureus ) and is clinically relevant. We identified 25 S. argenteus genomes in our collection of whole genome sequenced S. aureus . These genomes were compared to publicly available genomes and a phylogeny revealed seven clusters corresponding to seven clonal complexes. The genome of S. argenteus was found to be different from the genome of S. aureus and a core genome analysis showed that ~33% of the total gene pool was shared between the two species, at 90% homology level. An assessment of mobile elements shows flow of SCC mec cassettes, plasmids, phages, and pathogenicity islands, between S. argenteus and S. aureus . This dataset emphasizes that S. argenteus and S. aureus are two separate species that share genetic material.
Biology and genomics of viruses within the genus Gammabaculovirus.

PubMed

Arif, Basil; Escasa, Shannon; Pavlik, Lillian

2011-11-01

Hymenoptera is a very large and ancient insect order encompassing bees, wasps, ants and sawflies. Fossil records indicate that they existed over 200 million years ago and about 100 million years before the appearance of Lepidoptera. Sawflies have been major pests in many parts of the world and some have caused serious forest defoliation in North America. All baculoviruses isolated from sawflies are of the single nucleocapsids phenotype and appear to replicate in midgut cells only. This group of viruses has been shown to be excellent pest control agents and three have been registered in Canada and Britain for this purpose. Sawfly baculoviruses contain the smallest genome of all baculoviruses sequenced so far. Gene orders among sequenced sawfly baculoviruses are co-linear but this is not shared with the genomes of lepidopteran baculoviruses. One distinguishing feature among all sequenced sawfly viruses is the lack of a gene encoding a membrane fusion protein, which brought into question the role of the budded virus phenotype in Gammabaculovirus biology.
Clades of Adeno-Associated Viruses Are Widely Disseminated in Human Tissues

PubMed Central

Gao, Guangping; Vandenberghe, Luk H.; Alvira, Mauricio R.; Lu, You; Calcedo, Roberto; Zhou, Xiangyang; Wilson, James M.

2004-01-01

The potential for using Adeno-associated virus (AAV) as a vector for human gene therapy has stimulated interest in the Dependovirus genus. Serologic data suggest that AAV infections are prevalent in humans, although analyses of viruses and viral sequences from clinical samples are extremely limited. Molecular techniques were used in this study to successfully detect endogenous AAV sequences in 18% of all human tissues screened, with the liver and bone marrow being the most predominant sites. Sequence characterization of rescued AAV DNAs indicated a diverse array of molecular forms which segregate into clades whose members share functional and serologic similarities. One of the most predominant human clades is a hybrid of two previously described AAV serotypes, while another clade was found in humans and several species of nonhuman primates, suggesting a cross-species transmission of this virus. These data provide important information regarding the biology of parvoviruses in humans and their use as gene therapy vectors. PMID:15163731
Mitochondrial genome of the bullet tuna Auxis rochei from Indo-West Pacific collection provides novel genetic information about two subspecies.

PubMed

Li, Mingming; Guo, Liang; Zhang, Heng; Yang, Sen; Chen, Xinghan; Lin, Haoran; Meng, Zining

2016-09-01

Previously morphological studies supported the division of the bullet tuna into the two subspecies, Auxis rochei rochei and A. rochei eudorax. As a cosmopolitan species, A. rochei rochei ranges in the Indo-West Pacific and Atlantic oceans, while A. rochei eudorax inhabits in eastern Pacific region. Here, we used the HiSeq next-generation sequencing technique to determine the mitochondrial genome (mitogenome) of A. rochei from Indo-West Pacific collection, and then compared our data with mitogenomic sequences of the Atlantic and eastern Pacific retrieved from NCBI database. Results showed the mitogenome of A. rochei from three geographic collections shared the same genes and gene order, similar to typical teleosts. Also, we examined a low level of nucleotide diversity among these mitogenomic sequences. Interestingly, nucleotide diversity of intra-subspecies (Atlantic versus Indo-West) was higher than that of inter-subspecies (Atlantic versus eastern Pacific, Indo-West versus eastern Pacific).
A proposed model for the flowering signaling pathway of sugarcane under photoperiodic control.

PubMed

Coelho, C P; Costa Netto, A P; Colasanti, J; Chalfun-Júnior, A

2013-04-25

Molecular analysis of floral induction in Arabidopsis has identified several flowering time genes related to 4 response networks defined by the autonomous, gibberellin, photoperiod, and vernalization pathways. Although grass flowering processes include ancestral functions shared by both mono- and dicots, they have developed their own mechanisms to transmit floral induction signals. Despite its high production capacity and its important role in biofuel production, almost no information is available about the flowering process in sugarcane. We searched the Sugarcane Expressed Sequence Tags database to look for elements of the flowering signaling pathway under photoperiodic control. Sequences showing significant similarity to flowering time genes of other species were clustered, annotated, and analyzed for conserved domains. Multiple alignments comparing the sequences found in the sugarcane database and those from other species were performed and their phylogenetic relationship assessed using the MEGA 4.0 software. Electronic Northerns were run with Cluster and TreeView programs, allowing us to identify putative members of the photoperiod-controlled flowering pathway of sugarcane.
Genetic diversity and natural selection of Plasmodium knowlesi merozoite surface protein 1 paralog gene in Malaysia.

PubMed

Ahmed, Md Atique; Fauzi, Muh; Han, Eun-Taek

2018-03-14

Human infections due to the monkey malaria parasite Plasmodium knowlesi is on the rise in most Southeast Asian countries specifically Malaysia. The C-terminal 19 kDa domain of PvMSP1P is a potential vaccine candidate, however, no study has been conducted in the orthologous gene of P. knowlesi. This study investigates level of polymorphisms, haplotypes and natural selection of full-length pkmsp1p in clinical samples from Malaysia. A total of 36 full-length pkmsp1p sequences along with the reference H-strain and 40 C-terminal pkmsp1p sequences from clinical isolates of Malaysia were downloaded from published genomes. Genetic diversity, polymorphism, haplotype and natural selection were determined using DnaSP 5.10 and MEGA 5.0 software. Genealogical relationships were determined using haplotype network tree in NETWORK software v5.0. Population genetic differentiation index (F ST ) and population structure of parasite was determined using Arlequin v3.5 and STRUCTURE v2.3.4 software. Comparison of 36 full-length pkmsp1p sequences along with the H-strain identified 339 SNPs (175 non-synonymous and 164 synonymous substitutions). The nucleotide diversity across the full-length gene was low compared to its ortholog pvmsp1p. The nucleotide diversity was higher toward the N-terminal domains (pkmsp1p-83 and 30) compared to the C-terminal domains (pkmsp1p-38, 33 and 19). Phylogenetic analysis of full-length genes identified 2 distinct clusters of P. knowlesi from Malaysian Borneo. The 40 pkmsp1p-19 sequences showed low polymorphisms with 16 polymorphisms leading to 18 haplotypes. In total there were 10 synonymous and 6 non-synonymous substitutions and 12 cysteine residues were intact within the two EGF domains. Evidence of strong purifying selection was observed within the full-length sequences as well in all the domains. Shared haplotypes of 40 pkmsp1p-19 were identified within Malaysian Borneo haplotypes. This study is the first to report on the genetic diversity and natural selection of pkmsp1p. A low level of genetic diversity and strong evidence of negative selection was detected and observed in all the domains of pkmsp1p of P. knowlesi indicating functional constrains. Shared haplotypes were identified within pkmsp1p-19 highlighting further evaluation using larger number of clinical samples from Malaysia.
The gene expression profile of resistant and susceptible Bombyx mori strains reveals cypovirus-associated variations in host gene transcript levels.

PubMed

Guo, Rui; Wang, Simei; Xue, Renyu; Cao, Guangli; Hu, Xiaolong; Huang, Moli; Zhang, Yangqi; Lu, Yahong; Zhu, Liyuan; Chen, Fei; Liang, Zi; Kuang, Sulan; Gong, Chengliang

2015-06-01

High-throughput paired-end RNA sequencing (RNA-Seq) was performed to investigate the gene expression profile of a susceptible Bombyx mori strain, Lan5, and a resistant B. mori strain, Ou17, which were both orally infected with B. mori cypovirus (BmCPV) in the midgut. There were 330 and 218 up-regulated genes, while there were 147 and 260 down-regulated genes in the Lan5 and Ou17 strains, respectively. Gene ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment for differentially expressed genes (DEGs) were carried out. Moreover, gene interaction network (STRING) analyses were performed to analyze the relationships among the shared DEGs. Some of these genes were related and formed a large network, in which the genes for B. mori cuticular protein RR-2 motif 123 (BmCPR123) and the gene for B. mori DNA replication licensing factor Mcm2-like (BmMCM2) were key genes among the common up-regulated DEGs, whereas the gene for B. mori heat shock protein 20.1 (Bmhsp20.1) was the central gene among the shared down-regulated DEGs between Lan5 vs Lan5-CPV and Ou17 vs Ou17-CPV. These findings established a comprehensive database of genes that are differentially expressed in response to BmCPV infection between silkworm strains that differed in resistance to BmCPV and implied that these DEGs might be involved in B. mori immune responses against BmCPV infection.
Whole-genome sequencing reveals clonal expansion of multiresistant Staphylococcus haemolyticus in European hospitals.

PubMed

Cavanagh, Jorunn Pauline; Hjerde, Erik; Holden, Matthew T G; Kahlke, Tim; Klingenberg, Claus; Flægstad, Trond; Parkhill, Julian; Bentley, Stephen D; Sollid, Johanna U Ericson

2014-11-01

Staphylococcus haemolyticus is an emerging cause of nosocomial infections, primarily affecting immunocompromised patients. A comparative genomic analysis was performed on clinical S. haemolyticus isolates to investigate their genetic relationship and explore the coding sequences with respect to antimicrobial resistance determinants and putative hospital adaptation. Whole-genome sequencing was performed on 134 isolates of S. haemolyticus from geographically diverse origins (Belgium, 2; Germany, 10; Japan, 13; Norway, 54; Spain, 2; Switzerland, 43; UK, 9; USA, 1). Each genome was individually assembled. Protein coding sequences (CDSs) were predicted and homologous genes were categorized into three types: Type I, core genes, homologues present in all strains; Type II, unique core genes, homologues shared by only a subgroup of strains; and Type III, unique genes, strain-specific CDSs. The phylogenetic relationship between the isolates was built from variable sites in the form of single nucleotide polymorphisms (SNPs) in the core genome and used to construct a maximum likelihood phylogeny. SNPs in the genome core regions divided the isolates into one major group of 126 isolates and one minor group of isolates with highly diverse genomes. The major group was further subdivided into seven clades (A-G), of which four (A-D) encompassed isolates only from Europe. Antimicrobial multiresistance was observed in 77.7% of the collection. High levels of homologous recombination were detected in genes involved in adherence, staphylococcal host adaptation and bacterial cell communication. The presence of several successful and highly resistant clones underlines the adaptive potential of this opportunistic pathogen. © The Author 2014. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy.
Whole-genome sequencing reveals clonal expansion of multiresistant Staphylococcus haemolyticus in European hospitals

PubMed Central

Cavanagh, Jorunn Pauline; Hjerde, Erik; Holden, Matthew T. G.; Kahlke, Tim; Klingenberg, Claus; Flægstad, Trond; Parkhill, Julian; Bentley, Stephen D.; Sollid, Johanna U. Ericson

2014-01-01

Objectives Staphylococcus haemolyticus is an emerging cause of nosocomial infections, primarily affecting immunocompromised patients. A comparative genomic analysis was performed on clinical S. haemolyticus isolates to investigate their genetic relationship and explore the coding sequences with respect to antimicrobial resistance determinants and putative hospital adaptation. Methods Whole-genome sequencing was performed on 134 isolates of S. haemolyticus from geographically diverse origins (Belgium, 2; Germany, 10; Japan, 13; Norway, 54; Spain, 2; Switzerland, 43; UK, 9; USA, 1). Each genome was individually assembled. Protein coding sequences (CDSs) were predicted and homologous genes were categorized into three types: Type I, core genes, homologues present in all strains; Type II, unique core genes, homologues shared by only a subgroup of strains; and Type III, unique genes, strain-specific CDSs. The phylogenetic relationship between the isolates was built from variable sites in the form of single nucleotide polymorphisms (SNPs) in the core genome and used to construct a maximum likelihood phylogeny. Results SNPs in the genome core regions divided the isolates into one major group of 126 isolates and one minor group of isolates with highly diverse genomes. The major group was further subdivided into seven clades (A–G), of which four (A–D) encompassed isolates only from Europe. Antimicrobial multiresistance was observed in 77.7% of the collection. High levels of homologous recombination were detected in genes involved in adherence, staphylococcal host adaptation and bacterial cell communication. Conclusions The presence of several successful and highly resistant clones underlines the adaptive potential of this opportunistic pathogen. PMID:25038069
Gene relocations within chloroplast genomes of Jasminum and Menodora (Oleaceae) are due to multiple, overlapping inversions.

PubMed

Lee, Hae-Lim; Jansen, Robert K; Chumley, Timothy W; Kim, Ki-Joong

2007-05-01

The chloroplast (cp) DNA sequence of Jasminum nudiflorum (Oleaceae-Jasmineae) is completed and compared with the large single-copy region sequences from 6 related species. The cp genomes of the tribe Jasmineae (Jasminum and Menodora) show several distinctive rearrangements, including inversions, gene duplications, insertions, inverted repeat expansions, and gene and intron losses. The ycf4-psaI region in Jasminum section Primulina was relocated as a result of 2 overlapping inversions of 21,169 and 18,414 bp. The 1st, larger inversion is shared by all members of the Jasmineae indicating that it occurred in the common ancestor of the tribe. Similar rearrangements were also identified in the cp genome of Menodora. In this case, 2 fragments including ycf4 and rps4-trnS-ycf3 genes were moved by 2 additional inversions of 14 and 59 kb that are unique to Menodora. Other rearrangements in the Oleaceae are confined to certain regions of the Jasminum and Menodora cp genomes, including the presence of highly repeated sequences and duplications of coding and noncoding sequences that are inserted into clpP and between rbcL and psaI. These insertions are correlated with the loss of 2 introns in clpP and a serial loss of segments of accD. The loss of the accD gene and clpP introns in both the monocot family Poaceae and the eudicot family Oleaceae are clearly independent evolutionary events. However, their genome organization is surprisingly similar despite the distant relationship of these 2 angiosperm families.
K-shuff: A Novel Algorithm for Characterizing Structural and Compositional Diversity in Gene Libraries

PubMed Central

Jangid, Kamlesh; Kao, Ming-Hung; Lahamge, Aishwarya; Williams, Mark A.; Rathbun, Stephen L.; Whitman, William B.

2016-01-01

K-shuff is a new algorithm for comparing the similarity of gene sequence libraries, providing measures of the structural and compositional diversity as well as the significance of the differences between these measures. Inspired by Ripley’s K-function for spatial point pattern analysis, the Intra K-function or IKF measures the structural diversity, including both the richness and overall similarity of the sequences, within a library. The Cross K-function or CKF measures the compositional diversity between gene libraries, reflecting both the number of OTUs shared as well as the overall similarity in OTUs. A Monte Carlo testing procedure then enables statistical evaluation of both the structural and compositional diversity between gene libraries. For 16S rRNA gene libraries from complex bacterial communities such as those found in seawater, salt marsh sediments, and soils, K-shuff yields reproducible estimates of structural and compositional diversity with libraries greater than 50 sequences. Similarly, for pyrosequencing libraries generated from a glacial retreat chronosequence and Illumina® libraries generated from US homes, K-shuff required >300 and 100 sequences per sample, respectively. Power analyses demonstrated that K-shuff is sensitive to small differences in Sanger or Illumina® libraries. This extra sensitivity of K-shuff enabled examination of compositional differences at much deeper taxonomic levels, such as within abundant OTUs. This is especially useful when comparing communities that are compositionally very similar but functionally different. K-shuff will therefore prove beneficial for conventional microbiome analysis as well as specific hypothesis testing. PMID:27911946
Identification of the Core Set of Carbon-Associated Genes in a Bioenergy Grassland Soil

DOE PAGES

Howe, Adina; Yang, Fan; Williams, Ryan J.; ...

2016-11-17

Despite the central role of soil microbial communities in global carbon (C) cycling, little is known about soil microbial community structure and even less about their metabolic pathways. Efforts to characterize soil communities often focus on identifying differences in gene content across environmental gradients, but an alternative question is what genes are similar in soils. These genes may indicate critical species or potential functions that are required in all soils. Here we identified the “core” set of C cycling sequences widely present in multiple soil metagenomes from a fertilized prairie (FP). Of 226,887 sequences associated with known enzymes involved inmore » the synthesis, metabolism, and transport of carbohydrates, 843 were identified to be consistently prevalent across four replicate soil metagenomes. This core metagenome was functionally and taxonomically diverse, representing five enzyme classes and 99 enzyme families within the CAZy database. Though it only comprised 0.4% of all CAZy-associated genes identified in FP metagenomes, the core was found to be comprised of functions similar to those within cumulative soils. The FP CAZy-associated core sequences were present in multiple publicly available soil metagenomes and most similar to soils sharing geographic proximity. As a result, in soil ecosystems, where high diversity remains a key challenge for metagenomic investigations, these core genes represent a subset of critical functions necessary for carbohydrate metabolism, which can be targeted to evaluate important C fluxes in these and other similar soils.« less
Comparative Mitogenomic Analyses of Praying Mantises (Dictyoptera, Mantodea): Origin and Evolution of Unusual Intergenic Gaps

PubMed Central

Zhang, Hong-Li; Ye, Fei

2017-01-01

Praying mantises are a diverse group of predatory insects. Although some Mantodea mitogenomes have been reported, a comprehensive comparative and evolutionary genomic study is lacking for this group. In the present study, four new mitogenomes were sequenced, annotated, and compared to the previously published mitogenomes of other Mantodea species. Most Mantodea mitogenomes share a typical set of mitochondrial genes and a putative control region (CR). Additionally, and most intriguingly, another large non-coding region (LNC) was detected between trnM and ND2 in all six Paramantini mitogenomes examined. The main section in this common region of Paramantini may have initially originated from the corresponding control region for each species, whereas sequence differences between the LNCs and CRs and phylogenetic analyses indicate that LNC and CR are largely independently evolving. Namely, the LNC (the duplicated CR) may have subsequently degenerated during evolution. Furthermore, evidence suggests that special intergenic gaps have been introduced in some species through gene rearrangement and duplication. These gaps are actually the original abutting sequences of migrated or duplicated genes. Some gaps (G5 and G6) are homologous to the 5' and 3' surrounding regions of the duplicated gene in the original gene order, and another specific gap (G7) has tandem repeats. We analysed the phylogenetic relationships of fifteen Mantodea species using 37 concatenated mitochondrial genes and detected several synapomorphies unique to species in some clades. PMID:28367101
Cloning and characterization of full length of a novel zebrafish gene Zsrg abundantly expressed in the germline stem cells.

PubMed

Lv, Daoyuan; Song, Ping; Chen, Yungui; Gong, Wuming; Mo, Saijun

2005-04-08

Using the digital differential display program of the National Center for Biotechnology Information, we identified a contig of expression sequence tags (ESTs) (Accession No. BM316936), which came from zebrafish ovary and testis libraries. The full-length cDNA of this transcript was cloned and further confirmed by polymerase chain reaction and sequencing. The full-length cDNA of the novel gene is 807bp and encodes a novel protein of 187 amino acids, which shares no significant homology with any other known proteins. Characterization of genomic sequences of the gene revealed that it spans 6kb on the linkage group 3 and is composed of five exons and four introns. RT-PCR analysis showed that it was expressed in mature oocytes and one-cell stage, and persisted until 24h of development. RT-PCR also revealed that it is expressed in gonad and kidney, with the highest level of expression in the testis. The expression sites of the novel gene in adult gonad were further localized by in situ hybridization to oogonia and growing oocytes in ovary and to spermatogonia, spermatocytes but not to spermatids in testis. Based on its abundance in testis and the germline stem cell-spermatogonia and oogonia, we hypothesize that it may function as a testicular development and gametogenesis related gene that plays important roles in spermatogenesis, and named it Zsrg (zebrafish testis spermatogenesis related gene, Zsrg).
Detailed transcriptome description of the neglected cestode Taenia multiceps.

PubMed

Wu, Xuhang; Fu, Yan; Yang, Deying; Zhang, Runhui; Zheng, Wanpeng; Nie, Huaming; Xie, Yue; Yan, Ning; Hao, Guiying; Gu, Xiaobin; Wang, Shuxian; Peng, Xuerong; Yang, Guangyou

2012-01-01

The larval stage of Taenia multiceps, a global cestode, encysts in the central nervous system (CNS) of sheep and other livestock. This frequently leads to their death and huge socioeconomic losses, especially in developing countries. This parasite can also cause zoonotic infections in humans, but has been largely neglected due to a lack of diagnostic techniques and studies. Recent developments in next-generation sequencing provide an opportunity to explore the transcriptome of T. multiceps. We obtained a total of 31,282 unigenes (mean length 920 bp) using Illumina paired-end sequencing technology and a new Trinity de novo assembler without a referenced genome. Individual transcription molecules were determined by sequence-based annotations and/or domain-based annotations against public databases (Nr, UniprotKB/Swiss-Prot, COG, KEGG, UniProtKB/TrEMBL, InterPro and Pfam). We identified 26,110 (83.47%) unigenes and inferred 20,896 (66.8%) coding sequences (CDS). Further comparative transcripts analysis with other cestodes (Taenia pisiformis, Taenia solium, Echincoccus granulosus and Echincoccus multilocularis) and intestinal parasites (Trichinella spiralis, Ancylostoma caninum and Ascaris suum) showed that 5,100 common genes were shared among three Taenia tapeworms, 261 conserved genes were detected among five Taeniidae cestodes, and 109 common genes were found in four zoonotic intestinal parasites. Some of the common genes were genes required for parasite survival, involved in parasite-host interactions. In addition, we amplified two full-length CDS of unigenes from the common genes using RT-PCR. This study provides an extensive transcriptome of the adult stage of T. multiceps, and demonstrates that comparative transcriptomic investigations deserve to be further studied. This transcriptome dataset forms a substantial public information platform to achieve a fundamental understanding of the biology of T. multiceps, and helps in the identification of drug targets and parasite-host interaction studies.
Sequence and expression analyses of porcine ISG15 and ISG43 genes.

PubMed

Huang, Jiangnan; Zhao, Shuhong; Zhu, Mengjin; Wu, Zhenfang; Yu, Mei

2009-08-01

The coding sequences of porcine interferon-stimulated gene 15 (ISG15) and the interferon-stimulated gene (ISG43) were cloned from swine spleen mRNA. The amino acid sequences deduced from porcine ISG15 and ISG43 genes coding sequence shared 24-75% and 29-83% similarity with ISG15s and ISG43s from other vertebrates, respectively. Structural analyses revealed that porcine ISG15 comprises two ubiquitin homologues motifs (UBQ) domain and a conserved C-terminal LRLRGG conjugating motif. Porcine ISG43 contains an ubiquitin-processing proteases-like domain. Phylogenetic analyses showed that porcine ISG15 and ISG43 were mostly related to rat ISG15 and cattle ISG43, respectively. Using quantitative real-time PCR assay, significant increased expression levels of porcine ISG15 and ISG43 genes were detected in porcine kidney endothelial cells (PK15) cells treated with poly I:C. We also observed the enhanced mRNA expression of three members of dsRNA pattern-recognition receptors (PRR), TLR3, DDX58 and IFIH1, which have been reported to act as critical receptors in inducing the mRNA expression of ISG15 and ISG43 genes. However, we did not detect any induced mRNA expression of IFNalpha and IFNbeta, suggesting that transcriptional activations of ISG15 and ISG43 were mediated through IFN-independent signaling pathway in the poly I:C treated PK15 cells. Association analyses in a Landrace pig population revealed that ISG15 c.347T>C (BstUI) polymorphism and the ISG43 c.953T>G (BccI) polymorphism were significantly associated with hematological parameters and immune-related traits.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.