common gene sequence: Topics by Science.gov

Sample records for common gene sequence

In silico comparison of genomic regions containing genes coding for enzymes and transcription factors for the phenylpropanoid pathway in Phaseolus vulgaris L. and Glycine max L. Merr

PubMed Central

Reinprecht, Yarmilla; Yadegari, Zeinab; Perry, Gregory E.; Siddiqua, Mahbuba; Wright, Lori C.; McClean, Phillip E.; Pauls, K. Peter

2013-01-01

Legumes contain a variety of phytochemicals derived from the phenylpropanoid pathway that have important effects on human health as well as seed coat color, plant disease resistance and nodulation. However, the information about the genes involved in this important pathway is fragmentary in common bean (Phaseolus vulgaris L.). The objectives of this research were to isolate genes that function in and control the phenylpropanoid pathway in common bean, determine their genomic locations in silico in common bean and soybean, and analyze sequences of the 4CL gene family in two common bean genotypes. Sequences of phenylpropanoid pathway genes available for common bean or other plant species were aligned, and the conserved regions were used to design sequence-specific primers. The PCR products were cloned and sequenced and the gene sequences along with common bean gene-based (g) markers were BLASTed against the Glycine max v.1.0 genome and the P. vulgaris v.1.0 (Andean) early release genome. In addition, gene sequences were BLASTed against the OAC Rex (Mesoamerican) genome sequence assembly. In total, fragments of 46 structural and regulatory phenylpropanoid pathway genes were characterized in this way and placed in silico on common bean and soybean sequence maps. The maps contain over 250 common bean g and SSR (simple sequence repeat) markers and identify the positions of more than 60 additional phenylpropanoid pathway gene sequences, plus the putative locations of seed coat color genes. The majority of cloned phenylpropanoid pathway gene sequences were mapped to one location in the common bean genome but had two positions in soybean. The comparison of the genomic maps confirmed previous studies, which show that common bean and soybean share genomic regions, including those containing phenylpropanoid pathway gene sequences, with conserved synteny. Indels identified in the comparison of Andean and Mesoamerican common bean 4CL gene sequences might be used to develop inter-pool phenylpropanoid pathway gene-based markers. We anticipate that the information obtained by this study will simplify and accelerate selections of common bean with specific phenylpropanoid pathway alleles to increase the contents of beneficial phenylpropanoids in common bean and other legumes. PMID:24046770
Construction and EST sequencing of full-length, drought stress cDNA libraries for common beans (Phaseolus vulgaris L.)

PubMed Central

2011-01-01

Background Common bean is an important legume crop with only a moderate number of short expressed sequence tags (ESTs) made with traditional methods. The goal of this research was to use full-length cDNA technology to develop ESTs that would overlap with the beginning of open reading frames and therefore be useful for gene annotation of genomic sequences. The library was also constructed to represent genes expressed under drought, low soil phosphorus and high soil aluminum toxicity. We also undertook comparisons of the full-length cDNA library to two previous non-full clone EST sets for common bean. Results Two full-length cDNA libraries were constructed: one for the drought tolerant Mesoamerican genotype BAT477 and the other one for the acid-soil tolerant Andean genotype G19833 which has been selected for genome sequencing. Plants were grown in three soil types using deep rooting cylinders subjected to drought and non-drought stress and tissues were collected from both roots and above ground parts. A total of 20,000 clones were selected robotically, half from each library. Then, nearly 10,000 clones from the G19833 library were sequenced with an average read length of 850 nucleotides. A total of 4,219 unigenes were identified consisting of 2,981 contigs and 1,238 singletons. These were functionally annotated with gene ontology terms and placed into KEGG pathways. Compared to other EST sequencing efforts in common bean, about half of the sequences were novel or represented the 5' ends of known genes. Conclusions The present full-length cDNA libraries add to the technological toolbox available for common bean and our sequencing of these clones substantially increases the number of unique EST sequences available for the common bean genome. All of this should be useful for both functional gene annotation, analysis of splice site variants and intron/exon boundary determination by comparison to soybean genes or with common bean whole-genome sequences. In addition the library has a large number of transcription factors and will be interesting for discovery and validation of drought or abiotic stress related genes in common bean. PMID:22118559
Sequencing and comparing whole mitochondrial genomes ofanimals

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

2005-04-22

Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based onmore » our experiences to date with determining and comparing complete mtDNA sequences.« less
De novo Transcriptome Assembly of Common Wild Rice (Oryza rufipogon Griff.) and Discovery of Drought-Response Genes in Root Tissue Based on Transcriptomic Data.

PubMed

Tian, Xin-Jie; Long, Yan; Wang, Jiao; Zhang, Jing-Wen; Wang, Yan-Yan; Li, Wei-Min; Peng, Yu-Fa; Yuan, Qian-Hua; Pei, Xin-Wu

2015-01-01

The perennial O. rufipogon (common wild rice), which is considered to be the ancestor of Asian cultivated rice species, contains many useful genetic resources, including drought resistance genes. However, few studies have identified the drought resistance and tissue-specific genes in common wild rice. In this study, transcriptome sequencing libraries were constructed, including drought-treated roots (DR) and control leaves (CL) and roots (CR). Using Illumina sequencing technology, we generated 16.75 million bases of high-quality sequence data for common wild rice and conducted de novo assembly and annotation of genes without prior genome information. These reads were assembled into 119,332 unigenes with an average length of 715 bp. A total of 88,813 distinct sequences (74.42% of unigenes) significantly matched known genes in the NCBI NT database. Differentially expressed gene (DEG) analysis showed that 3617 genes were up-regulated and 4171 genes were down-regulated in the CR library compared with the CL library. Among the DEGs, 535 genes were expressed in roots but not in shoots. A similar comparison between the DR and CR libraries showed that 1393 genes were up-regulated and 315 genes were down-regulated in the DR library compared with the CR library. Finally, 37 genes that were specifically expressed in roots were screened after comparing the DEGs identified in the above-described analyses. This study provides a transcriptome sequence resource for common wild rice plants and establishes a digital gene expression profile of wild rice plants under drought conditions using the assembled transcriptome data as a reference. Several tissue-specific and drought-stress-related candidate genes were identified, representing a fully characterized transcriptome and providing a valuable resource for genetic and genomic studies in plants.
Isolation, cloning, and characterization of a partial novel aro A gene in common reed (Phragmites australis).

PubMed

Taravat, Elham; Zebarjadi, Alireza; Kahrizi, Danial; Yari, Kheirollah

2015-05-01

Among the essential amino acids, phenylalanine, tryptophan, and tyrosine are aromatic amino acids which are synthesized by the shikimate pathway in plants and bacteria. Herbicide glyphosate can inhibit the biosynthesis of these amino acids. So, identification of the gene tolerant to glyphosate is very important. It has been shown that the common reed or Phragmites australis Cav. (Poaceae) is relatively tolerant to glyphosate. The aim of the current research is identification, cloning, sequencing, and registering of partial aro A gene of the common reed P. australis. The partial aro A gene of common reed (P. australis) was cloned in Escherichia coli and the amino acid sequence was identified/determined for the first time. This is the first report for isolation, cloning, and sequencing of a part of aro A gene from the common reed. A 670 bp fragment including two introns (86 bp and 289 bp) was obtained. The open reading frame (ORF) region in part of gene was encoded for 98 amino acids. Alignment showed high similarity among this region with Zea mays (L.) (Poaceae) (94.6%), Eleusine indica L. Gaertn (Poaceae) (94.2%), and Zoysia japonica Steud. (Poaceae) (94.2%). The alignment of amino acid sequence of the investigated part of the gene showed a homology with aro A from several other plants. This conserved region forms the enzyme active site. The alignment results of nucleotide and amino acid residues with related sequences showed that there are some differences among them. The relative glyphosate tolerance in the common reed may be related to these differences.
Evidence for Horizontal Gene Transfer in Evolution of Elongation Factor Tu in Enterococci

PubMed Central

Ke, Danbing; Boissinot, Maurice; Huletsky, Ann; Picard, François J.; Frenette, Johanne; Ouellette, Marc; Roy, Paul H.; Bergeron, Michel G.

2000-01-01

The elongation factor Tu, encoded by tuf genes, is a GTP binding protein that plays a central role in protein synthesis. One to three tuf genes per genome are present, depending on the bacterial species. Most low-G+C-content gram-positive bacteria carry only one tuf gene. We have designed degenerate PCR primers derived from consensus sequences of the tuf gene to amplify partial tuf sequences from 17 enterococcal species and other phylogenetically related species. The amplified DNA fragments were sequenced either by direct sequencing or by sequencing cloned inserts containing putative amplicons. Two different tuf genes (tufA and tufB) were found in 11 enterococcal species, including Enterococcus avium, Enterococcus casseliflavus, Enterococcus dispar, Enterococcus durans, Enterococcus faecium, Enterococcus gallinarum, Enterococcus hirae, Enterococcus malodoratus, Enterococcus mundtii, Enterococcus pseudoavium, and Enterococcus raffinosus. For the other six enterococcal species (Enterococcus cecorum, Enterococcus columbae, Enterococcus faecalis, Enterococcus sulfureus, Enterococcus saccharolyticus, and Enterococcus solitarius), only the tufA gene was present. Based on 16S rRNA gene sequence analysis, the 11 species having two tuf genes all have a common ancestor, while the six species having only one copy diverged from the enterococcal lineage before that common ancestor. The presence of one or two copies of the tuf gene in enterococci was confirmed by Southern hybridization. Phylogenetic analysis of tuf sequences demonstrated that the enterococcal tufA gene branches with the Bacillus, Listeria, and Staphylococcus genera, while the enterococcal tufB gene clusters with the genera Streptococcus and Lactococcus. Primary structure analysis showed that four amino acid residues encoded within the sequenced regions are conserved and unique to the enterococcal tufB genes and the tuf genes of streptococci and Lactococcus lactis. The data suggest that an ancestral streptococcus or a streptococcus-related species may have horizontally transferred a tuf gene to the common ancestor of the 11 enterococcal species which now carry two tuf genes. PMID:11092850
FunGene: the functional gene pipeline and repository.

PubMed

Fish, Jordan A; Chai, Benli; Wang, Qiong; Sun, Yanni; Brown, C Titus; Tiedje, James M; Cole, James R

2013-01-01

Ribosomal RNA genes have become the standard molecular markers for microbial community analysis for good reasons, including universal occurrence in cellular organisms, availability of large databases, and ease of rRNA gene region amplification and analysis. As markers, however, rRNA genes have some significant limitations. The rRNA genes are often present in multiple copies, unlike most protein-coding genes. The slow rate of change in rRNA genes means that multiple species sometimes share identical 16S rRNA gene sequences, while many more species share identical sequences in the short 16S rRNA regions commonly analyzed. In addition, the genes involved in many important processes are not distributed in a phylogenetically coherent manner, potentially due to gene loss or horizontal gene transfer. While rRNA genes remain the most commonly used markers, key genes in ecologically important pathways, e.g., those involved in carbon and nitrogen cycling, can provide important insights into community composition and function not obtainable through rRNA analysis. However, working with ecofunctional gene data requires some tools beyond those required for rRNA analysis. To address this, our Functional Gene Pipeline and Repository (FunGene; http://fungene.cme.msu.edu/) offers databases of many common ecofunctional genes and proteins, as well as integrated tools that allow researchers to browse these collections and choose subsets for further analysis, build phylogenetic trees, test primers and probes for coverage, and download aligned sequences. Additional FunGene tools are specialized to process coding gene amplicon data. For example, FrameBot produces frameshift-corrected protein and DNA sequences from raw reads while finding the most closely related protein reference sequence. These tools can help provide better insight into microbial communities by directly studying key genes involved in important ecological processes.
Vibrio chromosomes share common history.

PubMed

Kirkup, Benjamin C; Chang, LeeAnn; Chang, Sarah; Gevers, Dirk; Polz, Martin F

2010-05-10

While most gamma proteobacteria have a single circular chromosome, Vibrionales have two circular chromosomes. Horizontal gene transfer is common among Vibrios, and in light of this genetic mobility, it is an open question to what extent the two chromosomes themselves share a common history since their formation. Single copy genes from each chromosome (142 genes from chromosome I and 42 genes from chromosome II) were identified from 19 sequenced Vibrionales genomes and their phylogenetic comparison suggests consistent phylogenies for each chromosome. Additionally, study of the gene organization and phylogeny of the respective origins of replication confirmed the shared history. Thus, while elements within the chromosomes may have experienced significant genetic mobility, the backbones share a common history. This allows conclusions based on multilocus sequence analysis (MLSA) for one chromosome to be applied equally to both chromosomes.
Analyses of Methylomes Derived from Meso-American Common Bean (Phaseolus vulgaris L.) Using MeDIP-Seq and Whole Genome Sodium Bisulfite-Sequencing.

PubMed

Crampton, Mollee; Sripathi, Venkateswara R; Hossain, Khwaja; Kalavacharla, Venu

2016-01-01

Common bean (Phaseolus vulgaris L.) is economically important for its high protein, fiber, and micronutrient contents, with a relatively small genome size of ∼587 Mb. Common bean is genetically diverse with two major gene pools, Meso-American and Andean. The phenotypic variability within common bean is partly attributed to the genetic diversity and epigenetic changes that are largely influenced by environmental factors. It is well established that an important epigenetic regulator of gene expression is DNA methylation. Here, we present results generated from two high-throughput sequencing technologies, methylated DNA immunoprecipitation-sequencing (MeDIP-seq) and whole genome bisulfite-sequencing (BS-Seq). Our analyses revealed that this Meso-American common bean displays similar methylation patterns as other previously published plant methylomes, with CG ∼50%, CHG ∼30%, and CHH ∼2.7% methylation, however, these differ from the common bean reference methylome of Andean origin. We identified higher CG methylation levels in both promoter and genic regions than CHG and CHH contexts. Moreover, we found relatively higher CG methylation levels in genes than in promoters. Conversely, the CHG and CHH methylation levels were highest in promoters than in genes. This is the first genome-wide DNA methylation profiling study in a Meso-American common bean cultivar ("Sierra") using NGS approaches. Our long-term goal is to generate genome-wide epigenomic maps in common bean focusing on chromatin accessibility, histone modifications, and DNA methylation.
Analyses of Methylomes Derived from Meso-American Common Bean (Phaseolus vulgaris L.) Using MeDIP-Seq and Whole Genome Sodium Bisulfite-Sequencing

PubMed Central

Crampton, Mollee; Sripathi, Venkateswara R.; Hossain, Khwaja; Kalavacharla, Venu

2016-01-01

Common bean (Phaseolus vulgaris L.) is economically important for its high protein, fiber, and micronutrient contents, with a relatively small genome size of ∼587 Mb. Common bean is genetically diverse with two major gene pools, Meso-American and Andean. The phenotypic variability within common bean is partly attributed to the genetic diversity and epigenetic changes that are largely influenced by environmental factors. It is well established that an important epigenetic regulator of gene expression is DNA methylation. Here, we present results generated from two high-throughput sequencing technologies, methylated DNA immunoprecipitation-sequencing (MeDIP-seq) and whole genome bisulfite-sequencing (BS-Seq). Our analyses revealed that this Meso-American common bean displays similar methylation patterns as other previously published plant methylomes, with CG ∼50%, CHG ∼30%, and CHH ∼2.7% methylation, however, these differ from the common bean reference methylome of Andean origin. We identified higher CG methylation levels in both promoter and genic regions than CHG and CHH contexts. Moreover, we found relatively higher CG methylation levels in genes than in promoters. Conversely, the CHG and CHH methylation levels were highest in promoters than in genes. This is the first genome-wide DNA methylation profiling study in a Meso-American common bean cultivar (“Sierra”) using NGS approaches. Our long-term goal is to generate genome-wide epigenomic maps in common bean focusing on chromatin accessibility, histone modifications, and DNA methylation. PMID:27199997
Transcript Profiling of Common Bean (Phaseolus vulgaris L.) Using the GeneChip(R) Soybean Genome Array: Optimizing Analysis by Masking Biased Probes

USDA-ARS?s Scientific Manuscript database

Common bean (Phaseolus vulgaris) and soybean (Glycine max) both belong to the Phaseoleae tribe and share significant coding sequence homology. This suggests that the GeneChip(R) Soybean Genome Array (soybean GeneChip) may be used for gene expression studies using common bean. To evaluate the utility...
Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis.

PubMed

Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi

2015-11-20

The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181 Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40 Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299 Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled.
Origins of Genes: "Big Bang" or Continuous Creation?

NASA Astrophysics Data System (ADS)

Kesse, Paul K.; Gibbs, Adrian

1992-10-01

Many protein families are common to all cellular organisms, indicating that many genes have ancient origins. Genetic variation is mostly attributed to processes such as mutation, duplication, and rearrangement of ancient modules. Thus it is widely assumed that much of present-day genetic diversity can be traced by common ancestry to a molecular "big bang." A rarely considered alternative is that proteins may arise continuously de novo. One mechanism of generating different coding sequences is by "overprinting," in which an existing nucleotide sequence is translated de novo in a different reading frame or from noncoding open reading frames. The clearest evidence for overprinting is provided when the original gene function is retained, as in overlapping genes. Analysis of their phylogenies indicates which are the original genes and which are their informationally novel partners. We report here the phylogenetic relationships of overlapping coding sequences from steroid-related receptor genes and from tymovirus, luteovirus, and lentivirus genomes. For each pair of overlapping coding sequences, one is confined to a single lineage, whereas the other is more widespread. This suggests that the phylogenetically restricted coding sequence arose only in the progenitor of that lineage by translating an out-of-frame sequence to yield the new polypeptide. The production of novel exons by alternative splicing in thyroid receptor and lentivirus genes suggests that introns can be a valuable evolutionary source for overprinting. New genes and their products may drive major evolutionary changes.
ExprAlign - the identification of ESTs in non-model species by alignment of cDNA microarray expression profiles

PubMed Central

2009-01-01

Background Sequence identification of ESTs from non-model species offers distinct challenges particularly when these species have duplicated genomes and when they are phylogenetically distant from sequenced model organisms. For the common carp, an environmental model of aquacultural interest, large numbers of ESTs remained unidentified using BLAST sequence alignment. We have used the expression profiles from large-scale microarray experiments to suggest gene identities. Results Expression profiles from ~700 cDNA microarrays describing responses of 7 major tissues to multiple environmental stressors were used to define a co-expression landscape. This was based on the Pearsons correlation coefficient relating each gene with all other genes, from which a network description provided clusters of highly correlated genes as 'mountains'. We show that these contain genes with known identities and genes with unknown identities, and that the correlation constitutes evidence of identity in the latter. This procedure has suggested identities to 522 of 2701 unknown carp ESTs sequences. We also discriminate several common carp genes and gene isoforms that were not discriminated by BLAST sequence alignment alone. Precision in identification was substantially improved by use of data from multiple tissues and treatments. Conclusion The detailed analysis of co-expression landscapes is a sensitive technique for suggesting an identity for the large number of BLAST unidentified cDNAs generated in EST projects. It is capable of detecting even subtle changes in expression profiles, and thereby of distinguishing genes with a common BLAST identity into different identities. It benefits from the use of multiple treatments or contrasts, and from the large-scale microarray data. PMID:19939286
Exome-wide Sequencing Shows Low Mutation Rates and Identifies Novel Mutated Genes in Seminomas.

PubMed

Cutcutache, Ioana; Suzuki, Yuka; Tan, Iain Beehuat; Ramgopal, Subhashini; Zhang, Shenli; Ramnarayanan, Kalpana; Gan, Anna; Lee, Heng Hong; Tay, Su Ting; Ooi, Aikseng; Ong, Choon Kiat; Bolthouse, Jonathan T; Lane, Brian R; Anema, John G; Kahnoski, Richard J; Tan, Patrick; Teh, Bin Tean; Rozen, Steven G

2015-07-01

Testicular germ cell tumors are the most common cancer diagnosed in young men, and seminomas are the most common type of these cancers. There have been no exome-wide examinations of genes mutated in seminomas or of overall rates of nonsilent somatic mutations in these tumors. The objective was to analyze somatic mutations in seminomas to determine which genes are affected and to determine rates of nonsilent mutations. Eight seminomas and matched normal samples were surgically obtained from eight patients. DNA was extracted from tissue samples and exome sequenced on massively parallel Illumina DNA sequencers. Single-nucleotide polymorphism chip-based copy number analysis was also performed to assess copy number alterations. The DNA sequencing read data were analyzed to detect somatic mutations including single-nucleotide substitutions and short insertions and deletions. The detected mutations were validated by independent sequencing and further checked for subclonality. The rate of nonsynonymous somatic mutations averaged 0.31 mutations/Mb. We detected nonsilent somatic mutations in 96 genes that were not previously known to be mutated in seminomas, of which some may be driver mutations. Many of the mutations appear to have been present in subclonal populations. In addition, two genes, KIT and KRAS, were affected in two tumors each with mutations that were previously observed in other cancers and are presumably oncogenic. Our study, the first report on exome sequencing of seminomas, detected somatic mutations in 96 new genes, several of which may be targetable drivers. Furthermore, our results show that seminoma mutation rates are five times higher than previously thought, but are nevertheless low compared to other common cancers. Similar low rates are seen in other cancers that also have excellent rates of remission achieved with chemotherapy. We examined the DNA sequences of seminomas, the most common type of testicular germ cell cancer. Our study identified 96 new genes in which mutations occurred during seminoma development, some of which might contribute to cancer development or progression. The study also showed that the rates of DNA mutations during seminoma development are higher than previously thought, but still lower than for other common solid-organ cancers. Such low rates are also observed among other cancers that, like seminomas, show excellent rates of disease remission after chemotherapy. Copyright © 2015 European Association of Urology. Published by Elsevier B.V. All rights reserved.
Complete chloroplast genome sequence of common bermudagrass (Cynodon dactylon (L.) Pers.) and comparative analysis within the family Poaceae

PubMed Central

Huang, Ya-Yi; Cho, Shu-Ting; Haryono, Mindia; Kuo, Chih-Horng

2017-01-01

Common bermudagrass (Cynodon dactylon (L.) Pers.) belongs to the subfamily Chloridoideae of the Poaceae family, one of the most important plant families ecologically and economically. This grass has a long connection with human culture but its systematics is relatively understudied. In this study, we sequenced and investigated the chloroplast genome of common bermudagrass, which is 134,297 bp in length with two single copy regions (LSC: 79,732 bp; SSC: 12,521 bp) and a pair of inverted repeat (IR) regions (21,022 bp). The annotation contains a total of 128 predicted genes, including 82 protein-coding, 38 tRNA, and 8 rRNA genes. Additionally, our in silico analyses identified 10 sets of repeats longer than 20 bp and predicted the presence of 36 RNA editing sites. Overall, the chloroplast genome of common bermudagrass resembles those from other Poaceae lineages. Compared to most angiosperms, the accD gene and the introns of both clpP and rpoC1 genes are missing. Additionally, the ycf1, ycf2, ycf15, and ycf68 genes are pseudogenized and two genome rearrangements exist. Our phylogenetic analysis based on 47 chloroplast protein-coding genes supported the placement of common bermudagrass within Chloridoideae. Our phylogenetic character mapping based on the parsimony principle further indicated that the loss of the accD gene and clpP introns, the pseudogenization of four ycf genes, and the two rearrangements occurred only once after the most recent common ancestor of the Poaceae diverged from other monocots, which could explain the unusual long branch leading to the Poaceae when phylogeny is inferred based on chloroplast sequences. PMID:28617867
Complete chloroplast genome sequence of common bermudagrass (Cynodon dactylon (L.) Pers.) and comparative analysis within the family Poaceae.

PubMed

Huang, Ya-Yi; Cho, Shu-Ting; Haryono, Mindia; Kuo, Chih-Horng

2017-01-01

Common bermudagrass (Cynodon dactylon (L.) Pers.) belongs to the subfamily Chloridoideae of the Poaceae family, one of the most important plant families ecologically and economically. This grass has a long connection with human culture but its systematics is relatively understudied. In this study, we sequenced and investigated the chloroplast genome of common bermudagrass, which is 134,297 bp in length with two single copy regions (LSC: 79,732 bp; SSC: 12,521 bp) and a pair of inverted repeat (IR) regions (21,022 bp). The annotation contains a total of 128 predicted genes, including 82 protein-coding, 38 tRNA, and 8 rRNA genes. Additionally, our in silico analyses identified 10 sets of repeats longer than 20 bp and predicted the presence of 36 RNA editing sites. Overall, the chloroplast genome of common bermudagrass resembles those from other Poaceae lineages. Compared to most angiosperms, the accD gene and the introns of both clpP and rpoC1 genes are missing. Additionally, the ycf1, ycf2, ycf15, and ycf68 genes are pseudogenized and two genome rearrangements exist. Our phylogenetic analysis based on 47 chloroplast protein-coding genes supported the placement of common bermudagrass within Chloridoideae. Our phylogenetic character mapping based on the parsimony principle further indicated that the loss of the accD gene and clpP introns, the pseudogenization of four ycf genes, and the two rearrangements occurred only once after the most recent common ancestor of the Poaceae diverged from other monocots, which could explain the unusual long branch leading to the Poaceae when phylogeny is inferred based on chloroplast sequences.
Common Viral Integration Sites Identified in Avian Leukosis Virus-Induced B-Cell Lymphomas

PubMed Central

Justice, James F.; Morgan, Robin W.

2015-01-01

ABSTRACT Avian leukosis virus (ALV) induces B-cell lymphoma and other neoplasms in chickens by integrating within or near cancer genes and perturbing their expression. Four genes—MYC, MYB, Mir-155, and TERT—have previously been identified as common integration sites in these virus-induced lymphomas and are thought to play a causal role in tumorigenesis. In this study, we employ high-throughput sequencing to identify additional genes driving tumorigenesis in ALV-induced B-cell lymphomas. In addition to the four genes implicated previously, we identify other genes as common integration sites, including TNFRSF1A, MEF2C, CTDSPL, TAB2, RUNX1, MLL5, CXorf57, and BACH2. We also analyze the genome-wide ALV integration landscape in vivo and find increased frequency of ALV integration near transcriptional start sites and within transcripts. Previous work has shown ALV prefers a weak consensus sequence for integration in cultured human cells. We confirm this consensus sequence for ALV integration in vivo in the chicken genome. PMID:26670384
Origins of genes: "big bang" or continuous creation?

PubMed Central

Keese, P K; Gibbs, A

1992-01-01

Many protein families are common to all cellular organisms, indicating that many genes have ancient origins. Genetic variation is mostly attributed to processes such as mutation, duplication, and rearrangement of ancient modules. Thus it is widely assumed that much of present-day genetic diversity can be traced by common ancestry to a molecular "big bang." A rarely considered alternative is that proteins may arise continuously de novo. One mechanism of generating different coding sequences is by "overprinting," in which an existing nucleotide sequence is translated de novo in a different reading frame or from noncoding open reading frames. The clearest evidence for overprinting is provided when the original gene function is retained, as in overlapping genes. Analysis of their phylogenies indicates which are the original genes and which are their informationally novel partners. We report here the phylogenetic relationships of overlapping coding sequences from steroid-related receptor genes and from tymovirus, luteovirus, and lentivirus genomes. For each pair of overlapping coding sequences, one is confined to a single lineage, whereas the other is more widespread. This suggests that the phylogenetically restricted coding sequence arose only in the progenitor of that lineage by translating an out-of-frame sequence to yield the new polypeptide. The production of novel exons by alternative splicing in thyroid receptor and lentivirus genes suggests that introns can be a valuable evolutionary source for overprinting. New genes and their products may drive major evolutionary changes. PMID:1329098
Comparative Analysis of Vertebrate Dystrophin Loci Indicate Intron Gigantism as a Common Feature

PubMed Central

Pozzoli, Uberto; Elgar, Greg; Cagliani, Rachele; Riva, Laura; Comi, Giacomo P.; Bresolin, Nereo; Bardoni, Alessandra; Sironi, Manuela

2003-01-01

The human DMD gene is the largest known to date, spanning > 2000 kb on the X chromosome. The gene size is mainly accounted for by huge intronic regions. We sequenced 190 kb of Fugu rubripes (pufferfish) genomic DNA corresponding to the complete dystrophin gene (FrDMD) and provide the first report of gene structure and sequence comparison among dystrophin genomic sequences from different vertebrate organisms. Almost all intron positions and phases are conserved between FrDMD and its mammalian counterparts, and the predicted protein product of the Fugu gene displays 55% identity and 71% similarity to human dystrophin. In analogy to the human gene, FrDMD presents several-fold longer than average intronic regions. Analysis of intron sequences of the human and murine genes revealed that they are extremely conserved in size and that a similar fraction of total intron length is represented by repetitive elements; moreover, our data indicate that intron expansion through repeat accumulation in the two orthologs is the result of independent insertional events. The hypothesis that intron length might be functionally relevant to the DMD gene regulation is proposed and substantiated by the finding that dystrophin intron gigantism is common to the three vertebrate genes. [Supplemental material is available online at www.genome.org.] PMID:12727896

Spectrum of mutations in leiomyosarcomas identified by clinical targeted next-generation sequencing.

PubMed

Lee, Paul J; Yoo, Naomi S; Hagemann, Ian S; Pfeifer, John D; Cottrell, Catherine E; Abel, Haley J; Duncavage, Eric J

2017-02-01

Recurrent genomic mutations in uterine and non-uterine leiomyosarcomas have not been well established. Using a next generation sequencing (NGS) panel of common cancer-associated genes, 25 leiomyosarcomas arising from multiple sites were examined to explore genetic alterations, including single nucleotide variants (SNV), small insertions/deletions (indels), and copy number alterations (CNA). Sequencing showed 86 non-synonymous, coding region somatic variants within 151 gene targets in 21 cases, with a mean of 4.1 variants per case; 4 cases had no putative mutations in the panel of genes assayed. The most frequently altered genes were TP53 (36%), ATM and ATRX (16%), and EGFR and RB1 (12%). CNA were identified in 85% of cases, with the most frequent copy number losses observed in chromosomes 10 and 13 including PTEN and RB1; the most frequent gains were seen in chromosomes 7 and 17. Our data show that deletions in canonical cancer-related genes are common in leiomyosarcomas. Further, the spectrum of gene mutations observed shows that defects in DNA repair and chromosomal maintenance are central to the biology of leiomyosarcomas, and that activating mutations observed in other common cancer types are rare in leiomyosarcomas. Copyright © 2017 Elsevier Inc. All rights reserved.
Genome Wide Search for Biomarkers to Diagnose Yersinia Infections.

PubMed

Kalia, Vipin Chandra; Kumar, Prasun

2015-12-01

Bacterial identification on the basis of the highly conserved 16S rRNA (rrs) gene is limited by its presence in multiple copies and a very high level of similarity among them. The need is to look for other genes with unique characteristics to be used as biomarkers. Fifty-one sequenced genomes belonging to 10 different Yersinia species were used for searching genes common to all the genomes. Out of 304 common genes, 34 genes of sizes varying from 0.11 to 4.42 kb, were selected and subjected to in silico digestion with 10 different Restriction endonucleases (RE) (4-6 base cutters). Yersinia species have 6-7 copies of rrs per genome, which are difficult to distinguish by multiple sequence alignments or their RE digestion patterns. However, certain unique combinations of other common gene sequences-carB, fadJ, gluM, gltX, ileS, malE, nusA, ribD, and rlmL and their RE digestion patterns can be used as markers for identifying 21 strains belonging to 10 Yersinia species: Y. aldovae, Y. enterocolitica, Y. frederiksenii, Y. intermedia, Y. kristensenii, Y. pestis, Y. pseudotuberculosis, Y. rohdei, Y. ruckeri, and Y. similis. This approach can be applied for rapid diagnostic applications.
[High gene conversion frequency between genes encoding 2-deoxyglucose-6-phosphate phosphatase in 3 Saccharomyces species].

PubMed

Piscopo, Sara-Pier; Drouin, Guy

2014-05-01

Gene conversions are nonreciprocal sequence exchanges between genes. They are relatively common in Saccharomyces cerevisiae, but few studies have investigated the evolutionary fate of gene conversions or their functional impacts. Here, we analyze the evolution and impact of gene conversions between the two genes encoding 2-deoxyglucose-6-phosphate phosphatase in S. cerevisiae, Saccharomyces paradoxus and Saccharomyces mikatae. Our results demonstrate that the last half of these genes are subject to gene conversions among these three species. The greater similarity and the greater percentage of GC nucleotides in the converted regions, as well as the absence of long regions of adjacent common converted sites, suggest that these gene conversions are frequent and occur independently in all three species. The high frequency of these conversions probably result from the fact that they have little impact on the protein sequences encoded by these genes.
Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum) and Comparative Analysis with Common Buckwheat (F. esculentum)

PubMed Central

Cho, Kwang-Soo; Yun, Bong-Kyoung; Yoon, Young-Ho; Hong, Su-Young; Mekapogu, Manjulatha; Kim, Kyung-Hee; Yang, Tae-Jin

2015-01-01

We report the chloroplast (cp) genome sequence of tartary buckwheat (Fagopyrum tataricum) obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale) cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp) were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats) and F. esculentum (one repeat), and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes—rpoC2, ycf3, accD, and clpP—have high synonymous (Ks) value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum. PMID:25966355
Using the TIGR gene index databases for biological discovery.

PubMed

Lee, Yuandan; Quackenbush, John

2003-11-01

The TIGR Gene Index web pages provide access to analyses of ESTs and gene sequences for nearly 60 species, as well as a number of resources derived from these. Each species-specific database is presented using a common format with a homepage. A variety of methods exist that allow users to search each species-specific database. Methods implemented currently include nucleotide or protein sequence queries using WU-BLAST, text-based searches using various sequence identifiers, searches by gene, tissue and library name, and searches using functional classes through Gene Ontology assignments. This protocol provides guidance for using the Gene Index Databases to extract information.
Epigenetics in Prostate Cancer

PubMed Central

Albany, Costantine; Alva, Ajjai S.; Aparicio, Ana M.; Singal, Rakesh; Yellapragada, Sarvari; Sonpavde, Guru; Hahn, Noah M.

2011-01-01

Prostate cancer (PC) is the most commonly diagnosed nonskin malignancy and the second most common cause of cancer death among men in the United States. Epigenetics is the study of heritable changes in gene expression caused by mechanisms other than changes in the underlying DNA sequences. Two common epigenetic mechanisms, DNA methylation and histone modification, have demonstrated critical roles in prostate cancer growth and metastasis. DNA hypermethylation of cytosine-guanine (CpG) rich sequence islands within gene promoter regions is widespread during neoplastic transformation of prostate cells, suggesting that treatment-induced restoration of a “normal” epigenome could be clinically beneficial. Histone modification leads to altered tumor gene function by changing chromosome structure and the level of gene transcription. The reversibility of epigenetic aberrations and restoration of tumor suppression gene function have made them attractive targets for prostate cancer treatment with modulators that demethylate DNA and inhibit histone deacetylases. PMID:22191037
Epigenetics in prostate cancer.

PubMed

Albany, Costantine; Alva, Ajjai S; Aparicio, Ana M; Singal, Rakesh; Yellapragada, Sarvari; Sonpavde, Guru; Hahn, Noah M

2011-01-01

Prostate cancer (PC) is the most commonly diagnosed nonskin malignancy and the second most common cause of cancer death among men in the United States. Epigenetics is the study of heritable changes in gene expression caused by mechanisms other than changes in the underlying DNA sequences. Two common epigenetic mechanisms, DNA methylation and histone modification, have demonstrated critical roles in prostate cancer growth and metastasis. DNA hypermethylation of cytosine-guanine (CpG) rich sequence islands within gene promoter regions is widespread during neoplastic transformation of prostate cells, suggesting that treatment-induced restoration of a "normal" epigenome could be clinically beneficial. Histone modification leads to altered tumor gene function by changing chromosome structure and the level of gene transcription. The reversibility of epigenetic aberrations and restoration of tumor suppression gene function have made them attractive targets for prostate cancer treatment with modulators that demethylate DNA and inhibit histone deacetylases.
Molecular cloning, characterization and expression analysis of TLR9, MyD88 and TRAF6 genes in common carp (Cyprinus carpio).

PubMed

Kongchum, Pawapol; Hallerman, Eric M; Hulata, Gideon; David, Lior; Palti, Yniv

2011-01-01

Induction of innate immune pathways is critical for early host defense, but there is limited understanding of how teleost fishes recognize pathogen molecules and activate these pathways. In mammals, cells of the innate immune system detect pathogenic molecular structures using pattern recognition receptors (PRRs). TLR9 functions as a PRR that recognizes CpG motifs in bacterial and viral DNA and requires adaptor molecules MyD88 and TRAF6 for signal transduction. Here we report full-length cDNA isolation, structural characterization and tissue mRNA expression analysis of the common carp (cc) TLR9, MyD88 and TRAF6 gene orthologs. The ccTLR9 open-reading frame (ORF) is predicted to encode a 1064-amino acid (aa) protein. We found that MyD88 and TRAF6 genes are duplicated in common carp. This is the first report of TRAF6 duplication in a vertebrate genome and stronger evidence in support of MyD88 duplication is provided. The ccMyD88a and b ORFs are predicted to encode 288-aa and 284-aa peptides, respectively. They share 91% aa sequence identity between paralogs. The ccTRAF6a and b ORFs are both predicted to encode 543-aa peptides sharing 95% aa sequence identity between paralogs. The ccTLR9 gene is contained in a single large exon. The ccMyD88a and ccMyD88b coding sequences span five exons. The TRAF6b gene spans six exons. PCR amplification to obtain the entire coding sequence of ccTRAF6a gene was not successful. The 2104-bp fragment amplified covers the 3' end of the gene and it contains a partial sequence of one exon and three complete exons. The predicated protein domains of the ccTLR9, ccMyD88 and ccTRAF6 are conserved and resemble orthologs from other vertebrates. Real-time quantitative PCR assays of the ccTLR9, MyD88a and b, and TRAF6a and b gene transcripts in healthy common carp indicated that mRNA expression varied between tissues. Differential expression of duplicate copies were found for ccMyD88 and ccTRAF6 in white and red muscle tissues, suggesting that paralogs may have evolved and attained a new function. The genomic information we describe in this paper provides evidence of sequence and structural conservation of immune response genes in common carp. Published by Elsevier Ltd.
Xanthomonas adaptation to common bean is associated with horizontal transfers of genes encoding TAL effectors.

PubMed

Ruh, Mylène; Briand, Martial; Bonneau, Sophie; Jacques, Marie-Agnès; Chen, Nicolas W G

2017-08-30

Common bacterial blight is a devastating bacterial disease of common bean (Phaseolus vulgaris) caused by Xanthomonas citri pv. fuscans and Xanthomonas phaseoli pv. phaseoli. These phylogenetically distant strains are able to cause similar symptoms on common bean, suggesting that they have acquired common genetic determinants of adaptation to common bean. Transcription Activator-Like (TAL) effectors are bacterial type III effectors that are able to induce the expression of host genes to promote infection or resistance. Their capacity to bind to a specific host DNA sequence suggests that they are potential candidates for host adaption. To study the diversity of tal genes from Xanthomonas strains responsible for common bacterial blight of bean, whole genome sequences of 17 strains representing the diversity of X. citri pv. fuscans and X. phaseoli pv. phaseoli were obtained by single molecule real time sequencing. Analysis of these genomes revealed the existence of four tal genes named tal23A, tal20F, tal18G and tal18H, respectively. While tal20F and tal18G were chromosomic, tal23A and tal18H were carried on plasmids and shared between phylogenetically distant strains, therefore suggesting recent horizontal transfers of these genes between X. citri pv. fuscans and X. phaseoli pv. phaseoli strains. Strikingly, tal23A was present in all strains studied, suggesting that it played an important role in adaptation to common bean. In silico predictions of TAL effectors targets in the common bean genome suggested that TAL effectors shared by X. citri pv. fuscans and X. phaseoli pv. phaseoli strains target the promoters of genes of similar functions. This could be a trace of convergent evolution among TAL effectors from different phylogenetic groups, and comforts the hypothesis that TAL effectors have been implied in the adaptation to common bean. Altogether, our results favour a model where plasmidic TAL effectors are able to contribute to host adaptation by being horizontally transferred between distant lineages.
Exome Sequencing of 18 Chinese Families with Congenital Cataracts: A New Sight of the NHS Gene

PubMed Central

Sun, Wenmin; Xiao, Xueshan; Li, Shiqiang; Guo, Xiangming; Zhang, Qingjiong

2014-01-01

Purpose The aim of this study was to investigate the mutation spectrum and frequency of 34 known genes in 18 Chinese families with congenital cataracts. Methods Genomic DNA and clinical data was collected from 18 families with congenital cataracts. Variations in 34 cataract-associated genes were screened by whole exome sequencing and then validated by Sanger sequencing. Results Eleven candidate variants in seven of the 34 genes were detected by exome sequencing and then confirmed by Sanger sequencing, including two variants predicted to be benign and the other pathogenic mutations. The nine mutations were present in 9 of the 18 (50%) families with congenital cataracts. Of the four families with mutations in the X-linked NHS gene, no other abnormalities were recorded except for cataract, in which a pseudo-dominant inheritance form was suggested, as female carriers also had different forms of cataracts. Conclusion This study expands the mutation spectrum and frequency of genes responsible for congenital cataract. Mutation in NHS is a common cause of nonsyndromic congenital cataract with pseudo-autosomal dominant inheritance. Combined with our previous studies, a genetic basis could be identified in 67.6% of families with congenital cataracts in our case series, in which mutations in genes encoding crystallins, genes encoding connexins, and NHS are responsible for 29.4%, 14.7%, and 11.8% of families, respectively. Our results suggest that mutations in NHS are the common cause of congenital cataract, both syndromic and nonsyndromic. PMID:24968223
Transcript Profiling of Common Bean (Phaseolus vulgaris) Using the GeneChip Soybean Genome Array: Optimizing Analysis by Masking Biased Probes

USDA-ARS?s Scientific Manuscript database

Common bean (Phaseolus vulgaris) and soybean (Glycine max) both belong to the Phaseoleae tribe and share significant coding sequence homology. To evaluate the utility of the soybean GeneChip for transcript profiling of common bean, we hybridized cRNAs purified from nodule, leaf, and root of common b...
Speciation of polyploid Cyprinidae fish of common carp, crucian carp, and silver crucian carp derived from duplicated Hox genes.

PubMed

Yuan, Jian; He, Zhuzi; Yuan, Xiangnan; Jiang, Xiayun; Sun, Xiaowen; Zou, Shuming

2010-09-15

Recent studies on comparative genomics have suggested that a round of fish-specific whole genome duplication (3R) in ray-finned fishes might have occurred around 226-316 Mya. Additional genome duplication, specifically in cyprinids, may have occurred more recently after the divergence of the teleosts. The timing of this event, however, is unknown. To address this question, we sequenced four Hox genes from taxa representing the polyploid Cyprinidae fish, common carp (Cyprinus carpio, 2n=100), crucian carp (Carassius auratus auratus, 2n=100), and silver crucian carp (C. auratus gibelio, 2n=156), and then compared them with known sequences from the diploid Cyprinidae fish, blunt snout bream (Megalobrama amblycephala, 2n=48). Our results showed the presence of two distinct Hox duplicates in the genomes of common and crucian carp. Three distinct Hox sequences, one of them orthologous to a Hox gene in common carp and the other two orthologous to a Hox gene in crucian carp, were isolated in silver crucian carp, indicating a possible hybrid origin of silver crucian carp from crucian and common carp. The gene duplication resulting in the origin of the common ancestor of common and crucian carp likely occurred around 10.9-13.2 Mya. The speciations of common vs. crucian carp and silver crucian vs. crucian carp likely occurred around 8.1-11.4 and 2.3-3.0 Mya, respectively. Finally, nonfunctionalization resulting from point mutations in the coding region is a probable fate for some Hox duplicates. Taken together, these results suggested an evolutionary model for polyploidization in speciation and diversification of polyploid fish. (c) 2010 Wiley-Liss, Inc.
Genome wide re-sequencing of newly developed Rice Lines from common wild rice (Oryza rufipogon Griff.) for the identification of NBS-LRR genes.

PubMed

Liu, Wen; Ghouri, Fozia; Yu, Hang; Li, Xiang; Yu, Shuhong; Shahid, Muhammad Qasim; Liu, Xiangdong

2017-01-01

Common wild rice (Oryza rufipogon Griff.) is an important germplasm for rice breeding, which contains many resistance genes. Re-sequencing provides an unprecedented opportunity to explore the abundant useful genes at whole genome level. Here, we identified the nucleotide-binding site leucine-rich repeat (NBS-LRR) encoding genes by re-sequencing of two wild rice lines (i.e. Huaye 1 and Huaye 2) that were developed from common wild rice. We obtained 128 to 147 million reads with approximately 32.5-fold coverage depth, and uniquely covered more than 89.6% (> = 1 fold) of reference genomes. Two wild rice lines showed high SNP (single-nucleotide polymorphisms) variation rate in 12 chromosomes against the reference genomes of Nipponbare (japonica cultivar) and 93-11 (indica cultivar). InDels (insertion/deletion polymorphisms) count-length distribution exhibited normal distribution in the two lines, and most of the InDels were ranged from -5 to 5 bp. With reference to the Nipponbare genome sequence, we detected a total of 1,209,308 SNPs, 161,117 InDels and 4,192 SVs (structural variations) in Huaye 1, and 1,387,959 SNPs, 180,226 InDels and 5,305 SVs in Huaye 2. A total of 44.9% and 46.9% genes exhibited sequence variations in two wild rice lines compared to the Nipponbare and 93-11 reference genomes, respectively. Analysis of NBS-LRR mutant candidate genes showed that they were mainly distributed on chromosome 11, and NBS domain was more conserved than LRR domain in both wild rice lines. NBS genes depicted higher levels of genetic diversity in Huaye 1 than that found in Huaye 2. Furthermore, protein-protein interaction analysis showed that NBS genes mostly interacted with the cytochrome C protein (Os05g0420600, Os01g0885000 and BGIOSGA038922), while some NBS genes interacted with heat shock protein, DNA-binding activity, Phosphoinositide 3-kinase and a coiled coil region. We explored abundant NBS-LRR encoding genes in two common wild rice lines through genome wide re-sequencing, which proved to be a useful tool to exploit elite NBS-LRR genes in wild rice. The data here provide a foundation for future work aimed at dissecting the genetic basis of disease resistance in rice, and the two wild rice lines will be useful germplasm for the molecular improvement of cultivated rice.
Genome wide re-sequencing of newly developed Rice Lines from common wild rice (Oryza rufipogon Griff.) for the identification of NBS-LRR genes

PubMed Central

Yu, Hang; Li, Xiang; Yu, Shuhong; Shahid, Muhammad Qasim

2017-01-01

Common wild rice (Oryza rufipogon Griff.) is an important germplasm for rice breeding, which contains many resistance genes. Re-sequencing provides an unprecedented opportunity to explore the abundant useful genes at whole genome level. Here, we identified the nucleotide-binding site leucine-rich repeat (NBS-LRR) encoding genes by re-sequencing of two wild rice lines (i.e. Huaye 1 and Huaye 2) that were developed from common wild rice. We obtained 128 to 147 million reads with approximately 32.5-fold coverage depth, and uniquely covered more than 89.6% (> = 1 fold) of reference genomes. Two wild rice lines showed high SNP (single-nucleotide polymorphisms) variation rate in 12 chromosomes against the reference genomes of Nipponbare (japonica cultivar) and 93–11 (indica cultivar). InDels (insertion/deletion polymorphisms) count-length distribution exhibited normal distribution in the two lines, and most of the InDels were ranged from -5 to 5 bp. With reference to the Nipponbare genome sequence, we detected a total of 1,209,308 SNPs, 161,117 InDels and 4,192 SVs (structural variations) in Huaye 1, and 1,387,959 SNPs, 180,226 InDels and 5,305 SVs in Huaye 2. A total of 44.9% and 46.9% genes exhibited sequence variations in two wild rice lines compared to the Nipponbare and 93–11 reference genomes, respectively. Analysis of NBS-LRR mutant candidate genes showed that they were mainly distributed on chromosome 11, and NBS domain was more conserved than LRR domain in both wild rice lines. NBS genes depicted higher levels of genetic diversity in Huaye 1 than that found in Huaye 2. Furthermore, protein-protein interaction analysis showed that NBS genes mostly interacted with the cytochrome C protein (Os05g0420600, Os01g0885000 and BGIOSGA038922), while some NBS genes interacted with heat shock protein, DNA-binding activity, Phosphoinositide 3-kinase and a coiled coil region. We explored abundant NBS-LRR encoding genes in two common wild rice lines through genome wide re-sequencing, which proved to be a useful tool to exploit elite NBS-LRR genes in wild rice. The data here provide a foundation for future work aimed at dissecting the genetic basis of disease resistance in rice, and the two wild rice lines will be useful germplasm for the molecular improvement of cultivated rice. PMID:28700714
Enlightenment of Yeast Mitochondrial Homoplasmy: Diversified Roles of Gene Conversion

PubMed Central

Ling, Feng; Mikawa, Tsutomu; Shibata, Takehiko

2011-01-01

Mitochondria have their own genomic DNA. Unlike the nuclear genome, each cell contains hundreds to thousands of copies of mitochondrial DNA (mtDNA). The copies of mtDNA tend to have heterogeneous sequences, due to the high frequency of mutagenesis, but are quickly homogenized within a cell (“homoplasmy”) during vegetative cell growth or through a few sexual generations. Heteroplasmy is strongly associated with mitochondrial diseases, diabetes and aging. Recent studies revealed that the yeast cell has the machinery to homogenize mtDNA, using a common DNA processing pathway with gene conversion; i.e., both genetic events are initiated by a double-stranded break, which is processed into 3′ single-stranded tails. One of the tails is base-paired with the complementary sequence of the recipient double-stranded DNA to form a D-loop (homologous pairing), in which repair DNA synthesis is initiated to restore the sequence lost by the breakage. Gene conversion generates sequence diversity, depending on the divergence between the donor and recipient sequences, especially when it occurs among a number of copies of a DNA sequence family with some sequence variations, such as in immunoglobulin diversification in chicken. MtDNA can be regarded as a sequence family, in which the members tend to be diversified by a high frequency of spontaneous mutagenesis. Thus, it would be interesting to determine why and how double-stranded breakage and D-loop formation induce sequence homogenization in mitochondria and sequence diversification in nuclear DNA. We will review the mechanisms and roles of mtDNA homoplasmy, in contrast to nuclear gene conversion, which diversifies gene and genome sequences, to provide clues toward understanding how the common DNA processing pathway results in such divergent outcomes. PMID:24710143
Function-Based Algorithms for Biological Sequences

ERIC Educational Resources Information Center

Mohanty, Pragyan Sheela P.

2015-01-01

Two problems at two different abstraction levels of computational biology are studied. At the molecular level, efficient pattern matching algorithms in DNA sequences are presented. For gene order data, an efficient data structure is presented capable of storing all gene re-orderings in a systematic manner. A common characteristic of presented…
Detailed transcriptome description of the neglected cestode Taenia multiceps.

PubMed

Wu, Xuhang; Fu, Yan; Yang, Deying; Zhang, Runhui; Zheng, Wanpeng; Nie, Huaming; Xie, Yue; Yan, Ning; Hao, Guiying; Gu, Xiaobin; Wang, Shuxian; Peng, Xuerong; Yang, Guangyou

2012-01-01

The larval stage of Taenia multiceps, a global cestode, encysts in the central nervous system (CNS) of sheep and other livestock. This frequently leads to their death and huge socioeconomic losses, especially in developing countries. This parasite can also cause zoonotic infections in humans, but has been largely neglected due to a lack of diagnostic techniques and studies. Recent developments in next-generation sequencing provide an opportunity to explore the transcriptome of T. multiceps. We obtained a total of 31,282 unigenes (mean length 920 bp) using Illumina paired-end sequencing technology and a new Trinity de novo assembler without a referenced genome. Individual transcription molecules were determined by sequence-based annotations and/or domain-based annotations against public databases (Nr, UniprotKB/Swiss-Prot, COG, KEGG, UniProtKB/TrEMBL, InterPro and Pfam). We identified 26,110 (83.47%) unigenes and inferred 20,896 (66.8%) coding sequences (CDS). Further comparative transcripts analysis with other cestodes (Taenia pisiformis, Taenia solium, Echincoccus granulosus and Echincoccus multilocularis) and intestinal parasites (Trichinella spiralis, Ancylostoma caninum and Ascaris suum) showed that 5,100 common genes were shared among three Taenia tapeworms, 261 conserved genes were detected among five Taeniidae cestodes, and 109 common genes were found in four zoonotic intestinal parasites. Some of the common genes were genes required for parasite survival, involved in parasite-host interactions. In addition, we amplified two full-length CDS of unigenes from the common genes using RT-PCR. This study provides an extensive transcriptome of the adult stage of T. multiceps, and demonstrates that comparative transcriptomic investigations deserve to be further studied. This transcriptome dataset forms a substantial public information platform to achieve a fundamental understanding of the biology of T. multiceps, and helps in the identification of drug targets and parasite-host interaction studies.
Identification of presumed ancestral DNA sequences of phaseolin in Phaseolus vulgaris.

PubMed Central

Kami, J; Velásquez, V B; Debouck, D G; Gepts, P

1995-01-01

Common bean (Phaseolus vulgaris) consists of two major geographic gene pools, one distributed in Mexico, Central America, and Colombia and the other in the southern Andes (southern Peru, Bolivia, and Argentina). Amplification and sequencing of members of the multigene family coding for phaseolin, the major seed storage protein of the common bean, provide evidence for accumulation of tandem direct repeats in both introns and exons during evolution of the multigene family in this species. The presumed ancestral phaseolin sequences, without tandem repeats, were found in recently discovered but nearly extinct wild common bean populations of Ecuador and northern Peru that are intermediate between the two major gene pools of the species based on geographical and molecular arguments. Our results illustrate the usefulness of tandem direct repeats in establishing the polarity of DNA sequence divergence and therefore in proposing phylogenies. Images Fig. 1 Fig. 3 PMID:7862642
LinkFinder: An expert system that constructs phylogenic trees

NASA Technical Reports Server (NTRS)

Inglehart, James; Nelson, Peter C.

1991-01-01

An expert system has been developed using the C Language Integrated Production System (CLIPS) that automates the process of constructing DNA sequence based phylogenies (trees or lineages) that indicate evolutionary relationships. LinkFinder takes as input homologous DNA sequences from distinct individual organisms. It measures variations between the sequences, selects appropriate proportionality constants, and estimates the time that has passed since each pair of organisms diverged from a common ancestor. It then designs and outputs a phylogenic map summarizing these results. LinkFinder can find genetic relationships between different species, and between individuals of the same species, including humans. It was designed to take advantage of the vast amount of sequence data being produced by the Genome Project, and should be of value to evolution theorists who wish to utilize this data, but who have no formal training in molecular genetics. Evolutionary theory holds that distinct organisms carrying a common gene inherited that gene from a common ancestor. Homologous genes vary from individual to individual and species to species, and the amount of variation is now believed to be directly proportional to the time that has passed since divergence from a common ancestor. The proportionality constant must be determined experimentally; it varies considerably with the types of organisms and DNA molecules under study. Given an appropriate constant, and the variation between two DNA sequences, a simple linear equation gives the divergence time.
Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.

PubMed Central

Benslimane, A A; Dron, M; Hartmann, C; Rode, A

1986-01-01

Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest that small tandemly repeated DNA sequences of plants may have evolved from a tRNA gene ancestor. These tandem repeats have probably arisen via a process involving reverse transcription of polymerase III RNA intermediates, as is the case for interspersed DNA sequences of mammalians. A model is proposed to explain the formation of such small tandemly repeated DNA sequences. Images PMID:3774553

Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin.

PubMed

Bokulich, Nicholas A; Kaehler, Benjamin D; Rideout, Jai Ram; Dillon, Matthew; Bolyen, Evan; Knight, Rob; Huttley, Gavin A; Gregory Caporaso, J

2018-05-17

Taxonomic classification of marker-gene sequences is an important step in microbiome analysis. We present q2-feature-classifier ( https://github.com/qiime2/q2-feature-classifier ), a QIIME 2 plugin containing several novel machine-learning and alignment-based methods for taxonomy classification. We evaluated and optimized several commonly used classification methods implemented in QIIME 1 (RDP, BLAST, UCLUST, and SortMeRNA) and several new methods implemented in QIIME 2 (a scikit-learn naive Bayes machine-learning classifier, and alignment-based taxonomy consensus methods based on VSEARCH, and BLAST+) for classification of bacterial 16S rRNA and fungal ITS marker-gene amplicon sequence data. The naive-Bayes, BLAST+-based, and VSEARCH-based classifiers implemented in QIIME 2 meet or exceed the species-level accuracy of other commonly used methods designed for classification of marker gene sequences that were evaluated in this work. These evaluations, based on 19 mock communities and error-free sequence simulations, including classification of simulated "novel" marker-gene sequences, are available in our extensible benchmarking framework, tax-credit ( https://github.com/caporaso-lab/tax-credit-data ). Our results illustrate the importance of parameter tuning for optimizing classifier performance, and we make recommendations regarding parameter choices for these classifiers under a range of standard operating conditions. q2-feature-classifier and tax-credit are both free, open-source, BSD-licensed packages available on GitHub.
Exome Sequencing in Suspected Monogenic Dyslipidemias

PubMed Central

Stitziel, Nathan O.; Peloso, Gina M.; Abifadel, Marianne; Cefalu, Angelo B.; Fouchier, Sigrid; Motazacker, M. Mahdi; Tada, Hayato; Larach, Daniel B.; Awan, Zuhier; Haller, Jorge F.; Pullinger, Clive R.; Varret, Mathilde; Rabès, Jean-Pierre; Noto, Davide; Tarugi, Patrizia; Kawashiri, Masa-aki; Nohara, Atsushi; Yamagishi, Masakazu; Risman, Marjorie; Deo, Rahul; Ruel, Isabelle; Shendure, Jay; Nickerson, Deborah A.; Wilson, James G.; Rich, Stephen S.; Gupta, Namrata; Farlow, Deborah N.; Neale, Benjamin M.; Daly, Mark J.; Kane, John P.; Freeman, Mason W.; Genest, Jacques; Rader, Daniel J.; Mabuchi, Hiroshi; Kastelein, John J.P.; Hovingh, G. Kees; Averna, Maurizio R.; Gabriel, Stacey; Boileau, Catherine; Kathiresan, Sekar

2015-01-01

Background Exome sequencing is a promising tool for gene mapping in Mendelian disorders. We utilized this technique in an attempt to identify novel genes underlying monogenic dyslipidemias. Methods and Results We performed exome sequencing on 213 selected family members from 41 kindreds with suspected Mendelian inheritance of extreme levels of low-density lipoprotein (LDL) cholesterol (after candidate gene sequencing excluded known genetic causes for high LDL cholesterol families) or high-density lipoprotein (HDL) cholesterol. We used standard analytic approaches to identify candidate variants and also assigned a polygenic score to each individual in order to account for their burden of common genetic variants known to influence lipid levels. In nine families, we identified likely pathogenic variants in known lipid genes (ABCA1, APOB, APOE, LDLR, LIPA, and PCSK9); however, we were unable to identify obvious genetic etiologies in the remaining 32 families despite follow-up analyses. We identified three factors that limited novel gene discovery: (1) imperfect sequencing coverage across the exome hid potentially causal variants; (2) large numbers of shared rare alleles within families obfuscated causal variant identification; and (3) individuals from 15% of families carried a significant burden of common lipid-related alleles, suggesting complex inheritance can masquerade as monogenic disease. Conclusions We identified the genetic basis of disease in nine of 41 families; however, none of these represented novel gene discoveries. Our results highlight the promise and limitations of exome sequencing as a discovery technique in suspected monogenic dyslipidemias. Considering the confounders identified may inform the design of future exome sequencing studies. PMID:25632026
Gene-based SSR markers for common bean (Phaseolus vulgaris L.) derived from root and leaf tissue ESTs: an integration of the BMc series.

PubMed

Blair, Matthew W; Hurtado, Natalia; Chavarro, Carolina M; Muñoz-Torres, Monica C; Giraldo, Martha C; Pedraza, Fabio; Tomkins, Jeff; Wing, Rod

2011-03-22

Sequencing of cDNA libraries for the development of expressed sequence tags (ESTs) as well as for the discovery of simple sequence repeats (SSRs) has been a common method of developing microsatellites or SSR-based markers. In this research, our objective was to further sequence and develop common bean microsatellites from leaf and root cDNA libraries derived from the Andean gene pool accession G19833 and the Mesoamerican gene pool accession DOR364, mapping parents of a commonly used reference map. The root libraries were made from high and low phosphorus treated plants. A total of 3,123 EST sequences from leaf and root cDNA libraries were screened and used for direct simple sequence repeat discovery. From these EST sequences we found 184 microsatellites; the majority containing tri-nucleotide motifs, many of which were GC rich (ACC, AGC and AGG in particular). Di-nucleotide motif microsatellites were about half as common as the tri-nucleotide motif microsatellites but most of these were AGn microsatellites with a moderate number of ATn microsatellites in root ESTs followed by few ACn and no GCn microsatellites. Out of the 184 new SSR loci, 120 new microsatellite markers were developed in the BMc (Bean Microsatellites from cDNAs) series and these were evaluated for their capacity to distinguish bean diversity in a germplasm panel of 18 genotypes. We developed a database with images of the microsatellites and their polymorphism information content (PIC), which averaged 0.310 for polymorphic markers. The present study produced information about microsatellite frequency in root and leaf tissues of two important genotypes for common bean genomics: namely G19833, the Andean genotype selected for whole genome shotgun sequencing from race Peru, and DOR364 a race Mesoamerica subgroup 2 genotype that is a small-red seeded, released variety in Central America. Both race Peru and Mesoamerica subgroup 2 (small red beans) have been understudied in comparison to race Nueva Granada and Mesoamerica subgroup 1 (black beans) both with regards to gene expression and as sources of markers. However, we found few differences between SSR type and frequency between the G19833 leaf and DOR364 root tissue-derived ESTs. Overall, our work adds to the analysis of microsatellite frequency evaluation for common bean and provides a new set of 120 BMc markers which combined with the 248 previously developed BMc markers brings the total in this series to 368 markers. Once we include BMd markers, which are derived from GenBank sequences, the current total of gene-based markers from our laboratory surpasses 500 markers. These markers are basic for studies of the transcriptome of common bean and can form anchor points for genetic mapping studies in the future.
Chloroplast and nuclear gene sequences indicate late Pennsylvanian time for the last common ancestor of extant seed plants.

PubMed Central

Savard, L; Li, P; Strauss, S H; Chase, M W; Michaud, M; Bousquet, J

1994-01-01

We have estimated the time for the last common ancestor of extant seed plants by using molecular clocks constructed from the sequences of the chloroplastic gene coding for the large subunit of ribulose-1,5-bisphosphate carboxylase/oxygenase (rbcL) and the nuclear gene coding for the small subunit of rRNA (Rrn18). Phylogenetic analyses of nucleotide sequences indicated that the earliest divergence of extant seed plants is likely represented by a split between conifer-cycad and angiosperm lineages. Relative-rate tests were used to assess homogeneity of substitution rates among lineages, and annual angiosperms were found to evolve at a faster rate than other taxa for rbcL and, thus, these sequences were excluded from construction of molecular clocks. Five distinct molecular clocks were calibrated using substitution rates for the two genes and four divergence times based on fossil and published molecular clock estimates. The five estimated times for the last common ancestor of extant seed plants were in agreement with one another, with an average of 285 million years and a range of 275-290 million years. This implies a substantially more recent ancestor of all extant seed plants than suggested by some theories of plant evolution. PMID:8197201
Emergence of a Novel Chimeric Gene Underlying Grain Number in Rice

PubMed Central

Chen, Hao; Tang, Yanyan; Liu, Jianfeng; Tan, Lubin; Jiang, Jiahuan; Wang, Mumu; Zhu, Zuofeng; Sun, Xianyou; Sun, Chuanqing

2017-01-01

Grain number is an important factor in determining grain production of rice (Oryza sativa L.). The molecular genetic basis for grain number is complex. Discovering new genes involved in regulating rice grain number increases our knowledge regarding its molecular mechanisms and aids breeding programs. Here, we identified GRAINS NUMBER 2 (GN2), a novel gene that is responsible for rice grain number, from “Yuanjiang” common wild rice (O. rufipogon Griff.). Transgenic plants overexpressing GN2 showed less grain number, reduced plant height, and later heading date than control plants. Interestingly, GN2 arose through the insertion of a 1094-bp sequence from LOC_Os02g45150 into the third exon of LOC_Os02g56630, and the inserted sequence recruited its nearby sequence to generate the chimeric GN2. The gene structure and expression pattern of GN2 were distinct from those of LOC_Os02g45150 and LOC_Os02g56630. Sequence analysis showed that GN2 may be generated in the natural population of Yuanjiang common wild rice. In this study, we identified a novel functional chimeric gene and also provided information regarding the molecular mechanisms regulating rice grain number. PMID:27986805
htsint: a Python library for sequencing pipelines that combines data through gene set generation.

PubMed

Richards, Adam J; Herrel, Anthony; Bonneaud, Camille

2015-09-24

Sequencing technologies provide a wealth of details in terms of genes, expression, splice variants, polymorphisms, and other features. A standard for sequencing analysis pipelines is to put genomic or transcriptomic features into a context of known functional information, but the relationships between ontology terms are often ignored. For RNA-Seq, considering genes and their genetic variants at the group level enables a convenient way to both integrate annotation data and detect small coordinated changes between experimental conditions, a known caveat of gene level analyses. We introduce the high throughput data integration tool, htsint, as an extension to the commonly used gene set enrichment frameworks. The central aim of htsint is to compile annotation information from one or more taxa in order to calculate functional distances among all genes in a specified gene space. Spectral clustering is then used to partition the genes, thereby generating functional modules. The gene space can range from a targeted list of genes, like a specific pathway, all the way to an ensemble of genomes. Given a collection of gene sets and a count matrix of transcriptomic features (e.g. expression, polymorphisms), the gene sets produced by htsint can be tested for 'enrichment' or conditional differences using one of a number of commonly available packages. The database and bundled tools to generate functional modules were designed with sequencing pipelines in mind, but the toolkit nature of htsint allows it to also be used in other areas of genomics. The software is freely available as a Python library through GitHub at https://github.com/ajrichards/htsint.
A Recombinant of Bean common mosaic virus Induces Temperature-Insensitive Necrosis in an I Gene-Bearing Line of Common Bean.

PubMed

Feng, Xue; Poplawsky, Alan R; Karasev, Alexander V

2014-11-01

The I gene is a single, dominant gene conferring temperature-sensitive resistance to all known strains of Bean common mosaic virus (BCMV) in common bean (Phaseolus vulgaris). However, the closely related Bean common mosaic necrosis virus (BCMNV) induces whole plant necrosis in I-bearing genotypes of common bean, and the presence of additional, recessive genes is required to prevent this severe whole plant necrotic reaction caused by BCMNV. Almost all known BCMNV isolates have so far been classified as having pathotype VI based on their interactions with the five BCMV resistance genes, and all have a distinct serotype A. Here, we describe a new isolate of BCMV, RU1M, capable of inducing whole plant necrosis in the presence of the I gene, that appears to belong to pathotype VII and exhibits B-serotype. Unlike other isolates of BCMV, RU1M was able to induce severe whole plant necrosis below 30°C in bean cultivar Jubila that carries the I gene and a protective recessive gene bc-1. The whole genome of RU1M was cloned and sequenced and determined to be 9,953 nucleotides long excluding poly(A), coding for a single polyprotein of 3,186 amino acids. Most of the genome was found almost identical (>98%) to the BCMV isolate RU1-OR (also pathotype VII) that did not induce necrotic symptoms in 'Jubila'. Inspection of the nucleotide sequences for BCMV isolates RU1-OR, RU1M, and US10 (all pathotype VII) and three closely related sequences of BCMV isolates RU1P, RU1D, and RU1W (all pathotype VI) revealed that RU1M is a product of recombination between RU1-OR and a yet unknown potyvirus. A 0.8-kb fragment of an unknown origin in the RU1M genome may have led to its ability to induce necrosis regardless of temperature in beans carrying the I gene. This is the first report of a BCMV isolate inducing temperature-insensitive necrosis in an I gene containing bean genotype.
Genome-Wide Association Study Identifies NBS-LRR-Encoding Genes Related with Anthracnose and Common Bacterial Blight in the Common Bean.

PubMed

Wu, Jing; Zhu, Jifeng; Wang, Lanfen; Wang, Shumin

2017-01-01

Nucleotide-binding site and leucine-rich repeat (NBS-LRR) genes represent the largest and most important disease resistance genes in plants. The genome sequence of the common bean ( Phaseolus vulgaris L.) provides valuable data for determining the genomic organization of NBS-LRR genes. However, data on the NBS-LRR genes in the common bean are limited. In total, 178 NBS-LRR-type genes and 145 partial genes (with or without a NBS) located on 11 common bean chromosomes were identified from genome sequences database. Furthermore, 30 NBS-LRR genes were classified into Toll/interleukin-1 receptor (TIR)-NBS-LRR (TNL) types, and 148 NBS-LRR genes were classified into coiled-coil (CC)-NBS-LRR (CNL) types. Moreover, the phylogenetic tree supported the division of these PvNBS genes into two obvious groups, TNL types and CNL types. We also built expression profiles of NBS genes in response to anthracnose and common bacterial blight using qRT-PCR. Finally, we detected nine disease resistance loci for anthracnose (ANT) and seven for common bacterial blight (CBB) using the developed NBS-SSR markers. Among these loci, NSSR24, NSSR73, and NSSR265 may be located at new regions for ANT resistance, while NSSR65 and NSSR260 may be located at new regions for CBB resistance. Furthermore, we validated NSSR24, NSSR65, NSSR73, NSSR260, and NSSR265 using a new natural population. Our results provide useful information regarding the function of the NBS-LRR proteins and will accelerate the functional genomics and evolutionary studies of NBS-LRR genes in food legumes. NBS-SSR markers represent a wide-reaching resource for molecular breeding in the common bean and other food legumes. Collectively, our results should be of broad interest to bean scientists and breeders.
Genome-Wide Association Study Identifies NBS-LRR-Encoding Genes Related with Anthracnose and Common Bacterial Blight in the Common Bean

PubMed Central

Wu, Jing; Zhu, Jifeng; Wang, Lanfen; Wang, Shumin

2017-01-01

Nucleotide-binding site and leucine-rich repeat (NBS-LRR) genes represent the largest and most important disease resistance genes in plants. The genome sequence of the common bean (Phaseolus vulgaris L.) provides valuable data for determining the genomic organization of NBS-LRR genes. However, data on the NBS-LRR genes in the common bean are limited. In total, 178 NBS-LRR-type genes and 145 partial genes (with or without a NBS) located on 11 common bean chromosomes were identified from genome sequences database. Furthermore, 30 NBS-LRR genes were classified into Toll/interleukin-1 receptor (TIR)-NBS-LRR (TNL) types, and 148 NBS-LRR genes were classified into coiled-coil (CC)-NBS-LRR (CNL) types. Moreover, the phylogenetic tree supported the division of these PvNBS genes into two obvious groups, TNL types and CNL types. We also built expression profiles of NBS genes in response to anthracnose and common bacterial blight using qRT-PCR. Finally, we detected nine disease resistance loci for anthracnose (ANT) and seven for common bacterial blight (CBB) using the developed NBS-SSR markers. Among these loci, NSSR24, NSSR73, and NSSR265 may be located at new regions for ANT resistance, while NSSR65 and NSSR260 may be located at new regions for CBB resistance. Furthermore, we validated NSSR24, NSSR65, NSSR73, NSSR260, and NSSR265 using a new natural population. Our results provide useful information regarding the function of the NBS-LRR proteins and will accelerate the functional genomics and evolutionary studies of NBS-LRR genes in food legumes. NBS-SSR markers represent a wide-reaching resource for molecular breeding in the common bean and other food legumes. Collectively, our results should be of broad interest to bean scientists and breeders. PMID:28848595
Unique core genomes of the bacterial family vibrionaceae: insights into niche adaptation and speciation.

PubMed

Kahlke, Tim; Goesmann, Alexander; Hjerde, Erik; Willassen, Nils Peder; Haugen, Peik

2012-05-10

The criteria for defining bacterial species and even the concept of bacterial species itself are under debate, and the discussion is apparently intensifying as more genome sequence data is becoming available. However, it is still unclear how the new advances in genomics should be used most efficiently to address this question. In this study we identify genes that are common to any group of genomes in our dataset, to determine whether genes specific to a particular taxon exist and to investigate their potential role in adaptation of bacteria to their specific niche. These genes were named unique core genes. Additionally, we investigate the existence and importance of unique core genes that are found in isolates of phylogenetically non-coherent groups. These groups of isolates, that share a genetic feature without sharing a closest common ancestor, are termed genophyletic groups. The bacterial family Vibrionaceae was used as the model, and we compiled and compared genome sequences of 64 different isolates. Using the software orthoMCL we determined clusters of homologous genes among the investigated genome sequences. We used multilocus sequence analysis to build a host phylogeny and mapped the numbers of unique core genes of all distinct groups of isolates onto the tree. The results show that unique core genes are more likely to be found in monophyletic groups of isolates. Genophyletic groups of isolates, in contrast, are less common especially for large groups of isolate. The subsequent annotation of unique core genes that are present in genophyletic groups indicate a high degree of horizontally transferred genes. Finally, the annotation of the unique core genes of Vibrio cholerae revealed genes involved in aerotaxis and biosynthesis of the iron-chelator vibriobactin. The presented work indicates that genes specific for any taxon inside the bacterial family Vibrionaceae exist. These unique core genes encode conserved metabolic functions that can shed light on the adaptation of a species to its ecological niche. Additionally, our study suggests that unique core genes can be used to aid classification of bacteria and contribute to a bacterial species definition on a genomic level. Furthermore, these genes may be of importance in clinical diagnostics and drug development.
New genes from old: asymmetric divergence of gene duplicates and the evolution of development.

PubMed

Holland, Peter W H; Marlétaz, Ferdinand; Maeso, Ignacio; Dunwell, Thomas L; Paps, Jordi

2017-02-05

Gene duplications and gene losses have been frequent events in the evolution of animal genomes, with the balance between these two dynamic processes contributing to major differences in gene number between species. After gene duplication, it is common for both daughter genes to accumulate sequence change at approximately equal rates. In some cases, however, the accumulation of sequence change is highly uneven with one copy radically diverging from its paralogue. Such 'asymmetric evolution' seems commoner after tandem gene duplication than after whole-genome duplication, and can generate substantially novel genes. We describe examples of asymmetric evolution in duplicated homeobox genes of moths, molluscs and mammals, in each case generating new homeobox genes that were recruited to novel developmental roles. The prevalence of asymmetric divergence of gene duplicates has been underappreciated, in part, because the origin of highly divergent genes can be difficult to resolve using standard phylogenetic methods.This article is part of the themed issue 'Evo-devo in the genomics era, and the origins of morphological diversity'. © 2016 The Author(s).
Genomic insight into the common carp (Cyprinus carpio) genome by sequencing analysis of BAC-end sequences

PubMed Central

2011-01-01

Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES) are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio), a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3,100 microsyntenies, covering over 50% of the zebrafish genome. BES of common carp are tremendous tools for comparative mapping between the two closely related species, zebrafish and common carp, which should facilitate both structural and functional genome analysis in common carp. PMID:21492448
The cytochrome oxidase subunit I and subunit III genes in Oenothera mitochondria are transcribed from identical promoter sequences

PubMed Central

Hiesel, Rudolf; Schobel, Werner; Schuster, Wolfgang; Brennicke, Axel

1987-01-01

Two loci encoding subunit III of the cytochrome oxidase (COX) in Oenothera mitochondria have been identified from a cDNA library of mitochondrial transcripts. A 657-bp sequence block upstream from the open reading frame is also present in the two copies of the COX subunit I gene and is presumably involved in homologous sequence rearrangement. The proximal points of sequence rearrangements are located 3 bp upstream from the COX I and 1139 bp upstream from the COX III initiation codons. The 5'-termini of both COX I and COX III mRNAs have been mapped in this common sequence confining the promoter region for the Oenothera mitochondrial COX I and COX III genes to the homologous sequence block. ImagesFig. 5. PMID:15981332
Sequence Analysis of Mitochondrial Genome of Toxascaris leonina from a South China Tiger.

PubMed

Li, Kangxin; Yang, Fang; Abdullahi, A Y; Song, Meiran; Shi, Xianli; Wang, Minwei; Fu, Yeqi; Pan, Weida; Shan, Fang; Chen, Wu; Li, Guoqing

2016-12-01

Toxascaris leonina is a common parasitic nematode of wild mammals and has significant impacts on the protection of rare wild animals. To analyze population genetic characteristics of T. leonina from South China tiger, its mitochondrial (mt) genome was sequenced. Its complete circular mt genome was 14,277 bp in length, including 12 protein-coding genes, 22 tRNA genes, 2 rRNA genes, and 2 non-coding regions. The nucleotide composition was biased toward A and T. The most common start codon and stop codon were TTG and TAG, and 4 genes ended with an incomplete stop codon. There were 13 intergenic regions ranging 1 to 10 bp in size. Phylogenetically, T. leonina from a South China tiger was close to canine T. leonina . This study reports for the first time a complete mt genome sequence of T. leonina from the South China tiger, and provides a scientific basis for studying the genetic diversity of nematodes between different hosts.
SNP discovery in common bean by restriction-associated DNA (RAD) sequencing for genetic diversity and population structure analysis.

PubMed

Valdisser, Paula Arielle M R; Pappas, Georgios J; de Menezes, Ivandilson P P; Müller, Bárbara S F; Pereira, Wendell J; Narciso, Marcelo G; Brondani, Claudio; Souza, Thiago L P O; Borba, Tereza C O; Vianello, Rosana P

2016-06-01

Researchers have made great advances into the development and application of genomic approaches for common beans, creating opportunities to driving more real and applicable strategies for sustainable management of the genetic resource towards plant breeding. This work provides useful polymorphic single-nucleotide polymorphisms (SNPs) for high-throughput common bean genotyping developed by RAD (restriction site-associated DNA) sequencing. The RAD tags were generated from DNA pooled from 12 common bean genotypes, including breeding lines of different gene pools and market classes. The aligned sequences identified 23,748 putative RAD-SNPs, of which 3357 were adequate for genotyping; 1032 RAD-SNPs with the highest ADT (assay design tool) score are presented in this article. The RAD-SNPs were structurally annotated in different coding (47.00 %) and non-coding (53.00 %) sequence components of genes. A subset of 384 RAD-SNPs with broad genome distribution was used to genotype a diverse panel of 95 common bean germplasms and revealed a successful amplification rate of 96.6 %, showing 73 % of polymorphic SNPs within the Andean group and 83 % in the Mesoamerican group. A slightly increased He (0.161, n = 21) value was estimated for the Andean gene pool, compared to the Mesoamerican group (0.156, n = 74). For the linkage disequilibrium (LD) analysis, from a group of 580 SNPs (289 RAD-SNPs and 291 BARC-SNPs) genotyped for the same set of genotypes, 70.2 % were in LD, decreasing to 0.10 %in the Andean group and 0.77 % in the Mesoamerican group. Haplotype patterns spanning 310 Mb of the genome (60 %) were characterized in samples from different origins. However, the haplotype frameworks were under-represented for the Andean (7.85 %) and Mesoamerican (5.55 %) gene pools separately. In conclusion, RAD sequencing allowed the discovery of hundreds of useful SNPs for broad genetic analysis of common bean germplasm. From now, this approach provides an excellent panel of molecular tools for whole genome analysis, allowing integrating and better exploring the common bean breeding practices.
Wheat-specific gene, ribosomal protein l21, used as the endogenous reference gene for qualitative and real-time quantitative polymerase chain reaction detection of transgenes.

PubMed

Liu, Yi-Ke; Li, He-Ping; Huang, Tao; Cheng, Wei; Gao, Chun-Sheng; Zuo, Dong-Yun; Zhao, Zheng-Xi; Liao, Yu-Cai

2014-10-29

Wheat-specific ribosomal protein L21 (RPL21) is an endogenous reference gene suitable for genetically modified (GM) wheat identification. This taxon-specific RPL21 sequence displayed high homogeneity in different wheat varieties. Southern blots revealed 1 or 3 copies, and sequence analyses showed one amplicon in common wheat. Combined analyses with sequences from common wheat (AABBDD) and three diploid ancestral species, Triticum urartu (AA), Aegilops speltoides (BB), and Aegilops tauschii (DD), demonstrated the presence of this amplicon in the AA genome. Using conventional qualitative polymerase chain reaction (PCR), the limit of detection was 2 copies of wheat haploid genome per reaction. In the quantitative real-time PCR assay, limits of detection and quantification were about 2 and 8 haploid genome copies, respectively, the latter of which is 2.5-4-fold lower than other reported wheat endogenous reference genes. Construct-specific PCR assays were developed using RPL21 as an endogenous reference gene, and as little as 0.5% of GM wheat contents containing Arabidopsis NPR1 were properly quantified.
Cancer genes mutation profiling in calcifying epithelial odontogenic tumour.

PubMed

de Sousa, Sílvia Ferreira; Diniz, Marina Gonçalves; França, Josiane Alves; Fontes Pereira, Thaís Dos Santos; Moreira, Rennan Garcias; Santos, Jean Nunes Dos; Gomez, Ricardo Santiago; Gomes, Carolina Cavalieri

2018-03-01

To identify calcifying epithelial odontogenic tumour (CEOT) mutations in oncogenes and tumour suppressor genes. A panel of 50 genes commonly mutated in cancer was sequenced in CEOT by next-generation sequencing. Sanger sequencing was used to cover the region of the frameshift deletion identified in one sample. Missense single nucleotide variants (SNVs) with minor allele frequency (MAF) <1% were detected in PTEN , MET and JAK3 . A frameshift deletion in CDKN2A occurred in association with a missense mutation in the same gene region, suggesting a second hit in the inactivation of this gene. APC, KDR, KIT, PIK3CA and TP53 missense SNVs were identified; however, these are common SNVs, showing MAF >1%. CEOT harbours mutations in the tumour suppressor PTEN and CDKN2A and in the oncogenes JAK3 and MET . As these mutations occurred in only one case each, they are probably not driver mutations for these tumours. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Sequencing artifacts in the type A influenza databases and attempts to correct them.

PubMed

Suarez, David L; Chester, Nikki; Hatfield, Jason

2014-07-01

There are over 276 000 influenza gene sequences in public databases, with the quality of the sequences determined by the contributor. As part of a high school class project, influenza sequences with possible errors were identified in the public databases based on the size of the gene being longer than expected, with the hypothesis that these sequences would have an error. Students contacted sequence submitters alerting them of the possible sequence issue(s) and requested they the suspect sequence(s) be correct as appropriate. Type A influenza viruses were screened, and gene segments longer than the accepted size were identified for further analysis. Attention was placed on sequences with additional nucleotides upstream or downstream of the highly conserved non-coding ends of the viral segments. A total of 1081 sequences were identified that met this criterion. Three types of errors were commonly observed: non-influenza primer sequence wasn't removed from the sequence; PCR product was cloned and plasmid sequence was included in the sequence; and Taq polymerase added an adenine at the end of the PCR product. Internal insertions of nucleotide sequence were also commonly observed, but in many cases it was unclear if the sequence was correct or actually contained an error. A total of 215 sequences, or 22.8% of the suspect sequences, were corrected in the public databases in the first year of the student project. Unfortunately 138 additional sequences with possible errors were added to the databases in the second year. Additional awareness of the need for data integrity of sequences submitted to public databases is needed to fully reap the benefits of these large data sets. © 2014 The Authors. Influenza and Other Respiratory Viruses Published by John Wiley & Sons Ltd.
New genes emerging for colorectal cancer predisposition.

PubMed

Esteban-Jurado, Clara; Garre, Pilar; Vila, Maria; Lozano, Juan José; Pristoupilova, Anna; Beltrán, Sergi; Abulí, Anna; Muñoz, Jenifer; Balaguer, Francesc; Ocaña, Teresa; Castells, Antoni; Piqué, Josep M; Carracedo, Angel; Ruiz-Ponte, Clara; Bessa, Xavier; Andreu, Montserrat; Bujanda, Luis; Caldés, Trinidad; Castellví-Bel, Sergi

2014-02-28

Colorectal cancer (CRC) is one of the most frequent neoplasms and an important cause of mortality in the developed world. This cancer is caused by both genetic and environmental factors although 35% of the variation in CRC susceptibility involves inherited genetic differences. Mendelian syndromes account for about 5% of the total burden of CRC, with Lynch syndrome and familial adenomatous polyposis the most common forms. Excluding hereditary forms, there is an important fraction of CRC cases that present familial aggregation for the disease with an unknown germline genetic cause. CRC can be also considered as a complex disease taking into account the common disease-commom variant hypothesis with a polygenic model of inheritance where the genetic components of common complex diseases correspond mostly to variants of low/moderate effect. So far, 30 common, low-penetrance susceptibility variants have been identified for CRC. Recently, new sequencing technologies including exome- and whole-genome sequencing have permitted to add a new approach to facilitate the identification of new genes responsible for human disease predisposition. By using whole-genome sequencing, germline mutations in the POLE and POLD1 genes have been found to be responsible for a new form of CRC genetic predisposition called polymerase proofreading-associated polyposis.
Horizontal gene transfer in Histophilus somni and its role in the evolution of pathogenic strain 2336, as determined by comparative genomic analyses

PubMed Central

2011-01-01

Background Pneumonia and myocarditis are the most commonly reported diseases due to Histophilus somni, an opportunistic pathogen of the reproductive and respiratory tracts of cattle. Thus far only a few genes involved in metabolic and virulence functions have been identified and characterized in H. somni using traditional methods. Analyses of the genome sequences of several Pasteurellaceae species have provided insights into their biology and evolution. In view of the economic and ecological importance of H. somni, the genome sequence of pneumonia strain 2336 has been determined and compared to that of commensal strain 129Pt and other members of the Pasteurellaceae. Results The chromosome of strain 2336 (2,263,857 bp) contained 1,980 protein coding genes, whereas the chromosome of strain 129Pt (2,007,700 bp) contained only 1,792 protein coding genes. Although the chromosomes of the two strains differ in size, their average GC content, gene density (total number of genes predicted on the chromosome), and percentage of sequence (number of genes) that encodes proteins were similar. The chromosomes of these strains also contained a number of discrete prophage regions and genomic islands. One of the genomic islands in strain 2336 contained genes putatively involved in copper, zinc, and tetracycline resistance. Using the genome sequence data and comparative analyses with other members of the Pasteurellaceae, several H. somni genes that may encode proteins involved in virulence (e.g., filamentous haemaggutinins, adhesins, and polysaccharide biosynthesis/modification enzymes) were identified. The two strains contained a total of 17 ORFs that encode putative glycosyltransferases and some of these ORFs had characteristic simple sequence repeats within them. Most of the genes/loci common to both the strains were located in different regions of the two chromosomes and occurred in opposite orientations, indicating genome rearrangement since their divergence from a common ancestor. Conclusions Since the genome of strain 129Pt was ~256,000 bp smaller than that of strain 2336, these genomes provide yet another paradigm for studying evolutionary gene loss and/or gain in regard to virulence repertoire and pathogenic ability. Analyses of the complete genome sequences revealed that bacteriophage- and transposon-mediated horizontal gene transfer had occurred at several loci in the chromosomes of strains 2336 and 129Pt. It appears that these mobile genetic elements have played a major role in creating genomic diversity and phenotypic variability among the two H. somni strains. PMID:22111657

Horizontal gene transfer in Histophilus somni and its role in the evolution of pathogenic strain 2336, as determined by comparative genomic analyses.

PubMed

Siddaramappa, Shivakumara; Challacombe, Jean F; Duncan, Alison J; Gillaspy, Allison F; Carson, Matthew; Gipson, Jenny; Orvis, Joshua; Zaitshik, Jeremy; Barnes, Gentry; Bruce, David; Chertkov, Olga; Detter, J Chris; Han, Cliff S; Tapia, Roxanne; Thompson, Linda S; Dyer, David W; Inzana, Thomas J

2011-11-23

Pneumonia and myocarditis are the most commonly reported diseases due to Histophilus somni, an opportunistic pathogen of the reproductive and respiratory tracts of cattle. Thus far only a few genes involved in metabolic and virulence functions have been identified and characterized in H. somni using traditional methods. Analyses of the genome sequences of several Pasteurellaceae species have provided insights into their biology and evolution. In view of the economic and ecological importance of H. somni, the genome sequence of pneumonia strain 2336 has been determined and compared to that of commensal strain 129Pt and other members of the Pasteurellaceae. The chromosome of strain 2336 (2,263,857 bp) contained 1,980 protein coding genes, whereas the chromosome of strain 129Pt (2,007,700 bp) contained only 1,792 protein coding genes. Although the chromosomes of the two strains differ in size, their average GC content, gene density (total number of genes predicted on the chromosome), and percentage of sequence (number of genes) that encodes proteins were similar. The chromosomes of these strains also contained a number of discrete prophage regions and genomic islands. One of the genomic islands in strain 2336 contained genes putatively involved in copper, zinc, and tetracycline resistance. Using the genome sequence data and comparative analyses with other members of the Pasteurellaceae, several H. somni genes that may encode proteins involved in virulence (e.g., filamentous haemaggutinins, adhesins, and polysaccharide biosynthesis/modification enzymes) were identified. The two strains contained a total of 17 ORFs that encode putative glycosyltransferases and some of these ORFs had characteristic simple sequence repeats within them. Most of the genes/loci common to both the strains were located in different regions of the two chromosomes and occurred in opposite orientations, indicating genome rearrangement since their divergence from a common ancestor. Since the genome of strain 129Pt was ~256,000 bp smaller than that of strain 2336, these genomes provide yet another paradigm for studying evolutionary gene loss and/or gain in regard to virulence repertoire and pathogenic ability. Analyses of the complete genome sequences revealed that bacteriophage- and transposon-mediated horizontal gene transfer had occurred at several loci in the chromosomes of strains 2336 and 129Pt. It appears that these mobile genetic elements have played a major role in creating genomic diversity and phenotypic variability among the two H. somni strains.
Palaeosymbiosis Revealed by Genomic Fossils of Wolbachia in a Strongyloidean Nematode

PubMed Central

Koutsovoulos, Georgios; Makepeace, Benjamin; Tanya, Vincent N.; Blaxter, Mark

2014-01-01

Wolbachia are common endosymbionts of terrestrial arthropods, and are also found in nematodes: the animal-parasitic filaria, and the plant-parasite Radopholus similis. Lateral transfer of Wolbachia DNA to the host genome is common. We generated a draft genome sequence for the strongyloidean nematode parasite Dictyocaulus viviparus, the cattle lungworm. In the assembly, we identified nearly 1 Mb of sequence with similarity to Wolbachia. The fragments were unlikely to derive from a live Wolbachia infection: most were short, and the genes were disabled through inactivating mutations. Many fragments were co-assembled with definitively nematode-derived sequence. We found limited evidence of expression of the Wolbachia-derived genes. The D. viviparus Wolbachia genes were most similar to filarial strains and strains from the host-promiscuous clade F. We conclude that D. viviparus was infected by Wolbachia in the past, and that clade F-like symbionts may have been the source of filarial Wolbachia infections. PMID:24901418
How Much Do rRNA Gene Surveys Underestimate Extant Bacterial Diversity?

PubMed

Rodriguez-R, Luis M; Castro, Juan C; Kyrpides, Nikos C; Cole, James R; Tiedje, James M; Konstantinidis, Konstantinos T

2018-03-15

The most common practice in studying and cataloguing prokaryotic diversity involves the grouping of sequences into operational taxonomic units (OTUs) at the 97% 16S rRNA gene sequence identity level, often using partial gene sequences, such as PCR-generated amplicons. Due to the high sequence conservation of rRNA genes, organisms belonging to closely related yet distinct species may be grouped under the same OTU. However, it remains unclear how much diversity has been underestimated by this practice. To address this question, we compared the OTUs of genomes defined at the 97% or 98.5% 16S rRNA gene identity level against OTUs of the same genomes defined at the 95% whole-genome average nucleotide identity (ANI), which is a much more accurate proxy for species. Our results show that OTUs resulting from a 98.5% 16S rRNA gene identity cutoff are more accurate than 97% compared to 95% ANI (90.5% versus 89.9% accuracy) but indistinguishable from any other threshold in the 98.29 to 98.78% range. Even with the more stringent thresholds, however, the 16S rRNA gene-based approach commonly underestimates the number of OTUs by ∼12%, on average, compared to the ANI-based approach (∼14% underestimation when using the 97% identity threshold). More importantly, the degree of underestimation can become 50% or more for certain taxa, such as the genera Pseudomonas , Burkholderia , Escherichia , Campylobacter , and Citrobacter These results provide a quantitative view of the degree of underestimation of extant prokaryotic diversity by 16S rRNA gene-defined OTUs and suggest that genomic resolution is often necessary. IMPORTANCE Species diversity is one of the most fundamental pieces of information for community ecology and conservational biology. Therefore, employing accurate proxies for what a species or the unit of diversity is are cornerstones for a large set of microbial ecology and diversity studies. The most common proxies currently used rely on the clustering of 16S rRNA gene sequences at some threshold of nucleotide identity, typically 97% or 98.5%. Here, we explore how well this strategy reflects the more accurate whole-genome-based proxies and determine the frequency with which the high conservation of 16S rRNA sequences masks substantial species-level diversity. Copyright © 2018 American Society for Microbiology.
Quantitative statistical analysis of cis-regulatory sequences in ABA/VP1- and CBF/DREB1-regulated genes of Arabidopsis.

PubMed

Suzuki, Masaharu; Ketterling, Matthew G; McCarty, Donald R

2005-09-01

We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.
rpoB gene mutations among Mycobacterium tuberculosis isolates from extrapulmonary sites.

PubMed

Khosravi, Azar Dokht; Meghdadi, Hossein; Ghadiri, Ata A; Alami, Ameneh; Sina, Amir Hossein; Mirsaeidi, Mehdi

2018-03-01

The aim of this study was to analyze mutations occurring in the rpoB gene of Mycobacterium tuberculosis (MTB) isolates from clinical samples of extrapulmonary tuberculosis (EPTB). Seventy formalin-fixed, paraffin-embedded samples and fresh tissue samples from confirmed EPTB cases were analyzed. Nested PCR based on the rpoB gene was performed on the extracted DNAs, combined with cloning and subsequent sequencing. Sixty-seven (95.7%) samples were positive for nester PCR. Sequence analysis of the 81 bp region of the rpoB gene demonstrated mutations in 41 (61.2%) of 67 sequenced samples. Several point mutations including deletion mutations at codons 510, 512, 513 and 515, with 45% and 51% of the mutations in codons 512 and 513 respectively were seen, along with 26% replacement mutations at codons 509, 513, 514, 518, 520, 524 and 531. The most common alteration was Gln → His, at codon 513, presented in 30 (75.6%) isolates. This study demonstrated sequence alterations in codon 513 of the 81 bp region of the rpoB gene as the most common mutation occurred in 75.6% of molecularly confirmed rifampin-resistant strains. In addition, simultaneous mutation at codons 512 and 513 was demonstrated in 34.3% of the isolates. © 2018 APMIS. Published by John Wiley & Sons Ltd.
Clusters of ancestrally related genes that show paralogy in whole or in part are a major feature of the genomes of humans and other species.

PubMed

Walker, Michael B; King, Benjamin L; Paigen, Kenneth

2012-01-01

Arrangements of genes along chromosomes are a product of evolutionary processes, and we can expect that preferable arrangements will prevail over the span of evolutionary time, often being reflected in the non-random clustering of structurally and/or functionally related genes. Such non-random arrangements can arise by two distinct evolutionary processes: duplications of DNA sequences that give rise to clusters of genes sharing both sequence similarity and common sequence features and the migration together of genes related by function, but not by common descent. To provide a background for distinguishing between the two, which is important for future efforts to unravel the evolutionary processes involved, we here provide a description of the extent to which ancestrally related genes are found in proximity.Towards this purpose, we combined information from five genomic datasets, InterPro, SCOP, PANTHER, Ensembl protein families, and Ensembl gene paralogs. The results are provided in publicly available datasets (http://cgd.jax.org/datasets/clustering/paraclustering.shtml) describing the extent to which ancestrally related genes are in proximity beyond what is expected by chance (i.e. form paraclusters) in the human and nine other vertebrate genomes, as well as the D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae genomes. With the exception of Saccharomyces, paraclusters are a common feature of the genomes we examined. In the human genome they are estimated to include at least 22% of all protein coding genes. Paraclusters are far more prevalent among some gene families than others, are highly species or clade specific and can evolve rapidly, sometimes in response to environmental cues. Altogether, they account for a large portion of the functional clustering previously reported in several genomes.
Transcriptome Sequencing of Lima Bean (Phaseolus lunatus) to Identify Putative Positive Selection in Phaseolus and Legumes

PubMed Central

Li, Fengqi; Cao, Depan; Liu, Yang; Yang, Ting; Wang, Guirong

2015-01-01

The identification of genes under positive selection is a central goal of evolutionary biology. Many legume species, including Phaseolus vulgaris (common bean) and Phaseolus lunatus (lima bean), have important ecological and economic value. In this study, we sequenced and assembled the transcriptome of one Phaseolus species, lima bean. A comparison with the genomes of six other legume species, including the common bean, Medicago, lotus, soybean, chickpea, and pigeonpea, revealed 15 and 4 orthologous groups with signatures of positive selection among the two Phaseolus species and among the seven legume species, respectively. Characterization of these positively selected genes using Non redundant (nr) annotation, gene ontology (GO) classification, GO term enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses revealed that these genes are mostly involved in thylakoids, photosynthesis and metabolism. This study identified genes that may be related to the divergence of the Phaseolus and legume species. These detected genes are particularly good candidates for subsequent functional studies. PMID:26151849
An ergot alkaloid biosynthesis gene and clustered hypothetical genes from Aspergillus fumigatus.

PubMed

Coyle, Christine M; Panaccione, Daniel G

2005-06-01

The ergot alkaloids are a family of indole-derived mycotoxins with a variety of significant biological activities. Aspergillus fumigatus, a common airborne fungus and opportunistic human pathogen, and several fungi in the relatively distant taxon Clavicipitaceae (clavicipitaceous fungi) produce different sets of ergot alkaloids. The ergot alkaloids of these divergent fungi share a four-member ergoline ring but differ in the number, type, and position of the side chains. Several genes required for ergot alkaloid production are known in the clavicipitaceous fungi, and these genes are clustered in the genome of the ergot fungus Claviceps purpurea. We investigated whether the ergot alkaloids of A. fumigatus have a common biosynthetic and genetic origin with those of the clavicipitaceous fungi. A homolog of dmaW, the gene controlling the determinant step in the ergot alkaloid pathway of clavicipitaceous fungi, was identified in the A. fumigatus genome. Knockout of dmaW eliminated all known ergot alkaloids from A. fumigatus, and complementation of the mutation restored ergot alkaloid production. Clustered with dmaW in the A. fumigatus genome are sequences corresponding to five genes previously proposed to encode steps in the ergot alkaloid pathway of C. purpurea, as well as additional sequences whose deduced protein products are consistent with their involvement in the ergot alkaloid pathway. The corresponding genes have similarities in their nucleotide sequences, but the orientations and positions within the cluster of several of these genes differ. The data indicate that the ergot alkaloid biosynthetic capabilities in A. fumigatus and the clavicipitaceous fungi had a common origin.
An Ergot Alkaloid Biosynthesis Gene and Clustered Hypothetical Genes from Aspergillus fumigatus†

PubMed Central

Coyle, Christine M.; Panaccione, Daniel G.

2005-01-01

The ergot alkaloids are a family of indole-derived mycotoxins with a variety of significant biological activities. Aspergillus fumigatus, a common airborne fungus and opportunistic human pathogen, and several fungi in the relatively distant taxon Clavicipitaceae (clavicipitaceous fungi) produce different sets of ergot alkaloids. The ergot alkaloids of these divergent fungi share a four-member ergoline ring but differ in the number, type, and position of the side chains. Several genes required for ergot alkaloid production are known in the clavicipitaceous fungi, and these genes are clustered in the genome of the ergot fungus Claviceps purpurea. We investigated whether the ergot alkaloids of A. fumigatus have a common biosynthetic and genetic origin with those of the clavicipitaceous fungi. A homolog of dmaW, the gene controlling the determinant step in the ergot alkaloid pathway of clavicipitaceous fungi, was identified in the A. fumigatus genome. Knockout of dmaW eliminated all known ergot alkaloids from A. fumigatus, and complementation of the mutation restored ergot alkaloid production. Clustered with dmaW in the A. fumigatus genome are sequences corresponding to five genes previously proposed to encode steps in the ergot alkaloid pathway of C. purpurea, as well as additional sequences whose deduced protein products are consistent with their involvement in the ergot alkaloid pathway. The corresponding genes have similarities in their nucleotide sequences, but the orientations and positions within the cluster of several of these genes differ. The data indicate that the ergot alkaloid biosynthetic capabilities in A. fumigatus and the clavicipitaceous fungi had a common origin. PMID:15933009
Transcriptome analysis reveals the time of the fourth round of genome duplication in common carp (Cyprinus carpio)

PubMed Central

2012-01-01

Background Common carp (Cyprinus carpio) is thought to have undergone one extra round of genome duplication compared to zebrafish. Transcriptome analysis has been used to study the existence and timing of genome duplication in species for which genome sequences are incomplete. Large-scale transcriptome data for the common carp genome should help reveal the timing of the additional duplication event. Results We have sequenced the transcriptome of common carp using 454 pyrosequencing. After assembling the 454 contigs and the published common carp sequences together, we obtained 49,669 contigs and identified genes using homology searches and an ab initio method. We identified 4,651 orthologous pairs between common carp and zebrafish and found 129,984 paralogous pairs within the common carp. An estimation of the synonymous substitution rate in the orthologous pairs indicated that common carp and zebrafish diverged 120 million years ago (MYA). We identified one round of genome duplication in common carp and estimated that it had occurred 5.6 to 11.3 MYA. In zebrafish, no genome duplication event after speciation was observed, suggesting that, compared to zebrafish, common carp had undergone an additional genome duplication event. We annotated the common carp contigs with Gene Ontology terms and KEGG pathways. Compared with zebrafish gene annotations, we found that a set of biological processes and pathways were enriched in common carp. Conclusions The assembled contigs helped us to estimate the time of the fourth-round of genome duplication in common carp. The resource that we have built as part of this study will help advance functional genomics and genome annotation studies in the future. PMID:22424280
Detailed Transcriptome Description of the Neglected Cestode Taenia multiceps

PubMed Central

Wu, Xuhang; Fu, Yan; Yang, Deying; Zhang, Runhui; Zheng, Wanpeng; Nie, Huaming; Xie, Yue; Yan, Ning; Hao, Guiying; Gu, Xiaobin; Wang, Shuxian; Peng, Xuerong; Yang, Guangyou

2012-01-01

Background The larval stage of Taenia multiceps, a global cestode, encysts in the central nervous system (CNS) of sheep and other livestock. This frequently leads to their death and huge socioeconomic losses, especially in developing countries. This parasite can also cause zoonotic infections in humans, but has been largely neglected due to a lack of diagnostic techniques and studies. Recent developments in next-generation sequencing provide an opportunity to explore the transcriptome of T. multiceps. Methodology/Principal Findings We obtained a total of 31,282 unigenes (mean length 920 bp) using Illumina paired-end sequencing technology and a new Trinity de novo assembler without a referenced genome. Individual transcription molecules were determined by sequence-based annotations and/or domain-based annotations against public databases (Nr, UniprotKB/Swiss-Prot, COG, KEGG, UniProtKB/TrEMBL, InterPro and Pfam). We identified 26,110 (83.47%) unigenes and inferred 20,896 (66.8%) coding sequences (CDS). Further comparative transcripts analysis with other cestodes (Taenia pisiformis, Taenia solium, Echincoccus granulosus and Echincoccus multilocularis) and intestinal parasites (Trichinella spiralis, Ancylostoma caninum and Ascaris suum) showed that 5,100 common genes were shared among three Taenia tapeworms, 261 conserved genes were detected among five Taeniidae cestodes, and 109 common genes were found in four zoonotic intestinal parasites. Some of the common genes were genes required for parasite survival, involved in parasite-host interactions. In addition, we amplified two full-length CDS of unigenes from the common genes using RT-PCR. Conclusions/Significance This study provides an extensive transcriptome of the adult stage of T. multiceps, and demonstrates that comparative transcriptomic investigations deserve to be further studied. This transcriptome dataset forms a substantial public information platform to achieve a fundamental understanding of the biology of T. multiceps, and helps in the identification of drug targets and parasite-host interaction studies. PMID:23049872
Conserved features of eukaryotic hsp70 genes revealed by comparison with the nucleotide sequence of human hsp70.

PubMed Central

Hunt, C; Morimoto, R I

1985-01-01

We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075
The Chapel Hill hemophilia A dog colony exhibits a factor VIII gene inversion

PubMed Central

Lozier, Jay N.; Dutra, Amalia; Pak, Evgenia; Zhou, Nan; Zheng, Zhili; Nichols, Timothy C.; Bellinger, Dwight A.; Read, Marjorie; Morgan, Richard A.

2002-01-01

In the Chapel Hill colony of factor VIII-deficient dogs, abnormal sequence (ch8, for canine hemophilia 8, GenBank no. AF361485) follows exons 1–22 in the factor VIII transcript in place of exons 23–26. The canine hemophilia 8 locus (ch8) sequence was found in a 140-kb normal dog genomic DNA bacterial artificial chromosome (BAC) clone that was completely outside the factor VIII gene, but not in BAC clones containing the factor VIII gene. The BAC clone that contained ch8 also contained a homologue of F8A (factor 8 associated) sequence, which participates in a common inversion that causes severe hemophilia A in humans. Fluorescence in situ hybridization analysis indicated that exons 1–26 normally proceed sequentially from telomere to centromere at Xq28, and ch8 is telomeric to the factor VIII gene. The appearance of an “upstream” genomic sequence element (ch8) at the end of the aberrant factor VIII transcript suggested that an inversion of genomic DNA replaced factor VIII exons 22–26 with ch8. The F8A sequence appeared also in overlapping normal BAC clones containing factor VIII sequence. We hypothesized that homologous recombination between copies of canine F8A inside and outside the factor VIII gene had occurred, as in human hemophilia A. High-resolution fluorescent in situ hybridization on hemophilia A dog DNA revealed a pattern consistent with this inversion mechanism. We also identified a HindIII restriction fragment length polymorphism of F8A fragments that distinguished hemophilia A, carrier, and normal dogs' DNA. The Chapel Hill hemophilia A dog colony therefore replicates the factor VIII gene inversion commonly seen in humans with severe hemophilia A. PMID:12242334
Origin-of-transfer sequences facilitate mobilisation of non-conjugative antimicrobial-resistance plasmids in Staphylococcus aureus

PubMed Central

O'Brien, Frances G.; Yui Eto, Karina; Murphy, Riley J. T.; Fairhurst, Heather M.; Coombs, Geoffrey W.; Grubb, Warren B.; Ramsay, Joshua P.

2015-01-01

Staphylococcus aureus is a common cause of hospital, community and livestock-associated infections and is increasingly resistant to multiple antimicrobials. A significant proportion of antimicrobial-resistance genes are plasmid-borne, but only a minority of S. aureus plasmids encode proteins required for conjugative transfer or Mob relaxase proteins required for mobilisation. The pWBG749 family of S. aureus conjugative plasmids can facilitate the horizontal transfer of diverse antimicrobial-resistance plasmids that lack Mob genes. Here we reveal that these mobilisable plasmids carry copies of the pWBG749 origin-of-transfer (oriT) sequence and that these oriT sequences facilitate mobilisation by pWBG749. Sequences resembling the pWBG749 oriT were identified on half of all sequenced S. aureus plasmids, including the most prevalent large antimicrobial-resistance/virulence-gene plasmids, pIB485, pMW2 and pUSA300HOUMR. oriT sequences formed five subfamilies with distinct inverted-repeat-2 (IR2) sequences. pWBG749-family plasmids encoding each IR2 were identified and pWBG749 mobilisation was found to be specific for plasmids carrying matching IR2 sequences. Specificity of mobilisation was conferred by a putative ribbon-helix-helix-protein gene smpO. Several plasmids carried 2–3 oriT variants and pWBG749-mediated recombination occurred between distinct oriT sites during mobilisation. These observations suggest this relaxase-in trans mechanism of mobilisation by pWBG749-family plasmids is a common mechanism of plasmid dissemination in S. aureus. PMID:26243776
Molecular characterization of dihydroneopterin aldolase and aminodeoxychorismate synthase in common bean-genes coding for enzymes in the folate synthesis pathway.

PubMed

Xie, Weilong; Perry, Gregory; Martin, C Joe; Shim, Youn-Seb; Navabi, Alireza; Pauls, K Peter

2017-07-01

Common beans (Phaseolus vulgaris) are excellent sources of dietary folates, but different varieties contain different amounts of these compounds. Genes coding for dihydroneopterin aldolase (DHNA) and aminodeoxychorismate synthase (ADCS) of the folate synthesis pathway were characterized by PCR amplification, BAC clone sequencing, and whole genome sequencing. All DHNA and ADCS genes in the Mesoamerican cultivar OAC Rex were isolated and compared with those genes in the genome of Andean genotype G19833. Both genotypes have two functional DHNA genes and one pseudo gene. PvDHNA1 and PvDHNA2 proteins have similar secondary structures and conserved residues as DHNA homologs in Staphylococcus aureus and Arabidopsis. Sequence analysis and synteny mapping indicated that PvDHNA1 might be a duplicated and transposed copy of PvDHNA2. There is only one ADCS gene (PvADCS) identified in the bean genome and it is identical in OAC Rex and G19833. PvADCS has the conserved motifs required for catalytic activity similar to other plant ADCS homologs. DHNA and ADCS gene-specific markers were developed, mapped, and compared to their physical locations on chromosomes 1 and 7, respectively. The gene-specific markers developed in this study should be useful for detection and selection of varieties with enhanced folate contents in bean breeding programs.
Comparative Transcriptome Analysis Reveals the Genetic Basis of Skin Color Variation in Common Carp

PubMed Central

Jiang, Yanliang; Zhang, Songhao; Xu, Jian; Feng, Jianxin; Mahboob, Shahid; Al-Ghanim, Khalid A.; Sun, Xiaowen; Xu, Peng

2014-01-01

Background The common carp is an important aquaculture species that is widely distributed across the world. During the long history of carp domestication, numerous carp strains with diverse skin colors have been established. Skin color is used as a visual criterion to determine the market value of carp. However, the genetic basis of common carp skin color has not been extensively studied. Methodology/Principal Findings In this study, we performed Illumina sequencing on two common carp strains: the reddish Xingguo red carp and the brownish-black Yellow River carp. A total of 435,348,868 reads were generated, resulting in 198,781 assembled contigs that were used as reference sequences. Comparisons of skin transcriptome files revealed 2,012 unigenes with significantly different expression in the two common carp strains, including 874 genes that were up-regulated in Xingguo red carp and 1,138 genes that were up-regulated in Yellow River carp. The expression patterns of 20 randomly selected differentially expressed genes were validated using quantitative RT-PCR. Gene pathway analysis of the differentially expressed genes indicated that melanin biosynthesis, along with the Wnt and MAPK signaling pathways, is highly likely to affect the skin pigmentation process. Several key genes involved in the skin pigmentation process, including TYRP1, SILV, ASIP and xCT, showed significant differences in their expression patterns between the two strains. Conclusions In this study, we conducted a comparative transcriptome analysis of Xingguo red carp and Yellow River carp skins, and we detected key genes involved in the common carp skin pigmentation process. We propose that common carp skin pigmentation depends upon at least three pathways. Understanding fish skin color genetics will facilitate future molecular selection of the fish skin colors with high market values. PMID:25255374
Comparative transcriptome analysis reveals the genetic basis of skin color variation in common carp.

PubMed

Jiang, Yanliang; Zhang, Songhao; Xu, Jian; Feng, Jianxin; Mahboob, Shahid; Al-Ghanim, Khalid A; Sun, Xiaowen; Xu, Peng

2014-01-01

The common carp is an important aquaculture species that is widely distributed across the world. During the long history of carp domestication, numerous carp strains with diverse skin colors have been established. Skin color is used as a visual criterion to determine the market value of carp. However, the genetic basis of common carp skin color has not been extensively studied. In this study, we performed Illumina sequencing on two common carp strains: the reddish Xingguo red carp and the brownish-black Yellow River carp. A total of 435,348,868 reads were generated, resulting in 198,781 assembled contigs that were used as reference sequences. Comparisons of skin transcriptome files revealed 2,012 unigenes with significantly different expression in the two common carp strains, including 874 genes that were up-regulated in Xingguo red carp and 1,138 genes that were up-regulated in Yellow River carp. The expression patterns of 20 randomly selected differentially expressed genes were validated using quantitative RT-PCR. Gene pathway analysis of the differentially expressed genes indicated that melanin biosynthesis, along with the Wnt and MAPK signaling pathways, is highly likely to affect the skin pigmentation process. Several key genes involved in the skin pigmentation process, including TYRP1, SILV, ASIP and xCT, showed significant differences in their expression patterns between the two strains. In this study, we conducted a comparative transcriptome analysis of Xingguo red carp and Yellow River carp skins, and we detected key genes involved in the common carp skin pigmentation process. We propose that common carp skin pigmentation depends upon at least three pathways. Understanding fish skin color genetics will facilitate future molecular selection of the fish skin colors with high market values.
Isolation of expressed sequences from the region commonly deleted in Velo-cardio-facial syndrome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sirotkin, H.; Morrow, B.; DasGupta, R.

Velo-cardio-facial syndrome (VCFS) is a relatively common autosomal dominant genetic disorder characterized by cleft palate, cardiac abnormalities, learning disabilities and a characteristic facial dysmorphology. Most VCFS patients have interstitial deletions of 22q11 of 1-2 mb. In an effort to isolate the gene(s) responsible for VCFS we have utilized a hybrid selection protocol to recover expressed sequences from three non-overlapping YACs comprising almost 1 mb of the commonly deleted region. Total yeast genomic DNA or isolated YAC DNA was immobilized on Hybond-N filters, blocked with yeast and human ribosomal and human repetitive sequences and hybridized with a mixture of random primedmore » short fragment cDNA libraries. Six human short fragment libraries derived from total fetus, fetal brain, adult brain, testes, thymus and spleen have been used for the selections. Short fragment cDNAs retained on the filter were passed through a second round of selection and cloned into lambda gt10. cDNAs shown to originate from the YACs and from chromosome 22 are being used to isolate full length cDNAs. Three genes known to be present on these YACs, catechol-O-methyltransferase, tuple 1 and clathrin heavy chain have been recovered. Additionally, a gene related to the murine p120 gene and a number of novel short cDNAs have been isolated. The role of these genes in VCFS is being investigated.« less
New FeFe-hydrogenase genes identified in a metagenomic fosmid library from a municipal wastewater treatment plant as revealed by high-throughput sequencing.

PubMed

Tomazetto, Geizecler; Wibberg, Daniel; Schlüter, Andreas; Oliveira, Valéria M

2015-01-01

A fosmid metagenomic library was constructed with total community DNA obtained from a municipal wastewater treatment plant (MWWTP), with the aim of identifying new FeFe-hydrogenase genes encoding the enzymes most important for hydrogen metabolism. The dataset generated by pyrosequencing of a fosmid library was mined to identify environmental gene tags (EGTs) assigned to FeFe-hydrogenase. The majority of EGTs representing FeFe-hydrogenase genes were affiliated with the class Clostridia, suggesting that this group is the main hydrogen producer in the MWWTP analyzed. Based on assembled sequences, three FeFe-hydrogenase genes were predicted based on detection of the L2 motif (MPCxxKxxE) in the encoded gene product, confirming true FeFe-hydrogenase sequences. These sequences were used to design specific primers to detect fosmids encoding FeFe-hydrogenase genes predicted from the dataset. Three identified fosmids were completely sequenced. The cloned genomic fragments within these fosmids are closely related to members of the Spirochaetaceae, Bacteroidales and Firmicutes, and their FeFe-hydrogenase sequences are characterized by the structure type M3, which is common to clostridial enzymes. FeFe-hydrogenase sequences found in this study represent hitherto undetected sequences, indicating the high genetic diversity regarding these enzymes in MWWTP. Results suggest that MWWTP have to be considered as reservoirs for new FeFe-hydrogenase genes. Copyright © 2014 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
The Genetic Background of Neonatal Disease.

PubMed

Göpel, Wolfgang; Westermann, Eva; Pagel, Friederike

2018-01-01

More than 27,000 human genes have been sequenced and described. Only a few of these genes are relevant for common human diseases with regard to diagnostic or therapeutic purposes. This review describes the genetics of common traits and diseases with a particular focus on perspectives for drug discovery and drug therapy in neonates. © 2018 S. Karger AG, Basel.

Increased taxon sampling reveals thousands of hidden orthologs in flatworms

PubMed Central

2017-01-01

Gains and losses shape the gene complement of animal lineages and are a fundamental aspect of genomic evolution. Acquiring a comprehensive view of the evolution of gene repertoires is limited by the intrinsic limitations of common sequence similarity searches and available databases. Thus, a subset of the gene complement of an organism consists of hidden orthologs, i.e., those with no apparent homology to sequenced animal lineages—mistakenly considered new genes—but actually representing rapidly evolving orthologs or undetected paralogs. Here, we describe Leapfrog, a simple automated BLAST pipeline that leverages increased taxon sampling to overcome long evolutionary distances and identify putative hidden orthologs in large transcriptomic databases by transitive homology. As a case study, we used 35 transcriptomes of 29 flatworm lineages to recover 3427 putative hidden orthologs, some unidentified by OrthoFinder and HaMStR, two common orthogroup inference algorithms. Unexpectedly, we do not observe a correlation between the number of putative hidden orthologs in a lineage and its “average” evolutionary rate. Hidden orthologs do not show unusual sequence composition biases that might account for systematic errors in sequence similarity searches. Instead, gene duplication with divergence of one paralog and weak positive selection appear to underlie hidden orthology in Platyhelminthes. By using Leapfrog, we identify key centrosome-related genes and homeodomain classes previously reported as absent in free-living flatworms, e.g., planarians. Altogether, our findings demonstrate that hidden orthologs comprise a significant proportion of the gene repertoire in flatworms, qualifying the impact of gene losses and gains in gene complement evolution. PMID:28400424
Polymorphism at Expressed DQ and DR Loci in Five Common Equine MHC Haplotypes

PubMed Central

Miller, Donald; Tallmadge, Rebecca L.; Binns, Matthew; Zhu, Baoli; Mohamoud, Yasmin Ali; Ahmed, Ayeda; Brooks, Samantha A.; Antczak, Douglas F.

2016-01-01

The polymorphism of Major Histocompatibility Complex (MHC) class II DQ and DR genes in five common Equine Leukocyte Antigen (ELA) haplotypes was determined through sequencing of mRNA transcripts isolated from lymphocytes of eight ELA homozygous horses. Ten expressed MHC class II genes were detected in horses of the ELA-A3 haplotype carried by the donor horses of the equine Bacterial Artificial Chromosome (BAC) library and the reference genome sequence: four DR genes and six DQ genes. The other four ELA haplotypes contained at least eight expressed polymorphic MHC class II loci. Next Generation Sequencing (NGS) of genomic DNA of these four MHC haplotypes revealed stop codons in the DQA3 gene in the ELA-A2, ELA-A5, and ELA-A9 haplotypes. Few NGS reads were obtained for the other MHC class II genes that were not amplified in these horses. The amino acid sequences across haplotypes contained locus-specific residues, and the locus clusters produced by phylogenetic analysis were well supported. The MHC class II alleles within the five tested haplotypes were largely non-overlapping between haplotypes. The complement of equine MHC class II DQ and DR genes appears to be well conserved between haplotypes, in contrast to the recently described variation in class I gene loci between equine MHC haplotypes. The identification of allelic series of equine MHC class II loci will aid comparative studies of mammalian MHC conservation and evolution and may also help to interpret associations between the equine MHC class II region and diseases of the horse. PMID:27889800
The complete genome sequence of Plodia interpunctella granulovirus: Discovery of an unusual inhibitor-of-apoptosis gene

USDA-ARS?s Scientific Manuscript database

The Indianmeal moth, Plodia interpunctella (Lepidoptera: Pyralidae), is a common pest of stored goods with a worldwide distribution. The complete genome sequence for a larval pathogen of this moth, the baculovirus Plodia interpunctella granulovirus (PiGV), was determined by next-generation sequenci...
Microbial evolution of sulphate reduction when lateral gene transfer is geographically restricted.

PubMed

Chi Fru, E

2011-07-01

Lateral gene transfer (LGT) is an important mechanism by which micro-organisms acquire new functions. This process has been suggested to be central to prokaryotic evolution in various environments. However, the influence of geographical constraints on the evolution of laterally acquired genes in microbial metabolic evolution is not yet well understood. In this study, the influence of geographical isolation on the evolution of laterally acquired dissimilatory sulphite reductase (dsr) gene sequences in the sulphate-reducing micro-organisms (SRM) was investigated. Sequences on four continental blocks related to SRM known to have received dsr by LGT were analysed using standard phylogenetic and multidimensional statistical methods. Sequences related to lineages with large genetic diversity correlated positively with habitat divergence. Those affiliated to Thermodesulfobacterium indicated strong biogeographical delineation; hydrothermal-vent sequences clustered independently from hot-spring sequences. Some of the hydrothermal-vent and hot-spring sequences suggested to have been acquired from a common ancestral source may have diverged upon isolation within distinct habitats. In contrast, analysis of some Desulfotomaculum sequences indicated they could have been transferred from different ancestral sources but converged upon isolation within the same niche. These results hint that, after lateral acquisition of dsr genes, barriers to gene flow probably play a strong role in their subsequent evolution.
Genomic structure of two ras family genes in the slime mold Physarum polycephalum.

PubMed

Trzcińska-Danielewicz, Joanna; Kozlowski, Piotr; Gierdal, Katarzyna; Wiejak, Jolanta; Jagielski, Adam; Toczko, Kazimierz; Fronk, Jan

2002-08-01

Genomic structure of two Physarum polycephalum ras family genes, Ppras2 and Pprap1, has been determined, including the upstream region of the latter. The genes are interrupted by three and four introns, respectively. The first intron of Ppras2 has the same location within the coding sequence as the first intron in another ras homolog from this organism, Ppras1 [Trzcińska-Danielewicz, J., Kozlowski, P., and Toczko, K. (1996). "Cloning and genomic sequence of the Physarum polycephalum Ppras1 gene, a homologue of the ras protooncogene", Gene 169, pp. 143-144]. All introns, ranging from 53 to ca. 460 base pairs, have the canonical 5' and 3' ends, are greatly enriched in pyrimidines in the coding strand and have frequent pyrimidines-only tracts. These latter features seem to be responsible for the difficulties in cloning and sequencing of parts of these genes. Short sequences shared with P. polycephalum transposon-like repeats are common in the introns, indicating a possible role of transposition in intron evolution. In all three ras family genes phase zero introns are located mostly between sequences coding for regular protein secondary structure elements.
Digital transcriptome analysis of putative sex-determination genes in papaya (Carica papaya).

PubMed

Urasaki, Naoya; Tarora, Kazuhiko; Shudo, Ayano; Ueno, Hiroki; Tamaki, Moritoshi; Miyagi, Norimichi; Adaniya, Shinichi; Matsumura, Hideo

2012-01-01

Papaya (Carica papaya) is a trioecious plant species that has male, female and hermaphrodite flowers on different plants. The primitive sex chromosomes genetically determine the sex of the papaya. Although draft sequences of the papaya genome are already available, the genes for sex determination have not been identified, likely due to the complicated structure of its sex-chromosome sequences. To identify the candidate genes for sex determination, we conducted a transcriptome analysis of flower samples from male, female and hermaphrodite plants using high-throughput SuperSAGE for digital gene expression analysis. Among the short sequence tags obtained from the transcripts, 312 unique tags were specifically mapped to the primitive sex chromosome (X or Y(h)) sequences. An annotation analysis revealed that retroelements are the most abundant sequences observed in the genes corresponding to these tags. The majority of tags on the sex chromosomes were located on the X chromosome, and only 30 tags were commonly mapped to both the X and Y(h) chromosome, implying a loss of many genes on the Y(h) chromosome. Nevertheless, candidate Y(h) chromosome-specific female determination genes, including a MADS-box gene, were identified. Information on these sex chromosome-specific expressed genes will help elucidating sex determination in the papaya.
Digital Transcriptome Analysis of Putative Sex-Determination Genes in Papaya (Carica papaya)

PubMed Central

Urasaki, Naoya; Tarora, Kazuhiko; Shudo, Ayano; Ueno, Hiroki; Tamaki, Moritoshi; Miyagi, Norimichi; Adaniya, Shinichi; Matsumura, Hideo

2012-01-01

Papaya (Carica papaya) is a trioecious plant species that has male, female and hermaphrodite flowers on different plants. The primitive sex chromosomes genetically determine the sex of the papaya. Although draft sequences of the papaya genome are already available, the genes for sex determination have not been identified, likely due to the complicated structure of its sex-chromosome sequences. To identify the candidate genes for sex determination, we conducted a transcriptome analysis of flower samples from male, female and hermaphrodite plants using high-throughput SuperSAGE for digital gene expression analysis. Among the short sequence tags obtained from the transcripts, 312 unique tags were specifically mapped to the primitive sex chromosome (X or Yh) sequences. An annotation analysis revealed that retroelements are the most abundant sequences observed in the genes corresponding to these tags. The majority of tags on the sex chromosomes were located on the X chromosome, and only 30 tags were commonly mapped to both the X and Yh chromosome, implying a loss of many genes on the Yh chromosome. Nevertheless, candidate Yh chromosome-specific female determination genes, including a MADS-box gene, were identified. Information on these sex chromosome-specific expressed genes will help elucidating sex determination in the papaya. PMID:22815863
Interspecific and intraspecific gene variability in a 1-Mb region containing the highest density of NBS-LRR genes found in the melon genome.

PubMed

González, Víctor M; Aventín, Núria; Centeno, Emilio; Puigdomènech, Pere

2014-12-17

Plant NBS-LRR -resistance genes tend to be found in clusters, which have been shown to be hot spots of genome variability. In melon, half of the 81 predicted NBS-LRR genes group in nine clusters, and a 1 Mb region on linkage group V contains the highest density of R-genes and presence/absence gene polymorphisms found in the melon genome. This region is known to contain the locus of Vat, an agronomically important gene that confers resistance to aphids. However, the presence of duplications makes the sequencing and annotation of R-gene clusters difficult, usually resulting in multi-gapped sequences with higher than average errors. A 1-Mb sequence that contains the largest NBS-LRR gene cluster found in melon was improved using a strategy that combines Illumina paired-end mapping and PCR-based gap closing. Unknown sequence was decreased by 70% while about 3,000 SNPs and small indels were corrected. As a result, the annotations of 18 of a total of 23 NBS-LRR genes found in this region were modified, including additional coding sequences, amino acid changes, correction of splicing boundaries, or fussion of ORFs in common transcription units. A phylogeny analysis of the R-genes and their comparison with syntenic sequences in other cucurbits point to a pattern of local gene amplifications since the diversification of cucurbits from other families, and through speciation within the family. A candidate Vat gene is proposed based on the sequence similarity between a reported Vat gene from a Korean melon cultivar and a sequence fragment previously absent in the unrefined sequence. A sequence refinement strategy allowed substantial improvement of a 1 Mb fragment of the melon genome and the re-annotation of the largest cluster of NBS-LRR gene homologues found in melon. Analysis of the cluster revealed that resistance genes have been produced by sequence duplication in adjacent genome locations since the divergence of cucurbits from other close families, and through the process of speciation within the family a candidate Vat gene was also identified using sequence previously unavailable, which demonstrates the advantages of genome assembly refinements when analyzing complex regions such as those containing clusters of highly similar genes.
[Sequence analysis of LEAFY homologous gene from Dendrobium moniliforme and application for identification of medicinal Dendrobium].

PubMed

Xing, Wen-Rui; Hou, Bei-Wei; Guan, Jing-Jiao; Luo, Jing; Ding, Xiao-Yu

2013-04-01

The LEAFY (LFY) homologous gene of Dendrobium moniliforme (L.) Sw. was cloned by new primers which were designed based on the conservative region of known sequences of orchid LEAFY gene. Partial LFY homologous gene was cloned by common PCR, then we got the complete LFY homologous gene Den LFY by Tail-PCR. The complete sequence of DenLFY gene was 3 575 bp which contained three exons and two introns. Using BLAST method, comparison analysis among the exon of LFY homologous gene indicted that the DenLFY gene had high identity with orchids LFY homologous, including the related fragment of PhalLFY (84%) in Phalaenopsis hybrid cultivar, LFY homologous gene in Oncidium (90%) and in other orchid (over 80%). Using MP analysis, Dendrobium is found to be the sister to Oncidium and Phalaenopsis. Homologous analysis demonstrated that the C-terminal amino acids were highly conserved. When the exons and introns were separately considered, exons and the sequence of amino acid were good markers for the function research of DenLFY gene. The second intron can be used in authentication research of Dendrobium based on the length polymorphism between Dendrobium moniliforme and Dendrobium officinale.
Common developmental genome deprogramming in schizophrenia - Role of Integrative Nuclear FGFR1 Signaling (INFS).

PubMed

Narla, S T; Lee, Y-W; Benson, C A; Sarder, P; Brennand, K J; Stachowiak, E K; Stachowiak, M K

2017-07-01

The watershed-hypothesis of schizophrenia asserts that over 200 different mutations dysregulate distinct pathways that converge on an unspecified common mechanism(s) that controls disease ontogeny. Consistent with this hypothesis, our RNA-sequencing of neuron committed cells (NCCs) differentiated from established iPSCs of 4 schizophrenia patients and 4 control subjects uncovered a dysregulated transcriptome of 1349 mRNAs common to all patients. Data reveals a global dysregulation of developmental genome, deconstruction of coordinated mRNA networks, and the formation of aberrant, new coordinated mRNA networks indicating a concerted action of the responsible factor(s). Sequencing of miRNA transcriptomes demonstrated an overexpression of 16 miRNAs and deconstruction of interactive miRNA-mRNA networks in schizophrenia NCCs. ChiPseq revealed that the nuclear (n) form of FGFR1, a pan-ontogenic regulator, is overexpressed in schizophrenia NCCs and overtargets dysregulated mRNA and miRNA genes. The nFGFR1 targeted 54% of all human gene promoters and 84.4% of schizophrenia dysregulated genes. The upregulated genes reside within major developmental pathways that control neurogenesis and neuron formation, whereas downregulated genes are involved in oligodendrogenesis. Our results indicate (i) an early (preneuronal) genomic etiology of schizophrenia, (ii) dysregulated genes and new coordinated gene networks are common to unrelated cases of schizophrenia, (iii) gene dysregulations are accompanied by increased nFGFR1-genome interactions, and (iv) modeling of increased nFGFR1 by an overexpression of a nFGFR1 lead to up or downregulation of selected genes as observed in schizophrenia NCCs. Together our results designate nFGFR1 signaling as a potential common dysregulated mechanism in investigated patients and potential therapeutic target in schizophrenia. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
De novo assembly and characterization of the spleen transcriptome of common carp (Cyprinus carpio) using Illumina paired-end sequencing.

PubMed

Li, Guoxi; Zhao, Yinli; Liu, Zhonghu; Gao, Chunsheng; Yan, Fengbin; Liu, Bianzhi; Feng, Jianxin

2015-06-01

Common carp (Cyprinus carpio) is one of the most important aquacultured species of the family Cyprinidae, and breeding this species for disease resistance is becoming more and more important. However, at the genome or transcriptome levels, study of the immunogenetics of disease resistance in the common carp is lacking. In this study, 60,316,906 and 75,200,328 paired-end clean reads were obtained from two cDNA libraries of the common carp spleen by Illumina paired-end sequencing technology. Totally, 130,293 unique transcript fragments (unigenes) were assembled, with an average length of 1400.57 bp. Approximately 105,612 (81.06%) unigenes could be annotated according to their homology with matches in the Nr, Nt, Swiss-Prot, COG, GO, or KEGG databases, and they were found to represent 46,747 non-redundant genes. Comparative analysis showed that 59.82% of the unigenes have significant similarity to zebrafish Refseq proteins. Gene expression comparison revealed that 10,432 and 6889 annotated unigenes were, respectively, up- and down-regulated with at least twofold changes between two developmental stages of the common carp spleen. Gene ontology and KEGG analysis were performed to classify all unigenes into functional categories for understanding gene functions and regulation pathways. In addition, 46,847 simple sequence repeats (SSRs) were detected from 35,618 unigenes, and a large number of single nucleotide polymorphism (SNP) and insertion/deletion (INDEL) sites were identified in the spleen transcriptome of common carp. This study has characterized the spleen transcriptome of the common carp for the first time, providing a valuable resource for a better understanding of the common carp immune system and defense mechanisms. This knowledge will also facilitate future functional studies on common carp immunogenetics that may eventually be applied in breeding programs. Copyright © 2015 Elsevier Ltd. All rights reserved.
A comprehensive aligned nifH gene database: a multipurpose tool for studies of nitrogen-fixing bacteria.

PubMed

Gaby, John Christian; Buckley, Daniel H

2014-01-01

We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm.
A comprehensive aligned nifH gene database: a multipurpose tool for studies of nitrogen-fixing bacteria

PubMed Central

Gaby, John Christian; Buckley, Daniel H.

2014-01-01

We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm PMID:24501396
New splicing mutation in the choline kinase beta (CHKB) gene causing a muscular dystrophy detected by whole-exome sequencing.

PubMed

Oliveira, Jorge; Negrão, Luís; Fineza, Isabel; Taipa, Ricardo; Melo-Pires, Manuel; Fortuna, Ana Maria; Gonçalves, Ana Rita; Froufe, Hugo; Egas, Conceição; Santos, Rosário; Sousa, Mário

2015-06-01

Muscular dystrophies (MDs) are a group of hereditary muscle disorders that include two particularly heterogeneous subgroups: limb-girdle MD and congenital MD, linked to 52 different genes (seven common to both subgroups). Massive parallel sequencing technology may avoid the usual stepwise gene-by-gene analysis. We report the whole-exome sequencing (WES) analysis of a patient with childhood-onset progressive MD, also presenting mental retardation and dilated cardiomyopathy. Conventional sequencing had excluded eight candidate genes. WES of the trio (patient and parents) was performed using the ion proton sequencing system. Data analysis resorted to filtering steps using the GEMINI software revealed a novel silent variant in the choline kinase beta (CHKB) gene. Inspection of sequence alignments ultimately identified the causal variant (CHKB:c.1031+3G>C). This splice site mutation was confirmed using Sanger sequencing and its effect was further evaluated with gene expression analysis. On reassessment of the muscle biopsy, typical abnormal mitochondrial oxidative changes were observed. Mutations in CHKB have been shown to cause phosphatidylcholine deficiency in myofibers, causing a rare form of CMD (only 21 patients reported). Notwithstanding interpretative difficulties that need to be overcome before the integration of WES in the diagnostic workflow, this work corroborates its utility in solving cases from highly heterogeneous groups of diseases, in which conventional diagnostic approaches fail to provide a definitive diagnosis.
Characterisation of betalain biosynthesis in Parakeelya flowers identifies the key biosynthetic gene DOD as belonging to an expanded LigB gene family that is conserved in betalain-producing species

PubMed Central

Chung, Hsiao-Hang; Schwinn, Kathy E.; Ngo, Hanh M.; Lewis, David H.; Massey, Baxter; Calcott, Kate E.; Crowhurst, Ross; Joyce, Daryl C.; Gould, Kevin S.; Davies, Kevin M.; Harrison, Dion K.

2015-01-01

Plant betalain pigments are intriguing because they are restricted to the Caryophyllales and are mutually exclusive with the more common anthocyanins. However, betalain biosynthesis is poorly understood compared to that of anthocyanins. In this study, betalain production and betalain-related genes were characterized in Parakeelya mirabilis (Montiaceae). RT-PCR and transcriptomics identified three sequences related to the key biosynthetic enzyme Dopa 4,5-dioxgenase (DOD). In addition to a LigB gene similar to that of non-Caryophyllales species (Class I genes), two other P. mirabilis LigB genes were found (DOD and DOD-like, termed Class II). PmDOD and PmDOD-like had 70% amino acid identity. Only PmDOD was implicated in betalain synthesis based on transient assays of enzyme activity and correlation of transcript abundance to spatio-temporal betalain accumulation. The role of PmDOD-like remains unknown. The striking pigment patterning of the flowers was due to distinct zones of red betacyanin and yellow betaxanthin production. The major betacyanin was the unglycosylated betanidin rather than the commonly found glycosides, an occurrence for which there are a few previous reports. The white petal zones lacked pigment but had DOD activity suggesting alternate regulation of the pathway in this tissue. DOD and DOD-like sequences were also identified in other betalain-producing species but not in examples of anthocyanin-producing Caryophyllales or non-Caryophyllales species. A Class I LigB sequence from the anthocyanin-producing Caryophyllaceae species Dianthus superbus and two DOD-like sequences from the Amaranthaceae species Beta vulgaris and Ptilotus spp. did not show DOD activity in the transient assay. The additional sequences suggests that DOD is part of a larger LigB gene family in betalain-producing Caryophyllales taxa, and the tandem genomic arrangement of two of the three B. vulgaris LigB genes suggests the involvement of duplication in the gene family evolution. PMID:26217353
Candidate Genes Expressed in Tolerant Common Wheat With Resistant to English Grain Aphid.

PubMed

Luo, Kun; Zhang, Gaisheng; Wang, Chunping; Ouellet, Thérèse; Wu, Jingjing; Zhu, Qidi; Zhao, Huiyan

2014-10-01

The English grain aphid, Sitobion avenae (F.) (Hemiptera: Aphididae), is a common worldwide pest of wheat (Triticum aestivum L.). The use of improved resistant cultivars by the farmers is the most effective and environmentally friendly method to control this aphid in the field. The winter wheat genotypes 98-10-35 and Amigo are resistant to S. avenae. To identify genes responsible for resistance to S. avenae in these genotypes, differential-display reverse transcription-polymerase chain reaction was used to identify the corresponding differentially expressed sequences in current study. Two backcross progenies were obtained by crossing the two resistant genotypes with the susceptible genotype 1376. Six potential expected-differential bands were sequenced. Lengths of the expressed sequence tags ranged from 128 to 532 bp. Although these expressed sequences were likely associated with S. avenae resistance, there was one expressed sequence tag located on 7DL chromosome, and its potential function may associate with the ability to maintain photosynthesis in wheat. That serves as an active way for tolerant common wheat with resistant to S. avenae. Cloning the full length of these sequences would help us thoroughly understand the mechanism of wheat resistance to S. avenae and be valuable for breeding cultivars with S. avenae resistance. © 2014 Entomological Society of America.
Identification of upstream and intragenic regulatory elements that confer cell-type-restricted and differentiation-specific expression on the muscle creatine kinase gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sternberg, E.A.; Spizz, G.; Perry, W.M.

1988-07-01

Terminal differentiation of skeletal myobalsts is accompanied by induction of a series of tissue-specific gene products, which includes the muscle isoenzymte of creatine kinase (MCK). To begin to define the sequences and signals involved in MCK regulation in developing muscle cells, the mouse MCK gene has been isolated. Sequence analysis of 4,147 bases of DNA surrounding the transcription initiation site revealed several interesting structural features, some of which are common to other muscle-specific genes and to cellular and viral enhancers.
Alternative RNA processing events in human calcitonin/calcitonin gene-related peptide gene expression.

PubMed Central

Jonas, V; Lin, C R; Kawashima, E; Semon, D; Swanson, L W; Mermod, J J; Evans, R M; Rosenfeld, M G

1985-01-01

Two mRNAs generated as a consequence of alternative RNA processing events in expression of the human calcitonin gene encode the protein precursors of either calcitonin or calcitonin gene-related peptide (CGRP). Both calcitonin and CGRP RNAs and their encoded peptide products are expressed in the human pituitary and in medullary thyroid tumors. On the basis of sequence comparison, it is suggested that both the calcitonin and CGRP exons arose from a common primordial sequence, suggesting that duplication and rearrangement events are responsible for the generation of this complex transcription unit. Images PMID:3872459
The complete mitochondrial genome of the dwarf tapeworm Hymenolepis nana--a neglected zoonotic helminth.

PubMed

Cheng, Tian; Liu, Guo-Hua; Song, Hui-Qun; Lin, Rui-Qing; Zhu, Xing-Quan

2016-03-01

Hymenolepis nana, commonly known as the dwarf tapeworm, is one of the most common tapeworms of humans and rodents and can cause hymenolepiasis. Although this zoonotic tapeworm is of socio-economic significance in many countries of the world, its genetics, systematics, epidemiology, and biology are poorly understood. In the present study, we sequenced and characterized the complete mitochondrial (mt) genome of H. nana. The mt genome is 13,764 bp in size and encodes 36 genes, including 12 protein-coding genes, 2 ribosomal RNA, and 22 transfer RNA genes. All genes are transcribed in the same direction. The gene order and genome content are completely identical with their congener Hymenolepis diminuta. Phylogenetic analyses based on concatenated amino acid sequences of 12 protein-coding genes by Bayesian inference, Maximum likelihood, and Maximum parsimony showed the division of class Cestoda into two orders, supported the monophylies of both the orders Cyclophyllidea and Pseudophyllidea. Analyses of mt genome sequences also support the monophylies of the three families Taeniidae, Hymenolepididae, and Diphyllobothriidae. This novel mt genome provides a useful genetic marker for studying the molecular epidemiology, systematics, and population genetics of the dwarf tapeworm and should have implications for the diagnosis, prevention, and control of hymenolepiasis in humans.
Genetic diversity and epidemiology of infectious hematopoietic necrosis virus in Alaska

USGS Publications Warehouse

Emmenegger, E.G; Meyers, T.R.; Burton, T.O.; Kurath, G.

2000-01-01

Forty-two infectious hematopoietic necrosis virus (IHNV) isolates from Alaska were analyzed using the ribonuclease protection assay (RPA) and nucleotide sequencing. RPA analyses, utilizing 4 probes, N5, N3 (N gene), GF (G gene), and NV (NV gene), determined that the haplotypes of all 3 genes demonstrated a consistent spatial pattern. Virus isolates belonging to the most common haplotype groups were distributed throughout Alaska, whereas isolates in small haplotype groups were obtained from only 1 site (hatchery, lake, etc.). The temporal pattern of the GF haplotypes suggested a 'genetic acclimation' of the G gene, possibly due to positive selection on the glycoprotein. A pairwise comparison of the sequence data determined that the maximum nucleotide diversity of the isolates was 2.75% (10 mismatches) for the NV gene, and 1.99% (6 mismatches) for a 301 base pair region of the G gene, indicating that the genetic diversity of IHNV within Alaska is notably lower than in the more southern portions of the IHNV North American range. Phylogenetic analysis of representative Alaskan sequences and sequences of 12 previously characterized IHNV strains from Washington, Oregon, Idaho, California (USA) and British Columbia (Canada) distinguished the isolates into clusters that correlated with geographic origin and indicated that the Alaskan and British Columbia isolates may have a common viral ancestral lineage. Comparisons of multiple isolates from the same site provided epidemiological insights into viral transmission patterns and indicated that viral evolution, viral introduction, and genetic stasis were the mechanisms involved with IHN virus population dynamics in Alaska. The examples of genetic stasis and the overall low sequence heterogeneity of the Alaskan isolates suggested that they are evolutionarily constrained. This study establishes a baseline of genetic fingerprint patterns and sequence groups representing the genetic diversity of Alaskan IHNV isolates. This information could be used to determine the source of an IHN outbreak and to facilitate decisions in fisheries management of Alaskan salmonid stocks.

Lessons learned from whole exome sequencing in multiplex families affected by a complex genetic disorder, intracranial aneurysm.

PubMed

Farlow, Janice L; Lin, Hai; Sauerbeck, Laura; Lai, Dongbing; Koller, Daniel L; Pugh, Elizabeth; Hetrick, Kurt; Ling, Hua; Kleinloog, Rachel; van der Vlies, Pieter; Deelen, Patrick; Swertz, Morris A; Verweij, Bon H; Regli, Luca; Rinkel, Gabriel J E; Ruigrok, Ynte M; Doheny, Kimberly; Liu, Yunlong; Broderick, Joseph; Foroud, Tatiana

2015-01-01

Genetic risk factors for intracranial aneurysm (IA) are not yet fully understood. Genomewide association studies have been successful at identifying common variants; however, the role of rare variation in IA susceptibility has not been fully explored. In this study, we report the use of whole exome sequencing (WES) in seven densely-affected families (45 individuals) recruited as part of the Familial Intracranial Aneurysm study. WES variants were prioritized by functional prediction, frequency, predicted pathogenicity, and segregation within families. Using these criteria, 68 variants in 68 genes were prioritized across the seven families. Of the genes that were expressed in IA tissue, one gene (TMEM132B) was differentially expressed in aneurysmal samples (n=44) as compared to control samples (n=16) (false discovery rate adjusted p-value=0.023). We demonstrate that sequencing of densely affected families permits exploration of the role of rare variants in a relatively common disease such as IA, although there are important study design considerations for applying sequencing to complex disorders. In this study, we explore methods of WES variant prioritization, including the incorporation of unaffected individuals, multipoint linkage analysis, biological pathway information, and transcriptome profiling. Further studies are needed to validate and characterize the set of variants and genes identified in this study.
Sequence verification of synthetic DNA by assembly of sequencing reads

PubMed Central

Wilson, Mandy L.; Cai, Yizhi; Hanlon, Regina; Taylor, Samantha; Chevreux, Bastien; Setubal, João C.; Tyler, Brett M.; Peccoud, Jean

2013-01-01

Gene synthesis attempts to assemble user-defined DNA sequences with base-level precision. Verifying the sequences of construction intermediates and the final product of a gene synthesis project is a critical part of the workflow, yet one that has received the least attention. Sequence validation is equally important for other kinds of curated clone collections. Ensuring that the physical sequence of a clone matches its published sequence is a common quality control step performed at least once over the course of a research project. GenoREAD is a web-based application that breaks the sequence verification process into two steps: the assembly of sequencing reads and the alignment of the resulting contig with a reference sequence. GenoREAD can determine if a clone matches its reference sequence. Its sophisticated reporting features help identify and troubleshoot problems that arise during the sequence verification process. GenoREAD has been experimentally validated on thousands of gene-sized constructs from an ORFeome project, and on longer sequences including whole plasmids and synthetic chromosomes. Comparing GenoREAD results with those from manual analysis of the sequencing data demonstrates that GenoREAD tends to be conservative in its diagnostic. GenoREAD is available at www.genoread.org. PMID:23042248
Deep sequencing of the viral phoH gene reveals temporal variation, depth-specific composition, and persistent dominance of the same viral phoH genes in the Sargasso Sea

PubMed Central

Goldsmith, Dawn B.; Parsons, Rachel J.; Beyene, Damitu; Salamon, Peter

2015-01-01

Deep sequencing of the viral phoH gene, a host-derived auxiliary metabolic gene, was used to track viral diversity throughout the water column at the Bermuda Atlantic Time-series Study (BATS) site in the summer (September) and winter (March) of three years. Viral phoH sequences reveal differences in the viral communities throughout a depth profile and between seasons in the same year. Variation was also detected between the same seasons in subsequent years, though these differences were not as great as the summer/winter distinctions. Over 3,600 phoH operational taxonomic units (OTUs; 97% sequence identity) were identified. Despite high richness, most phoH sequences belong to a few large, common OTUs whereas the majority of the OTUs are small and rare. While many OTUs make sporadic appearances at just a few times or depths, a small number of OTUs dominate the community throughout the seasons, depths, and years. PMID:26157645
Cloning and sequence analysis demonstrate the chromate reduction ability of a novel chromate reductase gene from Serratia sp.

PubMed

Deng, Peng; Tan, Xiaoqing; Wu, Ying; Bai, Qunhua; Jia, Yan; Xiao, Hong

2015-03-01

The ChrT gene encodes a chromate reductase enzyme which catalyzes the reduction of Cr(VI). The chromate reductase is also known as flavin mononucleotide (FMN) reductase (FMN_red). The aim of the present study was to clone the full-length ChrT DNA from Serratia sp. CQMUS2 and analyze the deduced amino acid sequence and three-dimensional structure. The putative ChrT gene fragment of Serratia sp. CQMUS2 was isolated by polymerase chain reaction (PCR), according to the known FMN_red gene sequence from Serratia sp. AS13. The flanking sequences of the ChrT gene were obtained by high efficiency TAIL-PCR, while the full-length gene of ChrT was cloned in Escherichia coli for subsequent sequencing. The nucleotide sequence of ChrT was submitted onto GenBank under the accession number, KF211434. Sequence analysis of the gene and amino acids was conducted using the Basic Local Alignment Search Tool, and open reading frame (ORF) analysis was performed using ORF Finder software. The ChrT gene was found to be an ORF of 567 bp that encodes a 188-amino acid enzyme with a calculated molecular weight of 20.4 kDa. In addition, the ChrT protein was hypothesized to be an NADPH-dependent FMN_red and a member of the flavodoxin-2 superfamily. The amino acid sequence of ChrT showed high sequence similarity to the FMN reductase genes of Klebsiella pneumonia and Raoultella ornithinolytica , which belong to the flavodoxin-2 superfamily. Furthermore, ChrT was shown to have a 85.6% similarity to the three-dimensional structure of Escherichia coli ChrR, sharing four common enzyme active sites for chromate reduction. Therefore, ChrT gene cloning and protein structure determination demonstrated the ability of the gene for chromate reduction. The results of the present study provide a basis for further studies on ChrT gene expression and protein function.
Cloning and sequence analysis demonstrate the chromate reduction ability of a novel chromate reductase gene from Serratia sp

PubMed Central

DENG, PENG; TAN, XIAOQING; WU, YING; BAI, QUNHUA; JIA, YAN; XIAO, HONG

2015-01-01

The ChrT gene encodes a chromate reductase enzyme which catalyzes the reduction of Cr(VI). The chromate reductase is also known as flavin mononucleotide (FMN) reductase (FMN_red). The aim of the present study was to clone the full-length ChrT DNA from Serratia sp. CQMUS2 and analyze the deduced amino acid sequence and three-dimensional structure. The putative ChrT gene fragment of Serratia sp. CQMUS2 was isolated by polymerase chain reaction (PCR), according to the known FMN_red gene sequence from Serratia sp. AS13. The flanking sequences of the ChrT gene were obtained by high efficiency TAIL-PCR, while the full-length gene of ChrT was cloned in Escherichia coli for subsequent sequencing. The nucleotide sequence of ChrT was submitted onto GenBank under the accession number, KF211434. Sequence analysis of the gene and amino acids was conducted using the Basic Local Alignment Search Tool, and open reading frame (ORF) analysis was performed using ORF Finder software. The ChrT gene was found to be an ORF of 567 bp that encodes a 188-amino acid enzyme with a calculated molecular weight of 20.4 kDa. In addition, the ChrT protein was hypothesized to be an NADPH-dependent FMN_red and a member of the flavodoxin-2 superfamily. The amino acid sequence of ChrT showed high sequence similarity to the FMN reductase genes of Klebsiella pneumonia and Raoultella ornithinolytica, which belong to the flavodoxin-2 superfamily. Furthermore, ChrT was shown to have a 85.6% similarity to the three-dimensional structure of Escherichia coli ChrR, sharing four common enzyme active sites for chromate reduction. Therefore, ChrT gene cloning and protein structure determination demonstrated the ability of the gene for chromate reduction. The results of the present study provide a basis for further studies on ChrT gene expression and protein function. PMID:25667630
Cloning of Giardia lamblia heat shock protein HSP70 homologs: implications regarding origin of eukaryotic cells and of endoplasmic reticulum.

PubMed Central

Gupta, R S; Aitken, K; Falah, M; Singh, B

1994-01-01

The genes for two different 70-kDa heat shock protein (HSP70) homologs have been cloned and sequenced from the protozoan Giardia lamblia. On the basis of their sequence features, one of these genes corresponds to the cytoplasmic form of HSP70. The second gene, on the basis of its characteristic N-terminal hydrophobic signal sequence and C-terminal endoplasmic reticulum (ER) retention sequence (Lys-Asp-Glu-Leu), is the equivalent of ER-resident GRP78 or the Bip family of proteins. Phylogenetic trees based on HSP70 sequences show that G. lamblia homologs show the deepest divergence among eukaryotic species. The identification of a GRP78 or Bip homolog in G. lamblia strongly suggests the existence of ER in this ancient eukaryote. Detailed phylogenetic analyses of HSP70 sequences by boot-strap neighbor-joining and maximum-parsimony methods show that the cytoplasmic and ER homologs form distinct subfamilies that evolved from a common eukaryotic ancestor by gene duplication that occurred very early in the evolution of eukaryotic cells. It is postulated that because of the essential "molecular chaperone" function of these proteins in translocation of other proteins across membranes, duplication of their genes accompanied the evolution of ER or nucleus in the eukaryotic cell ancestor. The presence in all eukaryotic cytoplasmic HSP70 homologs (including the cognate, heat-induced, and ER forms) of a number of autapomorphic sequence signatures that are not present in any prokaryotic or organellar homologs provides strong evidence regarding the monophyletic nature of eukaryotic lineage. Further, all eukaryotic HSP70 homologs share in common with the Gram-negative group of eubacteria a number of sequence features that are not present in any archaebacterium or Gram-positive bacterium, indicating their evolution from this group of organisms. Some implications of these findings regarding the evolution of eukaryotic cells and ER are discussed. Images PMID:8159675
The Eucalyptus terpene synthase gene family.

PubMed

Külheim, Carsten; Padovan, Amanda; Hefer, Charles; Krause, Sandra T; Köllner, Tobias G; Myburg, Alexander A; Degenhardt, Jörg; Foley, William J

2015-06-11

Terpenoids are abundant in the foliage of Eucalyptus, providing the characteristic smell as well as being valuable economically and influencing ecological interactions. Quantitative and qualitative inter- and intra- specific variation of terpenes is common in eucalypts. The genome sequences of Eucalyptus grandis and E. globulus were mined for terpene synthase genes (TPS) and compared to other plant species. We investigated the relative expression of TPS in seven plant tissues and functionally characterized five TPS genes from E. grandis. Compared to other sequenced plant genomes, Eucalyptus grandis has the largest number of putative functional TPS genes of any sequenced plant. We discovered 113 and 106 putative functional TPS genes in E. grandis and E. globulus, respectively. All but one TPS from E. grandis were expressed in at least one of seven plant tissues examined. Genomic clusters of up to 20 genes were identified. Many TPS are expressed in tissues other than leaves which invites a re-evaluation of the function of terpenes in Eucalyptus. Our data indicate that terpenes in Eucalyptus may play a wider role in biotic and abiotic interactions than previously thought. Tissue specific expression is common and the possibility of stress induction needs further investigation. Phylogenetic comparison of the two investigated Eucalyptus species gives insight about recent evolution of different clades within the TPS gene family. While the majority of TPS genes occur in orthologous pairs some clades show evidence of recent gene duplication, as well as loss of function.
A Nomadic Subtelomeric Disease Resistance Gene Cluster in Common Bean1[W

PubMed Central

David, Perrine; Chen, Nicolas W.G.; Pedrosa-Harand, Andrea; Thareau, Vincent; Sévignac, Mireille; Cannon, Steven B.; Debouck, Daniel; Langin, Thierry; Geffroy, Valérie

2009-01-01

The B4 resistance (R) gene cluster is one of the largest clusters known in common bean (Phaseolus vulgaris [Pv]). It is located in a peculiar genomic environment in the subtelomeric region of the short arm of chromosome 4, adjacent to two heterochromatic blocks (knobs). We sequenced 650 kb spanning this locus and annotated 97 genes, 26 of which correspond to Coiled-Coil-Nucleotide-Binding-Site-Leucine-Rich-Repeat (CNL). Conserved microsynteny was observed between the Pv B4 locus and corresponding regions of Medicago truncatula and Lotus japonicus in chromosomes Mt6 and Lj2, respectively. The notable exception was the CNL sequences, which were completely absent in these regions. The origin of the Pv B4-CNL sequences was investigated through phylogenetic analysis, which reveals that, in the Pv genome, paralogous CNL genes are shared among nonhomologous chromosomes (4 and 11). Together, our results suggest that Pv B4-CNL was derived from CNL sequences from another cluster, the Co-2 cluster, through an ectopic recombination event. Integration of the soybean (Glycine max) genome data enables us to date more precisely this event and also to infer that a single CNL moved from the Co-2 to the B4 cluster. Moreover, we identified a new 528-bp satellite repeat, referred to as khipu, specific to the Phaseolus genus, present both between B4-CNL sequences and in the two knobs identified at the B4 R gene cluster. The khipu repeat is present on most chromosomal termini, indicating the existence of frequent ectopic recombination events in Pv subtelomeric regions. Our results highlight the importance of ectopic recombination in R gene evolution. PMID:19776165
Adenovirus EIIA early promoter: transcriptional control elements and induction by the viral pre-early EIA gene, which appears to be sequence independent.

PubMed Central

Murthy, S C; Bhat, G P; Thimmappaya, B

1985-01-01

A molecular dissection of the adenovirus EIIA early (E) promoter was undertaken to study the sequence elements required for transcription and to examine the nucleotide sequences, if any, specific for its trans-activation by the viral pre-early EIA gene product. A chimeric gene in which the EIIA-E promoter region fused to the coding sequences of the bacterial chloramphenicol acetyltransferase (CAT) gene was used in transient assays to identify the transcriptional control regions. Deletion mapping studies revealed that the upstream DNA sequences up to -86 were sufficient for the optimal basal level transcription in HeLa cells and also for the EIA-induced transcription. A series of linker-scanning (LS) mutants were constructed to precisely identify the nucleotide sequences that control transcription. Analysis of these LS mutants allowed us to identify two regions of the promoter that are critical for the EIIA-E transcription. These regions are located between -29 and -21 (region I) and between -82 and -66 (region II). Mutations in region I affected initiation and appeared functionally similar to the "TATA" sequence of the commonly studied promoters. To examine whether or not the EIIA-E promoter contained DNA sequences specific for the trans-activation by the EIA, the LS mutants were analyzed in a cotransfection assay containing a plasmid carrying the EIA gene. CAT activity of all of the LS mutants was induced by the EIA gene in this assay, suggesting that the induction of transcription of the EIIA-E promoter by the EIA gene is not sequence-specific. Images PMID:3857577
Tracing common origins of Genomic Islands in prokaryotes based on genome signature analyses.

PubMed

van Passel, Mark Wj

2011-09-01

Horizontal gene transfer constitutes a powerful and innovative force in evolution, but often little is known about the actual origins of transferred genes. Sequence alignments are generally of limited use in tracking the original donor, since still only a small fraction of the total genetic diversity is thought to be uncovered. Alternatively, approaches based on similarities in the genome specific relative oligonucleotide frequencies do not require alignments. Even though the exact origins of horizontally transferred genes may still not be established using these compositional analyses, it does suggest that compositionally very similar regions are likely to have had a common origin. These analyses have shown that up to a third of large acquired gene clusters that reside in the same genome are compositionally very similar, indicative of a shared origin. This brings us closer to uncovering the original donors of horizontally transferred genes, and could help in elucidating possible regulatory interactions between previously unlinked sequences.
A multiplex method for detection of glucose-6-phosphate dehydrogenase (G6PD) gene mutations.

PubMed

Zhang, L; Yang, Y; Liu, R; Li, Q; Yang, F; Ma, L; Liu, H; Chen, X; Yang, Z; Cui, L; He, Y

2015-12-01

Glucose-6-phosphate dehydrogenase (G6PD) deficiency is the most common human enzyme defect caused by G6PD gene mutations. This study aimed to develop a cost-effective, multiplex, genotyping method for detecting common mutations in the G6PD gene. We used a SNaPshot approach to genotype multiple G6PD mutations that are common to human populations in South-East Asia. This assay is based on multiplex PCR coupled with primer extension reactions. Different G6PD gene mutations were determined by peak retention time and colors of the primer extension products. We designed PCR primers for multiplex amplification of the G6PD gene fragments and for primer extension reactions to genotype 11 G6PD mutations. DNA samples from a total of 120 unrelated G6PD-deficient individuals from the China-Myanmar border area were used to establish and validate this method. Direct sequencing of the PCR products demonstrated 100% concordance between the SNaPshot and the sequencing results. The SNaPshot method offers a specific and sensitive alternative for simultaneously interrogating multiple G6PD mutations. © 2015 John Wiley & Sons Ltd.
Nucleotide sequence of the gag gene and gag-pol junction of feline leukemia virus.

PubMed Central

Laprevotte, I; Hampe, A; Sherr, C J; Galibert, F

1984-01-01

The nucleotide sequence of the gag gene of feline leukemia virus and its flanking sequences were determined and compared with the corresponding sequences of two strains of feline sarcoma virus and with that of the Moloney strain of murine leukemia virus. A high degree of nucleotide sequence homology between the feline leukemia virus and murine leukemia virus gag genes was observed, suggesting that retroviruses of domestic cats and laboratory mice have a common, proximal evolutionary progenitor. The predicted structure of the complete feline leukemia virus gag gene precursor suggests that the translation of nonglycosylated and glycosylated gag gene polypeptides is initiated at two different AUG codons. These initiator codons fall in the same reading frame and are separated by a 222-base-pair segment which encodes an amino terminal signal peptide. The nucleotide sequence predicts the order of amino acids in each of the individual gag-coded proteins (p15, p12, p30, p10), all of which derive from the gag gene precursor. Stable stem-and-loop secondary structures are proposed for two regions of viral RNA. The first falls within sequences at the 5' end of the viral genome, together with adjacent palindromic sequences which may play a role in dimer linkage of RNA subunits. The second includes coding sequences at the gag-pol junction and is proposed to be involved in translation of the pol gene product. Sequence analysis of the latter region shows that the gag and pol genes are translated in different reading frames. Classical consensus splice donor and acceptor sequences could not be localized to regions which would permit synthesis of the expected gag-pol precursor protein. Alternatively, we suggest that the pol gene product (RNA-dependent DNA polymerase) could be translated by a frameshift suppressing mechanism which could involve cleavage modification of stems and loops in a manner similar to that observed in tRNA processing. PMID:6328019
How the Sequence of a Gene Specifies Structural Symmetry in Proteins

PubMed Central

Shen, Xiaojuan; Huang, Tongcheng; Wang, Guanyu; Li, Guanglin

2015-01-01

Internal symmetry is commonly observed in the majority of fundamental protein folds. Meanwhile, sufficient evidence suggests that nascent polypeptide chains of proteins have the potential to start the co-translational folding process and this process allows mRNA to contain additional information on protein structure. In this paper, we study the relationship between gene sequences and protein structures from the viewpoint of symmetry to explore how gene sequences code for structural symmetry in proteins. We found that, for a set of two-fold symmetric proteins from left-handed beta-helix fold, intragenic symmetry always exists in their corresponding gene sequences. Meanwhile, codon usage bias and local mRNA structure might be involved in modulating translation speed for the formation of structural symmetry: a major decrease of local codon usage bias in the middle of the codon sequence can be identified as a common feature; and major or consecutive decreases in local mRNA folding energy near the boundaries of the symmetric substructures can also be observed. The results suggest that gene duplication and fusion may be an evolutionarily conserved process for this protein fold. In addition, the usage of rare codons and the formation of higher order of secondary structure near the boundaries of symmetric substructures might have coevolved as conserved mechanisms to slow down translation elongation and to facilitate effective folding of symmetric substructures. These findings provide valuable insights into our understanding of the mechanisms of translation and its evolution, as well as the design of proteins via symmetric modules. PMID:26641668
Phylogenetic analysis of Fusobacterium prausnitzii based upon the 16S rRNA gene sequence and PCR confirmation.

PubMed

Wang, R F; Cao, W W; Cerniglia, C E

1996-01-01

In order to develop a PCR method to detect Fusobacterium prausnitzii in human feces and to clarify the phylogenetic position of this species, its 16S rRNA gene sequence was determined. The sequence described in this paper is different from the 16S rRNA gene sequence is specific for F. prausnitzii, and the results of this assay confirmed that F. prausnitzii is the most common species in human feces. However, a PCR assay based on the original GenBank sequence was negative when it was performed with two strains of F. prausnitzii obtained from the American Type Culture Collection. A phylogenetic tree based on the new 16S rRNA gene sequence was constructed. On this tree F. prausnitzii was not a member of the Fusobacterium group but was closer to some Eubacterium spp. and located between Clostridium "clusters III and IV" (M.D. Collins, P.A. Lawson, A. Willems, J.J. Cordoba, J. Fernandez-Garayzabal, P. Garcia, J. Cai, H. Hippe, and J.A.E. Farrow, Int. J. Syst. Bacteriol. 44:812-826, 1994).
De novo sequencing and characterization of floral transcriptome in two species of buckwheat (Fagopyrum)

PubMed Central

2011-01-01

Background Transcriptome sequencing data has become an integral component of modern genetics, genomics and evolutionary biology. However, despite advances in the technologies of DNA sequencing, such data are lacking for many groups of living organisms, in particular, many plant taxa. We present here the results of transcriptome sequencing for two closely related plant species. These species, Fagopyrum esculentum and F. tataricum, belong to the order Caryophyllales - a large group of flowering plants with uncertain evolutionary relationships. F. esculentum (common buckwheat) is also an important food crop. Despite these practical and evolutionary considerations Fagopyrum species have not been the subject of large-scale sequencing projects. Results Normalized cDNA corresponding to genes expressed in flowers and inflorescences of F. esculentum and F. tataricum was sequenced using the 454 pyrosequencing technology. This resulted in 267 (for F. esculentum) and 229 (F. tataricum) thousands of reads with average length of 341-349 nucleotides. De novo assembly of the reads produced about 25 thousands of contigs for each species, with 7.5-8.2× coverage. Comparative analysis of two transcriptomes demonstrated their overall similarity but also revealed genes that are presumably differentially expressed. Among them are retrotransposon genes and genes involved in sugar biosynthesis and metabolism. Thirteen single-copy genes were used for phylogenetic analysis; the resulting trees are largely consistent with those inferred from multigenic plastid datasets. The sister relationships of the Caryophyllales and asterids now gained high support from nuclear gene sequences. Conclusions 454 transcriptome sequencing and de novo assembly was performed for two congeneric flowering plant species, F. esculentum and F. tataricum. As a result, a large set of cDNA sequences that represent orthologs of known plant genes as well as potential new genes was generated. PMID:21232141
Pneumocystis jirovecii multilocus genotyping profiles in patients from Portugal and Spain.

PubMed

Esteves, F; Montes-Cano, M A; de la Horra, C; Costa, M C; Calderón, E J; Antunes, F; Matos, O

2008-04-01

Pneumonia caused by the opportunistic organism Pneumocystis jirovecii is a clinically important infection affecting AIDS and other immunocompromised patients. The present study aimed to compare and characterise the frequency pattern of DNA sequences from the P. jirovecii mitochondrial large-subunit rRNA (mtLSU rRNA) gene, the dihydropteroate synthase (DHPS) gene and the internal transcribed spacer (ITS) regions of the nuclear rRNA operon in specimens from Lisbon (Portugal) and Seville (Spain). Total DNA was extracted and used for specific molecular sequence analysis of the three loci. In both populations, mtLSU rRNA gene analysis revealed an overall prevalence of genotype 1. In the Portuguese population, genotype 2 was the second most common, followed by genotype 3. Inversely, in the Spanish population, genotype 3 was the second most common, followed by genotype 2. The DHPS wild-type sequence was the genotype observed most frequently in both populations, and the DHPS genotype frequency pattern was identical to distribution patterns revealed in other European studies. ITS types showed a significant diversity in both populations because of the high sequence variability in these genomic regions. The most prevalent ITS type in the Portuguese population was Eg, followed by Cg. In contrast to other European studies, Bi was the most common ITS type in the Spanish samples, followed by Eg. A statistically significant association between mtLSU rRNA genotype 1 and ITS type Eg was revealed.
Designing oligo libraries taking alternative splicing into account

NASA Astrophysics Data System (ADS)

Shoshan, Avi; Grebinskiy, Vladimir; Magen, Avner; Scolnicov, Ariel; Fink, Eyal; Lehavi, David; Wasserman, Alon

2001-06-01

We have designed sequences for DNA microarrays and oligo libraries, taking alternative splicing into account. Alternative splicing is a common phenomenon, occurring in more than 25% of the human genes. In many cases, different splice variants have different functions, are expressed in different tissues or may indicate different stages of disease. When designing sequences for DNA microarrays or oligo libraries, it is very important to take into account the sequence information of all the mRNA transcripts. Therefore, when a gene has more than one transcript (as a result of alternative splicing, alternative promoter sites or alternative poly-adenylation sites), it is very important to take all of them into account in the design. We have used the LEADS transcriptome prediction system to cluster and assemble the human sequences in GenBank and design optimal oligonucleotides for all the human genes with a known mRNA sequence based on the LEADS predictions.
Designing universal primers for the isolation of DNA sequences encoding Proanthocyanidins biosynthetic enzymes in Crataegus aronia

PubMed Central

2012-01-01

Background Hawthorn is the common name of all plant species in the genus Crataegus, which belongs to the Rosaceae family. Crataegus are considered useful medicinal plants because of their high content of proanthocyanidins (PAs) and other related compounds. To improve PAs production in Crataegus tissues, the sequences of genes encoding PAs biosynthetic enzymes are required. Findings Different bioinformatics tools, including BLAST, multiple sequence alignment and alignment PCR analysis were used to design primers suitable for the amplification of DNA fragments from 10 candidate genes encoding enzymes involved in PAs biosynthesis in C. aronia. DNA sequencing results proved the utility of the designed primers. The primers were used successfully to amplify DNA fragments of different PAs biosynthesis genes in different Rosaceae plants. Conclusion To the best of our knowledge, this is the first use of the alignment PCR approach to isolate DNA sequences encoding PAs biosynthetic enzymes in Rosaceae plants. PMID:22883984
Designing universal primers for the isolation of DNA sequences encoding Proanthocyanidins biosynthetic enzymes in Crataegus aronia.

PubMed

Zuiter, Afnan Saeid; Sawwan, Jammal; Al Abdallat, Ayed

2012-08-10

Hawthorn is the common name of all plant species in the genus Crataegus, which belongs to the Rosaceae family. Crataegus are considered useful medicinal plants because of their high content of proanthocyanidins (PAs) and other related compounds. To improve PAs production in Crataegus tissues, the sequences of genes encoding PAs biosynthetic enzymes are required. Different bioinformatics tools, including BLAST, multiple sequence alignment and alignment PCR analysis were used to design primers suitable for the amplification of DNA fragments from 10 candidate genes encoding enzymes involved in PAs biosynthesis in C. aronia. DNA sequencing results proved the utility of the designed primers. The primers were used successfully to amplify DNA fragments of different PAs biosynthesis genes in different Rosaceae plants. To the best of our knowledge, this is the first use of the alignment PCR approach to isolate DNA sequences encoding PAs biosynthetic enzymes in Rosaceae plants.
The LAM-PCR Method to Sequence LV Integration Sites.

PubMed

Wang, Wei; Bartholomae, Cynthia C; Gabriel, Richard; Deichmann, Annette; Schmidt, Manfred

2016-01-01

Integrating viral gene transfer vectors are commonly used gene delivery tools in clinical gene therapy trials providing stable integration and continuous gene expression of the transgene in the treated host cell. However, integration of the reverse-transcribed vector DNA into the host genome is a potentially mutagenic event that may directly contribute to unwanted side effects. A comprehensive and accurate analysis of the integration site (IS) repertoire is indispensable to study clonality in transduced cells obtained from patients undergoing gene therapy and to identify potential in vivo selection of affected cell clones. To date, next-generation sequencing (NGS) of vector-genome junctions allows sophisticated studies on the integration repertoire in vitro and in vivo. We have explored the use of the Illumina MiSeq Personal Sequencer platform to sequence vector ISs amplified by non-restrictive linear amplification-mediated PCR (nrLAM-PCR) and LAM-PCR. MiSeq-based high-quality IS sequence retrieval is accomplished by the introduction of a double-barcode strategy that substantially minimizes the frequency of IS sequence collisions compared to the conventionally used single-barcode protocol. Here, we present an updated protocol of (nr)LAM-PCR for the analysis of lentiviral IS using a double-barcode system and followed by deep sequencing using the MiSeq device.

In Silico Comparative Transcriptome Analysis of Two Color Morphs of the Common Coral Trout (Plectropomus Leopardus)

PubMed Central

Wang, Le; Yu, Cuiping; Guo, Liang; Lin, Haoran; Meng, Zining

2015-01-01

The common coral trout is one species of major importance in commercial fisheries and aquaculture. Recently, two different color morphs of Plectropomus leopardus were discovered and the biological importance of the color difference is unknown. Since coral trout species are poorly characterized at the molecular level, we undertook the transcriptomic characterization of the two color morphs, one black and one red coral trout, using Illumina next generation sequencing technologies. The study produced 55162966 and 54588952 paired-end reads, for black and red trout, respectively. De novo transcriptome assembly generated 95367 and 99424 unique sequences in black and red trout, respectively, with 88813 sequences shared between them. Approximately 50% of both trancriptomes were functionally annotated by BLAST searches against protein databases. The two trancriptomes were enriched into 25 functional categories and showed similar profiles of Gene Ontology category compositions. 34110 unigenes were grouped into 259 KEGG pathways. Moreover, we identified 14649 simple sequence repeats (SSRs) and designed primers for potential application. We also discovered 130524 putative single nucleotide polymorphisms (SNPs) in the two transcriptomes, supplying potential genomic resources for the coral trout species. In addition, we identified 936 fast-evolving genes and 165 candidate genes under positive selection between the two color morphs. Finally, 38 candidate genes underlying the mechanism of color and pigmentation were also isolated. This study presents the first transcriptome resources for the common coral trout and provides basic information for the development of genomic tools for the identification, conservation, and understanding of the speciation and local adaptation of coral reef fish species. PMID:26713756
Apophysomyces variabilis: draft genome sequence and comparison of predictive virulence determinants with other medically important Mucorales.

PubMed

Prakash, Hariprasath; Rudramurthy, Shivaprakash Mandya; Gandham, Prasad S; Ghosh, Anup Kumar; Kumar, Milner M; Badapanda, Chandan; Chakrabarti, Arunaloke

2017-09-18

Apophysomyces species are prevalent in tropical countries and A. variabilis is the second most frequent agent causing mucormycosis in India. Among Apophysomyces species, A. elegans, A. trapeziformis and A. variabilis are commonly incriminated in human infections. The genome sequences of A. elegans and A. trapeziformis are available in public database, but not A. variabilis. We, therefore, performed the whole genome sequence of A. variabilis to explore its genomic structure and possible genes determining the virulence of the organism. The whole genome of A. variabilis NCCPF 102052 was sequenced and the genomic structure of A. variabilis was compared with already available genome structures of A. elegans, A. trapeziformis and other medically important Mucorales. The total size of genome assembly of A. variabilis was 39.38 Mb with 12,764 protein-coding genes. The transposable elements (TEs) were low in Apophysomyces genome and the retrotransposon Ty3-gypsy was the common TE. Phylogenetically, Apophysomyces species were grouped closely with Phycomyces blakesleeanus. OrthoMCL analysis revealed 3025 orthologues proteins, which were common in those three pathogenic Apophysomyces species. Expansion of multiple gene families/duplication was observed in Apophysomyces genomes. Approximately 6% of Apophysomyces genes were predicted to be associated with virulence on PHIbase analysis. The virulence determinants included the protein families of CotH proteins (invasins), proteases, iron utilisation pathways, siderophores and signal transduction pathways. Serine proteases were the major group of proteases found in all Apophysomyces genomes. The carbohydrate active enzymes (CAZymes) constitute the majority of the secretory proteins. The present study is the maiden attempt to sequence and analyze the genomic structure of A. variabilis. Together with available genome sequence of A. elegans and A. trapeziformis, the study helped to indicate the possible virulence determinants of pathogenic Apophysomyces species. The presence of unique CAZymes in cell wall might be exploited in future for antifungal drug development.
The structure of the human interferon alpha/beta receptor gene.

PubMed

Lutfalla, G; Gardiner, K; Proudhon, D; Vielh, E; Uzé, G

1992-02-05

Using the cDNA coding for the human interferon alpha/beta receptor (IFNAR), the IFNAR gene has been physically mapped relative to the other loci of the chromosome 21q22.1 region. 32,906 base pairs covering the IFNAR gene have been cloned and sequenced. Primer extension and solution hybridization-ribonuclease protection have been used to determine that the transcription of the gene is initiated in a broad region of 20 base pairs. Some aspects of the polymorphism of the gene, including noncoding sequences, have been analyzed; some are allelic differences in the coding sequence that induce amino acid variations in the resulting protein. The exon structure of the IFNAR gene and of that of the available genes for the receptors of the cytokine/growth hormone/prolactin/interferon receptor family have been compared with the predictions for the secondary structure of those receptors. From this analysis, we postulate a common origin and propose an hypothesis for the divergence from the immunoglobulin superfamily.
Comparative sequence analysis of a region on human chromosome 13q14, frequently deleted in B-cell chronic lymphocytic leukemia, and its homologous region on mouse chromosome 14.

PubMed

Kapanadze, B; Makeeva, N; Corcoran, M; Jareborg, N; Hammarsund, M; Baranova, A; Zabarovsky, E; Vorontsova, O; Merup, M; Gahrton, G; Jansson, M; Yankovsky, N; Einhorn, S; Oscier, D; Grandér, D; Sangfelt, O

2000-12-15

Previous studies have indicated the presence of a putative tumor suppressor gene on human chromosome 13q14, commonly deleted in patients with B-cell chronic lymphocytic leukemia (B-CLL). We have recently identified a minimally deleted region encompassing parts of two adjacent genes, termed LEU1 and LEU2 (leukemia-associated genes 1 and 2), and several additional transcripts. In addition, 50 kb centromeric to this region we have identified another gene, LEU5/RFP2. To elucidate further the complex genomic organization of this region, we have identified, mapped, and sequenced the homologous region in the mouse. Fluorescence in situ hybridization analysis demonstrated that the region maps to mouse chromosome 14. The overall organization and gene order in this region were found to be highly conserved in the mouse. Sequence comparison between the human deletion hotspot region and its homologous mouse region revealed a high degree of sequence conservation with an overall score of 74%. However, our data also show that in terms of transcribed sequences, only two of those, human LEU2 and LEU5/RFP2, are clearly conserved, strengthening the case for these genes as putative candidate B-CLL tumor suppressor genes.
Genomics in Cardiovascular Disease

PubMed Central

Roberts, Robert; Marian, A.J.; Dandona, Sonny; Stewart, Alexandre F.R.

2013-01-01

A paradigm shift towards biology occurred in the 1990’s subsequently catalyzed by the sequencing of the human genome in 2000. The cost of DNA sequencing has gone from millions to thousands of dollars with sequencing of one’s entire genome costing only $1,000. Rapid DNA sequencing is being embraced for single gene disorders, particularly for sporadic cases and those from small families. Transmission of lethal genes such as associated with Huntington’s disease can, through in-vitro fertilization, avoid passing it on to one’s offspring. DNA sequencing will meet the challenge of elucidating the genetic predisposition for common polygenic diseases, especially in determining the function of the novel common genetic risk variants and identifying the rare variants, which may also partially ascertain the source of the missing heritability. The challenge for DNA sequencing remains great, despite human genome sequences being 99.5% identical, the 3 million single nucleotide polymorphisms (SNPs) responsible for most of the unique features add up to 60 new mutations per person which, for 7 billion people, is 420 billion mutations. It is claimed that DNA sequencing has increased 10,000 fold while information storage and retrieval only 16 fold. The physician and health user will be challenged by the convergence of two major trends, whole genome sequencing and the storage/retrieval and integration of the data. PMID:23524054
Uneven recombination rate and linkage disequilibrium across a reference SNP map for common bean (Phaseolus vulgaris L.)

PubMed Central

Farmer, Andrew D.; Huang, Wei; Ambachew, Daniel; Penmetsa, R. Varma; Carrasquilla-Garcia, Noelia; Assefa, Teshale; Cannon, Steven B.

2018-01-01

Recombination (R) rate and linkage disequilibrium (LD) analyses are the basis for plant breeding. These vary by breeding system, by generation of inbreeding or outcrossing and by region in the chromosome. Common bean (Phaseolus vulgaris L.) is a favored food legume with a small sequenced genome (514 Mb) and n = 11 chromosomes. The goal of this study was to describe R and LD in the common bean genome using a 768-marker array of single nucleotide polymorphisms (SNP) based on Trans-legume Orthologous Group (TOG) genes along with an advanced-generation Recombinant Inbred Line reference mapping population (BAT93 x Jalo EEP558) and an internationally available diversity panel. A whole genome genetic map was created that covered all eleven linkage groups (LG). The LGs were linked to the physical map by sequence data of the TOGs compared to each chromosome sequence of common bean. The genetic map length in total was smaller than for previous maps reflecting the precision of allele calling and mapping with SNP technology as well as the use of gene-based markers. A total of 91.4% of TOG markers had singleton hits with annotated Pv genes and all mapped outside of regions of resistance gene clusters. LD levels were found to be stronger within the Mesoamerican genepool and decay more rapidly within the Andean genepool. The recombination rate across the genome was 2.13 cM / Mb but R was found to be highly repressed around centromeres and frequent outside peri-centromeric regions. These results have important implications for association and genetic mapping or crop improvement in common bean. PMID:29522524
Analysis of MHC class I genes across horse MHC haplotypes

PubMed Central

Tallmadge, Rebecca L.; Campbell, Julie A.; Miller, Donald C.; Antczak, Douglas F.

2010-01-01

The genomic sequences of 15 horse Major Histocompatibility Complex (MHC) class I genes and a collection of MHC class I homozygous horses of five different haplotypes were used to investigate the genomic structure and polymorphism of the equine MHC. A combination of conserved and locus-specific primers was used to amplify horse MHC class I genes with classical and non-classical characteristics. Multiple clones from each haplotype identified three to five classical sequences per homozygous animal, and two to three non-classical sequences. Phylogenetic analysis was applied to these sequences and groups were identified which appear to be allelic series, but some sequences were left ungrouped. Sequences determined from MHC class I heterozygous horses and previously described MHC class I sequences were then added, representing a total of ten horse MHC haplotypes. These results were consistent with those obtained from the MHC homozygous horses alone, and 30 classical sequences were assigned to four previously confirmed loci and three new provisional loci. The non-classical genes had few alleles and the classical genes had higher levels of allelic polymorphism. Alleles for two classical loci with the expected pattern of polymorphism were found in the majority of haplotypes tested, but alleles at two other commonly detected loci had more variation outside of the hypervariable region than within. Our data indicate that the equine Major Histocompatibility Complex is characterized by variation in the complement of class I genes expressed in different haplotypes in addition to the expected allelic polymorphism within loci. PMID:20099063
Genetic characterization of the non-structural protein-3 gene of bluetongue virus serotype-2 isolate from India.

PubMed

Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu

2017-03-01

Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability.
Genetic characterization of the non-structural protein-3 gene of bluetongue virus serotype-2 isolate from India

PubMed Central

Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu

2017-01-01

Aim: Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. Materials and Methods: The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. Results: The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Conclusion: Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability. PMID:28435199
Analysis of the skin transcriptome in two oujiang color varieties of common carp.

PubMed

Wang, Chenghui; Wachholtz, Michael; Wang, Jun; Liao, Xiaolin; Lu, Guoqing

2014-01-01

Body color and coloration patterns are important phenotypic traits to maintain survival and reproduction activities. The Oujiang color varieties of common carp (Cyprinus carpio var. color), with a narrow distribution in Zhejiang Province of China and a history of aquaculture for over 1,200 years, consistently exhibit a variety of body color patterns. The molecular mechanism underlying diverse color patterns in these variants is unknown. To the practical end, it is essential to develop molecular markers that can distinguish different phenotypes and assist selective breeding. In this exploratory study, we conducted Roche 454 transcriptome sequencing of two pooled skin tissue samples of Oujiang common carp, which correspond to distinct color patterns, red with big black spots (RB) and whole white (WW), and a total of 737,525 sequence reads were generated. The reads obtained in this study were co-assembled jointly with common carp Roche 454 sequencing reads downloaded from NCBI SRA database, resulting in 43,923 isotigs and 546,676 singletons. Over 31 thousand (31,445; 71.6%) isotigs were found with significant BLAST matches (E<1e-10) to the nr protein database, which corresponds to 12,597 annotated zebrafish genes. A total of 70,947 isotigs and singletons (transcripts) were annotated with Gene Ontology, and 60,221 transcripts were found with corresponding EC numbers. Out of 145 zebrafish pigmentation genes, orthologs for 117 were recovered in Oujiang color carp transcriptome, including 18 found only among singletons. Our transcriptome analysis revealed over 52,902 SNPs in Oujiang common carp, and identified 63 SNP markers that are putatively unique either for RB or WW. The transcriptome of Oujiang color varieties of common carp obtained through this study, along with the pigmentation genes recovered and the color pattern-specific molecular markers developed, will facilitate future research on the molecular mechanism of color patterns and promote aquaculture of Oujiang color varieties of common carp through molecular marker assisted-selective breeding.
Comprehensive Molecular Characterization of Urothelial Bladder Carcinoma

PubMed Central

2014-01-01

Urothelial carcinoma of the bladder is a common malignancy that causes approximately 150,000 deaths per year worldwide. To date, no molecularly targeted agents have been approved for the disease. As part of The Cancer Genome Atlas project, we report here an integrated analysis of 131 urothelial carcinomas to provide a comprehensive landscape of molecular alterations. There were statistically significant recurrent mutations in 32 genes, including multiple genes involved in cell cycle regulation, chromatin regulation, and kinase signaling pathways, as well as 9 genes not previously reported as significantly mutated in any cancer. RNA sequencing revealed four expression subtypes, two of which (papillary-like and basal/squamous-like) were also evident in miRNA sequencing and protein data. Whole-genome and RNA sequencing identified recurrent in-frame activating FGFR3-TACC3 fusions and expression or integration of several viruses (including HPV16) that are associated with gene inactivation. Our analyses identified potential therapeutic targets in 69% of the tumours, including 42% with targets in the PI3K/AKT/mTOR pathway and 45% with targets (including ERBB2) in the RTK/MAPK pathway. Chromatin regulatory genes were more frequently mutated in urothelial carcinoma than in any common cancer studied to date, suggesting the future possibility of targeted therapy for chromatin abnormalities. PMID:24476821
A combination of PhP typing and β-d-glucuronidase gene sequence variation analysis for differentiation of Escherichia coli from humans and animals.

PubMed

Masters, N; Christie, M; Katouli, M; Stratton, H

2015-06-01

We investigated the usefulness of the β-d-glucuronidase gene variance in Escherichia coli as a microbial source tracking tool using a novel algorithm for comparison of sequences from a prescreened set of host-specific isolates using a high-resolution PhP typing method. A total of 65 common biochemical phenotypes belonging to 318 E. coli strains isolated from humans and domestic and wild animals were analysed for nucleotide variations at 10 loci along a 518 bp fragment of the 1812 bp β-d-glucuronidase gene. Neighbour-joining analysis of loci variations revealed 86 (76.8%) human isolates and 91.2% of animal isolates were correctly identified. Pairwise hierarchical clustering improved assignment; where 92 (82.1%) human and 204 (99%) animal strains were assigned to their respective cluster. Our data show that initial typing of isolates and selection of common types from different hosts prior to analysis of the β-d-glucuronidase gene sequence improves source identification. We also concluded that numerical profiling of the nucleotide variations can be used as a valuable approach to differentiate human from animal E. coli. This study signifies the usefulness of the β-d-glucuronidase gene as a marker for differentiating human faecal pollution from animal sources.
Evaluation of four endogenous reference genes and their real-time PCR assays for common wheat quantification in GMOs detection.

PubMed

Huang, Huali; Cheng, Fang; Wang, Ruoan; Zhang, Dabing; Yang, Litao

2013-01-01

Proper selection of endogenous reference genes and their real-time PCR assays is quite important in genetically modified organisms (GMOs) detection. To find a suitable endogenous reference gene and its real-time PCR assay for common wheat (Triticum aestivum L.) DNA content or copy number quantification, four previously reported wheat endogenous reference genes and their real-time PCR assays were comprehensively evaluated for the target gene sequence variation and their real-time PCR performance among 37 common wheat lines. Three SNPs were observed in the PKABA1 and ALMT1 genes, and these SNPs significantly decreased the efficiency of real-time PCR amplification. GeNorm analysis of the real-time PCR performance of each gene among common wheat lines showed that the Waxy-D1 assay had the lowest M values with the best stability among all tested lines. All results indicated that the Waxy-D1 gene and its real-time PCR assay were most suitable to be used as an endogenous reference gene for common wheat DNA content quantification. The validated Waxy-D1 gene assay will be useful in establishing accurate and creditable qualitative and quantitative PCR analysis of GM wheat.
Evaluation of Four Endogenous Reference Genes and Their Real-Time PCR Assays for Common Wheat Quantification in GMOs Detection

PubMed Central

Huang, Huali; Cheng, Fang; Wang, Ruoan; Zhang, Dabing; Yang, Litao

2013-01-01

Proper selection of endogenous reference genes and their real-time PCR assays is quite important in genetically modified organisms (GMOs) detection. To find a suitable endogenous reference gene and its real-time PCR assay for common wheat (Triticum aestivum L.) DNA content or copy number quantification, four previously reported wheat endogenous reference genes and their real-time PCR assays were comprehensively evaluated for the target gene sequence variation and their real-time PCR performance among 37 common wheat lines. Three SNPs were observed in the PKABA1 and ALMT1 genes, and these SNPs significantly decreased the efficiency of real-time PCR amplification. GeNorm analysis of the real-time PCR performance of each gene among common wheat lines showed that the Waxy-D1 assay had the lowest M values with the best stability among all tested lines. All results indicated that the Waxy-D1 gene and its real-time PCR assay were most suitable to be used as an endogenous reference gene for common wheat DNA content quantification. The validated Waxy-D1 gene assay will be useful in establishing accurate and creditable qualitative and quantitative PCR analysis of GM wheat. PMID:24098735
Molecular evolution and diversification of snake toxin genes, revealed by analysis of intron sequences.

PubMed

Fujimi, T J; Nakajyo, T; Nishimura, E; Ogura, E; Tsuchiya, T; Tamiya, T

2003-08-14

The genes encoding erabutoxin (short chain neurotoxin) isoforms (Ea, Eb, and Ec), LsIII (long chain neurotoxin) and a novel long chain neurotoxin pseudogene were cloned from a Laticauda semifasciata genomic library. Short and long chain neurotoxin genes were also cloned from the genome of Laticauda laticaudata, a closely related species of L. semifasciata, by PCR. A putative matrix attached region (MAR) sequence was found in the intron I of the LsIII gene. Comparative analysis of 11 structurally relevant snake toxin genes (three-finger-structure toxins) revealed the molecular evolution of these toxins. Three-finger-structure toxin genes diverged from a common ancestor through two types of evolutionary pathways (long and short types), early in the course of evolution. At a later stage of evolution in each gene, the accumulation of mutations in the exons, especially exon II, by accelerated evolution may have caused the increased diversification in their functions. It was also revealed that the putative MAR sequence found in the LsIII gene was integrated into the gene after the species-level divergence.
Seasonal and regional diversity of maple sap microbiota revealed using community PCR fingerprinting and 16S rRNA gene clone libraries.

PubMed

Filteau, Marie; Lagacé, Luc; LaPointe, Gisèle; Roy, Denis

2010-04-01

An arbitrary primed community PCR fingerprinting technique based on capillary electrophoresis was developed to study maple sap microbial community characteristics among 19 production sites in Québec over the tapping season. Presumptive fragment identification was made with corresponding fingerprint profiles of bacterial isolate cultures. Maple sap microbial communities were subsequently compared using a representative subset of 13 16S rRNA gene clone libraries followed by gene sequence analysis. Results from both methods indicated that all maple sap production sites and flow periods shared common microbiota members, but distinctive features also existed. Changes over the season in relative abundance of predominant populations showed evidence of a common pattern. Pseudomonas (64%) and Rahnella (8%) were the most abundantly and frequently represented genera of the 2239 sequences analyzed. Janthinobacterium, Leuconostoc, Lactococcus, Weissella, Epilithonimonas and Sphingomonas were revealed as occasional contaminants in maple sap. Maple sap microbiota showed a low level of deep diversity along with a high variation of similar 16S rRNA gene sequences within the Pseudomonas genus. Predominance of Pseudomonas is suggested as a typical feature of maple sap microbiota across geographical regions, production sites, and sap flow periods.
TaEDS1 genes positively regulate resistance to powdery mildew in wheat.

PubMed

Chen, Guiping; Wei, Bo; Li, Guoliang; Gong, Caiyan; Fan, Renchun; Zhang, Xiangqi

2018-04-01

Three EDS1 genes were cloned from common wheat and were demonstrated to positively regulate resistance to powdery mildew in wheat. The EDS1 proteins play important roles in plant basal resistance and TIR-NB-LRR protein-triggered resistance in dicots. Until now, there have been very few studies on EDS1 in monocots, and none in wheat. Here, we report on three common wheat orthologous genes of EDS1 family (TaEDS1-5A, 5B and 5D) and their function in powdery mildew resistance. Comparisons of these genes with their orthologs in diploid ancestors revealed that EDS1 is a conserved gene family in Triticeae. The cDNA sequence similarity among the three TaEDS1 genes was greater than 96.5%, and they shared sequence similarities of more than 99.6% with the respective orthologs from diploid ancestors. The phylogenetic analysis revealed that the EDS1 family originated prior to the differentiation of monocots and dicots, and EDS1 members have since undergone clear structural differentiation. The transcriptional levels of TaEDS1 genes in the leaves were obviously higher than those of the other organs, and they were induced by Blumeria graminis f. sp. tritici (Bgt) infection and salicylic acid (SA) treatment. The BSMV-VIGS experiments indicated that knock-down the transcriptional levels of the TaEDS1 genes in a powdery mildew-resistant variety of common wheat compromised resistance. Contrarily, transient overexpression of TaEDS1 genes in a susceptible common wheat variety significantly reduced the haustorium index and attenuated the growth of Bgt. Furthermore, the expression of TaEDS1 genes in the Arabidopsis mutant eds1-1 complemented its susceptible phenotype to powdery mildew. The above evidences strongly suggest that TaEDS1 acts as a positive regulator and confers resistance against powdery mildew in common wheat.
The Complete Mitochondrial Genome of Gossypium hirsutum and Evolutionary Analysis of Higher Plant Mitochondrial Genomes

PubMed Central

Su, Aiguo; Geng, Jianing; Grover, Corrinne E.; Hu, Songnian; Hua, Jinping

2013-01-01

Background Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. Methodology/Principal Findings We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. Conclusion The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species. PMID:23940520
The complete mitochondrial genome of Gossypium hirsutum and evolutionary analysis of higher plant mitochondrial genomes.

PubMed

Liu, Guozheng; Cao, Dandan; Li, Shuangshuang; Su, Aiguo; Geng, Jianing; Grover, Corrinne E; Hu, Songnian; Hua, Jinping

2013-01-01

Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.
Phosphate transporters in marine phytoplankton and their viruses: cross-domain commonalities in viral-host gene exchanges.

PubMed

Monier, Adam; Welsh, Rory M; Gentemann, Chelle; Weinstock, George; Sodergren, Erica; Armbrust, E Virginia; Eisen, Jonathan A; Worden, Alexandra Z

2012-01-01

Phosphate (PO(4)) is an important limiting nutrient in marine environments. Marine cyanobacteria scavenge PO(4) using the high-affinity periplasmic phosphate binding protein PstS. The pstS gene has recently been identified in genomes of cyanobacterial viruses as well. Here, we analyse genes encoding transporters in genomes from viruses that infect eukaryotic phytoplankton. We identified inorganic PO(4) transporter-encoding genes from the PHO4 superfamily in several virus genomes, along with other transporter-encoding genes. Homologues of the viral pho4 genes were also identified in genome sequences from the genera that these viruses infect. Genome sequences were available from host genera of all the phytoplankton viruses analysed except the host genus Bathycoccus. Pho4 was recovered from Bathycoccus by sequencing a targeted metagenome from an uncultured Atlantic Ocean population. Phylogenetic reconstruction showed that pho4 genes from pelagophytes, haptophytes and infecting viruses were more closely related to homologues in prasinophytes than to those in what, at the species level, are considered to be closer relatives (e.g. diatoms). We also identified PHO4 superfamily members in ocean metagenomes, including new metagenomes from the Pacific Ocean. The environmental sequences grouped with pelagophytes, haptophytes, prasinophytes and viruses as well as bacteria. The analyses suggest that multiple independent pho4 gene transfer events have occurred between marine viruses and both eukaryotic and bacterial hosts. Additionally, pho4 genes were identified in available genomes from viruses that infect marine eukaryotes but not those that infect terrestrial hosts. Commonalities in marine host-virus gene exchanges indicate that manipulation of host-PO(4) uptake is an important adaptation for viral proliferation in marine systems. Our findings suggest that PO(4) -availability may not serve as a simple bottom-up control of marine phytoplankton. © 2011 Society for Applied Microbiology and Blackwell Publishing Ltd.

ICO amplicon NGS data analysis: a Web tool for variant detection in common high-risk hereditary cancer genes analyzed by amplicon GS Junior next-generation sequencing.

PubMed

Lopez-Doriga, Adriana; Feliubadaló, Lídia; Menéndez, Mireia; Lopez-Doriga, Sergio; Morón-Duran, Francisco D; del Valle, Jesús; Tornero, Eva; Montes, Eva; Cuesta, Raquel; Campos, Olga; Gómez, Carolina; Pineda, Marta; González, Sara; Moreno, Victor; Capellá, Gabriel; Lázaro, Conxi

2014-03-01

Next-generation sequencing (NGS) has revolutionized genomic research and is set to have a major impact on genetic diagnostics thanks to the advent of benchtop sequencers and flexible kits for targeted libraries. Among the main hurdles in NGS are the difficulty of performing bioinformatic analysis of the huge volume of data generated and the high number of false positive calls that could be obtained, depending on the NGS technology and the analysis pipeline. Here, we present the development of a free and user-friendly Web data analysis tool that detects and filters sequence variants, provides coverage information, and allows the user to customize some basic parameters. The tool has been developed to provide accurate genetic analysis of targeted sequencing of common high-risk hereditary cancer genes using amplicon libraries run in a GS Junior System. The Web resource is linked to our own mutation database, to assist in the clinical classification of identified variants. We believe that this tool will greatly facilitate the use of the NGS approach in routine laboratories.
Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

PubMed

Jo, Yeong Deuk; Choi, Yoomi; Kim, Dong-Hwan; Kim, Byung-Dong; Kang, Byoung-Cheorl

2014-07-04

Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp. We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes. Although large portion of sequence context was shared by mitochondrial genomes of CMS and male-fertile pepper lines, extensive genome rearrangements were detected. CMS candidate genes located on the edges of highly-rearranged CMS-specific DNA regions and near to repeat sequences. These characteristics were detected among CMS-associated genes in other species, implying a common mechanism might be involved in the evolution of CMS-associated genes.
Autosomal-dominant Leber Congenital Amaurosis Caused by a Heterozygous CRX Mutation in a Father and Son.

PubMed

Arcot Sadagopan, Karthikeyan; Battista, Robert; Keep, Rosanne B; Capasso, Jenina E; Levin, Alex V

2015-06-01

Leber congenital amaurosis (LCA) is most often an autosomal recessive disorder. We report a father and son with autosomal dominant LCA due to a mutation in the CRX gene. DNA screening using an allele specific assay of 90 of the most common LCA-causing variations in the coding sequences of AIPL1, CEP290, CRB1, CRX, GUCY2D, RDH12 and RPE65 was performed on the father. Automated DNA sequencing of his son examining exon 3 of the CRX gene was subsequently performed. Both father and son have a heterozygous single base pair deletion of an adenine at codon 153 in the coding sequence of the CRX gene resulting in a frameshift mutation. Mutations involving the CRX gene may demonstrate an autosomal dominant inheritance pattern for LCA.
Computational analysis of gene-gene interactions using multifactor dimensionality reduction.

PubMed

Moore, Jason H

2004-11-01

Understanding the relationship between DNA sequence variations and biologic traits is expected to improve the diagnosis, prevention and treatment of common human diseases. Success in characterizing genetic architecture will depend on our ability to address nonlinearities in the genotype-to-phenotype mapping relationship as a result of gene-gene interactions, or epistasis. This review addresses the challenges associated with the detection and characterization of epistasis. A novel strategy known as multifactor dimensionality reduction that was specifically designed for the identification of multilocus genetic effects is presented. Several case studies that demonstrate the detection of gene-gene interactions in common diseases such as atrial fibrillation, Type II diabetes and essential hypertension are also discussed.
Characterizing differential gene expression in polyploid grasses lacking a reference transcriptome

USDA-ARS?s Scientific Manuscript database

Basal transcriptome characterization and differential gene expression in response to varying conditions are often addressed through next generation sequencing (NGS) and data analysis techniques. While these strategies are commonly used, there are countless tools, pipelines, data analysis methods an...
Implication of common and disease specific variants in CLU, CR1, and PICALM.

PubMed

Ferrari, Raffaele; Moreno, Jorge H; Minhajuddin, Abu T; O'Bryant, Sid E; Reisch, Joan S; Barber, Robert C; Momeni, Parastoo

2012-08-01

Two recent genome-wide association studies (GWAS) for late onset Alzheimer's disease (LOAD) revealed 3 new genes: clusterin (CLU), phosphatidylinositol binding clathrin assembly protein (PICALM), and complement receptor 1 (CR1). In order to evaluate association with these genome-wide association study-identified genes and to isolate the variants contributing to the pathogenesis of LOAD, we genotyped the top single nucleotide polymorphisms (SNPs), rs11136000 (CLU), rs3818361 (CR1), and rs3851179 (PICALM), and sequenced the entire coding regions of these genes in our cohort of 342 LOAD patients and 277 control subjects. We confirmed the association of rs3851179 (PICALM) (p = 7.4 × 10(-3)) with the disease status. Through sequencing we identified 18 variants in CLU, 3 of which were found exclusively in patients; 8 variants (out of 65) in CR1 gene were only found in patients and the 16 variants identified in PICALM gene were present in both patients and controls. In silico analysis of the variants in PICALM did not predict any damaging effect on the protein. The haplotype analysis of the variants in each gene predicted a common haplotype when the 3 single nucleotide polymorphisms rs11136000 (CLU), rs3818361 (CR1), and rs3851179 (PICALM), respectively, were included. For each gene the haplotype structure and size differed between patients and controls. In conclusion, we confirmed association of CLU, CR1, and PICALM genes with the disease status in our cohort through identification of a number of disease-specific variants among patients through the sequencing of the coding region of these genes. Published by Elsevier Inc.
DNA mutation motifs in the genes associated with inherited diseases.

PubMed

Růžička, Michal; Kulhánek, Petr; Radová, Lenka; Čechová, Andrea; Špačková, Naďa; Fajkusová, Lenka; Réblová, Kamila

2017-01-01

Mutations in human genes can be responsible for inherited genetic disorders and cancer. Mutations can arise due to environmental factors or spontaneously. It has been shown that certain DNA sequences are more prone to mutate. These sites are termed hotspots and exhibit a higher mutation frequency than expected by chance. In contrast, DNA sequences with lower mutation frequencies than expected by chance are termed coldspots. Mutation hotspots are usually derived from a mutation spectrum, which reflects particular population where an effect of a common ancestor plays a role. To detect coldspots/hotspots unaffected by population bias, we analysed the presence of germline mutations obtained from HGMD database in the 5-nucleotide segments repeatedly occurring in genes associated with common inherited disorders, in particular, the PAH, LDLR, CFTR, F8, and F9 genes. Statistically significant sequences (mutational motifs) rarely associated with mutations (coldspots) and frequently associated with mutations (hotspots) exhibited characteristic sequence patterns, e.g. coldspots contained purine tract while hotspots showed alternating purine-pyrimidine bases, often with the presence of CpG dinucleotide. Using molecular dynamics simulations and free energy calculations, we analysed the global bending properties of two selected coldspots and two hotspots with a G/T mismatch. We observed that the coldspots were inherently more flexible than the hotspots. We assume that this property might be critical for effective mismatch repair as DNA with a mutation recognized by MutSα protein is noticeably bent.
A candidate gene for choanal atresia in alpaca.

PubMed

Reed, Kent M; Bauer, Miranda M; Mendoza, Kristelle M; Armién, Aníbal G

2010-03-01

Choanal atresia (CA) is a common nasal craniofacial malformation in New World domestic camelids (alpaca and llama). CA results from abnormal development of the nasal passages and is especially debilitating to newborn crias. CA in camelids shares many of the clinical manifestations of a similar condition in humans (CHARGE syndrome). Herein we report on the regulatory gene CHD7 of alpaca, whose homologue in humans is most frequently associated with CHARGE. Sequence of the CHD7 coding region was obtained from a non-affected cria. The complete coding region was 9003 bp, corresponding to a translated amino acid sequence of 3000 aa. Additional genomic sequences corresponding to a significant portion of the CHD7 gene were identified and assembled from the 2x alpaca whole genome sequence, providing confirmatory sequence for much of the CHD7 coding region. The alpaca CHD7 mRNA sequence was 97.9% similar to the human sequence, with the greatest sequence difference being an insertion in exon 38 that results in a polyalanine repeat (A12). Polymorphism in this repeat was tested for association with CA in alpaca by cloning and sequencing the repeat from both affected and non-affected individuals. Variation in length of the poly-A repeat was not associated with CA. Complete sequencing of the CHD7 gene will be necessary to determine whether other mutations in CHD7 are the cause of CA in camelids.
Gene discovery by chemical mutagenesis and whole-genome sequencing in Dictyostelium.

PubMed

Li, Cheng-Lin Frank; Santhanam, Balaji; Webb, Amanda Nicole; Zupan, Blaž; Shaulsky, Gad

2016-09-01

Whole-genome sequencing is a useful approach for identification of chemical-induced lesions, but previous applications involved tedious genetic mapping to pinpoint the causative mutations. We propose that saturation mutagenesis under low mutagenic loads, followed by whole-genome sequencing, should allow direct implication of genes by identifying multiple independent alleles of each relevant gene. We tested the hypothesis by performing three genetic screens with chemical mutagenesis in the social soil amoeba Dictyostelium discoideum Through genome sequencing, we successfully identified mutant genes with multiple alleles in near-saturation screens, including resistance to intense illumination and strong suppressors of defects in an allorecognition pathway. We tested the causality of the mutations by comparison to published data and by direct complementation tests, finding both dominant and recessive causative mutations. Therefore, our strategy provides a cost- and time-efficient approach to gene discovery by integrating chemical mutagenesis and whole-genome sequencing. The method should be applicable to many microbial systems, and it is expected to revolutionize the field of functional genomics in Dictyostelium by greatly expanding the mutation spectrum relative to other common mutagenesis methods. © 2016 Li et al.; Published by Cold Spring Harbor Laboratory Press.
Multi-targeted priming for genome-wide gene expression assays.

PubMed

Adomas, Aleksandra B; Lopez-Giraldez, Francesc; Clark, Travis A; Wang, Zheng; Townsend, Jeffrey P

2010-08-17

Complementary approaches to assaying global gene expression are needed to assess gene expression in regions that are poorly assayed by current methodologies. A key component of nearly all gene expression assays is the reverse transcription of transcribed sequences that has traditionally been performed by priming the poly-A tails on many of the transcribed genes in eukaryotes with oligo-dT, or by priming RNA indiscriminately with random hexamers. We designed an algorithm to find common sequence motifs that were present within most protein-coding genes of Saccharomyces cerevisiae and of Neurospora crassa, but that were not present within their ribosomal RNA or transfer RNA genes. We then experimentally tested whether degenerately priming these motifs with multi-targeted primers improved the accuracy and completeness of transcriptomic assays. We discovered two multi-targeted primers that would prime a preponderance of genes in the genomes of Saccharomyces cerevisiae and Neurospora crassa while avoiding priming ribosomal RNA or transfer RNA. Examining the response of Saccharomyces cerevisiae to nitrogen deficiency and profiling Neurospora crassa early sexual development, we demonstrated that using multi-targeted primers in reverse transcription led to superior performance of microarray profiling and next-generation RNA tag sequencing. Priming with multi-targeted primers in addition to oligo-dT resulted in higher sensitivity, a larger number of well-measured genes and greater power to detect differences in gene expression. Our results provide the most complete and detailed expression profiles of the yeast nitrogen starvation response and N. crassa early sexual development to date. Furthermore, our multi-targeting priming methodology for genome-wide gene expression assays provides selective targeting of multiple sequences and counter-selection against undesirable sequences, facilitating a more complete and precise assay of the transcribed sequences within the genome.
Chromosomal arrangement of leghemoglobin genes in soybean.

PubMed Central

Lee, J S; Brown, G G; Verma, D P

1983-01-01

A cluster of four different leghemoglobin (Lb) genes was isolated from AluI-HaeIII and EcoRI genomic libraries of soybean in a set of overlapping clones which together include 45 kilobases (kb) of contiguous DNA. These four genes, including a pseudogene, are present in the same orientation and are arranged in the order: 5'-Lba-Lbc1-Lb psi-Lbc3-3'. The intergenic regions average 2.5 kb. In addition to this main Lb locus, there are other Lb genes which do not appear to be contiguous to this locus. A sequence probably common to the 3' region of Lb loci was found flanking the Lbc3 gene. The 3' flanking region of the main Lb locus also contains a sequence that appears to be expressed more abundantly in root tissue. Another sequence which is primarily expressed in root and leaf is found 5' to two Lb loci. Overall, the main leghemoglobin locus is similar in structure to the mammalian globin gene loci. Images PMID:6310504
Retroviral insertions in the VISION database identify molecular pathways in mouse lymphoid leukemia and lymphoma

PubMed Central

Weiser, Keith C.; Liu, Bin; Hansen, Gwenn M.; Skapura, Darlene; Hentges, Kathryn E.; Yarlagadda, Sujatha; Morse III, Herbert C.

2007-01-01

AKXD recombinant inbred (RI) strains develop a variety of leukemias and lymphomas due to somatically acquired insertions of retroviral DNA into the genome of hematopoetic cells that can mutate cellular proto-oncogenes and tumor suppressor genes. We generated a new set of tumors from nine AKXD RI strains selected for their propensity to develop B-cell tumors, the most common type of human hematopoietic cancers. We employed a PCR technique called viral insertion site amplification (VISA) to rapidly isolate genomic sequence at the site of provirus insertion. Here we describe 550 VISA sequence tags (VSTs) that identify 74 common insertion sites (CISs), of which 21 have not been identified previously. Several suspected proto-oncogenes and tumor suppressor genes lie near CISs, providing supportive evidence for their roles in cancer. Furthermore, numerous previously uncharacterized genes lie near CISs, providing a pool of candidate disease genes for future research. Pathway analysis of candidate genes identified several signaling pathways as common and powerful routes to blood cancer, including Notch, E-protein, NFκB, and Ras signaling. Misregulation of several Notch signaling genes was confirmed by quantitative RT-PCR. Our data suggest that analyses of insertional mutagenesis on a single genetic background are biased toward the identification of cooperating mutations. This tumor collection represents the most comprehensive study of the genetics of B-cell leukemia and lymphoma development in mice. We have deposited the VST sequences, CISs in a genome viewer, histopathology, and molecular tumor typing data in a public web database called VISION (Viral Insertion Sites Identifying Oncogenes), which is located at http://www.mouse-genome.bcm.tmc.edu/vision. PMID:17926094
Retroviral insertions in the VISION database identify molecular pathways in mouse lymphoid leukemia and lymphoma.

PubMed

Weiser, Keith C; Liu, Bin; Hansen, Gwenn M; Skapura, Darlene; Hentges, Kathryn E; Yarlagadda, Sujatha; Morse Iii, Herbert C; Justice, Monica J

2007-10-01

AKXD recombinant inbred (RI) strains develop a variety of leukemias and lymphomas due to somatically acquired insertions of retroviral DNA into the genome of hematopoetic cells that can mutate cellular proto-oncogenes and tumor suppressor genes. We generated a new set of tumors from nine AKXD RI strains selected for their propensity to develop B-cell tumors, the most common type of human hematopoietic cancers. We employed a PCR technique called viral insertion site amplification (VISA) to rapidly isolate genomic sequence at the site of provirus insertion. Here we describe 550 VISA sequence tags (VSTs) that identify 74 common insertion sites (CISs), of which 21 have not been identified previously. Several suspected proto-oncogenes and tumor suppressor genes lie near CISs, providing supportive evidence for their roles in cancer. Furthermore, numerous previously uncharacterized genes lie near CISs, providing a pool of candidate disease genes for future research. Pathway analysis of candidate genes identified several signaling pathways as common and powerful routes to blood cancer, including Notch, E-protein, NFkappaB, and Ras signaling. Misregulation of several Notch signaling genes was confirmed by quantitative RT-PCR. Our data suggest that analyses of insertional mutagenesis on a single genetic background are biased toward the identification of cooperating mutations. This tumor collection represents the most comprehensive study of the genetics of B-cell leukemia and lymphoma development in mice. We have deposited the VST sequences, CISs in a genome viewer, histopathology, and molecular tumor typing data in a public web database called VISION (Viral Insertion Sites Identifying Oncogenes), which is located at http://www.mouse-genome.bcm.tmc.edu/vision .
SNP discovery and development of genetic markers for mapping innate immune response genes in common carp (Cyprinus carpio).

PubMed

Kongchum, Pawapol; Palti, Yniv; Hallerman, Eric M; Hulata, Gideon; David, Lior

2010-08-01

Single nucleotide polymorphisms (SNPs) in immune response genes have been reported as markers for susceptibility to infectious diseases in human and livestock. A disease caused by cyprinid herpesvirus 3 (CyHV-3) is highly contagious and virulent in common carp (Cyprinus carpio). With the aim to develop molecular tools for breeding CyHV-3-resistant carp, we have amplified and sequenced 11 candidate genes for viral disease resistance including TLR2, TLR3, TLR4ba, TLR7, TLR9, TLR21, TLR22, MyD88, TRAF6, type I IFN and IL-1beta. For each gene, we initially cloned and sequenced PCR amplicons from 8 to 12 fish (2-3 fish per strain) from the SNP discovery panel. We then identified and evaluated putative SNPs for their polymorphisms in the SNP discovery panel and validated their usefulness for linkage analysis in a full-sib family using the SNaPshot method. Our sequencing results and phylogenetic analyses suggested that TLR3, TLR7 and MyD88 genes are duplicated in the common carp genome. We, therefore, developed locus-specific PCR primers and SNP genotyping assays for the duplicated loci. A total of 48 SNP markers were developed from PCR fragments of the 13 loci (7 single-locus and 3 duplicated genes). Thirty-nine markers were polymorphic with estimated minor allele frequencies of more than 0.1. The utility of the SNP markers was evaluated in one full-sib family and revealed that 20 markers from 9 loci segregated in a disomic and Mendelian pattern and would be useful for linkage analysis. Published by Elsevier Ltd.
Rapid bursts of androgen-binding protein (Abp) gene duplication occurred independently in diverse mammals

PubMed Central

2008-01-01

Background The draft mouse (Mus musculus) genome sequence revealed an unexpected proliferation of gene duplicates encoding a family of secretoglobin proteins including the androgen-binding protein (ABP) α, β and γ subunits. Further investigation of 14 α-like (Abpa) and 13 β- or γ-like (Abpbg) undisrupted gene sequences revealed a rich diversity of developmental stage-, sex- and tissue-specific expression. Despite these studies, our understanding of the evolution of this gene family remains incomplete. Questions arise from imperfections in the initial mouse genome assembly and a dearth of information about the gene family structure in other rodents and mammals. Results Here, we interrogate the latest 'finished' mouse (Mus musculus) genome sequence assembly to show that the Abp gene repertoire is, in fact, twice as large as reported previously, with 30 Abpa and 34 Abpbg genes and pseudogenes. All of these have arisen since the last common ancestor with rat (Rattus norvegicus). We then demonstrate, by sequencing homologs from species within the Mus genus, that this burst of gene duplication occurred very recently, within the past seven million years. Finally, we survey Abp orthologs in genomes from across the mammalian clade and show that bursts of Abp gene duplications are not specific to the murid rodents; they also occurred recently in the lagomorph (rabbit, Oryctolagus cuniculus) and ruminant (cattle, Bos taurus) lineages, although not in other mammalian taxa. Conclusion We conclude that Abp genes have undergone repeated bursts of gene duplication and adaptive sequence diversification driven by these genes' participation in chemosensation and/or sexual identification. PMID:18269759
Rapid bursts of androgen-binding protein (Abp) gene duplication occurred independently in diverse mammals.

PubMed

Laukaitis, Christina M; Heger, Andreas; Blakley, Tyler D; Munclinger, Pavel; Ponting, Chris P; Karn, Robert C

2008-02-12

The draft mouse (Mus musculus) genome sequence revealed an unexpected proliferation of gene duplicates encoding a family of secretoglobin proteins including the androgen-binding protein (ABP) alpha, beta and gamma subunits. Further investigation of 14 alpha-like (Abpa) and 13 beta- or gamma-like (Abpbg) undisrupted gene sequences revealed a rich diversity of developmental stage-, sex- and tissue-specific expression. Despite these studies, our understanding of the evolution of this gene family remains incomplete. Questions arise from imperfections in the initial mouse genome assembly and a dearth of information about the gene family structure in other rodents and mammals. Here, we interrogate the latest 'finished' mouse (Mus musculus) genome sequence assembly to show that the Abp gene repertoire is, in fact, twice as large as reported previously, with 30 Abpa and 34 Abpbg genes and pseudogenes. All of these have arisen since the last common ancestor with rat (Rattus norvegicus). We then demonstrate, by sequencing homologs from species within the Mus genus, that this burst of gene duplication occurred very recently, within the past seven million years. Finally, we survey Abp orthologs in genomes from across the mammalian clade and show that bursts of Abp gene duplications are not specific to the murid rodents; they also occurred recently in the lagomorph (rabbit, Oryctolagus cuniculus) and ruminant (cattle, Bos taurus) lineages, although not in other mammalian taxa. We conclude that Abp genes have undergone repeated bursts of gene duplication and adaptive sequence diversification driven by these genes' participation in chemosensation and/or sexual identification.
Localization of migraine susceptibility genes in human brain by single-cell RNA sequencing.

PubMed

Renthal, William

2018-01-01

Background Migraine is a debilitating disorder characterized by severe headaches and associated neurological symptoms. A key challenge to understanding migraine has been the cellular complexity of the human brain and the multiple cell types implicated in its pathophysiology. The present study leverages recent advances in single-cell transcriptomics to localize the specific human brain cell types in which putative migraine susceptibility genes are expressed. Methods The cell-type specific expression of both familial and common migraine-associated genes was determined bioinformatically using data from 2,039 individual human brain cells across two published single-cell RNA sequencing datasets. Enrichment of migraine-associated genes was determined for each brain cell type. Results Analysis of single-brain cell RNA sequencing data from five major subtypes of cells in the human cortex (neurons, oligodendrocytes, astrocytes, microglia, and endothelial cells) indicates that over 40% of known migraine-associated genes are enriched in the expression profiles of a specific brain cell type. Further analysis of neuronal migraine-associated genes demonstrated that approximately 70% were significantly enriched in inhibitory neurons and 30% in excitatory neurons. Conclusions This study takes the next step in understanding the human brain cell types in which putative migraine susceptibility genes are expressed. Both familial and common migraine may arise from dysfunction of discrete cell types within the neurovascular unit, and localization of the affected cell type(s) in an individual patient may provide insight into to their susceptibility to migraine.
The genome sequence of the model ascomycete fungus Podospora anserina.

PubMed

Espagne, Eric; Lespinet, Olivier; Malagnac, Fabienne; Da Silva, Corinne; Jaillon, Olivier; Porcel, Betina M; Couloux, Arnaud; Aury, Jean-Marc; Ségurens, Béatrice; Poulain, Julie; Anthouard, Véronique; Grossetete, Sandrine; Khalili, Hamid; Coppin, Evelyne; Déquard-Chablat, Michelle; Picard, Marguerite; Contamine, Véronique; Arnaise, Sylvie; Bourdais, Anne; Berteaux-Lecellier, Véronique; Gautheret, Daniel; de Vries, Ronald P; Battaglia, Evy; Coutinho, Pedro M; Danchin, Etienne Gj; Henrissat, Bernard; Khoury, Riyad El; Sainsard-Chanet, Annie; Boivin, Antoine; Pinan-Lucarré, Bérangère; Sellem, Carole H; Debuchy, Robert; Wincker, Patrick; Weissenbach, Jean; Silar, Philippe

2008-01-01

The dung-inhabiting ascomycete fungus Podospora anserina is a model used to study various aspects of eukaryotic and fungal biology, such as ageing, prions and sexual development. We present a 10X draft sequence of P. anserina genome, linked to the sequences of a large expressed sequence tag collection. Similar to higher eukaryotes, the P. anserina transcription/splicing machinery generates numerous non-conventional transcripts. Comparison of the P. anserina genome and orthologous gene set with the one of its close relatives, Neurospora crassa, shows that synteny is poorly conserved, the main result of evolution being gene shuffling in the same chromosome. The P. anserina genome contains fewer repeated sequences and has evolved new genes by duplication since its separation from N. crassa, despite the presence of the repeat induced point mutation mechanism that mutates duplicated sequences. We also provide evidence that frequent gene loss took place in the lineages leading to P. anserina and N. crassa. P. anserina contains a large and highly specialized set of genes involved in utilization of natural carbon sources commonly found in its natural biotope. It includes genes potentially involved in lignin degradation and efficient cellulose breakdown. The features of the P. anserina genome indicate a highly dynamic evolution since the divergence of P. anserina and N. crassa, leading to the ability of the former to use specific complex carbon sources that match its needs in its natural biotope.
Identification and subspecific differentiation of Mycobacterium scrofulaceum by automated sequencing of a region of the gene (hsp65) encoding a 65-kilodalton heat shock protein.

PubMed Central

Swanson, D S; Pan, X; Musser, J M

1996-01-01

Mycobacterium scrofulaceum is most commonly recovered from children with cervical lymphadenitis, although it also accounts for approximately 2% of the mycobacterial infections in AIDS patients. Species assignment of M. scrofulaceum isolated by conventional techniques can be difficult and time-consuming. To develop a strategy for rapid species assignment of these organisms, a 360-bp region of the gene (hsp65) encoding a 65-kDa heat shock protein in 37 isolates from diverse sources was sequenced. Eight hsp65 alleles were identified, and these sequences formed phylogenetic clusters and lineages largely distinct from other Mycobacterium species. There was incomplete correlation between serovar designation and hsp65 allele assignment. The hsp65 data correlated strongly with the results of sequence analysis of the gene coding for 16S rRNA. Automated DNA sequencing of a 360-bp region of the hsp65 gene provides a rapid and unambiguous method for species assignment of these acid-fast organisms for diagnostic purposes. PMID:8940463
Isolation and characterization of the chicken trypsinogen gene family.

PubMed Central

Wang, K; Gan, L; Lee, I; Hood, L

1995-01-01

Based on genomic Southern hybridizations and cDNA sequence analyses, the chicken trypsinogen gene family can be divided into two multi-member subfamilies, a six-member trypsinogen I subfamily which encodes the cationic trypsin isoenzymes and a three-member trypsinogen II subfamily which encodes the anionic trypsin isoenzymes. The chicken cDNA and genomic clones containing these two subfamilies were isolated and characterized by DNA sequence analysis. The results indicated that the chicken trypsinogen genes encoded a signal peptide of 15 to 16 amino acid residues, an activation peptide of 9 to 10 residues and a trypsin of 223 amino acid residues. The chicken trypsinogens contain all the common catalytic and structural features for trypsins, including the catalytic triad His, Asp and Ser and the six disulphide bonds. The trypsinogen I and II subfamilies share approximately 70% sequence identity at the nucleotide and amino acid level. The sequence comparison among chicken trypsinogen subfamily members and trypsin sequences from other species suggested that the chicken trypsinogen genes may have evolved in coincidental or concerted fashion. Images Figure 6 Figure 7 PMID:7733885

GIGA: a simple, efficient algorithm for gene tree inference in the genomic age

PubMed Central

2010-01-01

Background Phylogenetic relationships between genes are not only of theoretical interest: they enable us to learn about human genes through the experimental work on their relatives in numerous model organisms from bacteria to fruit flies and mice. Yet the most commonly used computational algorithms for reconstructing gene trees can be inaccurate for numerous reasons, both algorithmic and biological. Additional information beyond gene sequence data has been shown to improve the accuracy of reconstructions, though at great computational cost. Results We describe a simple, fast algorithm for inferring gene phylogenies, which makes use of information that was not available prior to the genomic age: namely, a reliable species tree spanning much of the tree of life, and knowledge of the complete complement of genes in a species' genome. The algorithm, called GIGA, constructs trees agglomeratively from a distance matrix representation of sequences, using simple rules to incorporate this genomic age information. GIGA makes use of a novel conceptualization of gene trees as being composed of orthologous subtrees (containing only speciation events), which are joined by other evolutionary events such as gene duplication or horizontal gene transfer. An important innovation in GIGA is that, at every step in the agglomeration process, the tree is interpreted/reinterpreted in terms of the evolutionary events that created it. Remarkably, GIGA performs well even when using a very simple distance metric (pairwise sequence differences) and no distance averaging over clades during the tree construction process. Conclusions GIGA is efficient, allowing phylogenetic reconstruction of very large gene families and determination of orthologs on a large scale. It is exceptionally robust to adding more gene sequences, opening up the possibility of creating stable identifiers for referring to not only extant genes, but also their common ancestors. We compared trees produced by GIGA to those in the TreeFam database, and they were very similar in general, with most differences likely due to poor alignment quality. However, some remaining differences are algorithmic, and can be explained by the fact that GIGA tends to put a larger emphasis on minimizing gene duplication and deletion events. PMID:20534164
GIGA: a simple, efficient algorithm for gene tree inference in the genomic age.

PubMed

Thomas, Paul D

2010-06-09

Phylogenetic relationships between genes are not only of theoretical interest: they enable us to learn about human genes through the experimental work on their relatives in numerous model organisms from bacteria to fruit flies and mice. Yet the most commonly used computational algorithms for reconstructing gene trees can be inaccurate for numerous reasons, both algorithmic and biological. Additional information beyond gene sequence data has been shown to improve the accuracy of reconstructions, though at great computational cost. We describe a simple, fast algorithm for inferring gene phylogenies, which makes use of information that was not available prior to the genomic age: namely, a reliable species tree spanning much of the tree of life, and knowledge of the complete complement of genes in a species' genome. The algorithm, called GIGA, constructs trees agglomeratively from a distance matrix representation of sequences, using simple rules to incorporate this genomic age information. GIGA makes use of a novel conceptualization of gene trees as being composed of orthologous subtrees (containing only speciation events), which are joined by other evolutionary events such as gene duplication or horizontal gene transfer. An important innovation in GIGA is that, at every step in the agglomeration process, the tree is interpreted/reinterpreted in terms of the evolutionary events that created it. Remarkably, GIGA performs well even when using a very simple distance metric (pairwise sequence differences) and no distance averaging over clades during the tree construction process. GIGA is efficient, allowing phylogenetic reconstruction of very large gene families and determination of orthologs on a large scale. It is exceptionally robust to adding more gene sequences, opening up the possibility of creating stable identifiers for referring to not only extant genes, but also their common ancestors. We compared trees produced by GIGA to those in the TreeFam database, and they were very similar in general, with most differences likely due to poor alignment quality. However, some remaining differences are algorithmic, and can be explained by the fact that GIGA tends to put a larger emphasis on minimizing gene duplication and deletion events.
Clostridium perfringens type A–E toxin plasmids

PubMed Central

Freedman, John C.; Theoret, James R.; Wisniewski, Jessica A.; Uzal, Francisco A.; Rood, Julian I.; McClane, Bruce A.

2014-01-01

Clostridium perfringens relies upon plasmid-encoded toxin genes to cause intestinal infections. These toxin genes are associated with insertion sequences that may facilitate their mobilization and transfer, giving rise to new toxin plasmids with common backbones. Most toxin plasmids carry a transfer of clostridial plasmids locus mediating conjugation, which likely explains the presence of similar toxin plasmids in otherwise unrelated C. perfringens strains. The association of many toxin genes with insertion sequences and conjugative plasmids provides virulence flexibility when causing intestinal infections. However, incompatibility issues apparently limit the number of toxin plasmids maintained by a single cell. PMID:25283728
Nucleotide and amino acid variations of tannase gene from different Aspergillus strains.

PubMed

Borrego-Terrazas, J A; Lara-Victoriano, F; Flores-Gallegos, A C; Veana, F; Aguilar, C N; Rodríguez-Herrera, R

2014-08-01

Tannase is an enzyme that catalyses the hydrolysis of ester bonds present in tannins. Most of the scientific reports about this biocatalysis focus on aspects related to tannase production and its recovery; on the other hand, reports assessing the molecular aspects of the tannase gene or protein are scarce. In the present study, a tannase gene fragment from several Aspergillus strains isolated from the Mexican semidesert was sequenced and compared with tannase amino acid sequences reported in NCBI database using bioinformatics tools. The genetic relationship among the different tannase sequences was also determined. A conserved region of 7 amino acids was found with the conserved motif GXSXG common to esterases, in which the active-site serine residue is located. In addition, in Aspergillus niger strains GH1 and PSH, we found an extra codon in the tannase sequences encoding glycine. The tannase gene belonging to semidesert fungal strains followed a neutral evolution path with the formation of 10 haplotypes, of which A. niger GH1 and PSH haplotypes are the oldest.
Impact of Lateral Transfers on the Genomes of Lepidoptera

PubMed Central

Drezen, Jean-Michel; Josse, Thibaut; Bézier, Annie; Gauthier, Jérémy; Huguet, Elisabeth

2017-01-01

Transfer of DNA sequences between species regardless of their evolutionary distance is very common in bacteria, but evidence that horizontal gene transfer (HGT) also occurs in multicellular organisms has been accumulating in the past few years. The actual extent of this phenomenon is underestimated due to frequent sequence filtering of “alien” DNA before genome assembly. However, recent studies based on genome sequencing have revealed, and experimentally verified, the presence of foreign DNA sequences in the genetic material of several species of Lepidoptera. Large DNA viruses, such as baculoviruses and the symbiotic viruses of parasitic wasps (bracoviruses), have the potential to mediate these transfers in Lepidoptera. In particular, using ultra-deep sequencing, newly integrated transposons have been identified within baculovirus genomes. Bacterial genes have also been acquired by genomes of Lepidoptera, as in other insects and nematodes. In addition, insertions of bracovirus sequences were present in the genomes of certain moth and butterfly lineages, that were likely corresponding to rearrangements of ancient integrations. The viral genes present in these sequences, sometimes of hymenopteran origin, have been co-opted by lepidopteran species to confer some protection against pathogens. PMID:29120392
Differentially expressed genes of Coptotermes formosanus (Isoptera: Rhinotermitidae) challenged by chemical insecticides.

PubMed

Zhang, Yi; Zhao, Yuanyuan; Qiu, Xuehong; Han, Richou

2013-08-01

Coptotermes formosanus Shiraki (Isoptera: Rhinotermitidae) termites are harmful social insects to wood constructions. The current control methods heavily depend on the chemical insecticides with increasing resistance. Analysis of the differentially expressed genes mediated by chemical insecticides will contribute to the understanding of the termite resistance to chemicals and to the establishment of alternative control measures. In the present article, a full-length cDNA library was constructed from the termites induced by a mixture of commonly used insecticides (0.01% sulfluramid and 0.01% triflumuron) for 24 h, by using the RNA ligase-mediated Rapid Amplification cDNA End method. Fifty-eight differentially expressed clones were obtained by polymerase chain reaction and confirmed by dot-blot hybridization. Forty-six known sequences were obtained, which clustered into 33 unique sequences grouped in 6 contigs and 27 singlets. Sixty-seven percent (22) of the sequences had counterpart genes from other organisms, whereas 33% (11) were undescribed. A Gene Ontology analysis classified 33 unique sequences into different functional categories. In general, most of the differential expression genes were involved in binding and catalytic activity.
Sorting by Cuts, Joins, and Whole Chromosome Duplications.

PubMed

Zeira, Ron; Shamir, Ron

2017-02-01

Genome rearrangement problems have been extensively studied due to their importance in biology. Most studied models assumed a single copy per gene. However, in reality, duplicated genes are common, most notably in cancer. In this study, we make a step toward handling duplicated genes by considering a model that allows the atomic operations of cut, join, and whole chromosome duplication. Given two linear genomes, [Formula: see text] with one copy per gene and [Formula: see text] with two copies per gene, we give a linear time algorithm for computing a shortest sequence of operations transforming [Formula: see text] into [Formula: see text] such that all intermediate genomes are linear. We also show that computing an optimal sequence with fewest duplications is NP-hard.
A chromosome inversion near the KIT gene and the Tobiano spotting pattern in horses.

PubMed

Brooks, S A; Lear, T L; Adelson, D L; Bailey, E

2007-01-01

Tobiano is a white spotting pattern in horses caused by a dominant gene, Tobiano(TO). Here, we report TO associated with a large paracentric chromosome inversion on horse chromosome 3. DNA sequences flanking the inversion were identified and a PCR test was developed to detect the inversion. The inversion was only found in horses with the tobiano pattern, including horses with diverse genetic backgrounds, which indicated a common genetic origin thousands of years ago. The inversion does not interrupt any annotated genes, but begins approximately 100 kb downstream of the KIT gene. This inversion may disrupt regulatory sequences for the KIT gene and cause the white spotting pattern. Copyright (c) 2008 S. Karger AG, Basel.
Identification of Alternative Splicing and Fusion Transcripts in Non-Small Cell Lung Cancer by RNA Sequencing.

PubMed

Hong, Yoonki; Kim, Woo Jin; Bang, Chi Young; Lee, Jae Cheol; Oh, Yeon-Mok

2016-04-01

Lung cancer is the most common cause of cancer related death. Alterations in gene sequence, structure, and expression have an important role in the pathogenesis of lung cancer. Fusion genes and alternative splicing of cancer-related genes have the potential to be oncogenic. In the current study, we performed RNA-sequencing (RNA-seq) to investigate potential fusion genes and alternative splicing in non-small cell lung cancer. RNA was isolated from lung tissues obtained from 86 subjects with lung cancer. The RNA samples from lung cancer and normal tissues were processed with RNA-seq using the HiSeq 2000 system. Fusion genes were evaluated using Defuse and ChimeraScan. Candidate fusion transcripts were validated by Sanger sequencing. Alternative splicing was analyzed using multivariate analysis of transcript sequencing and validated using quantitative real time polymerase chain reaction. RNA-seq data identified oncogenic fusion genes EML4-ALK and SLC34A2-ROS1 in three of 86 normal-cancer paired samples. Nine distinct fusion transcripts were selected using DeFuse and ChimeraScan; of which, four fusion transcripts were validated by Sanger sequencing. In 33 squamous cell carcinoma, 29 tumor specific skipped exon events and six mutually exclusive exon events were identified. ITGB4 and PYCR1 were top genes that showed significant tumor specific splice variants. In conclusion, RNA-seq data identified novel potential fusion transcripts and splice variants. Further evaluation of their functional significance in the pathogenesis of lung cancer is required.
Genetic variations in merozoite surface antigen genes of Babesia bovis detected in Vietnamese cattle and water buffaloes.

PubMed

Yokoyama, Naoaki; Sivakumar, Thillaiampalam; Tuvshintulga, Bumduuren; Hayashida, Kyoko; Igarashi, Ikuo; Inoue, Noboru; Long, Phung Thang; Lan, Dinh Thi Bich

2015-03-01

The genes that encode merozoite surface antigens (MSAs) in Babesia bovis are genetically diverse. In this study, we analyzed the genetic diversity of B. bovis MSA-1, MSA-2b, and MSA-2c genes in Vietnamese cattle and water buffaloes. Blood DNA samples from 258 cattle and 49 water buffaloes reared in the Thua Thien Hue province of Vietnam were screened with a B. bovis-specific diagnostic PCR assay. The B. bovis-positive DNA samples (23 cattle and 16 water buffaloes) were then subjected to PCR assays to amplify the MSA-1, MSA-2b, and MSA-2c genes. Sequencing analyses showed that the Vietnamese MSA-1 and MSA-2b sequences are genetically diverse, whereas MSA-2c is relatively conserved. The nucleotide identity values for these MSA gene sequences were similar in the cattle and water buffaloes. Consistent with the sequencing data, the Vietnamese MSA-1 and MSA-2b sequences were dispersed across several clades in the corresponding phylogenetic trees, whereas the MSA-2c sequences occurred in a single clade. Cattle- and water-buffalo-derived sequences also often clustered together on the phylogenetic trees. The Vietnamese MSA-1, MSA-2b, and MSA-2c sequences were then screened for recombination with automated methods. Of the seven recombination events detected, five and two were associated with the MSA-2b and MSA-2c recombinant sequences, respectively, whereas no MSA-1 recombinants were detected among the sequences analyzed. Recombination between the sequences derived from cattle and water buffaloes was very common, and the resultant recombinant sequences were found in both host animals. These data indicate that the genetic diversity of the MSA sequences does not differ between cattle and water buffaloes in Vietnam. They also suggest that recombination between the B. bovis MSA sequences in both cattle and water buffaloes might contribute to the genetic variation in these genes in Vietnam. Copyright © 2015 Elsevier B.V. All rights reserved.
Visualizing conserved gene location across microbe genomes

NASA Astrophysics Data System (ADS)

Shaw, Chris D.

2009-01-01

This paper introduces an analysis-based zoomable visualization technique for displaying the location of genes across many related species of microbes. The purpose of this visualizatiuon is to enable a biologist to examine the layout of genes in the organism of interest with respect to the gene organization of related organisms. During the genomic annotation process, the ability to observe gene organization in common with previously annotated genomes can help a biologist better confirm the structure and function of newly analyzed microbe DNA sequences. We have developed a visualization and analysis tool that enables the biologist to observe and examine gene organization among genomes, in the context of the primary sequence of interest. This paper describes the visualization and analysis steps, and presents a case study using a number of Rickettsia genomes.
Comparative Analysis of Gene Expression for Convergent Evolution of Camera Eye Between Octopus and Human

PubMed Central

Ogura, Atsushi; Ikeo, Kazuho; Gojobori, Takashi

2004-01-01

Although the camera eye of the octopus is very similar to that of humans, phylogenetic and embryological analyses have suggested that their camera eyes have been acquired independently. It has been known as a typical example of convergent evolution. To study the molecular basis of convergent evolution of camera eyes, we conducted a comparative analysis of gene expression in octopus and human camera eyes. We sequenced 16,432 ESTs of the octopus eye, leading to 1052 nonredundant genes that have matches in the protein database. Comparing these 1052 genes with 13,303 already-known ESTs of the human eye, 729 (69.3%) genes were commonly expressed between the human and octopus eyes. On the contrary, when we compared octopus eye ESTs with human connective tissue ESTs, the expression similarity was quite low. To trace the evolutionary changes that are potentially responsible for camera eye formation, we also compared octopus-eye ESTs with the completed genome sequences of other organisms. We found that 1019 out of the 1052 genes had already existed at the common ancestor of bilateria, and 875 genes were conserved between humans and octopuses. It suggests that a larger number of conserved genes and their similar gene expression may be responsible for the convergent evolution of the camera eye. PMID:15289475
No novel, high penetrant gene might remain to be found in Japanese patients with unknown MODY.

PubMed

Horikawa, Yukio; Hosomichi, Kazuyoshi; Enya, Mayumi; Ishiura, Hiroyuki; Suzuki, Yutaka; Tsuji, Shoji; Sugano, Sumio; Inoue, Ituro; Takeda, Jun

2018-07-01

MODY 5 and 6 have been shown to be low-penetrant MODYs. As the genetic background of unknown MODY is assumed to be similar, a new analytical strategy is applied here to elucidate genetic predispositions to unknown MODY. We examined to find whether there are major MODY gene loci remaining to be identified using SNP linkage analysis in Japanese. Whole-exome sequencing was performed with seven families with typical MODY. Candidates for novel MODY genes were examined combined with in silico network analysis. Some peaks were found only in either parametric or non-parametric analysis; however, none of these peaks showed a LOD score greater than 3.7, which is approved to be the significance threshold of evidence for linkage. Exome sequencing revealed that three mutated genes were common among 3 families and 42 mutated genes were common in two families. Only one of these genes, MYO5A, having rare amino acid mutations p.R849Q and p.V1601G, was involved in the biological network of known MODY genes through the intermediary of the INS. Although only one promising candidate gene, MYO5A, was identified, no novel, high penetrant MODY genes might remain to be found in Japanese MODY.
[Etiological and molecular characteristics of diarrhea caused Proteus mirabilis].

PubMed

Shi, Xiaolu; Hu, Qinghua; Lin, Yiman; Qiu, Yaqun; Li, Yinghui; Jiang, Min; Chen, Qiongcheng

2014-06-01

To analyze the etiological characteristics, virulence genes and plasmids that carrying diarrhea-causing Proteus mirabilis and to assess their relationship with drug resistance and pathogenicity. Proteus mirabilis coming from six different sources (food poisoning, external environment and healthy people) were analyzed biochemically, on related susceptibility and pulsed-field gel electrophoresis (PFGE). Virulence genes were detected by PCR. Plasmids were extracted and sequenced after gel electrophoresis purification. The biochemical characteristics of Proteus mirabilis from different sources seemed basically the same, and each of them showed having common virulence genes, as ureC, rsmA, hpmA and zapA. However, the PFGE patterns and susceptibility of these strains were different, so as the plasmids that they carried. Plasmid that presented in the sequenced strain showed that the 2 683 bp length plasmid encodes qnrD gene was associated with the quinolone resistance. Etiological characteristics and molecular characteristics of Proteus mirabilis gathered from different sources, were analyzed. Results indicated that traditional biochemical analysis and common virulence gene identification might be able to distinguish the strains with different sources. However, PFGE and plasmids analysis could distinguish the sources of strains and to identify those plasmids that commonly carried by the drug-resistant strains. These findings also provided theoretical basis for further study on the nature of resistance and pathogenicity in Proteus mirabilis.
Structure and expression of the attacin genes in Hyalophora cecropia.

PubMed

Sun, S C; Lindström, I; Lee, J Y; Faye, I

1991-02-26

To study the regulation of the immune genes in insects, we have cloned and sequenced the attacin gene locus of the giant silk moth Hyalophora cecropia. The locus contains one acidic and one basic attacin gene as well as two pseudogenes, which are remnants of basic attacin genes. A small insertion element was found within the locus. The two functional attacin genes are transcribed in opposite directions and have two introns inserted at homologous positions. A common sequence, GGGGATTCCT, is found at nucleotide position -48 in the acidic gene and at nucleotide position -58 in the basic gene. Interestingly, this decanucleotide is similar to the consensus of the NF-k B-binding site. Expression studies revealed that both attacins are strongly induced by phorbol 12-myristate 13-acetate, lipopolysaccharide and bacteria. However, only the acidic attacin gene showed a clear response to injury.
UniGene Tabulator: a full parser for the UniGene format.

PubMed

Lenzi, Luca; Frabetti, Flavia; Facchin, Federica; Casadei, Raffaella; Vitale, Lorenza; Canaider, Silvia; Carinci, Paolo; Zannotti, Maria; Strippoli, Pierluigi

2006-10-15

UniGene Tabulator 1.0 provides a solution for full parsing of UniGene flat file format; it implements a structured graphical representation of each data field present in UniGene following import into a common database managing system usable in a personal computer. This database includes related tables for sequence, protein similarity, sequence-tagged site (STS) and transcript map interval (TXMAP) data, plus a summary table where each record represents a UniGene cluster. UniGene Tabulator enables full local management of UniGene data, allowing parsing, querying, indexing, retrieving, exporting and analysis of UniGene data in a relational database form, usable on Macintosh (OS X 10.3.9 or later) and Windows (2000, with service pack 4, XP, with service pack 2 or later) operating systems-based computers. The current release, including both the FileMaker runtime applications, is freely available at http://apollo11.isto.unibo.it/software/
The mitochondrial genome of Frankliniella intonsa: insights into the evolution of mitochondrial genomes at lower taxonomic levels in Thysanoptera.

PubMed

Yan, Dankan; Tang, Yunxia; Hu, Min; Liu, Fengquan; Zhang, Dongfang; Fan, Jiaqin

2014-10-01

Thrips is an ideal group for studying the evolution of mitochondrial (mt) genomes in the genus and family due to independent rearrangements within this order. The complete sequence of the mitochondrial DNA (mtDNA) of the flower thrips Frankliniella intonsa has been completed and annotated in this study. The circular genome is 15,215bp in length with an A+T content of 75.9% and contains the typical 37 genes and it has triplicate putative control regions. Nucleotide composition is A+T biased, and the majority of the protein-coding genes present opposite CG skew which is reflected by the nucleotide composition, codon and amino acid usage. Although the known thrips have massive gene rearrangements, it showed no reversal of strand asymmetry. Gene rearrangements have been found in the lower taxonomic levels of thrips. Three tRNA genes were translocated in the genus Frankliniella and eight tRNA genes in the family Thripidae. Although the gene arrangements of mt genomes of all three thrips species differ massively from the ancestral insect, they are all very similar to each other, indicating that there was a large rearrangement somewhere before the most recent common ancestor of these three species and very little genomic evolution or rearrangements after then. The extremely similar sequences among the CRs suggest that they are ongoing concerted evolution. Analyses of the up and downstream sequence of CRs reveal that the CR2 is actually the ancestral CR. The three CRs are in the same spot in each of the three thrips mt genomes which have the identical inverted genes. These characteristics might be obtained from the most recent common ancestor of this three thrips. Above observations suggest that the mt genomes of the three thrips keep a single massive rearrangement from the common ancestor and have low evolutionary rates among them. Copyright © 2014 Elsevier Inc. All rights reserved.
Common fragile sites (CFS) and extremely large CFS genes are targets for human papillomavirus integrations and chromosome rearrangements in oropharyngeal squamous cell carcinoma.

PubMed

Gao, Ge; Johnson, Sarah H; Vasmatzis, George; Pauley, Christina E; Tombers, Nicole M; Kasperbauer, Jan L; Smith, David I

2017-01-01

Common fragile sites (CFS) are chromosome regions that are prone to form gaps or breaks in response to DNA replication stress. They are often found as hotspots for sister chromatid exchanges, deletions, and amplifications in different cancers. Many of the CFS regions are found to span genes whose genomic sequence is greater than 1 Mb, some of which have been demonstrated to function as important tumor suppressors. CFS regions are also hotspots for human papillomavirus (HPV) integrations in cervical cancer. We used mate-pair sequencing to examine HPV integration events and chromosomal structural variations in 34 oropharyngeal squamous cell carcinoma (OPSCC). We used endpoint PCR and Sanger sequencing to validate each HPV integration event and found HPV integrations preferentially occurred within CFS regions similar to what is observed in cervical cancer. We also found that many of the chromosomal alterations detected also occurred at or near the cytogenetic location of CFSs. Several large genes were also found to be recurrent targets of rearrangements, independent of HPV integrations, including CSMD1 (2.1Mb), LRP1B (1.9Mb), and LARGE1 (0.7Mb). Sanger sequencing revealed that the nucleotide sequences near to identified junction sites contained repetitive and AT-rich sequences that were shown to have the potential to form stem-loop DNA secondary structures that might stall DNA replication fork progression during replication stress. This could then cause increased instability in these regions which could lead to cancer development in human cells. Our findings suggest that CFSs and some specific large genes appear to play important roles in OPSCC. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Rapid evolutionary change of common bean (Phaseolus vulgaris L) plastome, and the genomic diversification of legume chloroplasts

PubMed Central

Guo, Xianwu; Castillo-Ramírez, Santiago; González, Víctor; Bustos, Patricia; Luís Fernández-Vázquez, José; Santamaría, Rosa Isela; Arellano, Jesús; Cevallos, Miguel A; Dávila, Guillermo

2007-01-01

Background Fabaceae (legumes) is one of the largest families of flowering plants, and some members are important crops. In contrast to what we know about their great diversity or economic importance, our knowledge at the genomic level of chloroplast genomes (cpDNAs or plastomes) for these crops is limited. Results We sequenced the complete genome of the common bean (Phaseolus vulgaris cv. Negro Jamapa) chloroplast. The plastome of P. vulgaris is a 150,285 bp circular molecule. It has gene content similar to that of other legume plastomes, but contains two pseudogenes, rpl33 and rps16. A distinct inversion occurred at the junction points of trnH-GUG/rpl14 and rps19/rps8, as in adzuki bean [1]. These two pseudogenes and the inversion were confirmed in 10 varieties representing the two domestication centers of the bean. Genomic comparative analysis indicated that inversions generally occur in legume plastomes and the magnitude and localization of insertions/deletions (indels) also vary. The analysis of repeat sequences demonstrated that patterns and sequences of tandem repeats had an important impact on sequence diversification between legume plastomes and tandem repeats did not belong to dispersed repeats. Interestingly, P. vulgaris plastome had higher evolutionary rates of change on both genomic and gene levels than G. max, which could be the consequence of pressure from both mutation and natural selection. Conclusion Legume chloroplast genomes are widely diversified in gene content, gene order, indel structure, abundance and localization of repetitive sequences, intracellular sequence exchange and evolutionary rates. The P. vulgaris plastome is a rapidly evolving genome. PMID:17623083
Novel organization of the common nodulation genes in Rhizobium leguminosarum bv. phaseoli strains.

PubMed Central

Vázquez, M; Dávalos, A; de las Peñas, A; Sánchez, F; Quinto, C

1991-01-01

Nodulation by Rhizobium, Bradyrhizobium, and Azorhizobium species in the roots of legumes and nonlegumes requires the proper expression of plant genes and of both common and specific bacterial nodulation genes. The common nodABC genes form an operon or are physically mapped together in all species studied thus far. Rhizobium leguminosarum bv. phaseoli strains are classified in two groups. The type I group has reiterated nifHDK genes and a narrow host range of nodulation. The type II group has a single copy of the nifHDK genes and a wide host range of nodulation. We have found by genetic and nucleotide sequence analysis that in type I strain CE-3, the functional common nodA gene is separated from the nodBC genes by 20 kb and thus is transcriptionally separated from the latter genes. This novel organization could be the result of a complex rearrangement, as we found zones of identity between the two separated nodA and nodBC regions. Moreover, this novel organization of the common nodABC genes seems to be a general characteristic of R. leguminosarum bv. phaseoli type I strains. Despite the separation, the coordination of the expression of these genes seems not to be altered. PMID:1991718

Assessment of clinical analytical sensitivity and specificity of next-generation sequencing for detection of simple and complex mutations.

PubMed

Chin, Ephrem L H; da Silva, Cristina; Hegde, Madhuri

2013-02-19

Detecting mutations in disease genes by full gene sequence analysis is common in clinical diagnostic laboratories. Sanger dideoxy terminator sequencing allows for rapid development and implementation of sequencing assays in the clinical laboratory, but it has limited throughput, and due to cost constraints, only allows analysis of one or at most a few genes in a patient. Next-generation sequencing (NGS), on the other hand, has evolved rapidly, although to date it has mainly been used for large-scale genome sequencing projects and is beginning to be used in the clinical diagnostic testing. One advantage of NGS is that many genes can be analyzed easily at the same time, allowing for mutation detection when there are many possible causative genes for a specific phenotype. In addition, regions of a gene typically not tested for mutations, like deep intronic and promoter mutations, can also be detected. Here we use 20 previously characterized Sanger-sequenced positive controls in disease-causing genes to demonstrate the utility of NGS in a clinical setting using standard PCR based amplification to assess the analytical sensitivity and specificity of the technology for detecting all previously characterized changes (mutations and benign SNPs). The positive controls chosen for validation range from simple substitution mutations to complex deletion and insertion mutations occurring in autosomal dominant and recessive disorders. The NGS data was 100% concordant with the Sanger sequencing data identifying all 119 previously identified changes in the 20 samples. We have demonstrated that NGS technology is ready to be deployed in clinical laboratories. However, NGS and associated technologies are evolving, and clinical laboratories will need to invest significantly in staff and infrastructure to build the necessary foundation for success.
Gene Loss and Lineage-Specific Restriction-Modification Systems Associated with Niche Differentiation in the Campylobacter jejuni Sequence Type 403 Clonal Complex

PubMed Central

Morley, Laura; McNally, Alan; Paszkiewicz, Konrad; Corander, Jukka; Méric, Guillaume; Sheppard, Samuel K.; Blom, Jochen

2015-01-01

Campylobacter jejuni is a highly diverse species of bacteria commonly associated with infectious intestinal disease of humans and zoonotic carriage in poultry, cattle, pigs, and other animals. The species contains a large number of distinct clonal complexes that vary from host generalist lineages commonly found in poultry, livestock, and human disease cases to host-adapted specialized lineages primarily associated with livestock or poultry. Here, we present novel data on the ST403 clonal complex of C. jejuni, a lineage that has not been reported in avian hosts. Our data show that the lineage exhibits a distinctive pattern of intralineage recombination that is accompanied by the presence of lineage-specific restriction-modification systems. Furthermore, we show that the ST403 complex has undergone gene decay at a number of loci. Our data provide a putative link between the lack of association with avian hosts of C. jejuni ST403 and both gene gain and gene loss through nonsense mutations in coding sequences of genes, resulting in pseudogene formation. PMID:25795671
VIZARD: analysis of Affymetrix Arabidopsis GeneChip data

NASA Technical Reports Server (NTRS)

Moseyko, Nick; Feldman, Lewis J.

2002-01-01

SUMMARY: The Affymetrix GeneChip Arabidopsis genome array has proved to be a very powerful tool for the analysis of gene expression in Arabidopsis thaliana, the most commonly studied plant model organism. VIZARD is a Java program created at the University of California, Berkeley, to facilitate analysis of Arabidopsis GeneChip data. It includes several integrated tools for filtering, sorting, clustering and visualization of gene expression data as well as tools for the discovery of regulatory motifs in upstream sequences. VIZARD also includes annotation and upstream sequence databases for the majority of genes represented on the Affymetrix Arabidopsis GeneChip array. AVAILABILITY: VIZARD is available free of charge for educational, research, and not-for-profit purposes, and can be downloaded at http://www.anm.f2s.com/research/vizard/ CONTACT: moseyko@uclink4.berkeley.edu.
An Evolutionary Network of Genes Present in the Eukaryote Common Ancestor Polls Genomes on Eukaryotic and Mitochondrial Origin

PubMed Central

Thiergart, Thorsten; Landan, Giddy; Schenk, Marc; Dagan, Tal; Martin, William F.

2012-01-01

To test the predictions of competing and mutually exclusive hypotheses for the origin of eukaryotes, we identified from a sample of 27 sequenced eukaryotic and 994 sequenced prokaryotic genomes 571 genes that were present in the eukaryote common ancestor and that have homologues among eubacterial and archaebacterial genomes. Maximum-likelihood trees identified the prokaryotic genomes that most frequently contained genes branching as the sister to the eukaryotic nuclear homologues. Among the archaebacteria, euryarchaeote genomes most frequently harbored the sister to the eukaryotic nuclear gene, whereas among eubacteria, the α-proteobacteria were most frequently represented within the sister group. Only 3 genes out of 571 gave a 3-domain tree. Homologues from α-proteobacterial genomes that branched as the sister to nuclear genes were found more frequently in genomes of facultatively anaerobic members of the rhiozobiales and rhodospirilliales than in obligate intracellular ricketttsial parasites. Following α-proteobacteria, the most frequent eubacterial sister lineages were γ-proteobacteria, δ-proteobacteria, and firmicutes, which were also the prokaryote genomes least frequently found as monophyletic groups in our trees. Although all 22 higher prokaryotic taxa sampled (crenarchaeotes, γ-proteobacteria, spirochaetes, chlamydias, etc.) harbor genes that branch as the sister to homologues present in the eukaryotic common ancestor, that is not evidence of 22 different prokaryotic cells participating at eukaryote origins because prokaryotic “lineages” have laterally acquired genes for more than 1.5 billion years since eukaryote origins. The data underscore the archaebacterial (host) nature of the eukaryotic informational genes and the eubacterial (mitochondrial) nature of eukaryotic energy metabolism. The network linking genes of the eukaryote ancestor to contemporary homologues distributed across prokaryotic genomes elucidates eukaryote gene origins in a dialect cognizant of gene transfer in nature. PMID:22355196
Novel chaperonins are prevalent in the virioplankton and demonstrate links to viral biology and ecology

PubMed Central

Marine, Rachel L; Nasko, Daniel J; Wray, Jeffrey; Polson, Shawn W; Wommack, K Eric

2017-01-01

Chaperonins are protein-folding machinery found in all cellular life. Chaperonin genes have been documented within a few viruses, yet, surprisingly, analysis of metagenome sequence data indicated that chaperonin-carrying viruses are common and geographically widespread in marine ecosystems. Also unexpected was the discovery of viral chaperonin sequences related to thermosome proteins of archaea, indicating the presence of virioplankton populations infecting marine archaeal hosts. Virioplankton large subunit chaperonin sequences (GroELs) were divergent from bacterial sequences, indicating that viruses have carried this gene over long evolutionary time. Analysis of viral metagenome contigs indicated that: the order of large and small subunit genes was linked to the phylogeny of GroEL; both lytic and temperate phages may carry group I chaperonin genes; and viruses carrying a GroEL gene likely have large double-stranded DNA (dsDNA) genomes (>70 kb). Given these connections, it is likely that chaperonins are critical to the biology and ecology of virioplankton populations that carry these genes. Moreover, these discoveries raise the intriguing possibility that viral chaperonins may more broadly alter the structure and function of viral and cellular proteins in infected host cells. PMID:28731469
Novel chaperonins are prevalent in the virioplankton and demonstrate links to viral biology and ecology.

PubMed

Marine, Rachel L; Nasko, Daniel J; Wray, Jeffrey; Polson, Shawn W; Wommack, K Eric

2017-11-01

Chaperonins are protein-folding machinery found in all cellular life. Chaperonin genes have been documented within a few viruses, yet, surprisingly, analysis of metagenome sequence data indicated that chaperonin-carrying viruses are common and geographically widespread in marine ecosystems. Also unexpected was the discovery of viral chaperonin sequences related to thermosome proteins of archaea, indicating the presence of virioplankton populations infecting marine archaeal hosts. Virioplankton large subunit chaperonin sequences (GroELs) were divergent from bacterial sequences, indicating that viruses have carried this gene over long evolutionary time. Analysis of viral metagenome contigs indicated that: the order of large and small subunit genes was linked to the phylogeny of GroEL; both lytic and temperate phages may carry group I chaperonin genes; and viruses carrying a GroEL gene likely have large double-stranded DNA (dsDNA) genomes (>70 kb). Given these connections, it is likely that chaperonins are critical to the biology and ecology of virioplankton populations that carry these genes. Moreover, these discoveries raise the intriguing possibility that viral chaperonins may more broadly alter the structure and function of viral and cellular proteins in infected host cells.
The house spider genome reveals an ancient whole-genome duplication during arachnid evolution.

PubMed

Schwager, Evelyn E; Sharma, Prashant P; Clarke, Thomas; Leite, Daniel J; Wierschin, Torsten; Pechmann, Matthias; Akiyama-Oda, Yasuko; Esposito, Lauren; Bechsgaard, Jesper; Bilde, Trine; Buffry, Alexandra D; Chao, Hsu; Dinh, Huyen; Doddapaneni, HarshaVardhan; Dugan, Shannon; Eibner, Cornelius; Extavour, Cassandra G; Funch, Peter; Garb, Jessica; Gonzalez, Luis B; Gonzalez, Vanessa L; Griffiths-Jones, Sam; Han, Yi; Hayashi, Cheryl; Hilbrant, Maarten; Hughes, Daniel S T; Janssen, Ralf; Lee, Sandra L; Maeso, Ignacio; Murali, Shwetha C; Muzny, Donna M; Nunes da Fonseca, Rodrigo; Paese, Christian L B; Qu, Jiaxin; Ronshaugen, Matthew; Schomburg, Christoph; Schönauer, Anna; Stollewerk, Angelika; Torres-Oliva, Montserrat; Turetzek, Natascha; Vanthournout, Bram; Werren, John H; Wolff, Carsten; Worley, Kim C; Bucher, Gregor; Gibbs, Richard A; Coddington, Jonathan; Oda, Hiroki; Stanke, Mario; Ayoub, Nadia A; Prpic, Nikola-Michael; Flot, Jean-François; Posnien, Nico; Richards, Stephen; McGregor, Alistair P

2017-07-31

The duplication of genes can occur through various mechanisms and is thought to make a major contribution to the evolutionary diversification of organisms. There is increasing evidence for a large-scale duplication of genes in some chelicerate lineages including two rounds of whole genome duplication (WGD) in horseshoe crabs. To investigate this further, we sequenced and analyzed the genome of the common house spider Parasteatoda tepidariorum. We found pervasive duplication of both coding and non-coding genes in this spider, including two clusters of Hox genes. Analysis of synteny conservation across the P. tepidariorum genome suggests that there has been an ancient WGD in spiders. Comparison with the genomes of other chelicerates, including that of the newly sequenced bark scorpion Centruroides sculpturatus, suggests that this event occurred in the common ancestor of spiders and scorpions, and is probably independent of the WGDs in horseshoe crabs. Furthermore, characterization of the sequence and expression of the Hox paralogs in P. tepidariorum suggests that many have been subject to neo-functionalization and/or sub-functionalization since their duplication. Our results reveal that spiders and scorpions are likely the descendants of a polyploid ancestor that lived more than 450 MYA. Given the extensive morphological diversity and ecological adaptations found among these animals, rivaling those of vertebrates, our study of the ancient WGD event in Arachnopulmonata provides a new comparative platform to explore common and divergent evolutionary outcomes of polyploidization events across eukaryotes.
The Essential Genome of Escherichia coli K-12

PubMed Central

2018-01-01

ABSTRACT Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. PMID:29463657
Automated DNA mutation detection using universal conditions direct sequencing: application to ten muscular dystrophy genes

PubMed Central

2009-01-01

Background One of the most common and efficient methods for detecting mutations in genes is PCR amplification followed by direct sequencing. Until recently, the process of designing PCR assays has been to focus on individual assay parameters rather than concentrating on matching conditions for a set of assays. Primers for each individual assay were selected based on location and sequence concerns. The two primer sequences were then iteratively adjusted to make the individual assays work properly. This generally resulted in groups of assays with different annealing temperatures that required the use of multiple thermal cyclers or multiple passes in a single thermal cycler making diagnostic testing time-consuming, laborious and expensive. These factors have severely hampered diagnostic testing services, leaving many families without an answer for the exact cause of a familial genetic disease. A search of GeneTests for sequencing analysis of the entire coding sequence for genes that are known to cause muscular dystrophies returns only a small list of laboratories that perform comprehensive gene panels. The hypothesis for the study was that a complete set of universal assays can be designed to amplify and sequence any gene or family of genes using computer aided design tools. If true, this would allow automation and optimization of the mutation detection process resulting in reduced cost and increased throughput. Results An automated process has been developed for the detection of deletions, duplications/insertions and point mutations in any gene or family of genes and has been applied to ten genes known to bear mutations that cause muscular dystrophy: DMD; CAV3; CAPN3; FKRP; TRIM32; LMNA; SGCA; SGCB; SGCG; SGCD. Using this process, mutations have been found in five DMD patients and four LGMD patients (one in the FKRP gene, one in the CAV3 gene, and two likely causative heterozygous pairs of variations in the CAPN3 gene of two other patients). Methods and assay sequences are reported in this paper. Conclusion This automated process allows laboratories to discover DNA variations in a short time and at low cost. PMID:19835634
Automated DNA mutation detection using universal conditions direct sequencing: application to ten muscular dystrophy genes.

PubMed

Bennett, Richard R; Schneider, Hal E; Estrella, Elicia; Burgess, Stephanie; Cheng, Andrew S; Barrett, Caitlin; Lip, Va; Lai, Poh San; Shen, Yiping; Wu, Bai-Lin; Darras, Basil T; Beggs, Alan H; Kunkel, Louis M

2009-10-18

One of the most common and efficient methods for detecting mutations in genes is PCR amplification followed by direct sequencing. Until recently, the process of designing PCR assays has been to focus on individual assay parameters rather than concentrating on matching conditions for a set of assays. Primers for each individual assay were selected based on location and sequence concerns. The two primer sequences were then iteratively adjusted to make the individual assays work properly. This generally resulted in groups of assays with different annealing temperatures that required the use of multiple thermal cyclers or multiple passes in a single thermal cycler making diagnostic testing time-consuming, laborious and expensive.These factors have severely hampered diagnostic testing services, leaving many families without an answer for the exact cause of a familial genetic disease. A search of GeneTests for sequencing analysis of the entire coding sequence for genes that are known to cause muscular dystrophies returns only a small list of laboratories that perform comprehensive gene panels.The hypothesis for the study was that a complete set of universal assays can be designed to amplify and sequence any gene or family of genes using computer aided design tools. If true, this would allow automation and optimization of the mutation detection process resulting in reduced cost and increased throughput. An automated process has been developed for the detection of deletions, duplications/insertions and point mutations in any gene or family of genes and has been applied to ten genes known to bear mutations that cause muscular dystrophy: DMD; CAV3; CAPN3; FKRP; TRIM32; LMNA; SGCA; SGCB; SGCG; SGCD. Using this process, mutations have been found in five DMD patients and four LGMD patients (one in the FKRP gene, one in the CAV3 gene, and two likely causative heterozygous pairs of variations in the CAPN3 gene of two other patients). Methods and assay sequences are reported in this paper. This automated process allows laboratories to discover DNA variations in a short time and at low cost.
Development of phoH as a Novel Signature Gene for Assessing Marine Phage Diversity▿

PubMed Central

Goldsmith, Dawn B.; Crosti, Giuseppe; Dwivedi, Bhakti; McDaniel, Lauren D.; Varsani, Arvind; Suttle, Curtis A.; Weinbauer, Markus G.; Sandaa, Ruth-Anne; Breitbart, Mya

2011-01-01

Phages play a key role in the marine environment by regulating the transfer of energy between trophic levels and influencing global carbon and nutrient cycles. The diversity of marine phage communities remains difficult to characterize because of the lack of a signature gene common to all phages. Recent studies have demonstrated the presence of host-derived auxiliary metabolic genes in phage genomes, such as those belonging to the Pho regulon, which regulates phosphate uptake and metabolism under low-phosphate conditions. Among the completely sequenced phage genomes in GenBank, this study identified Pho regulon genes in nearly 40% of the marine phage genomes, while only 4% of nonmarine phage genomes contained these genes. While several Pho regulon genes were identified, phoH was the most prevalent, appearing in 42 out of 602 completely sequenced phage genomes. Phylogenetic analysis demonstrated that phage phoH sequences formed a cluster distinct from those of their bacterial hosts. PCR primers designed to amplify a region of the phoH gene were used to determine the diversity of phage phoH sequences throughout a depth profile in the Sargasso Sea and at six locations worldwide. phoH was present at all sites examined, and a high diversity of phoH sequences was recovered. Most phoH sequences belonged to clusters without any cultured representatives. Each depth and geographic location had a distinct phoH composition, although most phoH clusters were recovered from multiple sites. Overall, phoH is an effective signature gene for examining phage diversity in the marine environment. PMID:21926220
[Personal genome research and neurological diseases: overview].

PubMed

Toda, Tatsushi

2013-03-01

Neurological diseases include those caused by a single defective gene,e.g., Huntington's disease, other polyglutamine diseases, and muscular dystrophies, and those that are mostly sporadic but rarely show Mendelian inheritance in some families, e.g., Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, and epilepsy. The latter diseases are considered polygenic disorders. Both sporadic and Mendelian cases of these diseases are believed to share some common pathological mechanisms. Since the detection of causal genes for the Mendelian cases, studies have been initiated on disease pathology. SNPs and rare gene variants play important roles in common neurological diseases. From a technological perspective, next-generation sequencers have become widely available and have contributed to the advancement of research based on individual genome sequences (personal genome). This paper presents an overview, as well as a historical context, of the contribution of personal genome research to neurological disease studies.
Common bacterial responses in six ecosystems exposed to 10 years of elevated atmospheric carbon dioxide.

PubMed

Dunbar, John; Eichorst, Stephanie A; Gallegos-Graves, La Verne; Silva, Shannon; Xie, Gary; Hengartner, N W; Evans, R David; Hungate, Bruce A; Jackson, Robert B; Megonigal, J Patrick; Schadt, Christopher W; Vilgalys, Rytas; Zak, Donald R; Kuske, Cheryl R

2012-05-01

Six terrestrial ecosystems in the USA were exposed to elevated atmospheric CO(2) in single or multifactorial experiments for more than a decade to assess potential impacts. We retrospectively assessed soil bacterial community responses in all six-field experiments and found ecosystem-specific and common patterns of soil bacterial community response to elevated CO(2) . Soil bacterial composition differed greatly across the six ecosystems. No common effect of elevated atmospheric CO(2) on bacterial biomass, richness and community composition across all of the ecosystems was identified, although significant responses were detected in individual ecosystems. The most striking common trend across the sites was a decrease of up to 3.5-fold in the relative abundance of Acidobacteria Group 1 bacteria in soils exposed to elevated CO(2) or other climate factors. The Acidobacteria Group 1 response observed in exploratory 16S rRNA gene clone library surveys was validated in one ecosystem by 100-fold deeper sequencing and semi-quantitative PCR assays. Collectively, the 16S rRNA gene sequencing approach revealed influences of elevated CO(2) on multiple ecosystems. Although few common trends across the ecosystems were detected in the small surveys, the trends may be harbingers of more substantive changes in less abundant, more sensitive taxa that can only be detected by deeper surveys. Representative bacterial 16S rRNA gene clone sequences were deposited in GenBank with Accession No. JQ366086–JQ387568. Published 2012. This article is a U.S. Government work and is in the public domain in the USA.
Transcriptional profiling of human femoral mesenchymal stem cells in osteoporosis and its association with adipogenesis.

PubMed

Choi, Yong Jun; Song, Insun; Jin, Yilan; Jin, Hyun-Seok; Ji, Hyung Min; Jeong, Seon-Yong; Won, Ye-Yeon; Chung, Yoon-Sok

2017-10-20

Genetic alterations are major contributing factors in the development of osteoporosis. Osteoblasts and adipocytes share a common origin, mesenchymal stem cells (MSCs), and their genetic determinants might be important in the relationship between osteoporosis and obesity. In the present study, we aimed to isolate differentially expressed genes (DEGs) in osteoporosis and normal controls using human MSCs, and elucidate the common pathways and genes related to osteoporosis and adipogenesis. Human MSCs were obtained from the bone marrow of femurs from postmenopausal women during orthopedic surgeries. RNA sequencing (RNA-seq) was carried out using next-generation sequencing (NGS) technology. DEGs were identified using RNA-seq data. Ingenuity pathway analysis (IPA) was used to elucidate the common pathway related to osteoporosis and adipogenesis. Candidate genes for the common pathway were validated with other independent osteoporosis and obese subjects using RT-PCR (reverse transcription-polymerase chain reaction) analysis. Fifty-three DEGs were identified between postmenopausal osteoporosis patients and normal bone mineral density (BMD) controls. Most of the genetic changes were related to the differentiation of cells. The nuclear receptor subfamily 4 group A (NR4A) family was identified as possible common genes related to osteogenesis and adipogenesis. The expression level of the mRNA of NR4A1 was significantly higher in osteoporosis patients than in controls (p=0.018). The expression level of the mRNA of NR4A2 was significantly higher in obese patients than in controls (p=0.041). Some genetic changes in MSCs are involved in the pathophysiology of osteoporosis. The NR4A family might comprise common genes related to osteoporosis and obesity. Copyright © 2017 Elsevier B.V. All rights reserved.
Question 7: Comparative Genomics and Early Cell Evolution: A Cautionary Methodological Note

NASA Astrophysics Data System (ADS)

Islas, Sara; Hernández-Morales, Ricardo; Lazcano, Antonio

2007-10-01

Inventories of the gene content of the last common ancestor (LCA), i.e., the cenancestor, include sequences that may have undergone horizontal transfer events, as well as sequences that have originated in different pre-cenancestral epochs. However, the universal distribution of highly conserved genes involved in RNA metabolism provide insights into early stages of cell evolution during which RNA played a much more conspicuous biological role, and is consistent with the hypothesis that extant living systems were preceded by an RNA/protein world. Insights into the traits of primitive entities from which the LCA evolved may be derived from the analysis of paralogous gene families, including those formed by sequences that resulted from internal elongation events. Three major types of paralogous gene families can be recognized. The importance of this grouping for understanding the traits of early cells is discussed.
Molecular characterization of Atractolytocestus sagittatus (Cestoda: Caryophyllidea), monozoic parasite of common carp, and its differentiation from the invasive species Atractolytocestus huronensis.

PubMed

Bazsalovicsová, Eva; Králová-Hromadová, Ivica; Stefka, Jan; Scholz, Tomáš

2012-05-01

Sequence structure of complete internal transcribed spacer 1 and 2 (ITS1 and ITS2) of the ribosomal DNA region and partial mitochondrial cytochrome c oxidase subunit I (cox1) gene sequences were studied in the monozoic tapeworm Atractolytocestus sagittatus (Kulakovskaya et Akhmerov, 1965) (Cestoda: Caryophyllidea), a parasite of common carp (Cyprinus carpio carpio L.). Intraindividual sequence diversity was observed in both ribosomal spacers. In ITS1, a total number of 19 recombinant clones yielded eight different sequence types (pairwise sequence identity, 99.7-100%) which, however, did not resemble the structure typical for divergent intragenomic ITS copies (paralogues). Polymorphism was displayed by several single nucleotide mutations present exclusively in single clones, but variation in the number of short repetitive motifs was not observed. In ITS2, a total of 21 recombinant clones yielded ten different sequence types (pairwise sequence identity, 97.5-100%). They were mostly characterized by a varying number of (TCGT)(n) repeats resulting in assortment of ITS2 sequences into two sequence variants, which reflected the structure specific for ITS paralogues. The third DNA region analysed, mitochondrial cox1 gene (669 bp) was detected to be 100% identical in all studied A. sagittatus individuals. Comparison of molecular data on A. sagittatus with those on Atractolytocestus huronensis Anthony, 1958, an invasive parasite of common carp, has shown that interspecific differences significantly exceeded intraspecific variation in both ribosomal spacers (81.4-82.5% in ITS1, 74.4-75.2% in ITS2) as well as in mitochondrial cox1, which confirms validity of both congeneric tapeworms parasitic in the same fish host.
Characterization of the glutathione S-transferase gene family through ESTs and expression analyses within common and pigmented cultivars of Citrus sinensis (L.) Osbeck.

PubMed

Licciardello, Concetta; D'Agostino, Nunzio; Traini, Alessandra; Recupero, Giuseppe Reforgiato; Frusciante, Luigi; Chiusano, Maria Luisa

2014-02-03

Glutathione S-transferases (GSTs) represent a ubiquitous gene family encoding detoxification enzymes able to recognize reactive electrophilic xenobiotic molecules as well as compounds of endogenous origin. Anthocyanin pigments require GSTs for their transport into the vacuole since their cytoplasmic retention is toxic to the cell. Anthocyanin accumulation in Citrus sinensis (L.) Osbeck fruit flesh determines different phenotypes affecting the typical pigmentation of Sicilian blood oranges. In this paper we describe: i) the characterization of the GST gene family in C. sinensis through a systematic EST analysis; ii) the validation of the EST assembly by exploiting the genome sequences of C. sinensis and C. clementina and their genome annotations; iii) GST gene expression profiling in six tissues/organs and in two different sweet orange cultivars, Cadenera (common) and Moro (pigmented). We identified 61 GST transcripts, described the full- or partial-length nature of the sequences and assigned to each sequence the GST class membership exploiting a comparative approach and the classification scheme proposed for plant species. A total of 23 full-length sequences were defined. Fifty-four of the 61 transcripts were successfully aligned to the C. sinensis and C. clementina genomes. Tissue specific expression profiling demonstrated that the expression of some GST transcripts was 'tissue-affected' and cultivar specific. A comparative analysis of C. sinensis GSTs with those from other plant species was also considered. Data from the current analysis are accessible at http://biosrv.cab.unina.it/citrusGST/, with the aim to provide a reference resource for C. sinensis GSTs. This study aimed at the characterization of the GST gene family in C. sinensis. Based on expression patterns from two different cultivars and on sequence-comparative analyses, we also highlighted that two sequences, a Phi class GST and a Mapeg class GST, could be involved in the conjugation of anthocyanin pigments and in their transport into the vacuole, specifically in fruit flesh of the pigmented cultivar.
A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy.

PubMed

Gao, Xiang; Lin, Huaiying; Revanna, Kashi; Dong, Qunfeng

2017-05-10

Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement. We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes. Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite its higher computational costs, our method is still suitable for analyzing large-scale microbiome datasets for practical purposes. Furthermore, our method can be applied for taxonomic classification of any phylogenetic marker gene sequences. Our software, called BLCA, is freely available at https://github.com/qunfengdong/BLCA .
Targeted Deep Resequencing Identifies Coding Variants in the PEAR1 Gene That Play a Role in Platelet Aggregation

PubMed Central

Kim, Yoonhee; Suktitipat, Bhoom; Yanek, Lisa R.; Faraday, Nauder; Wilson, Alexander F.; Becker, Diane M.; Becker, Lewis C.; Mathias, Rasika A.

2013-01-01

Platelet aggregation is heritable, and genome-wide association studies have detected strong associations with a common intronic variant of the platelet endothelial aggregation receptor1 (PEAR1) gene both in African American and European American individuals. In this study, we used a sequencing approach to identify additional exonic variants in PEAR1 that may also determine variability in platelet aggregation in the GeneSTAR Study. A 0.3 Mb targeted region on chromosome 1q23.1 including the entire PEAR1 gene was Sanger sequenced in 104 subjects (45% male, 49% African American, age = 52±13) selected on the basis of hyper- and hypo- aggregation across three different agonists (collagen, epinephrine, and adenosine diphosphate). Single-variant and multi-variant burden tests for association were performed. Of the 235 variants identified through sequencing, 61 were novel, and three of these were missense variants. More rare variants (MAF<5%) were noted in African Americans compared to European Americans (108 vs. 45). The common intronic GWAS-identified variant (rs12041331) demonstrated the most significant association signal in African Americans (p = 4.020×10−4); no association was seen for additional exonic variants in this group. In contrast, multi-variant burden tests indicated that exonic variants play a more significant role in European Americans (p = 0.0099 for the collective coding variants compared to p = 0.0565 for intronic variant rs12041331). Imputation of the individual exonic variants in the rest of the GeneSTAR European American cohort (N = 1,965) supports the results noted in the sequenced discovery sample: p = 3.56×10−4, 2.27×10−7, 5.20×10−5 for coding synonymous variant rs56260937 and collagen, epinephrine and adenosine diphosphate induced platelet aggregation, respectively. Sequencing approaches confirm that a common intronic variant has the strongest association with platelet aggregation in African Americans, and show that exonic variants play an additional role in platelet aggregation in European Americans. PMID:23704978
Transcription Factor Information System (TFIS): A Tool for Detection of Transcription Factor Binding Sites.

PubMed

Narad, Priyanka; Kumar, Abhishek; Chakraborty, Amlan; Patni, Pranav; Sengupta, Abhishek; Wadhwa, Gulshan; Upadhyaya, K C

2017-09-01

Transcription factors are trans-acting proteins that interact with specific nucleotide sequences known as transcription factor binding site (TFBS), and these interactions are implicated in regulation of the gene expression. Regulation of transcriptional activation of a gene often involves multiple interactions of transcription factors with various sequence elements. Identification of these sequence elements is the first step in understanding the underlying molecular mechanism(s) that regulate the gene expression. For in silico identification of these sequence elements, we have developed an online computational tool named transcription factor information system (TFIS) for detecting TFBS for the first time using a collection of JAVA programs and is mainly based on TFBS detection using position weight matrix (PWM). The database used for obtaining position frequency matrices (PFM) is JASPAR and HOCOMOCO, which is an open-access database of transcription factor binding profiles. Pseudo-counts are used while converting PFM to PWM, and TFBS detection is carried out on the basis of percent score taken as threshold value. TFIS is equipped with advanced features such as direct sequence retrieving from NCBI database using gene identification number and accession number, detecting binding site for common TF in a batch of gene sequences, and TFBS detection after generating PWM from known raw binding sequences in addition to general detection methods. TFIS can detect the presence of potential TFBSs in both the directions at the same time. This feature increases its efficiency. And the results for this dual detection are presented in different colors specific to the orientation of the binding site. Results obtained by the TFIS are more detailed and specific to the detected TFs as integration of more informative links from various related web servers are added in the result pages like Gene Ontology, PAZAR database and Transcription Factor Encyclopedia in addition to NCBI and UniProt. Common TFs like SP1, AP1 and NF-KB of the Amyloid beta precursor gene is easily detected using TFIS along with multiple binding sites. In another scenario of embryonic developmental process, TFs of the FOX family (FOXL1 and FOXC1) were also identified. TFIS is platform-independent which is publicly available along with its support and documentation at http://tfistool.appspot.com and http://www.bioinfoplus.com/tfis/ . TFIS is licensed under the GNU General Public License, version 3 (GPL-3.0).

HomozygosityMapper2012--bridging the gap between homozygosity mapping and deep sequencing.

PubMed

Seelow, Dominik; Schuelke, Markus

2012-07-01

Homozygosity mapping is a common method to map recessive traits in consanguineous families. To facilitate these analyses, we have developed HomozygosityMapper, a web-based approach to homozygosity mapping. HomozygosityMapper allows researchers to directly upload the genotype files produced by the major genotyping platforms as well as deep sequencing data. It detects stretches of homozygosity shared by the affected individuals and displays them graphically. Users can interactively inspect the underlying genotypes, manually refine these regions and eventually submit them to our candidate gene search engine GeneDistiller to identify the most promising candidate genes. Here, we present the new version of HomozygosityMapper. The most striking new feature is the support of Next Generation Sequencing *.vcf files as input. Upon users' requests, we have implemented the analysis of common experimental rodents as well as of important farm animals. Furthermore, we have extended the options for single families and loss of heterozygosity studies. Another new feature is the export of *.bed files for targeted enrichment of the potential disease regions for deep sequencing strategies. HomozygosityMapper also generates files for conventional linkage analyses which are already restricted to the possible disease regions, hence superseding CPU-intensive genome-wide analyses. HomozygosityMapper is freely available at http://www.homozygositymapper.org/.
[The mutation analysis of PAH gene and prenatal diagnosis in classical phenylketonuria family].

PubMed

Yan, Yousheng; Hao, Shengju; Yao, Fengxia; Sun, Qingmei; Zheng, Lei; Zhang, Qinghua; Zhang, Chuan; Yang, Tao; Huang, Shangzhi

2014-12-01

To characterize the mutation spectrum of phenylalanine hydroxylase (PAH) gene and perform prenatal diagnosis for families with classical phenylketonuria. By stratified sequencing, mutations were detected in the exons and flaking introns of PAH gene of 44 families with classical phenylketonuria. 47 fetuses were diagnosed by combined sequencing with linkage analysis of three common short tandem repeats (STR) (PAH-STR, PAH-26 and PAH-32) in the PAH gene. Thirty-one types of mutations were identified. A total of 84 mutations were identified in 88 alleles (95.45%), in which the most common mutation have been R243Q (21.59%), EX6-96A>G (6.82%), IVS4-1G>A (5.86%) and IVS7+2T>A (5.86%). Most mutations were found in exons 3, 5, 6, 7, 11 and 12. The polymorphism information content (PIC) of these three STR markers was 0.71 (PAH-STR), 0.48 (PAH-26) and 0.40 (PAH-32), respectively. Prenatal diagnosis was performed successfully with the combined method in 47 fetuses of 44 classical phenylketonuria families. Among them, 11 (23.4%) were diagnosed as affected, 24 (51.1%) as carriers, and 12 (25.5%) as unaffected. Prenatal diagnosis can be achieved efficiently and accurately by stratified sequencing of PAH gene and linkage analysis of STR for classical phenylketonuria families.
Myelin protein zero gene sequencing diagnoses Charcot-Marie-Tooth Type 1B disease

DOE Office of Scientific and Technical Information (OSTI.GOV)

Su, Y.; Zhang, H.; Madrid, R.

1994-09-01

Charcot-Marie-Tooth disease (CMT), the most common genetic neuropathy, affects about 1 in 2600 people in Norway and is found worldwide. CMT Type 1 (CMT1) has slow nerve conduction with demyelinated Schwann cells. Autosomal dominant CMT Type 1B (CMT1B) results from mutations in the myelin protein zero gene which directs the synthesis of more than half of all Schwann cell protein. This gene was mapped to the chromosome 1q22-1q23.1 borderline by fluorescence in situ hybridization. The first 7 of 7 reported CMT1B mutations are unique. Thus the most effective means to identify CMT1B mutations in at-risk family members and fetuses ismore » to sequence the entire coding sequence in dominant or sporadic CMT patients without the CMT1A duplication. Of the 19 primers used in 16 pars to uniquely amplify the entire MPZ coding sequence, 6 primer pairs were used to amplify and sequence the 6 exons. The DyeDeoxy Terminator cycle sequencing method used with four different color fluorescent lables was superior to manual sequencing because it sequences more bases unambiguously from extracted genomic DNA samples within 24 hours. This protocol was used to test 28 CMT and Dejerine-Sottas patients without CMT1A gene duplication. Sequencing MPZ gene-specific amplified fragments identified 9 polymorphic sites within the 6 exons that encode the 248 amino acid MPZ protein. The large number of major CMT1B mutations identified by single strand sequencing are being verified by reverse strand sequencing and when possible, by restriction enzyme analysis. This protocol can be used to distringuish CMT1B patients from othre CMT phenotypes and to determine the CMT1B status of relatives both presymptomatically and prenatally.« less
Changes in the Composition of Drinking Water Bacterial Clone Libraries Introduced by Using Two Different 16S rRna Gene PCR Primers

EPA Science Inventory

Sequence analysis of 16S rRNA gene clone libraries is a popular tool used to describe the composition of natural microbial communities. Commonly, clone libraries are developed by direct cloning of 16S rRNA gene PCR products. Different primers are often employed in the initial amp...
Changes in the Composition of Drinking Water Bacterial Clone Libraries Introduced by Using Two Different 16S rRNA Gene PCR Primers

EPA Science Inventory

Sequence analysis of 16S rRNA gene clone libraries is a popular tool used to describe the composition of natural microbial communities. Commonly, clone libraries are developed by direct cloning of 16S rRNA gene PCR products. Different primers are often employed in the initial amp...
Arabidopsis intragenomic conserved noncoding sequence

PubMed Central

Thomas, Brian C.; Rapaka, Lakshmi; Lyons, Eric; Pedersen, Brent; Freeling, Michael

2007-01-01

After the most recent tetraploidy in the Arabidopsis lineage, most gene pairs lost one, but not both, of their duplicates. We manually inspected the 3,179 retained gene pairs and their surrounding gene space still present in the genome using a custom-made viewer application. The display of these pairs allowed us to define intragenic conserved noncoding sequences (CNSs), identify exon annotation errors, and discover potentially new genes. Using a strict algorithm to sort high-scoring pair sequences from the bl2seq data, we created a database of 14,944 intragenomic Arabidopsis CNSs. The mean CNS length is 31 bp, ranging from 15 to 285 bp. There are ≈1.7 CNSs associated with a typical gene, and Arabidopsis CNSs are found in all areas around exons, most frequently in the 5′ upstream region. Gene ontology classifications related to transcription, regulation, or “response to …” external or endogenous stimuli, especially hormones, tend to be significantly overrepresented among genes containing a large number of CNSs, whereas protein localization, transport, and metabolism are common among genes with no CNSs. There is a 1.5% overlap between these CNSs and the 218,982 putative RNAs in the Arabidopsis Small RNA Project database, allowing for two mismatches. These CNSs provide a unique set of noncoding sequences enriched for function. CNS function is implied by evolutionary conservation and independently supported because CNS-richness predicts regulatory gene ontology categories. PMID:17301222
Comparative RNA-Sequence Transcriptome Analysis of Phenolic Acid Metabolism in Salvia miltiorrhiza, a Traditional Chinese Medicine Model Plant

PubMed Central

Song, Zhenqiao; Guo, Linlin; Liu, Tian; Lin, Caicai; Wang, Jianhua

2017-01-01

Salvia miltiorrhiza Bunge is an important traditional Chinese medicine (TCM). In this study, two S. miltiorrhiza genotypes (BH18 and ZH23) with different phenolic acid concentrations were used for de novo RNA sequencing (RNA-seq). A total of 170,787 transcripts and 56,216 unigenes were obtained. There were 670 differentially expressed genes (DEGs) identified between BH18 and ZH23, 250 of which were upregulated in ZH23, with genes involved in the phenylpropanoid biosynthesis pathway being the most upregulated genes. Nine genes involved in the lignin biosynthesis pathway were upregulated in BH18 and thus result in higher lignin content in BH18. However, expression profiles of most genes involved in the core common upstream phenylpropanoid biosynthesis pathway were higher in ZH23 than that in BH18. These results indicated that genes involved in the core common upstream phenylpropanoid biosynthesis pathway might play an important role in downstream secondary metabolism and demonstrated that lignin biosynthesis was a putative partially competing pathway with phenolic acid biosynthesis. The results of this study expanded our understanding of the regulation of phenolic acid biosynthesis in S. miltiorrhiza. PMID:28194403
Accurate, Rapid Taxonomic Classification of Fungal Large-Subunit rRNA Genes

PubMed Central

Liu, Kuan-Liang; Porras-Alfaro, Andrea; Eichorst, Stephanie A.

2012-01-01

Taxonomic and phylogenetic fingerprinting based on sequence analysis of gene fragments from the large-subunit rRNA (LSU) gene or the internal transcribed spacer (ITS) region is becoming an integral part of fungal classification. The lack of an accurate and robust classification tool trained by a validated sequence database for taxonomic placement of fungal LSU genes is a severe limitation in taxonomic analysis of fungal isolates or large data sets obtained from environmental surveys. Using a hand-curated set of 8,506 fungal LSU gene fragments, we determined the performance characteristics of a naïve Bayesian classifier across multiple taxonomic levels and compared the classifier performance to that of a sequence similarity-based (BLASTN) approach. The naïve Bayesian classifier was computationally more rapid (>460-fold with our system) than the BLASTN approach, and it provided equal or superior classification accuracy. Classifier accuracies were compared using sequence fragments of 100 bp and 400 bp and two different PCR primer anchor points to mimic sequence read lengths commonly obtained using current high-throughput sequencing technologies. Accuracy was higher with 400-bp sequence reads than with 100-bp reads. It was also significantly affected by sequence location across the 1,400-bp test region. The highest accuracy was obtained across either the D1 or D2 variable region. The naïve Bayesian classifier provides an effective and rapid means to classify fungal LSU sequences from large environmental surveys. The training set and tool are publicly available through the Ribosomal Database Project (http://rdp.cme.msu.edu/classifier/classifier.jsp). PMID:22194300
Cloning, analysis and functional annotation of expressed sequence tags from the Earthworm Eisenia fetida

PubMed Central

Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping

2007-01-01

Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730
Evolutionary history of the enolase gene family.

PubMed

Tracy, M R; Hedges, S B

2000-12-23

The enzyme enolase [EC 4.2.1.11] is found in all organisms, with vertebrates exhibiting tissue-specific isozymes encoded by three genes: alpha (alpha), beta (beta), and gamma (gamma) enolase. Limited taxonomic sampling of enolase has obscured the timing of gene duplication events. To help clarify the evolutionary history of the gene family, cDNAs were sequenced from six taxa representing major lineages of vertebrates: Chiloscyllium punctatum (shark), Amia calva (bowfin), Salmo trutta (trout), Latimeria chalumnae (coelacanth), Lepidosiren paradoxa (South American lungfish), and Neoceratodus forsteri (Australian lungfish). Phylogenetic analysis of all enolase and related gene sequences revealed an early gene duplication event prior to the last common ancestor of living organisms. Several distantly related archaebacterial sequences were designated as 'enolase-2', whereas all other enolase sequences were designated 'enolase-1'. Two of the three isozymes of enolase-1, alpha- and beta-enolase, were discovered in actinopterygian, sarcopterygian, and chondrichthian fishes. Phylogenetic analysis of vertebrate enolases revealed that the two gene duplications leading to the three isozymes of enolase-1 occurred subsequent to the divergence of living agnathans, near the Proterozoic/Phanerozoic boundary (approximately 550Mya). Two copies of enolase, designated alpha(1) and alpha(2), were found in the trout and are presumed to be the result of a genome duplication event.
Next-generation sequencing using a pre-designed gene panel for the molecular diagnosis of congenital disorders in pediatric patients.

PubMed

Lim, Eileen C P; Brett, Maggie; Lai, Angeline H M; Lee, Siew-Peng; Tan, Ee-Shien; Jamuar, Saumya S; Ng, Ivy S L; Tan, Ene-Choo

2015-12-14

Next-generation sequencing (NGS) has revolutionized genetic research and offers enormous potential for clinical application. Sequencing the exome has the advantage of casting the net wide for all known coding regions while targeted gene panel sequencing provides enhanced sequencing depths and can be designed to avoid incidental findings in adult-onset conditions. A HaloPlex panel consisting of 180 genes within commonly altered chromosomal regions is available for use on both the Ion Personal Genome Machine (PGM) and MiSeq platforms to screen for causative mutations in these genes. We used this Haloplex ICCG panel for targeted sequencing of 15 patients with clinical presentations indicative of an abnormality in one of the 180 genes. Sequencing runs were done using the Ion 318 Chips on the Ion Torrent PGM. Variants were filtered for known polymorphisms and analysis was done to identify possible disease-causing variants before validation by Sanger sequencing. When possible, segregation of variants with phenotype in family members was performed to ascertain the pathogenicity of the variant. More than 97% of the target bases were covered at >20×. There was an average of 9.6 novel variants per patient. Pathogenic mutations were identified in five genes for six patients, with two novel variants. There were another five likely pathogenic variants, some of which were unreported novel variants. In a cohort of 15 patients, we were able to identify a likely genetic etiology in six patients (40%). Another five patients had candidate variants for which further evaluation and segregation analysis are ongoing. Our results indicate that the HaloPlex ICCG panel is useful as a rapid, high-throughput and cost-effective screening tool for 170 of the 180 genes. There is low coverage for some regions in several genes which might have to be supplemented by Sanger sequencing. However, comparing the cost, ease of analysis, and shorter turnaround time, it is a good alternative to exome sequencing for patients whose features are suggestive of a genetic etiology involving one of the genes in the panel.
Identification of interleukin genes in Pogona vitticeps using a de novo transcriptome assembly from RNA-seq data.

PubMed

Livernois, Alexandra; Hardy, Kristine; Domaschenz, Renae; Papanicolaou, Alexie; Georges, Arthur; Sarre, Stephen D; Rao, Sudha; Ezaz, Tariq; Deakin, Janine E

2016-10-01

Interleukins are a group of cytokines with complex immunomodulatory functions that are important for regulating immunity in vertebrate species. Reptiles and mammals last shared a common ancestor more than 350 million years ago, so it is not surprising that low sequence identity has prevented divergent interleukin genes from being identified in the central bearded dragon lizard, Pogona vitticeps, in its genome assembly. To determine the complete nucleotide sequences of key interleukin genes, we constructed full-length transcripts, using the Trinity platform, from short paired-end read RNA sequences from stimulated spleen cells. De novo transcript reconstruction and analysis allowed us to identify interleukin genes that are missing from the published P. vitticeps assembly. Identification of key cytokines in P. vitticeps will provide insight into the essential molecular mechanisms and evolution of interleukin gene families and allow for characterization of the immune response in a lizard for comparison with mammals.
Disruption of long-distance highly conserved noncoding elements in neurocristopathies.

PubMed

Amiel, Jeanne; Benko, Sabina; Gordon, Christopher T; Lyonnet, Stanislas

2010-12-01

One of the key discoveries of vertebrate genome sequencing projects has been the identification of highly conserved noncoding elements (CNEs). Some characteristics of CNEs include their high frequency in mammalian genomes, their potential regulatory role in gene expression, and their enrichment in gene deserts nearby master developmental genes. The abnormal development of neural crest cells (NCCs) leads to a broad spectrum of congenital malformation(s), termed neurocristopathies, and/or tumor predisposition. Here we review recent findings that disruptions of CNEs, within or at long distance from the coding sequences of key genes involved in NCC development, result in neurocristopathies via the alteration of tissue- or stage-specific long-distance regulation of gene expression. While most studies on human genetic disorders have focused on protein-coding sequences, these examples suggest that investigation of genomic alterations of CNEs will provide a broader understanding of the molecular etiology of both rare and common human congenital malformations. © 2010 New York Academy of Sciences.
[Analysis of cis-regulatory element distribution in gene promoters of Gossypium raimondii and Arabidopsis thaliana].

PubMed

Sun, Gao-Fei; He, Shou-Pu; Du, Xiong-Ming

2013-10-01

Cotton genomic studies have boomed since the release of Gossypium raimondii draft genome. In this study, cis-regulatory element (CRE) in 1 kb length sequence upstream 5' UTR of annotated genes were selected and scanned in the Arabidopsis thaliana (At) and Gossypium raimondii (Gr) genomes, based on the database of PLACE (Plant cis-acting Regulatory DNA Elements). According to the definition of this study, 44 (12.3%) and 57 (15.5%) CREs presented "peak-like" distribution in the 1 kb selected sequences of both genomes, respectively. Thirty-four of them were peak-like distributed in both genomes, which could be further categorized into 4 types based on their core sequences. The coincidence of TATABOX peak position and their actual position ((-) -30 bp) indicated that the position of a common CRE was conservative in different genes, which suggested that the peak position of these CREs was their possible actual position of transcription factors. The position of a common CRE was also different between the two genomes due to stronger length variation of 5' UTR in Gr than At. Furthermore, most of the peak-like CREs were located in the region of -110 bp-0 bp, which suggested that concentrated distribution might be conductive to the interaction of transcription factors, and then regulate the gene expression in downstream.
Application of the major capsid protein as a marker of the phylogenetic diversity of Emiliania huxleyi viruses.

PubMed

Rowe, Janet M; Fabre, Marie-Françoise; Gobena, Daniel; Wilson, William H; Wilhelm, Steven W

2011-05-01

Studies of the Phycodnaviridae have traditionally relied on the DNA polymerase (pol) gene as a biomarker. However, recent investigations have suggested that the major capsid protein (MCP) gene may be a reliable phylogenetic biomarker. We used MCP gene amplicons gathered across the North Atlantic to assess the diversity of Emiliania huxleyi-infecting Phycodnaviridae. Nucleotide sequences were examined across >6000 km of open ocean, with comparisons between concentrates of the virus-size fraction of seawater and of lysates generated by exposing host strains to these same virus concentrates. Analyses revealed that many sequences were only sampled once, while several were over-represented. Analyses also revealed nucleotide sequences distinct from previous coastal isolates. Examination of lysed cultures revealed a new richness in phylogeny, as MCP sequences previously unrepresented within the existing collection of E. huxleyi viruses (EhV) were associated with viruses lysing cultures. Sequences were compared with previously described EhV MCP sequences from the North Sea and a Norwegian Fjord, as well as from the Gulf of Maine. Principal component analysis indicates that location-specific distinctions exist despite the presence of sequences common across these environments. Overall, this investigation provides new sequence data and an assessment on the use of the MCP gene. © 2011 Federation of European Microbiological Societies Published by Blackwell Publishing Ltd. All rights reserved.
Skin Microbiome Surveys Are Strongly Influenced by Experimental Design.

PubMed

Meisel, Jacquelyn S; Hannigan, Geoffrey D; Tyldsley, Amanda S; SanMiguel, Adam J; Hodkinson, Brendan P; Zheng, Qi; Grice, Elizabeth A

2016-05-01

Culture-independent studies to characterize skin microbiota are increasingly common, due in part to affordable and accessible sequencing and analysis platforms. Compared to culture-based techniques, DNA sequencing of the bacterial 16S ribosomal RNA (rRNA) gene or whole metagenome shotgun (WMS) sequencing provides more precise microbial community characterizations. Most widely used protocols were developed to characterize microbiota of other habitats (i.e., gastrointestinal) and have not been systematically compared for their utility in skin microbiome surveys. Here we establish a resource for the cutaneous research community to guide experimental design in characterizing skin microbiota. We compare two widely sequenced regions of the 16S rRNA gene to WMS sequencing for recapitulating skin microbiome community composition, diversity, and genetic functional enrichment. We show that WMS sequencing most accurately recapitulates microbial communities, but sequencing of hypervariable regions 1-3 of the 16S rRNA gene provides highly similar results. Sequencing of hypervariable region 4 poorly captures skin commensal microbiota, especially Propionibacterium. WMS sequencing, which is resource and cost intensive, provides evidence of a community's functional potential; however, metagenome predictions based on 16S rRNA sequence tags closely approximate WMS genetic functional profiles. This study highlights the importance of experimental design for downstream results in skin microbiome surveys. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Skin microbiome surveys are strongly influenced by experimental design

PubMed Central

Meisel, Jacquelyn S.; Hannigan, Geoffrey D.; Tyldsley, Amanda S.; SanMiguel, Adam J.; Hodkinson, Brendan P.; Zheng, Qi; Grice, Elizabeth A.

2016-01-01

Culture-independent studies to characterize skin microbiota are increasingly common, due in part to affordable and accessible sequencing and analysis platforms. Compared to culture-based techniques, DNA sequencing of the bacterial 16S ribosomal RNA (rRNA) gene or whole metagenome shotgun (WMS) sequencing provide more precise microbial community characterizations. Most widely used protocols were developed to characterize microbiota of other habitats (i.e. gastrointestinal), and have not been systematically compared for their utility in skin microbiome surveys. Here we establish a resource for the cutaneous research community to guide experimental design in characterizing skin microbiota. We compare two widely sequenced regions of the 16S rRNA gene to WMS sequencing for recapitulating skin microbiome community composition, diversity, and genetic functional enrichment. We show that WMS sequencing most accurately recapitulates microbial communities, but sequencing of hypervariable regions 1-3 of the 16S rRNA gene provides highly similar results. Sequencing of hypervariable region 4 poorly captures skin commensal microbiota, especially Propionibacterium. WMS sequencing, which is resource- and cost-intensive, provides evidence of a community’s functional potential; however, metagenome predictions based on 16S rRNA sequence tags closely approximate WMS genetic functional profiles. This work highlights the importance of experimental design for downstream results in skin microbiome surveys. PMID:26829039
Fine-scale mergers of chloroplast and mitochondrial genes create functional, transcompartmentally chimeric mitochondrial genes.

PubMed

Hao, Weilong; Palmer, Jeffrey D

2009-09-29

The mitochondrial genomes of flowering plants possess a promiscuous proclivity for taking up sequences from the chloroplast genome. All characterized chloroplast integrants exist apart from native mitochondrial genes, and only a few, involving chloroplast tRNA genes that have functionally supplanted their mitochondrial counterparts, appear to be of functional consequence. We developed a novel computational approach to search for homologous recombination (gene conversion) in a large number of sequences and applied it to 22 mitochondrial and chloroplast gene pairs, which last shared common ancestry some 2 billion years ago. We found evidence of recurrent conversion of short patches of mitochondrial genes by chloroplast homologs during angiosperm evolution, but no evidence of gene conversion in the opposite direction. All 9 putative conversion events involve the atp1/atpA gene encoding the alpha subunit of ATP synthase, which is unusually well conserved between the 2 organelles and the only shared gene that is widely sequenced across plant mitochondria. Moreover, all conversions were limited to the 2 regions of greatest nucleotide and amino acid conservation of atp1/atpA. These observations probably reflect constraints operating on both the occurrence and fixation of recombination between ancient homologs. These findings indicate that recombination between anciently related sequences is more frequent than previously appreciated and creates functional mitochondrial genes of chimeric origin. These results also have implications for the widespread use of mitochondrial atp1 in phylogeny reconstruction.
The genome sequence of the model ascomycete fungus Podospora anserina

PubMed Central

Espagne, Eric; Lespinet, Olivier; Malagnac, Fabienne; Da Silva, Corinne; Jaillon, Olivier; Porcel, Betina M; Couloux, Arnaud; Aury, Jean-Marc; Ségurens, Béatrice; Poulain, Julie; Anthouard, Véronique; Grossetete, Sandrine; Khalili, Hamid; Coppin, Evelyne; Déquard-Chablat, Michelle; Picard, Marguerite; Contamine, Véronique; Arnaise, Sylvie; Bourdais, Anne; Berteaux-Lecellier, Véronique; Gautheret, Daniel; de Vries, Ronald P; Battaglia, Evy; Coutinho, Pedro M; Danchin, Etienne GJ; Henrissat, Bernard; Khoury, Riyad EL; Sainsard-Chanet, Annie; Boivin, Antoine; Pinan-Lucarré, Bérangère; Sellem, Carole H; Debuchy, Robert; Wincker, Patrick; Weissenbach, Jean; Silar, Philippe

2008-01-01

Background The dung-inhabiting ascomycete fungus Podospora anserina is a model used to study various aspects of eukaryotic and fungal biology, such as ageing, prions and sexual development. Results We present a 10X draft sequence of P. anserina genome, linked to the sequences of a large expressed sequence tag collection. Similar to higher eukaryotes, the P. anserina transcription/splicing machinery generates numerous non-conventional transcripts. Comparison of the P. anserina genome and orthologous gene set with the one of its close relatives, Neurospora crassa, shows that synteny is poorly conserved, the main result of evolution being gene shuffling in the same chromosome. The P. anserina genome contains fewer repeated sequences and has evolved new genes by duplication since its separation from N. crassa, despite the presence of the repeat induced point mutation mechanism that mutates duplicated sequences. We also provide evidence that frequent gene loss took place in the lineages leading to P. anserina and N. crassa. P. anserina contains a large and highly specialized set of genes involved in utilization of natural carbon sources commonly found in its natural biotope. It includes genes potentially involved in lignin degradation and efficient cellulose breakdown. Conclusion The features of the P. anserina genome indicate a highly dynamic evolution since the divergence of P. anserina and N. crassa, leading to the ability of the former to use specific complex carbon sources that match its needs in its natural biotope. PMID:18460219
DoOPSearch: a web-based tool for finding and analysing common conserved motifs in the promoter regions of different chordate and plant genes

PubMed Central

Sebestyén, Endre; Nagy, Tibor; Suhai, Sándor; Barta, Endre

2009-01-01

Background The comparative genomic analysis of a large number of orthologous promoter regions of the chordate and plant genes from the DoOP databases shows thousands of conserved motifs. Most of these motifs differ from any known transcription factor binding site (TFBS). To identify common conserved motifs, we need a specific tool to be able to search amongst them. Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s). Results We have developed a new tool called DoOPSearch for the analysis of the conserved motifs in the promoter regions of chordate or plant genes. We used the orthologous promoters of the DoOP database to extract thousands of conserved motifs from different taxonomic groups. The advantage of this approach is that different sets of conserved motifs might be found depending on how broad the taxonomic coverage of the underlying orthologous promoter sequence collection is (consider e.g. primates vs. mammals or Brassicaceae vs. Viridiplantae). The DoOPSearch tool allows the users to search these motif collections or the promoter regions of DoOP with user supplied query sequences or any of the conserved motifs from the DoOP database. To find overrepresented gene ontologies, the gene lists obtained can be analysed further using a modified version of the GeneMerge program. Conclusion We present here a comparative genomics based promoter analysis tool. Our system is based on a unique collection of conserved promoter motifs characteristic of different taxonomic groups. We offer both a command line and a web-based tool for searching in these motif collections using user specified queries. These can be either short promoter sequences or consensus sequences of known transcription factor binding sites. The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that might provide a clue on the function of the motifs and genes. PMID:19534755

A reassessment of the evolutionary timescale of bat rabies viruses based upon glycoprotein gene sequences.

PubMed

Kuzmina, Natalia A; Kuzmin, Ivan V; Ellison, James A; Taylor, Steven T; Bergman, David L; Dew, Beverly; Rupprecht, Charles E

2013-10-01

Rabies, an acute progressive encephalomyelitis caused by viruses in the genus Lyssavirus, is one of the oldest known infectious diseases. Although dogs and other carnivores represent the greatest threat to public health as rabies reservoirs, it is commonly accepted that bats are the primary evolutionary hosts of lyssaviruses. Despite early historical documentation of rabies, molecular clock analyses indicate a quite young age of lyssaviruses, which is confusing. For example, the results obtained for partial and complete nucleoprotein gene sequences of rabies viruses (RABV), or for a limited number of glycoprotein gene sequences, indicated that the time of the most recent common ancestor (TMRCA) for current bat RABV diversity in the Americas lies in the seventeenth to eighteenth centuries and might be directly or indirectly associated with the European colonization. Conversely, several other reports demonstrated high genetic similarity between lyssavirus isolates, including RABV, obtained within a time interval of 25-50 years. In the present study, we attempted to re-estimate the age of several North American bat RABV lineages based on the largest set of complete and partial glycoprotein gene sequences compiled to date (n = 201) employing a codon substitution model. Although our results overlap with previous estimates in marginal areas of the 95 % high probability density (HPD), they suggest a longer evolutionary history of American bat RABV lineages (TMRCA at least 732 years, with a 95 % HPD 436-1107 years).
Evolution in the block: common elements of 5S rDNA organization and evolutionary patterns in distant fish genera.

PubMed

Campo, Daniel; García-Vázquez, Eva

2012-01-01

The 5S rDNA is organized in the genome as tandemly repeated copies of a structural unit composed of a coding sequence plus a nontranscribed spacer (NTS). The coding region is highly conserved in the evolution, whereas the NTS vary in both length and sequence. It has been proposed that 5S rRNA genes are members of a gene family that have arisen through concerted evolution. In this study, we describe the molecular organization and evolution of the 5S rDNA in the genera Lepidorhombus and Scophthalmus (Scophthalmidae) and compared it with already known 5S rDNA of the very different genera Merluccius (Merluccidae) and Salmo (Salmoninae), to identify common structural elements or patterns for understanding 5S rDNA evolution in fish. High intra- and interspecific diversity within the 5S rDNA family in all the genera can be explained by a combination of duplications, deletions, and transposition events. Sequence blocks with high similarity in all the 5S rDNA members across species were identified for the four studied genera, with evidences of intense gene conversion within noncoding regions. We propose a model to explain the evolution of the 5S rDNA, in which the evolutionary units are blocks of nucleotides rather than the entire sequences or single nucleotides. This model implies a "two-speed" evolution: slow within blocks (homogenized by recombination) and fast within the gene family (diversified by duplications and deletions).
Genomic analysis of NAC transcription factors in banana (Musa acuminata) and definition of NAC orthologous groups for monocots and dicots.

PubMed

Cenci, Albero; Guignon, Valentin; Roux, Nicolas; Rouard, Mathieu

2014-05-01

Identifying the molecular mechanisms underlying tolerance to abiotic stresses is important in crop breeding. A comprehensive understanding of the gene families associated with drought tolerance is therefore highly relevant. NAC transcription factors form a large plant-specific gene family involved in the regulation of tissue development and responses to biotic and abiotic stresses. The main goal of this study was to set up a framework of orthologous groups determined by an expert sequence comparison of NAC genes from both monocots and dicots. In order to clarify the orthologous relationships among NAC genes of different species, we performed an in-depth comparative study of four divergent taxa, in dicots and monocots, whose genomes have already been completely sequenced: Arabidopsis thaliana, Vitis vinifera, Musa acuminata and Oryza sativa. Due to independent evolution, NAC copy number is highly variable in these plant genomes. Based on an expert NAC sequence comparison, we propose forty orthologous groups of NAC sequences that were probably derived from an ancestor gene present in the most recent common ancestor of dicots and monocots. These orthologous groups provide a curated resource for large-scale protein sequence annotation of NAC transcription factors. The established orthology relationships also provide a useful reference for NAC function studies in newly sequenced genomes such as M. acuminata and other plant species.
Sequence analysis of the complete genome of Trichoplusia ni single nucleopolyhedrovirus and the identification of a baculoviral photolyase gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Willis, Leslie G.; Siepp, Robyn; Stewart, Taryn M.

2005-08-01

The genome of the Trichoplusia ni single nucleopolyhedrovirus (TnSNPV), a group II NPV which infects the cabbage looper (T. ni), has been completely sequenced and analyzed. The TnSNPV DNA genome consists of 134,394 bp and has an overall G + C content of 39%. Gene analysis predicted 144 open reading frames (ORFs) of 150 nucleotides or greater that showed minimal overlap. Comparisons with previously sequenced baculoviruses indicate that 119 TnSNPV ORFs were homologues of previously reported viral gene sequences. Ninety-four TnSNPV ORFs returned an Autographa californica multiple NPV (AcMNPV) homologue while 25 ORFs returned poor or no sequence matches withmore » the current databases. A putative photolyase gene was also identified that had highest amino acid identity to the photolyase genes of Chrysodeixis chalcites NPV (ChchNPV) (47%) and Danio rerio (zebrafish) (40%). In addition unlike all other baculoviruses no obvious homologous repeat (hr) sequences were identified. Comparison of the TnSNPV and AcMNPV genomes provides a unique opportunity to examine two baculoviruses that are highly virulent for a common insect host (T. ni) yet belong to diverse baculovirus taxonomic groups and possess distinct biological features. In vitro fusion assays demonstrated that the TnSNPV F protein induces membrane fusion and syncytia formation and were compared to syncytia formed by AcMNPV GP64.« less
Novel species including Mycobacterium fukienense sp. is found from tuberculosis patients in Fujian Province, China, using phylogenetic analysis of Mycobacterium chelonae/abscessus complex.

PubMed

Zhang, Yuan Yuan; Li, Yan Bing; Huang, Ming Xiang; Zhao, Xiu Qin; Zhang, Li Shui; Liu, Wen En; Wan, Kang Lin

2013-11-01

To identify the novel species 'Mycobacterium fukienense' sp. nov of Mycobacterium chelonae/abscessus complex from tuberculosis patients in Fujian Province, China. Five of 27 clinical Mycobacterium isolates (Cls) were previously identified as M. chelonae/abscessus complex by sequencing the hsp65, rpoB, 16S-23S rRNA internal transcribed spacer region (its), recA and sodA house-keeping genes commonly used to describe the molecular characteristics of Mycobacterium. Clinical Mycobacterium isolates were classified according to the gene sequence using a clustering analysis program. Sequence similarity within clusters and diversity between clusters were analyzed. The 5 isolates were identified with distinct sequences exhibiting 99.8% homology in the hsp65 gene. However, a complete lack of homology was observed among the sequences of the rpoB, 16S-23S rRNA internal transcribed spacer region (its), sodA, and recA genes as compared with the M. abscessus. Furthermore, no match for rpoB, sodA, and recA genes was identified among the published sequences. The novel species, Mycobacterium fukienense, is identified from tuberculosis patients in Fujian Province, China, which does not belong to any existing subspecies of M. chelonea/abscessus complex. Copyright © 2013 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.
Massively Parallel DNA Sequencing Facilitates Diagnosis of Patients with Usher Syndrome Type 1

PubMed Central

Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-ichi

2014-01-01

Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance. PMID:24618850
Massively parallel DNA sequencing facilitates diagnosis of patients with Usher syndrome type 1.

PubMed

Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-Ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-Ichi

2014-01-01

Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance.
Environmental genomics of "Haloquadratum walsbyi" in a saltern crystallizer indicates a large pool of accessory genes in an otherwise coherent species

PubMed Central

Legault, Boris A; Lopez-Lopez, Arantxa; Alba-Casado, Jose Carlos; Doolittle, W Ford; Bolhuis, Henk; Rodriguez-Valera, Francisco; Papke, R Thane

2006-01-01

Background Mature saturated brine (crystallizers) communities are largely dominated (>80% of cells) by the square halophilic archaeon "Haloquadratum walsbyi". The recent cultivation of the strain HBSQ001 and thesequencing of its genome allows comparison with the metagenome of this taxonomically simplified environment. Similar studies carried out in other extreme environments have revealed very little diversity in gene content among the cell lineages present. Results The metagenome of the microbial community of a crystallizer pond has been analyzed by end sequencing a 2000 clone fosmid library and comparing the sequences obtained with the genome sequence of "Haloquadratum walsbyi". The genome of the sequenced strain was retrieved nearly complete within this environmental DNA library. However, many ORF's that could be ascribed to the "Haloquadratum" metapopulation by common genome characteristics or scaffolding to the strain genome were not present in the specific sequenced isolate. Particularly, three regions of the sequenced genome were associated with multiple rearrangements and the presence of different genes from the metapopulation. Many transposition and phage related genes were found within this pool which, together with the associated atypical GC content in these areas, supports lateral gene transfer mediated by these elements as the most probable genetic cause of this variability. Additionally, these sequences were highly enriched in putative regulatory and signal transduction functions. Conclusion These results point to a large pan-genome (total gene repertoire of the genus/species) even in this highly specialized extremophile and at a single geographic location. The extensive gene repertoire is what might be expected of a population that exploits a diverse nutrient pool, resulting from the degradation of biomass produced at lower salinities. PMID:16820057
Candidate Gene Identification with SNP Marker-Based Fine Mapping of Anthracnose Resistance Gene Co-4 in Common Bean.

PubMed

Burt, Andrew J; William, H Manilal; Perry, Gregory; Khanal, Raja; Pauls, K Peter; Kelly, James D; Navabi, Alireza

2015-01-01

Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris). Alleles at the Co-4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08) where Co-4 is localized. Three SCAR markers with known linkage to Co-4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK-4 loci found in previous studies. It is possible that the Co-4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases.
Candidate Gene Identification with SNP Marker-Based Fine Mapping of Anthracnose Resistance Gene Co-4 in Common Bean

PubMed Central

Burt, Andrew J.; William, H. Manilal; Perry, Gregory; Khanal, Raja; Pauls, K. Peter; Kelly, James D.; Navabi, Alireza

2015-01-01

Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris). Alleles at the Co–4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08) where Co–4 is localized. Three SCAR markers with known linkage to Co–4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK–4 loci found in previous studies. It is possible that the Co–4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases. PMID:26431031
The human myelin oligodendrocyte glycoprotein (MOG) gene: Complete nucleotide sequence and structural characterization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paule Roth, M.; Malfroy, L.; Offer, C.

1995-07-20

Human myelin oligodendrocyte glycoprotein (MOG), a myelin component of the central nervous system, is a candidate target antigen for autoimmune-mediated demyelination. We have isolated and sequenced part of a cosmid clone that contains the entire human MOG gene. The primary nuclear transcript, extending from the putative start of transcription to the site of poly(A) addition, is 15,561 nucleotides in length. The human MOG gene contains 8 exons, separated by 7 introns; canonical intron/exon boundary sites are observed at each junction. The introns vary in size from 242 to 6484 bp and contain numerous repetitive DNA elements, including 14 Alu sequencesmore » within 3 introns. Another Alu element is located in the 3{prime}-untranslated region of the gene. Alu sequences were classified with respect to subfamily assignment. Seven hundred sixty-three nucleotides 5{prime} of the transcription start and 1214 nucleotides 3{prime} of the poly(A) addition sites were also sequenced. The 5{prime}-flanking region revealed the presence of several consensus sequences that could be relevant in the transcription of the MOG gene, in particular binding sites in common with other myelin gene promoters. Two polymorphic intragenic dinucleotide (CA){sub n} and tetranucleotide (TAAA){sub n} repeats were identified and may provide genetic marker tools for association and linkage studies. 50 refs., 3 figs., 3 tabs.« less
Transformation of Chloroplast Ribosomal RNA Genes in Chlamydomonas: Molecular and Genetic Characterization of Integration Events

PubMed Central

Newman, S. M.; Boynton, J. E.; Gillham, N. W.; Randolph-Anderson, B. L.; Johnson, A. M.; Harris, E. H.

1990-01-01

Transformation of chloroplast ribosomal RNA (rRNA) genes in Chlamydomonas has been achieved by the biolistic process using cloned chloroplast DNA fragments carrying mutations that confer antibiotic resistance. The sites of exchange employed during the integration of the donor DNA into the recipient genome have been localized using a combination of antibiotic resistance mutations in the 16S and 23S rRNA genes and restriction fragment length polymorphisms that flank these genes. Complete or nearly complete replacement of a region of the chloroplast genome in the recipient cell by the corresponding sequence from the donor plasmid was the most common integration event. Exchange events between the homologous donor and recipient sequences occurred preferentially near the vector:insert junctions. Insertion of the donor rRNA genes and flanking sequences into one inverted repeat of the recipient genome was followed by intramolecular copy correction so that both copies of the inverted repeat acquired identical sequences. Increased frequencies of rRNA gene transformants were achieved by reducing the copy number of the chloroplast genome in the recipient cells and by decreasing the heterology between donor and recipient DNA sequences flanking the selectable markers. In addition to producing bona fide chloroplast rRNA transformants, the biolistic process induced mutants resistant to low levels of streptomycin, typical of nuclear mutations in Chlamydomonas. PMID:1981764
Phylogenetic shadowing of primate sequences to find functional regions of the human genome.

PubMed

Boffelli, Dario; McAuliffe, Jon; Ovcharenko, Dmitriy; Lewis, Keith D; Ovcharenko, Ivan; Pachter, Lior; Rubin, Edward M

2003-02-28

Nonhuman primates represent the most relevant model organisms to understand the biology of Homo sapiens. The recent divergence and associated overall sequence conservation between individual members of this taxon have nonetheless largely precluded the use of primates in comparative sequence studies. We used sequence comparisons of an extensive set of Old World and New World monkeys and hominoids to identify functional regions in the human genome. Analysis of these data enabled the discovery of primate-specific gene regulatory elements and the demarcation of the exons of multiple genes. Much of the information content of the comprehensive primate sequence comparisons could be captured with a small subset of phylogenetically close primates. These results demonstrate the utility of intraprimate sequence comparisons to discover common mammalian as well as primate-specific functional elements in the human genome, which are unattainable through the evaluation of more evolutionarily distant species.
Pyviko: an automated Python tool to design gene knockouts in complex viruses with overlapping genes.

PubMed

Taylor, Louis J; Strebel, Klaus

2017-01-07

Gene knockouts are a common tool used to study gene function in various organisms. However, designing gene knockouts is complicated in viruses, which frequently contain sequences that code for multiple overlapping genes. Designing mutants that can be traced by the creation of new or elimination of existing restriction sites further compounds the difficulty in experimental design of knockouts of overlapping genes. While software is available to rapidly identify restriction sites in a given nucleotide sequence, no existing software addresses experimental design of mutations involving multiple overlapping amino acid sequences in generating gene knockouts. Pyviko performed well on a test set of over 240,000 gene pairs collected from viral genomes deposited in the National Center for Biotechnology Information Nucleotide database, identifying a point mutation which added a premature stop codon within the first 20 codons of the target gene in 93.2% of all tested gene-overprinted gene pairs. This shows that Pyviko can be used successfully in a wide variety of contexts to facilitate the molecular cloning and study of viral overprinted genes. Pyviko is an extensible and intuitive Python tool for designing knockouts of overlapping genes. Freely available as both a Python package and a web-based interface ( http://louiejtaylor.github.io/pyViKO/ ), Pyviko simplifies the experimental design of gene knockouts in complex viruses with overlapping genes.
Gene Discovery in the Apicomplexa as Revealed by EST Sequencing and Assembly of a Comparative Gene Database

PubMed Central

Li, Li; Brunk, Brian P.; Kissinger, Jessica C.; Pape, Deana; Tang, Keliang; Cole, Robert H.; Martin, John; Wylie, Todd; Dante, Mike; Fogarty, Steven J.; Howe, Daniel K.; Liberator, Paul; Diaz, Carmen; Anderson, Jennifer; White, Michael; Jerome, Maria E.; Johnson, Emily A.; Radke, Jay A.; Stoeckert, Christian J.; Waterston, Robert H.; Clifton, Sandra W.; Roos, David S.; Sibley, L. David

2003-01-01

Large-scale EST sequencing projects for several important parasites within the phylum Apicomplexa were undertaken for the purpose of gene discovery. Included were several parasites of medical importance (Plasmodium falciparum, Toxoplasma gondii) and others of veterinary importance (Eimeria tenella, Sarcocystis neurona, and Neospora caninum). A total of 55,192 ESTs, deposited into dbEST/GenBank, were included in the analyses. The resulting sequences have been clustered into nonredundant gene assemblies and deposited into a relational database that supports a variety of sequence and text searches. This database has been used to compare the gene assemblies using BLAST similarity comparisons to the public protein databases to identify putative genes. Of these new entries, ∼15%–20% represent putative homologs with a conservative cutoff of p < 10−9, thus identifying many conserved genes that are likely to share common functions with other well-studied organisms. Gene assemblies were also used to identify strain polymorphisms, examine stage-specific expression, and identify gene families. An interesting class of genes that are confined to members of this phylum and not shared by plants, animals, or fungi, was identified. These genes likely mediate the novel biological features of members of the Apicomplexa and hence offer great potential for biological investigation and as possible therapeutic targets. [The sequence data from this study have been submitted to dbEST division of GenBank under accession nos.: Toxoplasma gondii: –, –, –, –, – , –, –, –, –. Plasmodium falciparum: –, –, –, –. Sarcocystis neurona: , , , , , , , , , , , , , –, –, –, –, –. Eimeria tenella: –, –, –, –, –, –, –, –, – , –, –, –, –, –, –, –, –, –, –, –. Neospora caninum: –, –, , – , –, –.] PMID:12618375
Myophosphorylase (PYGM) mutations determined by next generation sequencing in a cohort from Turkey with McArdle disease.

PubMed

Inal-Gültekin, Güldal; Toptaş-Hekimoğlu, Bahar; Görmez, Zeliha; Gelişin, Özlem; Durmuş, Hacer; Ergüner, Bekir; Demirci, Hüseyin; Sağıroğlu, Mahmut Ş; Parman, Yeşim; Deymeer, Feza; Yılmaz-Aydoğan, Hülya; Pençe, Sadrettin; Bekircan-Kurt, Can Ebru; Tan, Ersin; Erdem-Özdamar, Sevim; Üstek, Duran; Giger, Urs; Öztürk, Oğuz; Serdaroğlu-Oflazer, Piraye

2017-11-01

This study aimed to identify PYGM mutations in patients with McArdle disease from Turkey by next generation sequencing (NGS). Genomic DNA was extracted from the blood of the McArdle patients (n = 67) and unrelated healthy volunteers (n = 53). The PYGM gene was sequenced with NGS and the observed mutations were validated by direct Sanger sequencing. A diagnostic algorithm was developed for patients with suspected McArdle disease. A total of 16 deleterious PYGM mutations were identified, of which 5 were novel, including 1 splice-site donor, 1 frame-shift, and 3 non-synonymous variants. The p.Met1Val (27-patients/11-families) was the most common PYGM mutation, followed by p.Arg576* (6/4), c.1827+7A>G (5/4), c.772+2_3delTG (5/3), p.Phe710del (4/2), p.Lys754Asnfs (2/1), and p.Arg50* (1/1). A molecular diagnostic flowchart is proposed for the McArdle patients in Turkey, covering the 6 most common PYGM mutations found in Turkey as well as the most common mutation in Europe. The diagnostic algorithm may alleviate the need for muscle biopsies in 77.6% of future patients. A prevalence of any of the mutations to a geographical region in Turkey was not identified. Furthermore, the NGS approach to sequence the entire PYGM gene was successful in detecting a common missense mutation and discovering novel mutations in this population study. Copyright © 2017 Elsevier B.V. All rights reserved.
PCV: An Alignment Free Method for Finding Homologous Nucleotide Sequences and its Application in Phylogenetic Study.

PubMed

Kumar, Rajnish; Mishra, Bharat Kumar; Lahiri, Tapobrata; Kumar, Gautam; Kumar, Nilesh; Gupta, Rahul; Pal, Manoj Kumar

2017-06-01

Online retrieval of the homologous nucleotide sequences through existing alignment techniques is a common practice against the given database of sequences. The salient point of these techniques is their dependence on local alignment techniques and scoring matrices the reliability of which is limited by computational complexity and accuracy. Toward this direction, this work offers a novel way for numerical representation of genes which can further help in dividing the data space into smaller partitions helping formation of a search tree. In this context, this paper introduces a 36-dimensional Periodicity Count Value (PCV) which is representative of a particular nucleotide sequence and created through adaptation from the concept of stochastic model of Kolekar et al. (American Institute of Physics 1298:307-312, 2010. doi: 10.1063/1.3516320 ). The PCV construct uses information on physicochemical properties of nucleotides and their positional distribution pattern within a gene. It is observed that PCV representation of gene reduces computational cost in the calculation of distances between a pair of genes while being consistent with the existing methods. The validity of PCV-based method was further tested through their use in molecular phylogeny constructs in comparison with that using existing sequence alignment methods.
Modeling Gene-Wise Dependencies Improves the Identification of Drug Response Biomarkers in Cancer Studies | Office of Cancer Genomics

Cancer.gov

Recent advances in biomedical and sequencing technologies have revealed the genomic landscape of common forms of human cancer in unprecedented detail. Of the genes that drive tumorigenesis when altered, for most cancers it is believed that there exist a small number of “mountains” (genes altered at high frequencies across the population), and a much larger number of “hills” (much less frequently altered genes).
High-Resolution Whole-Genome Sequencing Reveals That Specific Chromatin Domains from Most Human Chromosomes Associate with Nucleoli

PubMed Central

van Koningsbruggen, Silvana; Gierliński, Marek; Schofield, Pietá; Martin, David; Barton, Geoffey J.; Ariyurek, Yavuz; den Dunnen, Johan T.

2010-01-01

The nuclear space is mostly occupied by chromosome territories and nuclear bodies. Although this organization of chromosomes affects gene function, relatively little is known about the role of nuclear bodies in the organization of chromosomal regions. The nucleolus is the best-studied subnuclear structure and forms around the rRNA repeat gene clusters on the acrocentric chromosomes. In addition to rDNA, other chromatin sequences also surround the nucleolar surface and may even loop into the nucleolus. These additional nucleolar-associated domains (NADs) have not been well characterized. We present here a whole-genome, high-resolution analysis of chromatin endogenously associated with nucleoli. We have used a combination of three complementary approaches, namely fluorescence comparative genome hybridization, high-throughput deep DNA sequencing and photoactivation combined with time-lapse fluorescence microscopy. The data show that specific sequences from most human chromosomes, in addition to the rDNA repeat units, associate with nucleoli in a reproducible and heritable manner. NADs have in common a high density of AT-rich sequence elements, low gene density and a statistically significant enrichment in transcriptionally repressed genes. Unexpectedly, both the direct DNA sequencing and fluorescence photoactivation data show that certain chromatin loci can specifically associate with either the nucleolus, or the nuclear envelope. PMID:20826608
High-resolution whole-genome sequencing reveals that specific chromatin domains from most human chromosomes associate with nucleoli.

PubMed

van Koningsbruggen, Silvana; Gierlinski, Marek; Schofield, Pietá; Martin, David; Barton, Geoffey J; Ariyurek, Yavuz; den Dunnen, Johan T; Lamond, Angus I

2010-11-01

The nuclear space is mostly occupied by chromosome territories and nuclear bodies. Although this organization of chromosomes affects gene function, relatively little is known about the role of nuclear bodies in the organization of chromosomal regions. The nucleolus is the best-studied subnuclear structure and forms around the rRNA repeat gene clusters on the acrocentric chromosomes. In addition to rDNA, other chromatin sequences also surround the nucleolar surface and may even loop into the nucleolus. These additional nucleolar-associated domains (NADs) have not been well characterized. We present here a whole-genome, high-resolution analysis of chromatin endogenously associated with nucleoli. We have used a combination of three complementary approaches, namely fluorescence comparative genome hybridization, high-throughput deep DNA sequencing and photoactivation combined with time-lapse fluorescence microscopy. The data show that specific sequences from most human chromosomes, in addition to the rDNA repeat units, associate with nucleoli in a reproducible and heritable manner. NADs have in common a high density of AT-rich sequence elements, low gene density and a statistically significant enrichment in transcriptionally repressed genes. Unexpectedly, both the direct DNA sequencing and fluorescence photoactivation data show that certain chromatin loci can specifically associate with either the nucleolus, or the nuclear envelope.

Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses

PubMed Central

Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael

2013-01-01

Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343
Transcriptional insulation of the human keratin 18 gene in transgenic mice.

PubMed Central

Neznanov, N; Thorey, I S; Ceceña, G; Oshima, R G

1993-01-01

Expression of the 10-kb human keratin 18 (K18) gene in transgenic mice results in efficient and appropriate tissue-specific expression in a variety of internal epithelial organs, including liver, lung, intestine, kidney, and the ependymal epithelium of brain, but not in spleen, heart, or skeletal muscle. Expression at the RNA level is directly proportional to the number of integrated K18 transgenes. These results indicate that the K18 gene is able to insulate itself both from the commonly observed cis-acting effects of the sites of integration and from the potential complications of duplicated copies of the gene arranged in head-to-tail fashion. To begin to identify the K18 gene sequences responsible for this property of transcriptional insulation, additional transgenic mouse lines containing deletions of either the 5' or 3' distal end of the K18 gene have been characterized. Deletion of 1.5 kb of the distal 5' flanking sequence has no effect upon either the tissue specificity or the copy number-dependent behavior of the transgene. In contrast, deletion of the 3.5-kb 3' flanking sequence of the gene results in the loss of the copy number-dependent behavior of the gene in liver and intestine. However, expression in kidney, lung, and brain remains efficient and copy number dependent in these transgenic mice. Furthermore, herpes simplex virus thymidine kinase gene expression is copy number dependent in transgenic mice when the gene is located between the distal 5'- and 3'-flanking sequences of the K18 gene. Each adult transgenic male expressed the thymidine kinase gene in testes and brain and proportionally to the number of integrated transgenes. We conclude that the characteristic of copy number-dependent expression of the K18 gene is tissue specific because the sequence requirements for transcriptional insulation in adult liver and intestine are different from those for lung and kidney. In addition, the behavior of the transgenic thymidine kinase gene in testes and brain suggests that the property of transcriptional insulation of the K18 gene may be conferred by the distal flanking sequences of the K18 gene and, additionally, may function for other genes. Images PMID:7681143
Homology of aspartyl- and lysyl-tRNA synthetases.

PubMed Central

Gampel, A; Tzagoloff, A

1989-01-01

The yeast nuclear gene MSD1 coding for mitochondrial aspartyl-tRNA synthetase has been cloned and sequenced. The identity of the gene is confirmed by the following evidence. (i) The primary structure of the protein derived from the gene sequence is similar to that of the yeast cytoplasmic aspartyl-tRNA synthetase. (ii) In situ disruption of MSD1 in a respiratory-competent haploid strain of yeast induces a pleiotropic phenotype consistent with a lesion in mitochondrial protein synthesis. (iii) Mitochondria from a mutant with a disrupted chromosomal copy of MSD1 are unable to acylate mitochondrial aspartyl-tRNA. The primary structures of the cytoplasmic and mitochondrial aspartyl-tRNA synthetases are similar to the yeast cytoplasmic lysyl-tRNA synthetase, suggesting that the two types of synthetases may have a common evolutionary origin. Searches of the current protein banks also have revealed a high degree of sequence similarity of the lysyl-tRNA synthetase to the product of the Escherichia coli herC gene and to the partial sequence of a protein encoded by an unidentified reading frame located adjacent to the E. coli frdA gene. Based on the sequence similarities and the map positions of the herC and frdA loci, we propose herC to be the structural gene of the constitutively expressed lysyl-tRNA synthetase of E. coli and the unidentified reading frame to be the structural gene of the heat-inducible lysyl-tRNA synthetase. Images PMID:2668951
The complete mitochondrial genome of Koerneria sudhausi (Diplogasteromorpha: Nematoda) supports monophyly of Diplogasteromorpha within Rhabditomorpha.

PubMed

Kim, Taeho; Kim, Jiyeon; Nadler, Steven A; Park, Joong-Ki

2016-05-01

Testing hypotheses of monophyly for different nematode groups in the context of broad representation of nematode diversity is central to understanding the patterns and processes of nematode evolution. Herein sequence information from mitochondrial genomes is used to test the monophyly of diplogasterids, which includes an important nematode model organism. The complete mitochondrial genome sequence of Koerneria sudhausi, a representative of Diplogasteromorpha, was determined and used for phylogenetic analyses along with 60 other nematode species. The mtDNA of K. sudhausi is comprised of 16,005 bp that includes 36 genes (12 protein-coding genes, 2 ribosomal RNA genes and 22 transfer RNA genes) encoded in the same direction. Phylogenetic trees inferred from amino acid and nucleotide sequence data for the 12 protein-coding genes strongly supported the sister relationship of K. sudhausi with Pristionchus pacificus, supporting Diplogasteromorpha. The gene order of K. sudhausi is identical to that most commonly found in members of the Rhabditomorpha + Ascaridomorpha + Diplogasteromorpha clade, with an exception of some tRNA translocations. Both the gene order pattern and sequence-based phylogenetic analyses support a close relationship between the diplogasterid species and Rhabditomorpha. The nesting of the two diplogasteromorph species within Rhabditomorpha is consistent with most molecular phylogenies for the group, but inconsistent with certain morphology-based hypotheses that asserted phylogenetic affinity between diplogasteromorphs and tylenchomorphs. Phylogenetic analysis of mitochondrial genome sequences strongly supports monophyly of the diplogasteromorpha.
Organization of the Escherichia coli K-12 gene cluster responsible for production of the extracellular polysaccharide colanic acid.

PubMed Central

Stevenson, G; Andrianopoulos, K; Hobbs, M; Reeves, P R

1996-01-01

Colanic acid (CA) is an extracellular polysaccharide produced by most Escherichia coli strains as well as by other species of the family Enterobacteriaceae. We have determined the sequence of a 23-kb segment of the E. coli K-12 chromosome which includes the cluster of genes necessary for production of CA. The CA cluster comprises 19 genes. Two other sequenced genes (orf1.3 and galF), which are situated between the CA cluster and the O-antigen cluster, were shown to be unnecessary for CA production. The CA cluster includes genes for synthesis of GDP-L-fucose, one of the precursors of CA, and the gene for one of the enzymes in this pathway (GDP-D-mannose 4,6-dehydratase) was identified by biochemical assay. Six of the inferred proteins show sequence similarity to glycosyl transferases, and two others have sequence similarity to acetyl transferases. Another gene (wzx) is predicted to encode a protein with multiple transmembrane segments and may function in export of the CA repeat unit from the cytoplasm into the periplasm in a process analogous to O-unit export. The first three genes of the cluster are predicted to encode an outer membrane lipoprotein, a phosphatase, and an inner membrane protein with an ATP-binding domain. Since homologs of these genes are found in other extracellular polysaccharide gene clusters, they may have a common function, such as export of polysaccharide from the cell. PMID:8759852
Mutation profile of BBS genes in Iranian patients with Bardet-Biedl syndrome: genetic characterization and report of nine novel mutations in five BBS genes.

PubMed

Fattahi, Zohreh; Rostami, Parvin; Najmabadi, Amin; Mohseni, Marzieh; Kahrizi, Kimia; Akbari, Mohammad Reza; Kariminejad, Ariana; Najmabadi, Hossein

2014-07-01

Bardet-Biedl syndrome (BBS) is a rare ciliopathy disorder that is clinically and genetically heterogeneous with 18 known genes. This study was performed to characterize responsible genes and mutation spectrum in a cohort of 14 Iranian families with BBS. Sanger sequencing of the most commonly mutated genes (BBS1, BBS2 and BBS10) accounting for ∼50% of BBS patients determined mutations only in BBS2, including three novel mutations. Next, three of the remaining patients were subjected to whole exome sequencing with 96% at 20 × depth of coverage that revealed novel BBS4 mutation. Observation of no mutation in the other patients represents the possible presence of novel genes. Screening of the remaining patients for six other genes (BBS3, BBS4, BBS6, BBS7, BBS9 and BBS12) revealed five novel mutations. This result represents another indication for the genetic heterogeneity of BBS and extends the mutational spectrum of the disease by introducing nine novel mutations in five BBS genes. In conclusion, although BBS1 and BBS10 are among the most commonly mutated genes in other populations like Caucasian, these two seem not to have an important role in Iranian patients. This suggests that a different strategy in molecular genetics diagnostic approaches in Middle Eastern countries such as Iran should be considered.
Human ribosomal RNA gene: nucleotide sequence of the transcription initiation region and comparison of three mammalian genes.

PubMed Central

Financsek, I; Mizumoto, K; Mishima, Y; Muramatsu, M

1982-01-01

The transcription initiation site of the human ribosomal RNA gene (rDNA) was located by using the single-strand specific nuclease protection method and by determining the first nucleotide of the in vitro capped 45S preribosomal RNA. The sequence of 1,211 nucleotides surrounding the initiation site was determined. The sequenced region was found to consist of 75% G and C and to contain a number of short direct and inverted repeats and palindromes. By comparison of the corresponding initiation regions of three mammalian species, several conserved sequences were found upstream and downstream from the transcription starting point. Two short A + T-rich sequences are present on human, mouse, and rat ribosomal RNA genes between the initiation site and 40 nucleotides upstream, and a C + T cluster is located at a position around -60. At and downstream from the initiation site, a common sequence, T-AG-C-T-G-A-C-A-C-G-C-T-G-T-C-C-T-CT-T, was found in the three genes from position -1 through +18. The strong conservation of these sequences suggests their functional significance in rDNA. The S1 nuclease protection experiments with cloned rDNA fragments indicated the presence in human 45S RNA of molecules several hundred nucleotides shorter than the supposed primary transcript. The first 19 nucleotides of these molecules appear identical--except for one mismatch--to the nucleotide sequence of the 5' end of a supposed early processing product of the mouse 45S RNA. Images PMID:6954460
Negligible impact of rare autoimmune-locus coding-region variants on missing heritability.

PubMed

Hunt, Karen A; Mistry, Vanisha; Bockett, Nicholas A; Ahmad, Tariq; Ban, Maria; Barker, Jonathan N; Barrett, Jeffrey C; Blackburn, Hannah; Brand, Oliver; Burren, Oliver; Capon, Francesca; Compston, Alastair; Gough, Stephen C L; Jostins, Luke; Kong, Yong; Lee, James C; Lek, Monkol; MacArthur, Daniel G; Mansfield, John C; Mathew, Christopher G; Mein, Charles A; Mirza, Muddassar; Nutland, Sarah; Onengut-Gumuscu, Suna; Papouli, Efterpi; Parkes, Miles; Rich, Stephen S; Sawcer, Steven; Satsangi, Jack; Simmonds, Matthew J; Trembath, Richard C; Walker, Neil M; Wozniak, Eva; Todd, John A; Simpson, Michael A; Plagnol, Vincent; van Heel, David A

2013-06-13

Genome-wide association studies (GWAS) have identified common variants of modest-effect size at hundreds of loci for common autoimmune diseases; however, a substantial fraction of heritability remains unexplained, to which rare variants may contribute. To discover rare variants and test them for association with a phenotype, most studies re-sequence a small initial sample size and then genotype the discovered variants in a larger sample set. This approach fails to analyse a large fraction of the rare variants present in the entire sample set. Here we perform simultaneous amplicon-sequencing-based variant discovery and genotyping for coding exons of 25 GWAS risk genes in 41,911 UK residents of white European origin, comprising 24,892 subjects with six autoimmune disease phenotypes and 17,019 controls, and show that rare coding-region variants at known loci have a negligible role in common autoimmune disease susceptibility. These results do not support the rare-variant synthetic genome-wide-association hypothesis (in which unobserved rare causal variants lead to association detected at common tag variants). Many known autoimmune disease risk loci contain multiple, independently associated, common and low-frequency variants, and so genes at these loci are a priori stronger candidates for harbouring rare coding-region variants than other genes. Our data indicate that the missing heritability for common autoimmune diseases may not be attributable to the rare coding-region variant portion of the allelic spectrum, but perhaps, as others have proposed, may be a result of many common-variant loci of weak effect.
Analysis of alterative cleavage and polyadenylation by 3′ region extraction and deep sequencing

PubMed Central

Hoque, Mainul; Ji, Zhe; Zheng, Dinghai; Luo, Wenting; Li, Wencheng; You, Bei; Park, Ji Yeon; Yehia, Ghassan; Tian, Bin

2012-01-01

Alternative cleavage and polyadenylation (APA) leads to mRNA isoforms with different coding sequences (CDS) and/or 3′ untranslated regions (3′UTRs). Using 3′ Region Extraction And Deep Sequencing (3′READS), a method which addresses the internal priming and oligo(A) tail issues that commonly plague polyA site (pA) identification, we comprehensively mapped pAs in the mouse genome, thoroughly annotating 3′ ends of genes and revealing over five thousand pAs (~8% of total) flanked by A-rich sequences, which have hitherto been overlooked. About 79% of mRNA genes and 66% of long non-coding RNA (lncRNA) genes have APA; but these two gene types have distinct usage patterns for pAs in introns and upstream exons. Promoter-distal pAs become relatively more abundant during embryonic development and cell differentiation, a trend affecting pAs in both 3′-most exons and upstream regions. Upregulated isoforms generally have stronger pAs, suggesting global modulation of the 3′ end processing activity in development and differentiation. PMID:23241633
A segment of rbcL gene as a potential tool for forensic discrimination of Cannabis sativa seized at Rio de Janeiro, Brazil.

PubMed

Mello, I C T; Ribeiro, A S D; Dias, V H G; Silva, R; Sabino, B D; Garrido, R G; Seldin, L; de Moura Neto, Rodrigo Soares

2016-03-01

Cannabis sativa, known by the common name marijuana, is the psychoactive drug most widely distributed in the world. Identification of Cannabis cultivars may be useful for association to illegal crops, which may reveal trafficking routes and related criminal groups. This study provides evidence for the performance of a segment of the rbcL gene, through genetic signature, as a tool for identification for C. sativa samples apprehended by the Rio de Janeiro Police, Brazil. The PCR amplified and further sequenced the fragment of approximately 561 bp of 24 samples of C. sativa rbcL gene and showed the same nucleotide sequences, suggesting a possible genetic similarity or identical varieties. Comparing with other Cannabaceae family sequences, we have found 99% of similarity between the Rio de Janeiro sequence and three other C. sativa rbcL genes. These findings suggest that the fragment utilized at this study is efficient in identifying C. sativa samples, therefore, useful in genetic discrimination of samples seized in forensic cases.
Gene Composer: database software for protein construct design, codon engineering, and gene synthesis

PubMed Central

Lorimer, Don; Raymond, Amy; Walchli, John; Mixon, Mark; Barrow, Adrienne; Wallace, Ellen; Grice, Rena; Burgin, Alex; Stewart, Lance

2009-01-01

Background To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering. Results An interactive Alignment Viewer allows the researcher to simultaneously visualize sequence conservation in the context of known protein secondary structure, ligand contacts, water contacts, crystal contacts, B-factors, solvent accessible area, residue property type and several other useful property views. The Construct Design Module enables the facile design of novel protein constructs with altered N- and C-termini, internal insertions or deletions, point mutations, and desired affinity tags. The modifications can be combined and permuted into multiple protein constructs, and then virtually cloned in silico into defined expression vectors. The Gene Design Module uses a protein-to-gene algorithm that automates the back-translation of a protein amino acid sequence into a codon engineered nucleic acid gene sequence according to a selected codon usage table with minimal codon usage threshold, defined G:C% content, and desired sequence features achieved through synonymous codon selection that is optimized for the intended expression system. The gene-to-oligo algorithm of the Gene Design Module plans out all of the required overlapping oligonucleotides and mutagenic primers needed to synthesize the desired gene constructs by PCR, and for physically cloning them into selected vectors by the most popular subcloning strategies. Conclusion We present a complete description of Gene Composer functionality, and an efficient PCR-based synthetic gene assembly procedure with mis-match specific endonuclease error correction in combination with PIPE cloning. In a sister manuscript we present data on how Gene Composer designed genes and protein constructs can result in improved protein production for structural studies. PMID:19383142
Gene composer: database software for protein construct design, codon engineering, and gene synthesis.

PubMed

Lorimer, Don; Raymond, Amy; Walchli, John; Mixon, Mark; Barrow, Adrienne; Wallace, Ellen; Grice, Rena; Burgin, Alex; Stewart, Lance

2009-04-21

To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering. An interactive Alignment Viewer allows the researcher to simultaneously visualize sequence conservation in the context of known protein secondary structure, ligand contacts, water contacts, crystal contacts, B-factors, solvent accessible area, residue property type and several other useful property views. The Construct Design Module enables the facile design of novel protein constructs with altered N- and C-termini, internal insertions or deletions, point mutations, and desired affinity tags. The modifications can be combined and permuted into multiple protein constructs, and then virtually cloned in silico into defined expression vectors. The Gene Design Module uses a protein-to-gene algorithm that automates the back-translation of a protein amino acid sequence into a codon engineered nucleic acid gene sequence according to a selected codon usage table with minimal codon usage threshold, defined G:C% content, and desired sequence features achieved through synonymous codon selection that is optimized for the intended expression system. The gene-to-oligo algorithm of the Gene Design Module plans out all of the required overlapping oligonucleotides and mutagenic primers needed to synthesize the desired gene constructs by PCR, and for physically cloning them into selected vectors by the most popular subcloning strategies. We present a complete description of Gene Composer functionality, and an efficient PCR-based synthetic gene assembly procedure with mis-match specific endonuclease error correction in combination with PIPE cloning. In a sister manuscript we present data on how Gene Composer designed genes and protein constructs can result in improved protein production for structural studies.
RNA Sequencing-Based Genome Reannotation of the Dermatophyte Arthroderma benhamiae and Characterization of Its Secretome and Whole Gene Expression Profile during Infection

PubMed Central

De Coi, Niccolò; Feuermann, Marc; Schmid-Siegert, Emanuel; Băguţ, Elena-Tatiana; Mignon, Bernard; Waridel, Patrice; Peter, Corinne; Pradervand, Sylvain

2016-01-01

ABSTRACT Dermatophytes are the most common agents of superficial mycoses in humans and animals. The aim of the present investigation was to systematically identify the extracellular, possibly secreted, proteins that are putative virulence factors and antigenic molecules of dermatophytes. A complete gene expression profile of Arthroderma benhamiae was obtained during infection of its natural host (guinea pig) using RNA sequencing (RNA-seq) technology. This profile was completed with those of the fungus cultivated in vitro in two media containing either keratin or soy meal protein as the sole source of nitrogen and in Sabouraud medium. More than 60% of transcripts deduced from RNA-seq data differ from those previously deposited for A. benhamiae. Using these RNA-seq data along with an automatic gene annotation procedure, followed by manual curation, we produced a new annotation of the A. benhamiae genome. This annotation comprised 7,405 coding sequences (CDSs), among which only 2,662 were identical to the currently available annotation, 383 were newly identified, and 15 secreted proteins were manually corrected. The expression profile of genes encoding proteins with a signal peptide in infected guinea pigs was found to be very different from that during in vitro growth when using keratin as the substrate. Especially, the sets of the 12 most highly expressed genes encoding proteases with a signal sequence had only the putative vacuolar aspartic protease gene PEP2 in common, during infection and in keratin medium. The most upregulated gene encoding a secreted protease during infection was that encoding subtilisin SUB6, which is a known major allergen in the related dermatophyte Trichophyton rubrum. IMPORTANCE Dermatophytoses (ringworm, jock itch, athlete’s foot, and nail infections) are the most common fungal infections, but their virulence mechanisms are poorly understood. Combining transcriptomic data obtained from growth under various culture conditions with data obtained during infection led to a significantly improved genome annotation. About 65% of the protein-encoding genes predicted with our protocol did not match the existing annotation for A. benhamiae. Comparing gene expression during infection on guinea pigs with keratin degradation in vitro, which is supposed to mimic the host environment, revealed the critical importance of using real in vivo conditions for investigating virulence mechanisms. The analysis of genes expressed in vivo, encoding cell surface and secreted proteins, particularly proteases, led to the identification of new allergen and virulence factor candidates. PMID:27822542
RNA Sequencing-Based Genome Reannotation of the Dermatophyte Arthroderma benhamiae and Characterization of Its Secretome and Whole Gene Expression Profile during Infection.

PubMed

Tran, Van Du T; De Coi, Niccolò; Feuermann, Marc; Schmid-Siegert, Emanuel; Băguţ, Elena-Tatiana; Mignon, Bernard; Waridel, Patrice; Peter, Corinne; Pradervand, Sylvain; Pagni, Marco; Monod, Michel

2016-01-01

Dermatophytes are the most common agents of superficial mycoses in humans and animals. The aim of the present investigation was to systematically identify the extracellular, possibly secreted, proteins that are putative virulence factors and antigenic molecules of dermatophytes. A complete gene expression profile of Arthroderma benhamiae was obtained during infection of its natural host (guinea pig) using RNA sequencing (RNA-seq) technology. This profile was completed with those of the fungus cultivated in vitro in two media containing either keratin or soy meal protein as the sole source of nitrogen and in Sabouraud medium. More than 60% of transcripts deduced from RNA-seq data differ from those previously deposited for A. benhamiae . Using these RNA-seq data along with an automatic gene annotation procedure, followed by manual curation, we produced a new annotation of the A. benhamiae genome. This annotation comprised 7,405 coding sequences (CDSs), among which only 2,662 were identical to the currently available annotation, 383 were newly identified, and 15 secreted proteins were manually corrected. The expression profile of genes encoding proteins with a signal peptide in infected guinea pigs was found to be very different from that during in vitro growth when using keratin as the substrate. Especially, the sets of the 12 most highly expressed genes encoding proteases with a signal sequence had only the putative vacuolar aspartic protease gene PEP2 in common, during infection and in keratin medium. The most upregulated gene encoding a secreted protease during infection was that encoding subtilisin SUB6, which is a known major allergen in the related dermatophyte Trichophyton rubrum . IMPORTANCE Dermatophytoses (ringworm, jock itch, athlete's foot, and nail infections) are the most common fungal infections, but their virulence mechanisms are poorly understood. Combining transcriptomic data obtained from growth under various culture conditions with data obtained during infection led to a significantly improved genome annotation. About 65% of the protein-encoding genes predicted with our protocol did not match the existing annotation for A. benhamiae . Comparing gene expression during infection on guinea pigs with keratin degradation in vitro , which is supposed to mimic the host environment, revealed the critical importance of using real in vivo conditions for investigating virulence mechanisms. The analysis of genes expressed in vivo , encoding cell surface and secreted proteins, particularly proteases, led to the identification of new allergen and virulence factor candidates.
Characterization of cDNAs and genomic DNAs for human threonyl- and cysteinyl-tRNA synthetases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cruzen, M.E.

1993-01-01

Techniques of molecular biology were used to clone, sequence and map two human aminoacyl-tRNA synthetase (aaRS) cDNAs: threonyl-tRNA synthetase (ThrRS) a class II enzyme and cysteinyl-tRNA synthetase (CysRS) a class I enzyme. The predicted protein sequence of human ThrRS is highly homologous to that of lower eukaryotic and prokaryotic ThRSs, particularly in the regions containing the three structural motifs common to all class II synthetases. Signature regions 1 and 2, which characterize the class IIa subgroup (SerRS, ThrRS and HisRS) are highly conserved from bacteria to human. Structural predictions for human ThrRS based on the known structure of the closelymore » related SerRS from E.coli implicate strongly conserved residues in the signature sequences to be important in substrate binding. The amino terminal 100 residues of the deduced amino acid sequence of ThrRS shares structural similarity to SerRS consistent with forming an antiparallel helix implicated in tRNA binding. The 5' untranslated sequence of the human ThrRS gene shares short stretches of common sequence with the gene for hamster HisRS including a binding site for the promoter specific transcription factor sp-1. The deduced amino acid sequence of human CysRS has a high degree of sequence identify to E. coli CysRS. Human CysRS possesses the classic characteristics of a class I synthetase and is most closely related to the MetRS subgroup. The amino terminal half of human CysRS can be modeled as a nucleotide binding fold and shares significant sequence and structural similarity to the other enzymes in this subgroup. The CysRS structural gene (CARS) was mapped to human chromosome 11p15.5 by fluorescent in situ hybridization. CARS is the first aaRS gene to be mapped to chromosome 11. The steady state of both CysRS and ThrRs mRNA were quantitated in several human tissues. Message levels for these enzymes appear to be subjected to differential regulation in different cell types.« less
New and Common Haplotypes Shape Genetic Diversity in Asian Tiger Mosquito Populations from Costa Rica and Panamá.

PubMed

Futami, K; Valderrama, A; Baldi, M; Minakawa, N; Marín Rodríguez, R; Chaves, L F

2015-04-01

The Asian tiger mosquito, Aedes albopictus (Skuse) (Diptera: Culicidae), is a vector of several human pathogens. Ae. albopictus is also an invasive species that, over recent years, has expanded its range out of its native Asia. Ae. albopictus was suspected to be present in Central America since the 1990s, and its presence was confirmed by most Central American nations by 2010. Recently, this species has been regularly found, yet in low numbers, in limited areas of Panamá and Costa Rica (CR). Here, we report that short sequences (∼558 bp) of the mitochondrial cytochrome oxidase subunit 1 (COI) and NADH dehydrogenase subunit 5 genes of Ae. albopictus, had no haplotype diversity. Instead, there was a common haplotype for each gene in both CR and Panamá. In contrast, a long COI sequence (∼1,390 bp) revealed that haplotype diversity (±SD) was relatively high in CR (0.72±0.04) when compared with Panamá (0.33±0.13), below the global estimate for reported samples (0.89±0.01). The long COI sequence allowed us to identify seven (five new) haplotypes in CR and two (one new) in Panamá. A haplotype network for the long COI gene sequence showed that samples from CR and Panamá belong to a single large group. The long COI gene sequences suggest that haplotypes in Panamá and CR, although similar to each other, had a significant geographic differentiation (Kst=1.33; P<0.001). Thus, most of our results suggest a recent range expansion in CR and Panamá. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Screening Currency Notes for Microbial Pathogens and Antibiotic Resistance Genes Using a Shotgun Metagenomic Approach

PubMed Central

Jalali, Saakshi; Kohli, Samantha; Latka, Chitra; Bhatia, Sugandha; Vellarikal, Shamsudheen Karuthedath; Sivasubbu, Sridhar; Scaria, Vinod; Ramachandran, Srinivasan

2015-01-01

Fomites are a well-known source of microbial infections and previous studies have provided insights into the sojourning microbiome of fomites from various sources. Paper currency notes are one of the most commonly exchanged objects and its potential to transmit pathogenic organisms has been well recognized. Approaches to identify the microbiome associated with paper currency notes have been largely limited to culture dependent approaches. Subsequent studies portrayed the use of 16S ribosomal RNA based approaches which provided insights into the taxonomical distribution of the microbiome. However, recent techniques including shotgun sequencing provides resolution at gene level and enable estimation of their copy numbers in the metagenome. We investigated the microbiome of Indian paper currency notes using a shotgun metagenome sequencing approach. Metagenomic DNA isolated from samples of frequently circulated denominations of Indian currency notes were sequenced using Illumina Hiseq sequencer. Analysis of the data revealed presence of species belonging to both eukaryotic and prokaryotic genera. The taxonomic distribution at kingdom level revealed contigs mapping to eukaryota (70%), bacteria (9%), viruses and archae (~1%). We identified 78 pathogens including Staphylococcus aureus, Corynebacterium glutamicum, Enterococcus faecalis, and 75 cellulose degrading organisms including Acidothermus cellulolyticus, Cellulomonas flavigena and Ruminococcus albus. Additionally, 78 antibiotic resistance genes were identified and 18 of these were found in all the samples. Furthermore, six out of 78 pathogens harbored at least one of the 18 common antibiotic resistance genes. To the best of our knowledge, this is the first report of shotgun metagenome sequence dataset of paper currency notes, which can be useful for future applications including as bio-surveillance of exchangeable fomites for infectious agents. PMID:26035208
dbWFA: a web-based database for functional annotation of Triticum aestivum transcripts

PubMed Central

Vincent, Jonathan; Dai, Zhanwu; Ravel, Catherine; Choulet, Frédéric; Mouzeyar, Said; Bouzidi, M. Fouad; Agier, Marie; Martre, Pierre

2013-01-01

The functional annotation of genes based on sequence homology with genes from model species genomes is time-consuming because it is necessary to mine several unrelated databases. The aim of the present work was to develop a functional annotation database for common wheat Triticum aestivum (L.). The database, named dbWFA, is based on the reference NCBI UniGene set, an expressed gene catalogue built by expressed sequence tag clustering, and on full-length coding sequences retrieved from the TriFLDB database. Information from good-quality heterogeneous sources, including annotations for model plant species Arabidopsis thaliana (L.) Heynh. and Oryza sativa L., was gathered and linked to T. aestivum sequences through BLAST-based homology searches. Even though the complexity of the transcriptome cannot yet be fully appreciated, we developed a tool to easily and promptly obtain information from multiple functional annotation systems (Gene Ontology, MapMan bin codes, MIPS Functional Categories, PlantCyc pathway reactions and TAIR gene families). The use of dbWFA is illustrated here with several query examples. We were able to assign a putative function to 45% of the UniGenes and 81% of the full-length coding sequences from TriFLDB. Moreover, comparison of the annotation of the whole T. aestivum UniGene set along with curated annotations of the two model species assessed the accuracy of the annotation provided by dbWFA. To further illustrate the use of dbWFA, genes specifically expressed during the early cell division or late storage polymer accumulation phases of T. aestivum grain development were identified using a clustering analysis and then annotated using dbWFA. The annotation of these two sets of genes was consistent with previous analyses of T. aestivum grain transcriptomes and proteomes. Database URL: urgi.versailles.inra.fr/dbWFA/ PMID:23660284
Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.

PubMed

Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C

2018-01-10

Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing cancer cells. Copyright © 2017 Elsevier B.V. All rights reserved.
VarDetect: a nucleotide sequence variation exploratory tool

PubMed Central

Ngamphiw, Chumpol; Kulawonganunchai, Supasak; Assawamakin, Anunchai; Jenwitheesuk, Ekachai; Tongsima, Sissades

2008-01-01

Background Single nucleotide polymorphisms (SNPs) are the most commonly studied units of genetic variation. The discovery of such variation may help to identify causative gene mutations in monogenic diseases and SNPs associated with predisposing genes in complex diseases. Accurate detection of SNPs requires software that can correctly interpret chromatogram signals to nucleotides. Results We present VarDetect, a stand-alone nucleotide variation exploratory tool that automatically detects nucleotide variation from fluorescence based chromatogram traces. Accurate SNP base-calling is achieved using pre-calculated peak content ratios, and is enhanced by rules which account for common sequence reading artifacts. The proposed software tool is benchmarked against four other well-known SNP discovery software tools (PolyPhred, novoSNP, Genalys and Mutation Surveyor) using fluorescence based chromatograms from 15 human genes. These chromatograms were obtained from sequencing 16 two-pooled DNA samples; a total of 32 individual DNA samples. In this comparison of automatic SNP detection tools, VarDetect achieved the highest detection efficiency. Availability VarDetect is compatible with most major operating systems such as Microsoft Windows, Linux, and Mac OSX. The current version of VarDetect is freely available at . PMID:19091032

On the Sequence-Directed Nature of Human Gene Mutation: The Role of Genomic Architecture and the Local DNA Sequence Environment in Mediating Gene Mutations Underlying Human Inherited Disease

PubMed Central

Cooper, David N.; Bacolla, Albino; Férec, Claude; Vasquez, Karen M.; Kehrer-Sawatzki, Hildegard; Chen, Jian-Min

2011-01-01

Different types of human gene mutation may vary in size, from structural variants (SVs) to single base-pair substitutions, but what they all have in common is that their nature, size and location are often determined either by specific characteristics of the local DNA sequence environment or by higher-order features of the genomic architecture. The human genome is now recognized to contain ‘pervasive architectural flaws’ in that certain DNA sequences are inherently mutation-prone by virtue of their base composition, sequence repetitivity and/or epigenetic modification. Here we explore how the nature, location and frequency of different types of mutation causing inherited disease are shaped in large part, and often in remarkably predictable ways, by the local DNA sequence environment. The mutability of a given gene or genomic region may also be influenced indirectly by a variety of non-canonical (non-B) secondary structures whose formation is facilitated by the underlying DNA sequence. Since these non-B DNA structures can interfere with subsequent DNA replication and repair, and may serve to increase mutation frequencies in generalized fashion (i.e. both in the context of subtle mutations and SVs), they have the potential to serve as a unifying concept in studies of mutational mechanisms underlying human inherited disease. PMID:21853507
Amino- and carboxyl-terminal amino acid sequences of proteins coded by gag gene of murine leukemia virus

PubMed Central

Oroszlan, Stephen; Henderson, Louis E.; Stephenson, John R.; Copeland, Terry D.; Long, Cedric W.; Ihle, James N.; Gilden, Raymond V.

1978-01-01

The amino- and carboxyl-terminal amino acid sequences of proteins (p10, p12, p15, and p30) coded by the gag gene of Rauscher and AKR murine leukemia viruses were determined. Among these proteins, p15 from both viruses appears to have a blocked amino end. Proline was found to be the common NH2 terminus of both p30s and both p12s, and alanine of both p10s. The amino-terminal sequences of p30s are identical, as are those of p10s, while the p12 sequences are clearly distinctive but also show substantial homology. The carboxyl-terminal amino acids of both viral p30s and p12s are leucine and phenylalanine, respectively. Rauscher leukemia virus p15 has tyrosine as the carboxyl terminus while AKR virus p15 has phenylalanine in this position. The compositional and sequence data provide definite chemical criteria for the identification of analogous gag gene products and for the comparison of viral proteins isolated in different laboratories. On the basis of amino acid sequences and the previously proposed H-p15-p12-p30-p10-COOH peptide sequence in the precursor polyprotein, a model for cleavage sites involved in the post-translational processing of the precursor coded for by the gag gene is proposed. PMID:206897
Common skate (Raja kenojei) secretes pentraxin into the cutaneous secretion: The first skin mucus lectin in cartilaginous fish.

PubMed

Tsutsui, Shigeyuki; Yamaguchi, Motoki; Hirasawa, Ai; Nakamura, Osamu; Watanabe, Tasuku

2009-08-01

A lactose-specific lectin with a molecular mass of about 25 kDa was purified from the skin mucus of a cartilaginous fish-the common skate (Raja kenojei). The complementary DNA sequence of the lectin was 1540 bp long and contained a reading frame encoding 226 amino acids, which showed approximately 38% identity to pentraxins of mammals and teleosts. Gene expression was observed in the skin, gill, stomach and intestine in the healthy skate. We also identified an isotype gene from the liver whose deduced amino-acid sequence shared 69.0% identity with the skin type gene. The antiserum detected protein in the skin, where the lectin is localized in the epidermal cells, and in the blood plasma. The lectin genes are multicopied in the common skate genome. Although pentraxins are acute phase proteins, mRNAs of both the isotypes were not upregulated after the in vivo challenge with formalin-killed Escherichia coli, which suggests that they are constantly present in the skin mucus and blood plasma to protect against pathogenic invasion. This lectin is the fifth type of lectin found in the cutaneous secretions of fish, demonstrating that skin mucus lectins have evolved with marked molecular diversity in fish.
The Sudden Dominance of bla CTX–M Harbouring Plasmids in Shigella spp. Circulating in Southern Vietnam

PubMed Central

Nhu, Nguyen Thi Khanh; Vinh, Ha; Nga, Tran Vu Thieu; Stabler, Richard; Duy, Pham Thanh; Thi Minh Vien, Le; van Doorn, H. Rogier; Cerdeño-Tárraga, Ana; Thomson, Nicholas; Campbell, James; Van Minh Hoang, Nguyen; Thi Thu Nga, Tran; Minh, Pham Van; Thuy, Cao Thu; Wren, Brendan; Farrar, Jeremy; Baker, Stephen

2010-01-01

Background Plasmid mediated antimicrobial resistance in the Enterobacteriaceae is a global problem. The rise of CTX-M class extended spectrum beta lactamases (ESBLs) has been well documented in industrialized countries. Vietnam is representative of a typical transitional middle income country where the spectrum of infectious diseases combined with the spread of drug resistance is shifting and bringing new healthcare challenges. Methodology We collected hospital admission data from the pediatric population attending the hospital for tropical diseases in Ho Chi Minh City with Shigella infections. Organisms were cultured from all enrolled patients and subjected to antimicrobial susceptibility testing. Those that were ESBL positive were subjected to further investigation. These investigations included PCR amplification for common ESBL genes, plasmid investigation, conjugation, microarray hybridization and DNA sequencing of a bla CTX–M encoding plasmid. Principal Findings We show that two different bla CTX-M genes are circulating in this bacterial population in this location. Sequence of one of the ESBL plasmids shows that rather than the gene being integrated into a preexisting MDR plasmid, the bla CTX-M gene is located on relatively simple conjugative plasmid. The sequenced plasmid (pEG356) carried the bla CTX-M-24 gene on an ISEcp1 element and demonstrated considerable sequence homology with other IncFI plasmids. Significance The rapid dissemination, spread of antimicrobial resistance and changing population of Shigella spp. concurrent with economic growth are pertinent to many other countries undergoing similar development. Third generation cephalosporins are commonly used empiric antibiotics in Ho Chi Minh City. We recommend that these agents should not be considered for therapy of dysentery in this setting. PMID:20544028
Comparative Genomics of Non-TNL Disease Resistance Genes from Six Plant Species.

PubMed

Nepal, Madhav P; Andersen, Ethan J; Neupane, Surendra; Benson, Benjamin V

2017-09-30

Disease resistance genes (R genes), as part of the plant defense system, have coevolved with corresponding pathogen molecules. The main objectives of this project were to identify non-Toll interleukin receptor, nucleotide-binding site, leucine-rich repeat (nTNL) genes and elucidate their evolutionary divergence across six plant genomes. Using reference sequences from Arabidopsis , we investigated nTNL orthologs in the genomes of common bean, Medicago , soybean, poplar, and rice. We used Hidden Markov Models for sequence identification, performed model-based phylogenetic analyses, visualized chromosomal positioning, inferred gene clustering, and assessed gene expression profiles. We analyzed 908 nTNL R genes in the genomes of the six plant species, and classified them into 12 subgroups based on the presence of coiled-coil (CC), nucleotide binding site (NBS), leucine rich repeat (LRR), resistance to Powdery mildew 8 (RPW8), and BED type zinc finger domains. Traditionally classified CC-NBS-LRR (CNL) genes were nested into four clades (CNL A-D) often with abundant, well-supported homogeneous subclades of Type-II R genes. CNL-D members were absent in rice, indicating a unique R gene retention pattern in the rice genome. Genomes from Arabidopsis , common bean, poplar and soybean had one chromosome without any CNL R genes. Medicago and Arabidopsis had the highest and lowest number of gene clusters, respectively. Gene expression analyses suggested unique patterns of expression for each of the CNL clades. Differential gene expression patterns of the nTNL genes were often found to correlate with number of introns and GC content, suggesting structural and functional divergence.
Comparative Genomics of Non-TNL Disease Resistance Genes from Six Plant Species

PubMed Central

Andersen, Ethan J.; Neupane, Surendra; Benson, Benjamin V.

2017-01-01

Disease resistance genes (R genes), as part of the plant defense system, have coevolved with corresponding pathogen molecules. The main objectives of this project were to identify non-Toll interleukin receptor, nucleotide-binding site, leucine-rich repeat (nTNL) genes and elucidate their evolutionary divergence across six plant genomes. Using reference sequences from Arabidopsis, we investigated nTNL orthologs in the genomes of common bean, Medicago, soybean, poplar, and rice. We used Hidden Markov Models for sequence identification, performed model-based phylogenetic analyses, visualized chromosomal positioning, inferred gene clustering, and assessed gene expression profiles. We analyzed 908 nTNL R genes in the genomes of the six plant species, and classified them into 12 subgroups based on the presence of coiled-coil (CC), nucleotide binding site (NBS), leucine rich repeat (LRR), resistance to Powdery mildew 8 (RPW8), and BED type zinc finger domains. Traditionally classified CC-NBS-LRR (CNL) genes were nested into four clades (CNL A-D) often with abundant, well-supported homogeneous subclades of Type-II R genes. CNL-D members were absent in rice, indicating a unique R gene retention pattern in the rice genome. Genomes from Arabidopsis, common bean, poplar and soybean had one chromosome without any CNL R genes. Medicago and Arabidopsis had the highest and lowest number of gene clusters, respectively. Gene expression analyses suggested unique patterns of expression for each of the CNL clades. Differential gene expression patterns of the nTNL genes were often found to correlate with number of introns and GC content, suggesting structural and functional divergence. PMID:28973974
Comprehensive analysis and discovery of drought-related NAC transcription factors in common bean.

PubMed

Wu, Jing; Wang, Lanfen; Wang, Shumin

2016-09-07

Common bean (Phaseolus vulgaris L.) is an important warm-season food legume. Drought is the most important environmental stress factor affecting large areas of common bean via plant death or reduced global production. The NAM, ATAF1/2 and CUC2 (NAC) domain protein family are classic transcription factors (TFs) involved in a variety of abiotic stresses, particularly drought stress. However, the NAC TFs in common bean have not been characterized. In the present study, 86 putative NAC TF proteins were identified from the common bean genome database and located on 11 common bean chromosomes. The proteins were phylogenetically clustered into 8 distinct subfamilies. The gene structure and motif composition of common bean NACs were similar in each subfamily. These results suggest that NACs in the same subfamily may possess conserved functions. The expression patterns of common bean NAC genes were also characterized. The majority of NACs exhibited specific temporal and spatial expression patterns. We identified 22 drought-related NAC TFs based on transcriptome data for drought-tolerant and drought-sensitive genotypes. Quantitative real-time PCR (qRT-PCR) was performed to confirm the expression patterns of the 20 drought-related NAC genes. Based on the common bean genome sequence, we analyzed the structural characteristics, genome distribution, and expression profiles of NAC gene family members and analyzed drought-responsive NAC genes. Our results provide useful information for the functional characterization of common bean NAC genes and rich resources and opportunities for understanding common bean drought stress tolerance mechanisms.
The structural analysis of the mitochondrial SSUrRNA implies a close phylogenetic relationship between mitochondria from plants and from the heterotrophic alga Prototheca wickerhamii.

PubMed

Wolff, G; Kück, U

1990-04-01

The gene for the mitochondrial small subunit rRNA (SSUrRNA) from the heterotrophic alga Prototheca wickerhamii has been isolated from a gene library of extranuclear DNA. Sequence and structural analyses allow the determination of a secondary structure model for this rRNA. In addition, several sequence motifs are present which are typically found in SSUrRNAs of various mitochondrial origins. Unexpectedly, the Prototheca RNA sequence has more features in common with mitochondrial SSUrRNAs from plants than with that from the green alga Chlamydomonas reinhardtii. The phylogenetic relationship between mitochondria from plants and algae is discussed.
Processing and population genetic analysis of multigenic datasets with ProSeq3 software.

PubMed

Filatov, Dmitry A

2009-12-01

The current tendency in molecular population genetics is to use increasing numbers of genes in the analysis. Here I describe a program for handling and population genetic analysis of DNA polymorphism data collected from multiple genes. The program includes a sequence/alignment editor and an internal relational database that simplify the preparation and manipulation of multigenic DNA polymorphism datasets. The most commonly used DNA polymorphism analyses are implemented in ProSeq3, facilitating population genetic analysis of large multigenic datasets. Extensive input/output options make ProSeq3 a convenient hub for sequence data processing and analysis. The program is available free of charge from http://dps.plants.ox.ac.uk/sequencing/proseq.htm.
A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling

DOE PAGES

Podar, Mircea; Shakya, Migun; D'Amore, Rosalinda; ...

2016-01-14

In the last 5 years, the rapid pace of innovations and improvements in sequencing technologies has completely changed the landscape of metagenomic and metagenetic experiments. Therefore, it is critical to benchmark the various methodologies for interrogating the composition of microbial communities, so that we can assess their strengths and limitations. Here, the most common phylogenetic marker for microbial community diversity studies is the 16S ribosomal RNA gene and in the last 10 years the field has moved from sequencing a small number of amplicons and samples to more complex studies where thousands of samples and multiple different gene regions aremore » interrogated.« less
Genetic epidemiology of pharmacogenetic variants in South East Asian Malays using whole-genome sequences.

PubMed

Sivadas, A; Salleh, M Z; Teh, L K; Scaria, V

2017-10-01

Expanding the scope of pharmacogenomic research by including multiple global populations is integral to building robust evidence for its clinical translation. Deep whole-genome sequencing of diverse ethnic populations provides a unique opportunity to study rare and common pharmacogenomic markers that often vary in frequency across populations. In this study, we aim to build a diverse map of pharmacogenetic variants in South East Asian (SEA) Malay population using deep whole-genome sequences of 100 healthy SEA Malay individuals. We investigated the allelic diversity of potentially deleterious pharmacogenomic variants in SEA Malay population. Our analysis revealed 227 common and 466 rare potentially functional single nucleotide variants (SNVs) in 437 pharmacogenomic genes involved in drug metabolism, transport and target genes, including 74 novel variants. This study has created one of the most comprehensive maps of pharmacogenetic markers in any population from whole genomes and will hugely benefit pharmacogenomic investigations and drug dosage recommendations in SEA Malays.
What the Aspergillus genomes have told us.

PubMed

Nierman, W C; May, G; Kim, H S; Anderson, M J; Chen, D; Denning, D W

2005-05-01

The sequencing and annotation of the genomes of the first strains of Aspergillus nidulans, Aspergillus oryzae, and Aspergillus fumigatus will be seen in retrospect as a transformational event in Aspergillus biology. With this event the entire genetic composition of A. nidulans, the sexual experimental model organism of the genus Aspergillus, A. oryzae, the food biotechnology organism which is the product of centuries of cultivation, and A. fumigatus, the most common causative agent of invasive aspergillosis is now revealed to the extent that we are at present able to understand. Each genome exhibits a large set of genes common to the three as well as a much smaller set of genes unique to each. Moreover, these sequences serve as resources providing the major tool to expanding our understanding of the biology of each. Transcription profiling of A. fumigatus at high temperatures and comparative genomic hybridization between A. fumigatus and a closely related Aspergillus species provides microarray based examples of the beginning of functional analysis of the genomes of these organisms going forward from the genome sequence.
The genome sequence of Condylorrhiza vestigialis NPV, a novel baculovirus for the control of the Alamo moth on Populus spp. in Brazil.

PubMed

Castro, Maria Elita B; Melo, Fernando L; Tagliari, Marina; Inglis, Peter W; Craveiro, Saluana R; Ribeiro, Zilda Maria A; Ribeiro, Bergmann M; Báo, Sônia N

2017-09-01

Condylorrhiza vestigialis (Lepidoptera: Cambridae), commonly known as the Brazilian poplar moth or Alamo moth, is a serious defoliating pest of poplar, a crop of great economic importance for the production of wood, fiber, biofuel and other biomaterials as well as its significant ecological and environmental value. The complete genome sequence of a new alphabaculovirus isolated from C. vestigialis was determined and analyzed. Condylorrhiza vestigialis nucleopolyhedrovirus (CoveNPV) has a circular double-stranded DNA genome of 125,767bp with a GC content of 42.9%. One hundred and thirty-eight putative open reading frames were identified and annotated in the CoveNPV genome, including 38 core genes and 9 bros. Four homologous regions (hrs), a feature common to most baculoviruses, and 19 perfect and imperfect direct repeats (drs) were found. Phylogenetic analysis confirmed that CoveNPV is a Group I Alphabaculovirus and is most closely related to Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) and Choristoneura fumiferana DEF multiple nucleopolyhedrovirus CfDEFMNPV. The gp37 gene was not detected in the CoveNPV genome, although this gene is found in many NPVs. Two other common NPV genes, chitinase (v-chiA) and cathepsin (v-cath), that are responsible for host insect liquefaction and melanization, were also absent, where phylogenetic analysis suggests that the loss these genes occurred in the common ancestor of AgMNPV, CfDEFMNPV and CoveNPV, with subsequent reacquisition of these genes by CfDEFMNPV. The molecular biology and genetics of CoveNPV was formerly very little known and our expectation is that the findings presented here should accelerate research on this baculovirus, which will facilitate the use of CoveNPV in integrated pest management programs in Poplar crops. Copyright © 2017 Elsevier Inc. All rights reserved.
Fungal Genes in Context: Genome Architecture Reflects Regulatory Complexity and Function

PubMed Central

Noble, Luke M.; Andrianopoulos, Alex

2013-01-01

Gene context determines gene expression, with local chromosomal environment most influential. Comparative genomic analysis is often limited in scope to conserved or divergent gene and protein families, and fungi are well suited to this approach with low functional redundancy and relatively streamlined genomes. We show here that one aspect of gene context, the amount of potential upstream regulatory sequence maintained through evolution, is highly predictive of both molecular function and biological process in diverse fungi. Orthologs with large upstream intergenic regions (UIRs) are strongly enriched in information processing functions, such as signal transduction and sequence-specific DNA binding, and, in the genus Aspergillus, include the majority of experimentally studied, high-level developmental and metabolic transcriptional regulators. Many uncharacterized genes are also present in this class and, by implication, may be of similar importance. Large intergenic regions also share two novel sequence characteristics, currently of unknown significance: they are enriched for plus-strand polypyrimidine tracts and an information-rich, putative regulatory motif that was present in the last common ancestor of the Pezizomycotina. Systematic consideration of gene UIR in comparative genomics, particularly for poorly characterized species, could help reveal organisms’ regulatory priorities. PMID:23699226
A ZFY-like sequence in fish, with comments on the evolution of the ZFY family of genes in vertebrates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zimmerer, E.J.; Threlkeld, L.

1995-08-01

ZFY-like genes have been observed in a variety of vertebrate species. Although originally implicated as the primary testis-determining gene in humans and other placental mammals, more recent evidence indicates a role(s) outside that of testis determination. In this study, DNA from five species of fish, Carasius auratus, Rivulus marmoratus, Xiphophorus maculatus, X. milleri, and X. nigrensis was subjected to Southern blot analysis using a PCR-amplified fragment of mouse ZFY-like sequence as a probe. Restriction fragment patterns were not polymorphic between sexes in any one species but showed a different pattern for each species. With one exception, Rivulus, a 3.1-kb bandmore » from the EcoRI digestion was common to all. Sequence and open reading frame analysis of this fragment showed a strong homology to other known vertebrate ZFY-like genes. Of particular interest in this gene is a novel third finger domain similar to one human and one alligator ZFY-like gene. Our studies and others provide evidence for a family of vertebrate ZFY genes, with those having this novel third finger being representative of the ancestral condition. 30 refs., 3 figs., 3 tabs.« less
Improved systematic tRNA gene annotation allows new insights into the evolution of mitochondrial tRNA structures and into the mechanisms of mitochondrial genome rearrangements

PubMed Central

Jühling, Frank; Pütz, Joern; Bernt, Matthias; Donath, Alexander; Middendorf, Martin; Florentz, Catherine; Stadler, Peter F.

2012-01-01

Transfer RNAs (tRNAs) are present in all types of cells as well as in organelles. tRNAs of animal mitochondria show a low level of primary sequence conservation and exhibit ‘bizarre’ secondary structures, lacking complete domains of the common cloverleaf. Such sequences are hard to detect and hence frequently missed in computational analyses and mitochondrial genome annotation. Here, we introduce an automatic annotation procedure for mitochondrial tRNA genes in Metazoa based on sequence and structural information in manually curated covariance models. The method, applied to re-annotate 1876 available metazoan mitochondrial RefSeq genomes, allows to distinguish between remaining functional genes and degrading ‘pseudogenes’, even at early stages of divergence. The subsequent analysis of a comprehensive set of mitochondrial tRNA genes gives new insights into the evolution of structures of mitochondrial tRNA sequences as well as into the mechanisms of genome rearrangements. We find frequent losses of tRNA genes concentrated in basal Metazoa, frequent independent losses of individual parts of tRNA genes, particularly in Arthropoda, and wide-spread conserved overlaps of tRNAs in opposite reading direction. Direct evidence for several recent Tandem Duplication-Random Loss events is gained, demonstrating that this mechanism has an impact on the appearance of new mitochondrial gene orders. PMID:22139921
Genomic V exons from whole genome shotgun data in reptiles.

PubMed

Olivieri, D N; von Haeften, B; Sánchez-Espinel, C; Faro, J; Gambón-Deza, F

2014-08-01

Reptiles and mammals diverged over 300 million years ago, creating two parallel evolutionary lineages amongst terrestrial vertebrates. In reptiles, two main evolutionary lines emerged: one gave rise to Squamata, while the other gave rise to Testudines, Crocodylia, and Aves. In this study, we determined the genomic variable (V) exons from whole genome shotgun sequencing (WGS) data in reptiles corresponding to the three main immunoglobulin (IG) loci and the four main T cell receptor (TR) loci. We show that Squamata lack the TRG and TRD genes, and snakes lack the IGKV genes. In representative species of Testudines and Crocodylia, the seven major IG and TR loci are maintained. As in mammals, genes of the IG loci can be grouped into well-defined IMGT clans through a multi-species phylogenetic analysis. We show that the reptilian IGHV and IGLV genes are distributed amongst the established mammalian clans, while their IGKV genes are found within a single clan, nearly exclusive from the mammalian sequences. The reptilian and mammalian TRAV genes cluster into six common evolutionary clades (since IMGT clans have not been defined for TR). In contrast, the reptilian TRBV genes cluster into three clades, which have few mammalian members. In this locus, the V exon sequences from mammals appear to have undergone different evolutionary diversification processes that occurred outside these shared reptilian clans. These sequences can be obtained in a freely available public repository (http://vgenerepertoire.org).
JCoDA: a tool for detecting evolutionary selection.

PubMed

Steinway, Steven N; Dannenfelser, Ruth; Laucius, Christopher D; Hayes, James E; Nayak, Sudhir

2010-05-27

The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences. JCoDA accepts user-inputted unaligned or pre-aligned coding sequences, performs a codon-delimited alignment using ClustalW, and determines the dN/dS calculations using PAML (Phylogenetic Analysis Using Maximum Likelihood, yn00 and codeml) in order to identify regions and sites under evolutionary selection. The JCoDA package includes a graphical interface for Phylip (Phylogeny Inference Package) to generate phylogenetic trees, manages formatting of all required file types, and streamlines passage of information between underlying programs. The raw data are output to user configurable graphs with sliding window options for straightforward visualization of pairwise or gene family comparisons. Additionally, codon-delimited alignments are output in a variety of common formats and all dN/dS calculations can be output in comma-separated value (CSV) format for downstream analysis. To illustrate the types of analyses that are facilitated by JCoDA, we have taken advantage of the well studied sex determination pathway in nematodes as well as the extensive sequence information available to identify genes under positive selection, examples of regional positive selection, and differences in selection based on the role of genes in the sex determination pathway. JCoDA is a configurable, open source, user-friendly visualization tool for performing evolutionary analysis on homologous coding sequences. JCoDA can be used to rapidly screen for genes and regions of genes under selection using PAML. It can be freely downloaded at http://www.tcnj.edu/~nayaklab/jcoda.
JCoDA: a tool for detecting evolutionary selection

PubMed Central

2010-01-01

Background The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences. Results JCoDA accepts user-inputted unaligned or pre-aligned coding sequences, performs a codon-delimited alignment using ClustalW, and determines the dN/dS calculations using PAML (Phylogenetic Analysis Using Maximum Likelihood, yn00 and codeml) in order to identify regions and sites under evolutionary selection. The JCoDA package includes a graphical interface for Phylip (Phylogeny Inference Package) to generate phylogenetic trees, manages formatting of all required file types, and streamlines passage of information between underlying programs. The raw data are output to user configurable graphs with sliding window options for straightforward visualization of pairwise or gene family comparisons. Additionally, codon-delimited alignments are output in a variety of common formats and all dN/dS calculations can be output in comma-separated value (CSV) format for downstream analysis. To illustrate the types of analyses that are facilitated by JCoDA, we have taken advantage of the well studied sex determination pathway in nematodes as well as the extensive sequence information available to identify genes under positive selection, examples of regional positive selection, and differences in selection based on the role of genes in the sex determination pathway. Conclusions JCoDA is a configurable, open source, user-friendly visualization tool for performing evolutionary analysis on homologous coding sequences. JCoDA can be used to rapidly screen for genes and regions of genes under selection using PAML. It can be freely downloaded at http://www.tcnj.edu/~nayaklab/jcoda. PMID:20507581
Genome-Wide Profiling of Histone Modifications (H3K9me2 and H4K12ac) and Gene Expression in Rust (Uromyces appendiculatus) Inoculated Common Bean (Phaseolus vulgaris L.).

PubMed

Ayyappan, Vasudevan; Kalavacharla, Venu; Thimmapuram, Jyothi; Bhide, Ketaki P; Sripathi, Venkateswara R; Smolinski, Tomasz G; Manoharan, Muthusamy; Thurston, Yaqoob; Todd, Antonette; Kingham, Bruce

2015-01-01

Histone modifications such as methylation and acetylation play a significant role in controlling gene expression in unstressed and stressed plants. Genome-wide analysis of such stress-responsive modifications and genes in non-model crops is limited. We report the genome-wide profiling of histone methylation (H3K9me2) and acetylation (H4K12ac) in common bean (Phaseolus vulgaris L.) under rust (Uromyces appendiculatus) stress using two high-throughput approaches, chromatin immunoprecipitation sequencing (ChIP-Seq) and RNA sequencing (RNA-Seq). ChIP-Seq analysis revealed 1,235 and 556 histone methylation and acetylation responsive genes from common bean leaves treated with the rust pathogen at 0, 12 and 84 hour-after-inoculation (hai), while RNA-Seq analysis identified 145 and 1,763 genes differentially expressed between mock-inoculated and inoculated plants. The combined ChIP-Seq and RNA-Seq analyses identified some key defense responsive genes (calmodulin, cytochrome p450, chitinase, DNA Pol II, and LRR) and transcription factors (WRKY, bZIP, MYB, HSFB3, GRAS, NAC, and NMRA) in bean-rust interaction. Differential methylation and acetylation affected a large proportion of stress-responsive genes including resistant (R) proteins, detoxifying enzymes, and genes involved in ion flux and cell death. The genes identified were functionally classified using Gene Ontology (GO) and EuKaryotic Orthologous Groups (KOGs). The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis identified a putative pathway with ten key genes involved in plant-pathogen interactions. This first report of an integrated analysis of histone modifications and gene expression involved in the bean-rust interaction as reported here provides a comprehensive resource for other epigenomic regulation studies in non-model species under stress.

Genome-Wide Profiling of Histone Modifications (H3K9me2 and H4K12ac) and Gene Expression in Rust (Uromyces appendiculatus) Inoculated Common Bean (Phaseolus vulgaris L.)

PubMed Central

Thimmapuram, Jyothi; Bhide, Ketaki P.; Sripathi, Venkateswara R.; Smolinski, Tomasz G.; Manoharan, Muthusamy; Thurston, Yaqoob; Todd, Antonette; Kingham, Bruce

2015-01-01

Histone modifications such as methylation and acetylation play a significant role in controlling gene expression in unstressed and stressed plants. Genome-wide analysis of such stress-responsive modifications and genes in non-model crops is limited. We report the genome-wide profiling of histone methylation (H3K9me2) and acetylation (H4K12ac) in common bean (Phaseolus vulgaris L.) under rust (Uromyces appendiculatus) stress using two high-throughput approaches, chromatin immunoprecipitation sequencing (ChIP-Seq) and RNA sequencing (RNA-Seq). ChIP-Seq analysis revealed 1,235 and 556 histone methylation and acetylation responsive genes from common bean leaves treated with the rust pathogen at 0, 12 and 84 hour-after-inoculation (hai), while RNA-Seq analysis identified 145 and 1,763 genes differentially expressed between mock-inoculated and inoculated plants. The combined ChIP-Seq and RNA-Seq analyses identified some key defense responsive genes (calmodulin, cytochrome p450, chitinase, DNA Pol II, and LRR) and transcription factors (WRKY, bZIP, MYB, HSFB3, GRAS, NAC, and NMRA) in bean-rust interaction. Differential methylation and acetylation affected a large proportion of stress-responsive genes including resistant (R) proteins, detoxifying enzymes, and genes involved in ion flux and cell death. The genes identified were functionally classified using Gene Ontology (GO) and EuKaryotic Orthologous Groups (KOGs). The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis identified a putative pathway with ten key genes involved in plant-pathogen interactions. This first report of an integrated analysis of histone modifications and gene expression involved in the bean-rust interaction as reported here provides a comprehensive resource for other epigenomic regulation studies in non-model species under stress. PMID:26167691
Distinct profiles of expressed sequence tags during intestinal regeneration in the sea cucumber Holothuria glaberrima

PubMed Central

Rojas-Cartagena, Carmencita; Ortíz-Pineda, Pablo; Ramírez-Gómez, Francisco; Suárez-Castillo, Edna C.; Matos-Cruz, Vanessa; Rodríguez, Carlos; Ortíz-Zuazaga, Humberto; García-Arrarás, José E.

2010-01-01

Repair and regeneration are key processes for tissue maintenance, and their disruption may lead to disease states. Little is known about the molecular mechanisms that underline the repair and regeneration of the digestive tract. The sea cucumber Holothuria glaberrima represents an excellent model to dissect and characterize the molecular events during intestinal regeneration. To study the gene expression profile, cDNA libraries were constructed from normal, 3-day, and 7-day regenerating intestines of H. glaberrima. Clones were randomly sequenced and queried against the nonredundant protein database at the National Center for Biotechnology Information. RT-PCR analyses were made of several genes to determine their expression profile during intestinal regeneration. A total of 5,173 sequences from three cDNA libraries were obtained. About 46.2, 35.6, and 26.2% of the sequences for the normal, 3-days, and 7-days cDNA libraries, respectively, shared significant similarity with known sequences in the protein database of GenBank but only present 10% of similarity among them. Analysis of the libraries in terms of functional processes, protein domains, and most common sequences suggests that a differential expression profile is taking place during the regeneration process. Further examination of the expressed sequence tag dataset revealed that 12 putative genes are differentially expressed at significant level (R > 6). Experimental validation by RT-PCR analysis reveals that at least three genes (unknown C-4677-1, melanotransferrin, and centaurin) present a differential expression during regeneration. These findings strongly suggest that the gene expression profile varies among regeneration stages and provide evidence for the existence of differential gene expression. PMID:17579180
Serratia marcescens outbreak in a neonatal intensive care unit (NICU): new insights from next-generation sequencing applications.

PubMed

Martineau, Christine; Li, Xuejing; Lalancette, Cindy; Perreault, Thérèse; Fournier, Eric; Tremblay, Julien; Gonzales, Milagros; Yergeau, Étienne; Quach, Caroline

2018-06-13

Serratia marcescens is an environmental bacterium commonly associated with outbreaks in neonatal intensive care units (NICU). Investigation of S. marcescens outbreaks requires efficient recovery and typing of clinical and environmental isolates. In this study, we described how the use of next-generation sequencing applications, such as bacterial whole-genome sequencing (WGS) and bacterial community profiling, could improve S. marcescens outbreak investigation. Phylogenomic links and potential antibiotic resistance genes and plasmids in S. marcescens isolates were investigated using WGS, while bacterial communities and relative abundances of Serratia in environmental samples were assessed using sequencing of bacterial phylogenetic marker genes (16S rRNA and gyrB genes). Typing results obtained using WGS for the ten S. marcescens isolates recovered during a NICU outbreak investigation were highly consistent with those from pulse-field gel electrophoresis (PFGE), the current gold standard typing method for this bacterium. WGS also allowed for the identification of genes associated with antibiotic resistance in all isolates, while no plasmid was detected. Sequencing of the 16S rRNA and gyrB genes both showed higher relative abundances of Serratia in environmental sampling sites that were in close contact with infected babies. Much lower relative abundances of Serratia were observed following disinfection of a room, indicating that the protocol used was efficient. Variations in the bacterial community composition and structure following room disinfection and between sampling sites were also identified through 16S rRNA gene sequencing. Globally, results from this study highlight the potential for next-generation sequencing tools to improve and facilitate outbreak investigation. Copyright © 2018 American Society for Microbiology.
Genome-wide signatures of convergent evolution in echolocating mammals

PubMed Central

Parker, Joe; Tsagkogeorga, Georgia; Cotton, James A.; Liu, Yuan; Provero, Paolo; Stupka, Elia; Rossiter, Stephen J.

2013-01-01

Evolution is typically thought to proceed through divergence of genes, proteins, and ultimately phenotypes1-3. However, similar traits might also evolve convergently in unrelated taxa due to similar selection pressures4,5. Adaptive phenotypic convergence is widespread in nature, and recent results from a handful of genes have suggested that this phenomenon is powerful enough to also drive recurrent evolution at the sequence level6-9. Where homoplasious substitutions do occur these have long been considered the result of neutral processes. However, recent studies have demonstrated that adaptive convergent sequence evolution can be detected in vertebrates using statistical methods that model parallel evolution9,10 although the extent to which sequence convergence between genera occurs across genomes is unknown. Here we analyse genomic sequence data in mammals that have independently evolved echolocation and show for the first time that convergence is not a rare process restricted to a handful of loci but is instead widespread, continuously distributed and commonly driven by natural selection acting on a small number of sites per locus. Systematic analyses of convergent sequence evolution in 805,053 amino acids within 2,326 orthologous coding gene sequences compared across 22 mammals (including four new bat genomes) revealed signatures consistent with convergence in nearly 200 loci. Strong and significant support for convergence among bats and the dolphin was seen in numerous genes linked to hearing or deafness, consistent with an involvement in echolocation. Surprisingly we also found convergence in many genes linked to vision: the convergent signal of many sensory genes was robustly correlated with the strength of natural selection. This first attempt to detect genome-wide convergent sequence evolution across divergent taxa reveals the phenomenon to be much more pervasive than previously recognised. PMID:24005325
Genome Fragmentation Is Not Confined to the Peridinin Plastid in Dinoflagellates

PubMed Central

Espelund, Mari; Minge, Marianne A.; Gabrielsen, Tove M.; Nederbragt, Alexander J.; Shalchian-Tabrizi, Kamran; Otis, Christian; Turmel, Monique; Lemieux, Claude; Jakobsen, Kjetill S.

2012-01-01

When plastids are transferred between eukaryote lineages through series of endosymbiosis, their environment changes dramatically. Comparison of dinoflagellate plastids that originated from different algal groups has revealed convergent evolution, suggesting that the host environment mainly influences the evolution of the newly acquired organelle. Recently the genome from the anomalously pigmented dinoflagellate Karlodinium veneficum plastid was uncovered as a conventional chromosome. To determine if this haptophyte-derived plastid contains additional chromosomal fragments that resemble the mini-circles of the peridin-containing plastids, we have investigated its genome by in-depth sequencing using 454 pyrosequencing technology, PCR and clone library analysis. Sequence analyses show several genes with significantly higher copy numbers than present in the chromosome. These genes are most likely extrachromosomal fragments, and the ones with highest copy numbers include genes encoding the chaperone DnaK(Hsp70), the rubisco large subunit (rbcL), and two tRNAs (trnE and trnM). In addition, some photosystem genes such as psaB, psaA, psbB and psbD are overrepresented. Most of the dnaK and rbcL sequences are found as shortened or fragmented gene sequences, typically missing the 3′-terminal portion. Both dnaK and rbcL are associated with a common sequence element consisting of about 120 bp of highly conserved AT-rich sequence followed by a trnE gene, possibly serving as a control region. Decatenation assays and Southern blot analysis indicate that the extrachromosomal plastid sequences do not have the same organization or lengths as the minicircles of the peridinin dinoflagellates. The fragmentation of the haptophyte-derived plastid genome K. veneficum suggests that it is likely a sign of a host-driven process shaping the plastid genomes of dinoflagellates. PMID:22719952
Using Drosophila melanogaster as a Model for Genotoxic Chemical Mutational Studies with a New Program, SnpSift

PubMed Central

Cingolani, Pablo; Patel, Viral M.; Coon, Melissa; Nguyen, Tung; Land, Susan J.; Ruden, Douglas M.; Lu, Xiangyi

2012-01-01

This paper describes a new program SnpSift for filtering differential DNA sequence variants between two or more experimental genomes after genotoxic chemical exposure. Here, we illustrate how SnpSift can be used to identify candidate phenotype-relevant variants including single nucleotide polymorphisms, multiple nucleotide polymorphisms, insertions, and deletions (InDels) in mutant strains isolated from genome-wide chemical mutagenesis of Drosophila melanogaster. First, the genomes of two independently isolated mutant fly strains that are allelic for a novel recessive male-sterile locus generated by genotoxic chemical exposure were sequenced using the Illumina next-generation DNA sequencer to obtain 20- to 29-fold coverage of the euchromatic sequences. The sequencing reads were processed and variants were called using standard bioinformatic tools. Next, SnpEff was used to annotate all sequence variants and their potential mutational effects on associated genes. Then, SnpSift was used to filter and select differential variants that potentially disrupt a common gene in the two allelic mutant strains. The potential causative DNA lesions were partially validated by capillary sequencing of polymerase chain reaction-amplified DNA in the genetic interval as defined by meiotic mapping and deletions that remove defined regions of the chromosome. Of the five candidate genes located in the genetic interval, the Pka-like gene CG12069 was found to carry a separate pre-mature stop codon mutation in each of the two allelic mutants whereas the other four candidate genes within the interval have wild-type sequences. The Pka-like gene is therefore a strong candidate gene for the male-sterile locus. These results demonstrate that combining SnpEff and SnpSift can expedite the identification of candidate phenotype-causative mutations in chemically mutagenized Drosophila strains. This technique can also be used to characterize the variety of mutations generated by genotoxic chemicals. PMID:22435069
The complete mitochondrial genome sequence of the spider habronattus oregonensis reveals rearranged and extremely truncated tRNAs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Masta, Susan E.; Boore, Jeffrey L.

2004-01-31

We sequenced the entire mitochondrial genome of the jumping spider Habronattus oregonensis of the arachnid order Araneae (Arthropoda: Chelicerata). A number of unusual features distinguish this genome from other chelicerate and arthropod mitochondrial genomes. Most of the transfer RNA gene sequences are greatly reduced in size and cannot be folded into typical cloverleaf-shaped secondary structures. At least nine of the tRNA sequences lack the potential to form TYC arm stem pairings, and instead are inferred to have TV-replacement loops. Furthermore, sequences that could encode the 3' aminoacyl acceptor stems in at least 10 tRNAs appear to be lacking, because fullymore » paired acceptor stems are not possible and because the downstream sequences instead encode adjacent genes. Hence, these appear to be among the smallest known tRNA genes. We postulate that an RNA editing mechanism must exist to restore the 3' aminoacyl acceptor stems in order to allow the tRNAs to function. At least seven tRN As are rearranged with respect to the chelicerate Limulus polyphemus, although the arrangement of the protein-coding genes is identical. Most mitochondrial protein-coding genes of H. oregonensis have ATN as initiation codons, as commonly found in arthropod mtDNAs, but cytochrome oxidase subunit 2 and 3 genes apparently use UUG as an initiation codon. Finally, many of the gene sequences overlap one another and are truncated. This 14,381 bp genome, the first mitochondrial genome of a spider yet sequenced, is one of the smallest arthropod mitochondrial genomes known. We suggest that post transcriptional RNA editing can likely maintain function of the tRNAs while permitting the accumulation of mutations that would otherwise be deleterious. Such mechanisms may have allowed for the minimization of the spider mitochondrial genome.« less
Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns.

PubMed

Gruel, Jérémy; LeBorgne, Michel; LeMeur, Nolwenn; Théret, Nathalie

2011-09-12

Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks.
Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns

PubMed Central

2011-01-01

Background Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Results Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Conclusions Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks. PMID:21910886
Toxicity phenotype does not correlate with phylogeny of Cylindrospermopsis raciborskii strains.

PubMed

Stucken, Karina; Murillo, Alejandro A; Soto-Liebe, Katia; Fuentes-Valdés, Juan J; Méndez, Marco A; Vásquez, Mónica

2009-02-01

Cylindrospermopsis raciborskii is a species of freshwater, bloom-forming cyanobacterium. C. raciborskii produces toxins, including cylindrospermopsin (hepatotoxin) and saxitoxin (neurotoxin), although non toxin-producing strains are also observed. In spite of differences in toxicity, C. raciborskii strains comprise a monophyletic group, based upon 16S rRNA gene sequence identities (greater than 99%). We performed phylogenetic analyses; 16S rRNA gene and 16S-23S rRNA gene internally transcribed spacer (ITS-1) sequence comparisons, and genomic DNA restriction fragment length polymorphism (RFLP), resolved by pulsed-field gel electrophoresis (PFGE), of strains of C. raciborskii, obtained mainly from the Australian phylogeographic cluster. Our results showed no correlation between toxic phenotype and phylogenetic association in the Australian strains. Analyses of the 16S rRNA gene and the respective ITS-1 sequences (long L, and short S) showed an independent evolution of each ribosomal operon. The genes putatively involved in the cylindrospermopsin biosynthetic pathway were present in one locus and only in the hepatotoxic strains, demonstrating a common genomic organization for these genes and the absence of mutated or inactivated biosynthetic genes in the non toxic strains. In summary, our results support the hypothesis that the genes involved in toxicity may have been transferred as an island by processes of gene lateral transfer, rather than convergent evolution.
Rapid birth-and-death evolution of the xenobiotic metabolizing NAT gene family in vertebrates with evidence of adaptive selection

PubMed Central

2013-01-01

Background The arylamine N-acetyltransferases (NATs) are a unique family of enzymes widely distributed in nature that play a crucial role in the detoxification of aromatic amine xenobiotics. Considering the temporal changes in the levels and toxicity of environmentally available chemicals, the metabolic function of NATs is likely to be under adaptive evolution to broaden or change substrate specificity over time, making NATs a promising subject for evolutionary analyses. In this study, we trace the molecular evolutionary history of the NAT gene family during the last ~450 million years of vertebrate evolution and define the likely role of gene duplication, gene conversion and positive selection in the evolutionary dynamics of this family. Results A phylogenetic analysis of 77 NAT sequences from 38 vertebrate species retrieved from public genomic databases shows that NATs are phylogenetically unstable genes, characterized by frequent gene duplications and losses even among closely related species, and that concerted evolution only played a minor role in the patterns of sequence divergence. Local signals of positive selection are detected in several lineages, probably reflecting response to changes in xenobiotic exposure. We then put a special emphasis on the study of the last ~85 million years of primate NAT evolution by determining the NAT homologous sequences in 13 additional primate species. Our phylogenetic analysis supports the view that the three human NAT genes emerged from a first duplication event in the common ancestor of Simiiformes, yielding NAT1 and an ancestral NAT gene which in turn, duplicated in the common ancestor of Catarrhini, giving rise to NAT2 and the NATP pseudogene. Our analysis suggests a main role of purifying selection in NAT1 protein evolution, whereas NAT2 was predicted to mostly evolve under positive selection to change its amino acid sequence over time. These findings are consistent with a differential role of the two human isoenzymes and support the involvement of NAT1 in endogenous metabolic pathways. Conclusions This study provides unequivocal evidence that the NAT gene family has evolved under a dynamic process of birth-and-death evolution in vertebrates, consistent with previous observations made in fungi. PMID:23497148
Sequence variations of the human MPDZ gene and association with alcoholism in subjects with European ancestry.

PubMed

Karpyak, Victor M; Kim, Jeong-Hyun; Biernacka, Joanna M; Wieben, Eric D; Mrazek, David A; Black, John L; Choi, Doo-Sup

2009-04-01

Mpdz gene variations are known contributors of acute alcohol withdrawal severity and seizures in mice. To investigate the relevance of these findings for human alcoholism, we resequenced 46 exons, exon-intron boundaries, and 2 kilobases in the 5' region of the human MPDZ gene in 61 subjects with a history of alcohol withdrawal seizures (AWS), 59 subjects with a history of alcohol withdrawal without AWS, and 64 Coriell samples from self-reported nonalcoholic subjects [all European American (EA) ancestry] and compared with the Mpdz sequences of 3 mouse strains with different propensity to AWS. To explore potential associations of the human MPDZ gene with alcoholism and AWS, single SNP and haplotype analyses were performed using 13 common variants. Sixty-seven new, mostly rare variants were discovered in the human MPDZ gene. Sequence comparison revealed that the human gene does not have variations identical to those comprising Mpdz gene haplotype associated with AWS in mice. We also found no significant association between MPDZ haplotypes and AWS in humans. However, a global test of haplotype association revealed a significant difference in haplotype frequencies between alcohol-dependent subjects without AWS and Coriell controls (p = 0.015), suggesting a potential role of MPDZ in alcoholism and/or related phenotypes other than AWS. Haplotype-specific tests for the most common haplotypes (frequency > 0.05), revealed a specific high-risk haplotype (p = 0.006, maximum statistic p = 0.051), containing rs13297480G allele also found to be significantly more prevalent in alcoholics without AWS compared with nonalcoholic Coriell subjects (p = 0.019). Sequencing of MPDZ gene in individuals with EA ancestry revealed no variations in the sites identical to those associated with AWS in mice. Exploratory haplotype and single SNP association analyses suggest a possible association between the MPDZ gene and alcohol dependence but not AWS. Further functional genomic analysis of MPDZ variants and investigation of their association with a broader array of alcoholism-related phenotypes could reveal additional genetic markers of alcoholism.
Aerobic Anoxygenic Photosynthesis Is Commonly Present within the Genus Limnohabitans.

PubMed

Kasalický, Vojtěch; Zeng, Yonghui; Piwosz, Kasia; Šimek, Karel; Kratochvilová, Hana; Koblížek, Michal

2018-01-01

The genus Limnohabitans ( Comamonadaceae , Betaproteobacteria ) is a common and a highly active component of freshwater bacterioplanktonic communities. To date, the genus has been considered to contain only heterotrophic species. In this study, we detected the photosynthesis genes pufLM and bchY in 28 of 46 strains from three Limnohabitans lineages. The pufM sequences obtained are very closely related to environmental pufM sequences detected in various freshwater habitats, indicating the ubiquity and potential importance of photoheterotrophic Limnohabitans in nature. Additionally, we sequenced and analyzed the genomes of 5 potentially photoheterotrophic Limnohabitans strains, to gain further insights into their phototrophic capacity. The structure of the photosynthesis gene cluster turned out to be highly conserved within the genus Limnohabitans and also among all potentially photosynthetic Betaproteobacteria strains. The expression of photosynthetic complexes was detected in a culture of Limnohabitans planktonicus II-D5 T using spectroscopic and pigment analyses. This was further verified by a novel combination of infrared microscopy and fluorescent in situ hybridization. IMPORTANCE The data presented document that the capacity to perform anoxygenic photosynthesis is common among the members of the genus Limnohabitans , indicating that they may have a novel role in freshwater habitats. Copyright © 2017 American Society for Microbiology.
The structure and evolution of angiosperm nuclear genomes.

PubMed

Bennetzen, J L

1998-04-01

Despite several decades of investigation, the organization of angiosperm genomes remained largely unknown until very recently. Data describing the sequence composition of large segments of genomes, covering hundreds of kilobases of contiguous sequence, have only become available in the past two years. Recent results indicate commonalities in the characteristics of many plant genomes, including in the structure of chromosomal components like telomeres and centromeres, and in the order and content of genes. Major differences between angiosperms have been associated mainly with repetitive DNAs, both gene families and mobile elements. Intriguing new studies have begun to characterize the dynamic three-dimensional structures of chromosomes and chromatin, and the relationship between genome structure and co-ordinated gene function.
Sequence variations of the alpha-globin genes: scanning of high CG content genes with DHPLC and DG-DGGE.

PubMed

Lacerra, Giuseppina; Fiorito, Mirella; Musollino, Gennaro; Di Noce, Francesca; Esposito, Maria; Nigro, Vincenzo; Gaudiano, Carlo; Carestia, Clementina

2004-10-01

The alpha-globin chains are encoded by two duplicated genes (HBA2 and HBA1, 5'-3') showing overall sequence homology >96% and average CG content >60%. alpha-Thalassemia, the most prevalent worldwide autosomal recessive disorder, is a hereditary anemia caused by sequence variations of these genes in about 25% of carriers. We evaluated the overall sensitivity and suitability of DHPLC and DG-DGGE in scanning both the alpha-globin genes by carrying out a retrospective analysis of 19 variant alleles in 29 genotypes. The HBA2 alleles c.1A>G, c.79G>A, and c.281T>G, and the HBA1 allele c.475C>A were new. Three pathogenic sequence variations were associated in cis with nonpathogenic variations in all families studied; they were the HBA2 variation c.2T>C associated with c.-24C>G, and the HBA2 variations c.391G>C and c.427T>C, both associated with c.565G>A. We set up original experimental conditions for DHPLC and DG-DGGE and analyzed 10 normal subjects, 46 heterozygotes, seven homozygotes, seven compound heterozygotes, and six compound heterozygotes for a hybrid gene. Both the methodologies gave reproducible results and no false-positive was detected. DHPLC showed 100% sensitivity and DG-DGGE nearly 90%. About 100% of the sequence from the cap site to the polyA addition site could be scanned by DHPLC, about 87% by DG-DGGE. It is noteworthy that the three most common pathogenic sequence variations (HBA2 alleles c.2T>C, c.95+2_95+6del, and c.523A>G) were unambiguously detected by both the methodologies. Genotype diagnosis must be confirmed with PCR sequencing of single amplicons or with an allele-specific method. This study can be helpful for scanning genes with high CG content and offers a model suitable for duplicated genes with high homology. Copyright 2004 Wiley-Liss, Inc.
Molecular analysis of two phytohemagglutinin genes and their expression in Phaseolus vulgaris cv. Pinto, a lectin-deficient cultivar of the bean.

PubMed

Voelker, T A; Staswick, P; Chrispeels, M J

1986-12-01

Phytohemagglutinin (PHA), the seed lectin of the common bean, Phaseolus vulgaris, is encoded by two highly homologous, tandemly linked genes, dlec1 and dlec2, which are coordinately expressed at high levels in developing cotyledons. Their respective transcripts translate into closely related polypeptides, PHA-E and PHA-L, constituents of the tetrameric lectin which accumulates at high levels in developing seeds. In the bean cultivar Pinto UI111, PHA-E is not detectable, and PHA-L accumulates at very reduced levels. To investigate the cause of the Pinto phenotype, we cloned and sequenced the two PHA genes of Pinto, called Pdlec1 and Pdlec2, and determined the abundance of their respective mRNAs in developing cotyledons. Both genes are more than 90% homologous to the normal PHA genes found in other cultivars. Pdlec1 carries a 1-bp frameshift mutation close to the 5' end of its coding sequence. Only very truncated polypeptides could be made from its mRNA. The gene Pdlec2 encodes a polypeptide, which resembles PHA-L and its predicted amino acid sequence agrees with the available Pinto PHA amino acid sequence data. Analysis of the mRNA of developing cotyledons revealed that the Pdlec1 message is reduced 600-fold, and Pdlec2 mRNA is reduced 20-fold with respect to mRNA levels in normal cultivars. A comparison of the sequences which are upstream from the coding sequence shows that Pdlec2 has a 100-bp deletion compared to the other genes (dlec1, dlec2 and Pdlec1). This deletion which contains a large tandem repeat may be responsible for the low level of expression of Pdlec2. The very low expression of Pdlec1 is as yet unexplained.
RNA-seq Data: Challenges in and Recommendations for Experimental Design and Analysis.

PubMed

Williams, Alexander G; Thomas, Sean; Wyman, Stacia K; Holloway, Alisha K

2014-10-01

RNA-seq is widely used to determine differential expression of genes or transcripts as well as identify novel transcripts, identify allele-specific expression, and precisely measure translation of transcripts. Thoughtful experimental design and choice of analysis tools are critical to ensure high-quality data and interpretable results. Important considerations for experimental design include number of replicates, whether to collect paired-end or single-end reads, sequence length, and sequencing depth. Common analysis steps in all RNA-seq experiments include quality control, read alignment, assigning reads to genes or transcripts, and estimating gene or transcript abundance. Our aims are two-fold: to make recommendations for common components of experimental design and assess tool capabilities for each of these steps. We also test tools designed to detect differential expression, since this is the most widespread application of RNA-seq. We hope that these analyses will help guide those who are new to RNA-seq and will generate discussion about remaining needs for tool improvement and development. Copyright © 2014 John Wiley & Sons, Inc.
Development of a polymerase chain reaction to distinguish monocellate cobra (Naja khouthia) bites from other common Thai snake species, using both venom extracts and bite-site swabs.

PubMed

Suntrarachun, S; Pakmanee, N; Tirawatnapong, T; Chanhome, L; Sitprija, V

2001-07-01

A PCR technique was used in this study to identify and distinguish monocellate cobra snake bites using snake venoms and swab specimens from snake bite-sites in mice from bites by other common Thai snakes. The sequences of nucleotide primers were selected for the cobrotoxin-encoding gene from the Chinese cobra (Naja atra) since the sequences of monocellate cobra (Naja kaouthia) venom are still unknown. However, the 113-bp fragment of cDNA of the cobrotoxin-encoding gene was detected in the monocellate cobra venom using RT-PCR. This gene was not found in the venoms of Ophiophagus hannah (king cobra), Bungarus fasciatus (banded krait), Daboia russelii siamensis (Siamese Russell's Viper, and Calloselasma rhodostoma (Malayan pit viper). Moreover, direct PCR could detect a 665-bp fragment of the cobrotoxin-encoding gene in the monocellate cobra venom but not the other snake venoms. Likewise, this gene was only observed in swab specimens from cobra snake bite-sites in mice. This is the first report demonstrating the ability of PCR to detect the cobrotoxin-encoding gene from snake venoms and swab specimens. Further studies are required for identification of this and other snakes from the bite-sites on human skin.
Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea.

PubMed

Makarova, Kira S; Sorokin, Alexander V; Novichkov, Pavel S; Wolf, Yuri I; Koonin, Eugene V

2007-11-27

An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. New Archaeal Clusters of Orthologous Genes (arCOGs) were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon) using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover approximately 88% of the genes in a genome compared to a approximately 76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; approximately 40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome) consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA) is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile that, in addition to the core archaeal functions, encoded more idiosyncratic systems, e.g., the CASS systems of antivirus defense and some toxin-antitoxin systems. The arCOGs provide a convenient, flexible framework for functional annotation of archaeal genomes, comparative genomics and evolutionary reconstructions. Genomic reconstructions suggest that the last common ancestor of archaea might have been (nearly) as advanced as the modern archaeal hyperthermophiles. ArCOGs and related information are available at: ftp://ftp.ncbi.nih.gov/pub/koonin/arCOGs/.
The bean. alpha. -amylase inhibitor is encoded by a lectin gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moreno, J.; Altabella, T.; Chrispeels, M.J.

The common bean, Phaseolus vulgaris, contains an inhibitor of insect and mammalian {alpha}-amylases that does not inhibit plant {alpha}-amylase. This inhibitor functions as an anti-feedant or seed-defense protein. We purified this inhibitor by affinity chromatography and found that it consists of a series of glycoforms of two polypeptides (Mr 14,000-19,000). Partial amino acid sequencing was carried out, and the sequences obtained are identical with portions of the derived amino acid sequence of a lectin-like gene. This lectin gene encodes a polypeptide of MW 28,000, and the primary in vitro translation product identified by antibodies to the {alpha}-amylase inhibitor has themore » same size. Co- and posttranslational processing of this polypeptide results in glycosylated polypeptides of 14-19 kDa. Our interpretation of these results is that the bean lectins constitute a gene family that encodes diverse plant defense proteins, including phytohemagglutinin, arcelin and {alpha}-amylase inhibitor.« less

Identification of the WBSCR9 gene, encoding a novel transcriptional regulator, in the Williams-Beuren syndrome deletion at 7q11.23.

PubMed

Peoples, R J; Cisco, M J; Kaplan, P; Francke, U

1998-01-01

We have identified a novel gene (WBSCR9) within the common Williams-Beuren syndrome (WBS) deletion by interspecies sequence conservation. The WBSCR9 gene encodes a roughly 7-kb transcript with an open reading frame of 1483 amino acids and a predicted protein product size of 170.8 kDa. WBSCR9 is comprised of at least 20 exons extending over 60 kb. The transcript is expressed ubiquitously throughout development and is subject to alternative splicing. Functional motifs identified by sequence homology searches include a bromodomain; a PHD, or C4HC3, finger; several putative nuclear localization signals; four nuclear receptor binding motifs; a polyglutamate stretch and two PEST sequences. Bromodomains, PHD motifs and nuclear receptor binding motifs are cardinal features of proteins that are involved in chromatin remodeling and modulation of transcription. Haploinsufficiency for WBSCR9 gene products may contribute to the complex phenotype of WBS by interacting with tissue-specific regulatory factors during development.
Molecular genetics of cone-rod dystrophy in Chinese patients: New data from 61 probands and mutation overview of 163 probands.

PubMed

Huang, Li; Xiao, Xueshan; Li, Shiqiang; Jia, Xiaoyun; Wang, Panfeng; Sun, Wenmin; Xu, Yan; Xin, Wei; Guo, Xiangming; Zhang, Qingjiong

2016-05-01

Cone-rod dystrophy (CORD) is a common form of inherited retinal degeneration. Previously, we have conducted serial mutational analysis in probands with CORD either by Sanger sequencing or whole exome sequencing (WES). In the current study, variants in all genes from RetNet were selected from the whole exome sequencing data of 108 CORD probands (including 61 probands reported here for the first time) and were analyzed by multistep bioinformatics analysis, followed by Sanger sequencing and segregation validation. Data from the previous studies and new data from this study (163 probands in total) were summarized to provide an overview of the molecular genetics of CORD. The following potentially pathogenic mutations were identified in 93 of the 163 (57.1%) probands: CNGA3 (32.5%), ABCA4 (3.8%), ALMS1 (3.1%), GUCY2D (3.1%), CACNA1F (2.5%), CRX (1.8%), PDE6C (1.8%), CNGB3 (1.8%), GUCA1A (1.2%), UNC119 (0.6%), RPGRIP1 (1.2%), RDH12 (0.6%), KCNV2 (0.6%), C21orf2 (0.6%), CEP290 (0.6%), USH2A (0.6%) and SNRNP200 (0.6%). The 17 genes with mutations included 12 known CORD genes and five genes (ALMS1, RDH12, CEP290, USH2A, and SNRNP200) associated with other forms of retinal degeneration. Mutations in CNGA3 is most common in this cohort. This is a systematic molecular genetic analysis of Chinese patients with CORD. Copyright © 2016 Elsevier Ltd. All rights reserved.
Gene expression analysis of E. coli strains provides insights into the role of gene regulation in diversification

PubMed Central

Vital, Marius; Chai, Benli; Østman, Bjørn; Cole, James; Konstantinidis, Konstantinos T; Tiedje, James M

2015-01-01

Escherichia coli spans a genetic continuum from enteric strains to several phylogenetically distinct, atypical lineages that are rare in humans, but more common in extra-intestinal environments. To investigate the link between gene regulation, phylogeny and diversification in this species, we analyzed global gene expression profiles of four strains representing distinct evolutionary lineages, including a well-studied laboratory strain, a typical commensal (enteric) strain and two environmental strains. RNA-Seq was employed to compare the whole transcriptomes of strains grown under batch, chemostat and starvation conditions. Highly differentially expressed genes showed a significantly lower nucleotide sequence identity compared with other genes, indicating that gene regulation and coding sequence conservation are directly connected. Overall, distances between the strains based on gene expression profiles were largely dependent on the culture condition and did not reflect phylogenetic relatedness. Expression differences of commonly shared genes (all four strains) and E. coli core genes were consistently smaller between strains characterized by more similar primary habitats. For instance, environmental strains exhibited increased expression of stress defense genes under carbon-limited growth and entered a more pronounced survival-like phenotype during starvation compared with other strains, which stayed more alert for substrate scavenging and catabolism during no-growth conditions. Since those environmental strains show similar genetic distance to each other and to the other two strains, these findings cannot be simply attributed to genetic relatedness but suggest physiological adaptations. Our study provides new insights into ecologically relevant gene-expression and underscores the role of (differential) gene regulation for the diversification of the model bacterial species. PMID:25343512
Immune-Related Transcriptome of Coptotermes formosanus Shiraki Workers: The Defense Mechanism

PubMed Central

Hussain, Abid; Li, Yi-Feng; Cheng, Yu; Liu, Yang; Chen, Chuan-Cheng; Wen, Shuo-Yang

2013-01-01

Formosan subterranean termites, Coptotermes formosanus Shiraki, live socially in microbial-rich habitats. To understand the molecular mechanism by which termites combat pathogenic microbes, a full-length normalized cDNA library and four Suppression Subtractive Hybridization (SSH) libraries were constructed from termite workers infected with entomopathogenic fungi (Metarhizium anisopliae and Beauveria bassiana), Gram-positive Bacillus thuringiensis and Gram-negative Escherichia coli, and the libraries were analyzed. From the high quality normalized cDNA library, 439 immune-related sequences were identified. These sequences were categorized as pattern recognition receptors (47 sequences), signal modulators (52 sequences), signal transducers (137 sequences), effectors (39 sequences) and others (164 sequences). From the SSH libraries, 27, 17, 22 and 15 immune-related genes were identified from each SSH library treated with M. anisopliae, B. bassiana, B. thuringiensis and E. coli, respectively. When the normalized cDNA library was compared with the SSH libraries, 37 immune-related clusters were found in common; 56 clusters were identified in the SSH libraries, and 259 were identified in the normalized cDNA library. The immune-related gene expression pattern was further investigated using quantitative real time PCR (qPCR). Important immune-related genes were characterized, and their potential functions were discussed based on the integrated analysis of the results. We suggest that normalized cDNA and SSH libraries enable us to discover functional genes transcriptome. The results remarkably expand our knowledge about immune-inducible genes in C. formosanus Shiraki and enable the future development of novel control strategies for the management of Formosan subterranean termites. PMID:23874972
Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

2004-08-06

The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene, and assayedmore » embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Measuring conservation of sequence features closely linked to function--such as binding-site clustering--makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less
Transposon mutagenesis of type III group B Streptococcus: correlation of capsule expression with virulence.

PubMed

Rubens, C E; Wessels, M R; Heggen, L M; Kasper, D L

1987-10-01

The capsular polysaccharide of type III group B Streptococcus (GBS) is thought to be a major factor in the virulence of this organism. Transposon mutagenesis was used to obtain isogenic strains of a GBS serotype III clinical isolate (COH 31r/s) with site-specific mutations in the gene(s) responsible for capsule production. The self-conjugative transposon Tn916 was transferred to strain COH 31r/s during incubation with Streptococcus faecalis strain CG110 on membrane filters. Eleven transconjugant clones did not bind type III GBS antiserum by immunoblot. Immunofluorescence, competitive ELISA, and electron microscopy confirmed the absence of detectable GBS type III capsular polysaccharide in one of the transconjugants, COH 31-15. Southern hybridization analysis with a Tn916 probe confirmed the presence of the transposon sequence within each mutant. A 3.0-kilobase EcoRI fragment that flanked the Tn916 sequence was subcloned from mutant COH 31-15. This fragment shared homology with DNA from the other GBS serotypes, suggesting a common sequence for capsulation shared by organisms of different capsular types. Loss of capsule expression resulted in loss of virulence in a neonatal rat model. We conclude that a gene common to all capsular types of GBS is required for surface expression of the type III capsule and that inactivation of this gene by Tn916 results in the loss of virulence.
Next-generation sequencing-based transcriptomic and proteomic analysis of the common reed, Phragmites australis (Poaceae), reveals genes involved in invasiveness and rhizome specificity.

PubMed

He, Ruifeng; Kim, Min-Jeong; Nelson, William; Balbuena, Tiago S; Kim, Ryan; Kramer, Robin; Crow, John A; May, Greg D; Thelen, Jay J; Soderlund, Carol A; Gang, David R

2012-02-01

The common reed (Phragmites australis), one of the most widely distributed of all angiosperms, uses its rhizomes (underground stems) to invade new territory, making it one of the most successful weedy species worldwide. Characterization of the rhizome transcriptome and proteome is needed to identify candidate genes and proteins involved in rhizome growth, development, metabolism, and invasiveness. We employed next-generation sequencing technologies including 454 and Illumina platforms to characterize the reed rhizome transcriptome and used quantitative proteomics techniques to identify the rhizome proteome. Combining 336514 Roche 454 Titanium reads and 103350802 Illumina paired-end reads in a de novo hybrid assembly yielded 124450 unique transcripts with an average length of 549 bp, of which 54317 were annotated. Rhizome-specific and differentially expressed transcripts were identified between rhizome apical tips (apical meristematic region) and rhizome elongation zones. A total of 1280 nonredundant proteins were identified and quantified using GeLC-MS/MS based label-free proteomics, where 174 and 77 proteins were preferentially expressed in the rhizome elongation zone and apical tip tissues, respectively. Genes involved in allelopathy and in controlling development and potentially invasiveness were identified. In addition to being a valuable sequence and protein data resource for studying plant rhizome species, our results provide useful insights into identifying specific genes and proteins with potential roles in rhizome differentiation, development, and function.
Exome sequencing links corticospinal motor neuron disease to common neurodegenerative disorders.

PubMed

Novarino, Gaia; Fenstermaker, Ali G; Zaki, Maha S; Hofree, Matan; Silhavy, Jennifer L; Heiberg, Andrew D; Abdellateef, Mostafa; Rosti, Basak; Scott, Eric; Mansour, Lobna; Masri, Amira; Kayserili, Hulya; Al-Aama, Jumana Y; Abdel-Salam, Ghada M H; Karminejad, Ariana; Kara, Majdi; Kara, Bulent; Bozorgmehri, Bita; Ben-Omran, Tawfeg; Mojahedi, Faezeh; El Din Mahmoud, Iman Gamal; Bouslam, Naima; Bouhouche, Ahmed; Benomar, Ali; Hanein, Sylvain; Raymond, Laure; Forlani, Sylvie; Mascaro, Massimo; Selim, Laila; Shehata, Nabil; Al-Allawi, Nasir; Bindu, P S; Azam, Matloob; Gunel, Murat; Caglayan, Ahmet; Bilguvar, Kaya; Tolun, Aslihan; Issa, Mahmoud Y; Schroth, Jana; Spencer, Emily G; Rosti, Rasim O; Akizu, Naiara; Vaux, Keith K; Johansen, Anide; Koh, Alice A; Megahed, Hisham; Durr, Alexandra; Brice, Alexis; Stevanin, Giovanni; Gabriel, Stacy B; Ideker, Trey; Gleeson, Joseph G

2014-01-31

Hereditary spastic paraplegias (HSPs) are neurodegenerative motor neuron diseases characterized by progressive age-dependent loss of corticospinal motor tract function. Although the genetic basis is partly understood, only a fraction of cases can receive a genetic diagnosis, and a global view of HSP is lacking. By using whole-exome sequencing in combination with network analysis, we identified 18 previously unknown putative HSP genes and validated nearly all of these genes functionally or genetically. The pathways highlighted by these mutations link HSP to cellular transport, nucleotide metabolism, and synapse and axon development. Network analysis revealed a host of further candidate genes, of which three were mutated in our cohort. Our analysis links HSP to other neurodegenerative disorders and can facilitate gene discovery and mechanistic understanding of disease.
Completion of the mitochondrial genome sequence of onion (Allium cepa L.) containing the CMS-S male-sterile cytoplasm and identification of an independent event of the ccmF N gene split.

PubMed

Kim, Bongju; Kim, Kyunghee; Yang, Tae-Jin; Kim, Sunggil

2016-11-01

Cytoplasmic male-sterility (CMS) conferred by the CMS-S cytoplasm has been most commonly used for onion (Allium cepa L.) F 1 hybrid seed production. We first report the complete mitochondrial genome sequence containing CMS-S cytoplasm in this study. Initially, seven contigs were de novo assembled from 150-bp paired-end raw reads produced from the total genomic DNA using the Illumina NextSeq500 platform. These contigs were connected into a single circular genome consisting of 316,363 bp (GenBank accession: KU318712) by PCR amplification. Although all 24 core protein-coding genes were present, no ribosomal protein-coding genes, except rps12, were identified in the onion mitochondrial genome. Unusual trans-splicing of the cox2 gene was verified, and the cox1 gene was identified as part of the chimeric orf725 gene, which is a candidate gene responsible for inducing CMS. In addition to orf725, two small chimeric genes were identified, but no transcripts were detected for these two open reading frames. Thirteen chloroplast-derived sequences, with sizes of 126-13,986 bp, were identified in the intergenic regions. Almost 10 % of the onion mitochondrial genome was composed of repeat sequences. The vast majority of repeats were short repeats of <100 base pairs. Interestingly, the gene encoding ccmF N was split into two genes. The ccmF N gene split is first identified outside the Brassicaceae family. The breakpoint in the onion ccmF N gene was different from that of other Brassicaceae species. This split of the ccmF N gene was also present in 30 other Allium species. The complete onion mitochondrial genome sequence reported in this study would be fundamental information for elucidation of onion CMS evolution.
Developmental rearrangement of cyanobacterial nif genes: nucleotide sequence, open reading frames, and cytochrome P-450 homology of the Anabaena sp. strain PCC 7120 nifD element.

PubMed Central

Lammers, P J; McLaughlin, S; Papin, S; Trujillo-Provencio, C; Ryncarz, A J

1990-01-01

An 11-kbp DNA element of unknown function interrupts the nifD gene in vegetative cells of Anabaena sp. strain PCC 7120. In developing heterocysts the nifD element excises from the chromosome via site-specific recombination between short repeat sequences that flank the element. The nucleotide sequence of the nifH-proximal half of the element was determined to elucidate the genetic potential of the element. Four open reading frames with the same relative orientation as the nifD element-encoded xisA gene were identified in the sequenced region. Each of the open reading frames was preceded by a reasonable ribosome-binding site and had biased codon utilization preferences consistent with low levels of expression. Open reading frame 3 was highly homologous with three cytochrome P-450 omega-hydroxylase proteins and showed regional homology to functionally significant domains common to the cytochrome P-450 superfamily. The sequence encoding open reading frame 2 was the most highly conserved portion of the sequenced region based on heterologous hybridization experiments with three genera of heterocystous cyanobacteria. Images PMID:2123860
Whole-Exome Sequencing to Identify Novel Biological Pathways Associated With Infertility After Pelvic Inflammatory Disease.

PubMed

Taylor, Brandie D; Zheng, Xiaojing; Darville, Toni; Zhong, Wujuan; Konganti, Kranti; Abiodun-Ojo, Olayinka; Ness, Roberta B; O'Connell, Catherine M; Haggerty, Catherine L

2017-01-01

Ideal management of sexually transmitted infections (STI) may require risk markers for pathology or vaccine development. Previously, we identified common genetic variants associated with chlamydial pelvic inflammatory disease (PID) and reduced fecundity. As this explains only a proportion of the long-term morbidity risk, we used whole-exome sequencing to identify biological pathways that may be associated with STI-related infertility. We obtained stored DNA from 43 non-Hispanic black women with PID from the PID Evaluation and Clinical Health Study. Infertility was assessed at a mean of 84 months. Principal component analysis revealed no population stratification. Potential covariates did not significantly differ between groups. Sequencing kernel association test was used to examine associations between aggregates of variants on a single gene and infertility. The results from the sequencing kernel association test were used to choose "focus genes" (P < 0.01; n = 150) for subsequent Ingenuity Pathway Analysis to identify "gene sets" that are enriched in biologically relevant pathways. Pathway analysis revealed that focus genes were enriched in canonical pathways including, IL-1 signaling, P2Y purinergic receptor signaling, and bone morphogenic protein signaling. Focus genes were enriched in pathways that impact innate and adaptive immunity, protein kinase A activity, cellular growth, and DNA repair. These may alter host resistance or immunopathology after infection. Targeted sequencing of biological pathways identified in this study may provide insight into STI-related infertility.
Characterization of defensin gene from abalone Haliotis discus hannai and its deduced protein

NASA Astrophysics Data System (ADS)

Hong, Xuguang; Sun, Xiuqin; Zheng, Minggang; Qu, Lingyun; Zan, Jindong; Zhang, Jinxing

2008-11-01

Defensin is one of preserved ancient host defensive materials formed in biological evolution. As a regulator and effector molecule, it is very important in animals’ acquired immune system. This paper reports the defensin gene from the mixed liver and kidney cDNA library of abalone Haliotis discus hannai Ino. Sequence analysis shows that the gene sequence of full-length cDNA encodes 42 mature peptides (including six Cys), molecular weight of 4 323 Da, and pI of 8.02. Amino acid sequence homology analysis shows that the peptides are highly similar (70% in common) to other insects defensin. Because of a typical insect-defensin structural character of mature peptide in the secondary structure, the polypeptide named Haliotis discus defensin (hd-def), a novel of antimicrobial peptides, belongs to insects defensin subfamily. The RT-PCR result of Haliotis discus defensin shows that the gene can be expressed only in the hepatopancreas by Gram-negative and positive bacteria stimulation, which is ascribed to inducible expression. Therefore, it is revealed that the Haliotis discus defensin gene expression was related to the antibacterial infection of Haliotis discus hannai Ino.
A single gene for lycopene cyclase, phytoene synthase, and regulation of carotene biosynthesis in Phycomyces

PubMed Central

Arrach, Nabil; Fernández-Martín, Rafael; Cerdá-Olmedo, Enrique; Avalos, Javier

2001-01-01

Previous complementation and mapping of mutations that change the usual yellow color of the Zygomycete Phycomyces blakesleeanus to white or red led to the definition of two structural genes for carotene biosynthesis. We have cloned one of these genes, carRA, by taking advantage of its close linkage to the other, carB, responsible for phytoene dehydrogenase. The sequences of the wild type and six mutants have been established, compared with sequences in other organisms, and correlated with the mutant phenotypes. The carRA and carB coding sequences are separated by 1,381 untranslated nucleotides and are divergently transcribed. Gene carRA contains separate domains for two enzymes, lycopene cyclase and phytoene synthase, and regulates the overall activity of the pathway and its response to physical and chemical stimuli from the environment. The lycopene cyclase domain of carRA derived from a duplication of a gene from a common ancestor of fungi and Brevibacterium linens; the phytoene synthase domain is similar to the phytoene and squalene synthases of many organisms; but the regulatory functions appear to be specific to Phycomyces. PMID:11172012
Identification and characterization of microRNAs in Phaseolus vulgaris by high-throughput sequencing

PubMed Central

2012-01-01

Background MicroRNAs (miRNAs) are endogenously encoded small RNAs that post-transcriptionally regulate gene expression. MiRNAs play essential roles in almost all plant biological processes. Currently, few miRNAs have been identified in the model food legume Phaseolus vulgaris (common bean). Recent advances in next generation sequencing technologies have allowed the identification of conserved and novel miRNAs in many plant species. Here, we used Illumina's sequencing by synthesis (SBS) technology to identify and characterize the miRNA population of Phaseolus vulgaris. Results Small RNA libraries were generated from roots, flowers, leaves, and seedlings of P. vulgaris. Based on similarity to previously reported plant miRNAs,114 miRNAs belonging to 33 conserved miRNA families were identified. Stem-loop precursors and target gene sequences for several conserved common bean miRNAs were determined from publicly available databases. Less conserved miRNA families and species-specific common bean miRNA isoforms were also characterized. Moreover, novel miRNAs based on the small RNAs were found and their potential precursors were predicted. In addition, new target candidates for novel and conserved miRNAs were proposed. Finally, we studied organ-specific miRNA family expression levels through miRNA read frequencies. Conclusions This work represents the first massive-scale RNA sequencing study performed in Phaseolus vulgaris to identify and characterize its miRNA population. It significantly increases the number of miRNAs, precursors, and targets identified in this agronomically important species. The miRNA expression analysis provides a foundation for understanding common bean miRNA organ-specific expression patterns. The present study offers an expanded picture of P. vulgaris miRNAs in relation to those of other legumes. PMID:22394504
De Novo Foliar Transcriptome of Chenopodium amaranticolor and Analysis of Its Gene Expression During Virus-Induced Hypersensitive Response

PubMed Central

Zhang, Yongqiang; Pei, Xinwu; Zhang, Chao; Lu, Zifeng; Wang, Zhixing; Jia, Shirong; Li, Weimin

2012-01-01

Background The hypersensitive response (HR) system of Chenopodium spp. confers broad-spectrum virus resistance. However, little knowledge exists at the genomic level for Chenopodium, thus impeding the advanced molecular research of this attractive feature. Hence, we took advantage of RNA-seq to survey the foliar transcriptome of C. amaranticolor, a Chenopodium species widely used as laboratory indicator for pathogenic viruses, in order to facilitate the characterization of the HR-type of virus resistance. Methodology and Principal Findings Using Illumina HiSeq™ 2000 platform, we obtained 39,868,984 reads with 3,588,208,560 bp, which were assembled into 112,452 unigenes (3,847 clusters and 108,605 singletons). BlastX search against the NCBI NR database identified 61,698 sequences with a cut-off E-value above 10−5. Assembled sequences were annotated with gene descriptions, GO, COG and KEGG terms, respectively. A total number of 738 resistance gene analogs (RGAs) and homology sequences of 6 key signaling proteins within the R proteins-directed signaling pathway were identified. Based on this transcriptome data, we investigated the gene expression profiles over the stage of HR induced by Tobacco mosaic virus and Cucumber mosaic virus by using digital gene expression analysis. Numerous candidate genes specifically or commonly regulated by these two distinct viruses at early and late stages of the HR were identified, and the dynamic changes of the differently expressed genes enriched in the pathway of plant-pathogen interaction were particularly emphasized. Conclusions To our knowledge, this study is the first description of the genetic makeup of C. amaranticolor, providing deep insight into the comprehensive gene expression information at transcriptional level in this species. The 738 RGAs as well as the differentially regulated genes, particularly the common genes regulated by both TMV and CMV, are suitable candidates which merit further functional characterization to dissect the molecular mechanisms and regulatory pathways of the HR-type of virus resistance in Chenopodium. PMID:23029338
De novo transcriptome sequencing of two cultivated jute species under salinity stress.

PubMed

Yang, Zemao; Yan, An; Lu, Ruike; Dai, Zhigang; Tang, Qing; Cheng, Chaohua; Xu, Ying; Su, Jianguang

2017-01-01

Soil salinity, a major environmental stress, reduces agricultural productivity by restricting plant development and growth. Jute (Corchorus spp.), a commercially important bast fiber crop, includes two commercially cultivated species, Corchorus capsularis and Corchorus olitorius. We conducted high-throughput transcriptome sequencing of 24 C. capsularis and C. olitorius samples under salt stress and found 127 common differentially expressed genes (DEGs); additionally, 4489 and 492 common DEGs were identified in the root and leaf tissues, respectively, of both Corchorus species. Further, 32, 196, and 11 common differentially expressed transcription factors (DTFs) were detected in the leaf, root, or both tissues, respectively. Several Gene Ontology (GO) terms were enriched in NY and YY. A Kyoto Encyclopedia of Genes and Genomes analysis revealed numerous DEGs in both species. Abscisic acid and cytokinin signal pathways enriched respectively about 20 DEGs in leaves and roots of both NY and YY. The Ca2+, mitogen-activated protein kinase signaling and oxidative phosphorylation pathways were also found to be related to the plant response to salt stress, as evidenced by the DEGs in the roots of both species. These results provide insight into salt stress response mechanisms in plants as well as a basis for future breeding of salt-tolerant cultivars.
Center for Inherited Disease Research (CIDR)

Cancer.gov

The Center for Inherited Disease Research (CIDR) Program at The Johns Hopkins University provides high-quality next generation sequencing and genotyping services to investigators working to discover genes that contribute to common diseases.
Phylogenetic relatedness determined between antibiotic resistance and 16S rRNA genes in actinobacteria.

PubMed

Sagova-Mareckova, Marketa; Ulanova, Dana; Sanderova, Petra; Omelka, Marek; Kamenik, Zdenek; Olsovska, Jana; Kopecky, Jan

2015-04-01

Distribution and evolutionary history of resistance genes in environmental actinobacteria provide information on intensity of antibiosis and evolution of specific secondary metabolic pathways at a given site. To this day, actinobacteria producing biologically active compounds were isolated mostly from soil but only a limited range of soil environments were commonly sampled. Consequently, soil remains an unexplored environment in search for novel producers and related evolutionary questions. Ninety actinobacteria strains isolated at contrasting soil sites were characterized phylogenetically by 16S rRNA gene, for presence of erm and ABC transporter resistance genes and antibiotic production. An analogous analysis was performed in silico with 246 and 31 strains from Integrated Microbial Genomes (JGI_IMG) database selected by the presence of ABC transporter genes and erm genes, respectively. In the isolates, distances of erm gene sequences were significantly correlated to phylogenetic distances based on 16S rRNA genes, while ABC transporter gene distances were not. The phylogenetic distance of isolates was significantly correlated to soil pH and organic matter content of isolation sites. In the analysis of JGI_IMG datasets the correlation between phylogeny of resistance genes and the strain phylogeny based on 16S rRNA genes or five housekeeping genes was observed for both the erm genes and ABC transporter genes in both actinobacteria and streptomycetes. However, in the analysis of sequences from genomes where both resistance genes occurred together the correlation was observed for both ABC transporter and erm genes in actinobacteria but in streptomycetes only in the erm gene. The type of erm resistance gene sequences was influenced by linkage to 16S rRNA gene sequences and site characteristics. The phylogeny of ABC transporter gene was correlated to 16S rRNA genes mainly above the genus level. The results support the concept of new specific secondary metabolite scaffolds occurring more likely in taxonomically distant producers but suggest that the antibiotic selection of gene pools is also influenced by site conditions.
Pooled Resequencing of 122 Ulcerative Colitis Genes in a Large Dutch Cohort Suggests Population-Specific Associations of Rare Variants in MUC2.

PubMed

Visschedijk, Marijn C; Alberts, Rudi; Mucha, Soren; Deelen, Patrick; de Jong, Dirk J; Pierik, Marieke; Spekhorst, Lieke M; Imhann, Floris; van der Meulen-de Jong, Andrea E; van der Woude, C Janneke; van Bodegraven, Adriaan A; Oldenburg, Bas; Löwenberg, Mark; Dijkstra, Gerard; Ellinghaus, David; Schreiber, Stefan; Wijmenga, Cisca; Rivas, Manuel A; Franke, Andre; van Diemen, Cleo C; Weersma, Rinse K

2016-01-01

Genome-wide association studies have revealed several common genetic risk variants for ulcerative colitis (UC). However, little is known about the contribution of rare, large effect genetic variants to UC susceptibility. In this study, we performed a deep targeted re-sequencing of 122 genes in Dutch UC patients in order to investigate the contribution of rare variants to the genetic susceptibility to UC. The selection of genes consists of 111 established human UC susceptibility genes and 11 genes that lead to spontaneous colitis when knocked-out in mice. In addition, we sequenced the promoter regions of 45 genes where known variants exert cis-eQTL-effects. Targeted pooled re-sequencing was performed on DNA of 790 Dutch UC cases. The Genome of the Netherlands project provided sequence data of 500 healthy controls. After quality control and prioritization based on allele frequency and pathogenicity probability, follow-up genotyping of 171 rare variants was performed on 1021 Dutch UC cases and 1166 Dutch controls. Single-variant association and gene-based analyses identified an association of rare variants in the MUC2 gene with UC. The associated variants in the Dutch population could not be replicated in a German replication cohort (1026 UC cases, 3532 controls). In conclusion, this study has identified a putative role for MUC2 on UC susceptibility in the Dutch population and suggests a population-specific contribution of rare variants to UC.
The complete mitochondrial genomes of three parasitic nematodes of birds: a unique gene order and insights into nematode phylogeny

PubMed Central

2013-01-01

Background Analyses of mitochondrial (mt) genome sequences in recent years challenge the current working hypothesis of Nematoda phylogeny proposed from morphology, ecology and nuclear small subunit rRNA gene sequences, and raise the need to sequence additional mt genomes for a broad range of nematode lineages. Results We sequenced the complete mt genomes of three Ascaridia species (family Ascaridiidae) that infest chickens, pigeons and parrots, respectively. These three Ascaridia species have an identical arrangement of mt genes to each other but differ substantially from other nematodes. Phylogenetic analyses of the mt genome sequences of the Ascaridia species, together with 62 other nematode species, support the monophylies of seven high-level taxa of the phylum Nematoda: 1) the subclass Dorylaimia; 2) the orders Rhabditida, Trichinellida and Mermithida; 3) the suborder Rhabditina; and 4) the infraorders Spiruromorpha and Oxyuridomorpha. Analyses of mt genome sequences, however, reject the monophylies of the suborders Spirurina and Tylenchina, and the infraorders Rhabditomorpha, Panagrolaimomorpha and Tylenchomorpha. Monophyly of the infraorder Ascaridomorpha varies depending on the methods of phylogenetic analysis. The Ascaridomorpha was more closely related to the infraorders Rhabditomorpha and Diplogasteromorpha (suborder Rhabditina) than they were to the other two infraorders of the Spirurina: Oxyuridorpha and Spiruromorpha. The closer relationship among Ascaridomorpha, Rhabditomorpha and Diplogasteromorpha was also supported by a shared common pattern of mitochondrial gene arrangement. Conclusions Analyses of mitochondrial genome sequences and gene arrangement has provided novel insights into the phylogenetic relationships among several major lineages of nematodes. Many lineages of nematodes, however, are underrepresented or not represented in these analyses. Expanding taxon sampling is necessary for future phylogenetic studies of nematodes with mt genome sequences. PMID:23800363

Detection of novel mutations that cause autosomal dominant retinitis pigmentosa in candidate genes by long-range PCR amplification and next-generation sequencing

PubMed Central

Dias, Miguel de Sousa; Hernan, Imma; Pascual, Beatriz; Borràs, Emma; Mañé, Begoña; Gamundi, Maria José

2013-01-01

Purpose To devise an effective method for detecting mutations in 12 genes (CA4, CRX, IMPDH1, NR2E3, RP9, PRPF3, PRPF8, PRPF31, PRPH2, RHO, RP1, and TOPORS) commonly associated with autosomal dominant retinitis pigmentosa (adRP) that account for more than 95% of known mutations. Methods We used long-range PCR (LR-PCR) amplification and next-generation sequencing (NGS) performed in a GS Junior 454 benchtop sequencing platform. Twenty LR-PCR fragments, between 3,000 and 10,000 bp, containing all coding exons and flanking regions of the 12 genes, were obtained from DNA samples of patients with adRP. Sequencing libraries were prepared with an enzymatic (Fragmentase technology) method. Results Complete coverage of the coding and flanking sequences of the 12 genes assayed was obtained with NGS, with an average sequence depth of 380× (ranging from 128× to 1,077×). Five previous known mutations in the adRP genes were detected with a sequence variation percentage between 35% and 65%. We also performed a parallel sequence analysis of four samples, three of them new patients with index adRP, in which two novel mutations were detected in RHO (p.Asn73del) and PRPF31 (p.Ile109del). Conclusions The results demonstrate that genomic LR-PCR amplification together with NGS is an effective method for analyzing individual patient samples for mutations in a monogenic heterogeneous disease such as adRP. This approach proved effective for the parallel analysis of adRP and has been introduced as routine. Additionally, this approach could be extended to other heterogeneous genetic diseases. PMID:23559859
Morphological and molecular characteristics of Malayfilaria sofiani Uni, Mat Udin & Takaoka n. g., n. sp. (Nematoda: Filarioidea) from the common treeshrew Tupaia glis Diard & Duvaucel (Mammalia: Scandentia) in Peninsular Malaysia.

PubMed

Uni, Shigehiko; Mat Udin, Ahmad Syihan; Agatsuma, Takeshi; Saijuntha, Weerachai; Junker, Kerstin; Ramli, Rosli; Omar, Hasmahzaiti; Lim, Yvonne Ai-Lian; Sivanandam, Sinnadurai; Lefoulon, Emilie; Martin, Coralie; Belabut, Daicus Martin; Kasim, Saharul; Abdullah Halim, Muhammad Rasul; Zainuri, Nur Afiqah; Bhassu, Subha; Fukuda, Masako; Matsubayashi, Makoto; Harada, Masashi; Low, Van Lun; Chen, Chee Dhang; Suganuma, Narifumi; Hashim, Rosli; Takaoka, Hiroyuki; Azirun, Mohd Sofian

2017-04-20

The filarial nematodes Wuchereria bancrofti (Cobbold, 1877), Brugia malayi (Brug, 1927) and B. timori Partono, Purnomo, Dennis, Atmosoedjono, Oemijati & Cross, 1977 cause lymphatic diseases in humans in the tropics, while B. pahangi (Buckley & Edeson, 1956) infects carnivores and causes zoonotic diseases in humans in Malaysia. Wuchereria bancrofti, W. kalimantani Palmieri, Pulnomo, Dennis & Marwoto, 1980 and six out of ten Brugia spp. have been described from Australia, Southeast Asia, Sri Lanka and India. However, the origin and evolution of the species in the Wuchereria-Brugia clade remain unclear. While investigating the diversity of filarial parasites in Malaysia, we discovered an undescribed species in the common treeshrew Tupaia glis Diard & Duvaucel (Mammalia: Scandentia). We examined 81 common treeshrews from 14 areas in nine states and the Federal Territory of Peninsular Malaysia for filarial parasites. Once any filariae that were found had been isolated, we examined their morphological characteristics and determined the partial sequences of their mitochondrial cytochrome c oxidase subunit 1 (cox1) and 12S rRNA genes. Polymerase chain reaction (PCR) products of the internal transcribed spacer 1 (ITS1) region were then cloned into the pGEM-T vector, and the recombinant plasmids were used as templates for sequencing. Malayfilaria sofiani Uni, Mat Udin & Takaoka, n. g., n. sp. is described based on the morphological characteristics of adults and microfilariae found in common treeshrews from Jeram Pasu, Kelantan, Malaysia. The Kimura 2-parameter distance between the cox1 gene sequences of the new species and W. bancrofti was 11.8%. Based on the three gene sequences, the new species forms a monophyletic clade with W. bancrofti and Brugia spp. The adult parasites were found in tissues surrounding the lymph nodes of the neck of common treeshrews. The newly described species appears most closely related to Wuchereria spp. and Brugia spp., but differs from these in several morphological characteristics. Molecular analyses based on the cox1 and 12S rRNA genes and the ITS1 region indicated that this species differs from both W. bancrofti and Brugia spp. at the genus level. We thus propose a new genus, Malayfilaria, along with the new species M. sofiani.
A Gibbs sampler for motif detection in phylogenetically close sequences

NASA Astrophysics Data System (ADS)

Siddharthan, Rahul; van Nimwegen, Erik; Siggia, Eric

2004-03-01

Genes are regulated by transcription factors that bind to DNA upstream of genes and recognize short conserved ``motifs'' in a random intergenic ``background''. Motif-finders such as the Gibbs sampler compare the probability of these short sequences being represented by ``weight matrices'' to the probability of their arising from the background ``null model'', and explore this space (analogous to a free-energy landscape). But closely related species may show conservation not because of functional sites but simply because they have not had sufficient time to diverge, so conventional methods will fail. We introduce a new Gibbs sampler algorithm that accounts for common ancestry when searching for motifs, while requiring minimal ``prior'' assumptions on the number and types of motifs, assessing the significance of detected motifs by ``tracking'' clusters that stay together. We apply this scheme to motif detection in sporulation-cycle genes in the yeast S. cerevisiae, using recent sequences of other closely-related Saccharomyces species.
Convergent evolution of marine mammals is associated with distinct substitutions in common genes

PubMed Central

Zhou, Xuming; Seim, Inge; Gladyshev, Vadim N.

2015-01-01

Phenotypic convergence is thought to be driven by parallel substitutions coupled with natural selection at the sequence level. Multiple independent evolutionary transitions of mammals to an aquatic environment offer an opportunity to test this thesis. Here, whole genome alignment of coding sequences identified widespread parallel amino acid substitutions in marine mammals; however, the majority of these changes were not unique to these animals. Conversely, we report that candidate aquatic adaptation genes, identified by signatures of likelihood convergence and/or elevated ratio of nonsynonymous to synonymous nucleotide substitution rate, are characterized by very few parallel substitutions and exhibit distinct sequence changes in each group. Moreover, no significant positive correlation was found between likelihood convergence and positive selection in all three marine lineages. These results suggest that convergence in protein coding genes associated with aquatic lifestyle is mainly characterized by independent substitutions and relaxed negative selection. PMID:26549748
Characterization of the glutathione S-transferase gene family through ESTs and expression analyses within common and pigmented cultivars of Citrus sinensis (L.) Osbeck

PubMed Central

2014-01-01

Background Glutathione S-transferases (GSTs) represent a ubiquitous gene family encoding detoxification enzymes able to recognize reactive electrophilic xenobiotic molecules as well as compounds of endogenous origin. Anthocyanin pigments require GSTs for their transport into the vacuole since their cytoplasmic retention is toxic to the cell. Anthocyanin accumulation in Citrus sinensis (L.) Osbeck fruit flesh determines different phenotypes affecting the typical pigmentation of Sicilian blood oranges. In this paper we describe: i) the characterization of the GST gene family in C. sinensis through a systematic EST analysis; ii) the validation of the EST assembly by exploiting the genome sequences of C. sinensis and C. clementina and their genome annotations; iii) GST gene expression profiling in six tissues/organs and in two different sweet orange cultivars, Cadenera (common) and Moro (pigmented). Results We identified 61 GST transcripts, described the full- or partial-length nature of the sequences and assigned to each sequence the GST class membership exploiting a comparative approach and the classification scheme proposed for plant species. A total of 23 full-length sequences were defined. Fifty-four of the 61 transcripts were successfully aligned to the C. sinensis and C. clementina genomes. Tissue specific expression profiling demonstrated that the expression of some GST transcripts was 'tissue-affected' and cultivar specific. A comparative analysis of C. sinensis GSTs with those from other plant species was also considered. Data from the current analysis are accessible at http://biosrv.cab.unina.it/citrusGST/, with the aim to provide a reference resource for C. sinensis GSTs. Conclusions This study aimed at the characterization of the GST gene family in C. sinensis. Based on expression patterns from two different cultivars and on sequence-comparative analyses, we also highlighted that two sequences, a Phi class GST and a Mapeg class GST, could be involved in the conjugation of anthocyanin pigments and in their transport into the vacuole, specifically in fruit flesh of the pigmented cultivar. PMID:24490620
Fetal origins of the TEL-AML1 fusion gene in identical twins with leukemia

PubMed Central

Ford, Anthony M.; Bennett, Caroline A.; Price, Cathy M.; Bruin, M. C. A.; Van Wering, Elisabeth R.; Greaves, Mel

1998-01-01

The TEL (ETV6)−AML1 (CBFA2) gene fusion is the most common reciprocal chromosomal rearrangement in childhood cancer occurring in ≈25% of the most predominant subtype of leukemia— common acute lymphoblastic leukemia. The TEL-AML1 genomic sequence has been characterized in a pair of monozygotic twins diagnosed at ages 3 years, 6 months and 4 years, 10 months with common acute lymphoblastic leukemia. The twin leukemic DNA shared the same unique (or clonotypic) but nonconstitutive TEL-AML1 fusion sequence. The most plausible explanation for this finding is a single cell origin of the TEL-AML fusion in one fetus in utero, probably as a leukemia-initiating mutation, followed by intraplacental metastasis of clonal progeny to the other twin. Clonal identity is further supported by the finding that the leukemic cells in the two twins shared an identical rearranged IGH allele. These data have implications for the etiology and natural history of childhood leukemia. PMID:9539781
Mitochondrial targeting sequence variants of the CHCHD2 gene are a risk for Lewy body disorders

PubMed Central

Ogaki, Kotaro; Koga, Shunsuke; Heckman, Michael G.; Fiesel, Fabienne C.; Ando, Maya; Labbé, Catherine; Lorenzo-Betancor, Oswaldo; Moussaud-Lamodière, Elisabeth L.; Soto-Ortolaza, Alexandra I.; Walton, Ronald L.; Strongosky, Audrey J.; Uitti, Ryan J.; McCarthy, Allan; Lynch, Timothy; Siuda, Joanna; Opala, Grzegorz; Rudzinska, Monika; Krygowska-Wajs, Anna; Barcikowska, Maria; Czyzewski, Krzysztof; Puschmann, Andreas; Nishioka, Kenya; Funayama, Manabu; Hattori, Nobutaka; Parisi, Joseph E.; Petersen, Ronald C.; Graff-Radford, Neill R.; Boeve, Bradley F.; Springer, Wolfdieter; Wszolek, Zbigniew K.; Dickson, Dennis W.

2015-01-01

Objective: To assess the role of CHCHD2 variants in patients with Parkinson disease (PD) and Lewy body disease (LBD) in Caucasian populations. Methods: All exons of the CHCHD2 gene were sequenced in a US Caucasian patient-control series (878 PD, 610 LBD, and 717 controls). Subsequently, exons 1 and 2 were sequenced in an Irish series (355 PD and 365 controls) and a Polish series (394 PD and 350 controls). Immunohistochemistry and immunofluorescence studies were performed on pathologic LBD cases with rare CHCHD2 variants. Results: We identified 9 rare exonic variants of unknown significance. These variants were more frequent in the combined group of PD and LBD patients compared to controls (0.6% vs 0.1%, p = 0.013). In addition, the presence of any rare variant was more common in patients with LBD (2.5% vs 1.0%, p = 0.050) compared to controls. Eight of these 9 variants were located within the gene's mitochondrial targeting sequence. Conclusions: Although the role of variants of the CHCHD2 gene in PD and LBD remains to be further elucidated, the rare variants in the mitochondrial targeting sequence may be a risk factor for Lewy body disorders, which may link CHCHD2 to other genetic forms of parkinsonism with mitochondrial dysfunction. PMID:26561290
Clinical next-generation sequencing in patients with non-small cell lung cancer.

PubMed

Hagemann, Ian S; Devarakonda, Siddhartha; Lockwood, Christina M; Spencer, David H; Guebert, Kalin; Bredemeyer, Andrew J; Al-Kateb, Hussam; Nguyen, TuDung T; Duncavage, Eric J; Cottrell, Catherine E; Kulkarni, Shashikant; Nagarajan, Rakesh; Seibert, Karen; Baggstrom, Maria; Waqar, Saiama N; Pfeifer, John D; Morgensztern, Daniel; Govindan, Ramaswamy

2015-02-15

A clinical assay was implemented to perform next-generation sequencing (NGS) of genes commonly mutated in multiple cancer types. This report describes the feasibility and diagnostic yield of this assay in 381 consecutive patients with non-small cell lung cancer (NSCLC). Clinical targeted sequencing of 23 genes was performed with DNA from formalin-fixed, paraffin-embedded (FFPE) tumor tissue. The assay used Agilent SureSelect hybrid capture followed by Illumina HiSeq 2000, MiSeq, or HiSeq 2500 sequencing in a College of American Pathologists-accredited, Clinical Laboratory Improvement Amendments-certified laboratory. Single-nucleotide variants and insertion/deletion events were reported. This assay was performed before methods were developed to detect rearrangements by NGS. Two hundred nine of all requisitioned samples (55%) were successfully sequenced. The most common reason for not performing the sequencing was an insufficient quantity of tissue available in the blocks (29%). Excisional, endoscopic, and core biopsy specimens were sufficient for testing in 95%, 66%, and 40% of the cases, respectively. The median turnaround time (TAT) in the pathology laboratory was 21 days, and there was a trend of an improved TAT with more rapid sequencing platforms. Sequencing yielded a mean coverage of 1318×. Potentially actionable mutations (ie, predictive or prognostic) were identified in 46% of 209 samples and were most commonly found in KRAS (28%), epidermal growth factor receptor (14%), phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (4%), phosphatase and tensin homolog (1%), and BRAF (1%). Five percent of the samples had multiple actionable mutations. A targeted therapy was instituted on the basis of NGS in 11% of the sequenced patients or in 6% of all patients. NGS-based diagnostics are feasible in NSCLC and provide clinically relevant information from readily available FFPE tissue. The sample type is associated with the probability of successful testing. © 2014 American Cancer Society.
Transcriptome Analysis of an Insecticide Resistant Housefly Strain: Insights about SNPs and Regulatory Elements in Cytochrome P450 Genes.

PubMed

Mahmood, Khalid; Højland, Dorte H; Asp, Torben; Kristensen, Michael

2016-01-01

Insecticide resistance in the housefly, Musca domestica, has been investigated for more than 60 years. It will enter a new era after the recent publication of the housefly genome and the development of multiple next generation sequencing technologies. The genetic background of the xenobiotic response can now be investigated in greater detail. Here, we investigate the 454-pyrosequencing transcriptome of the spinosad-resistant 791spin strain in relation to the housefly genome with focus on P450 genes. The de novo assembly of clean reads gave 35,834 contigs consisting of 21,780 sequences of the spinosad resistant strain. The 3,648 sequences were annotated with an enzyme code EC number and were mapped to 124 KEGG pathways with metabolic processes as most highly represented pathway. One hundred and twenty contigs were annotated as P450s covering 44 different P450 genes of housefly. Eight differentially expressed P450s genes were identified and investigated for SNPs, CpG islands and common regulatory motifs in promoter and coding regions. Functional annotation clustering of metabolic related genes and motif analysis of P450s revealed their association with epigenetic, transcription and gene expression related functions. The sequence variation analysis resulted in 12 SNPs and eight of them found in cyp6d1. There is variation in location, size and frequency of CpG islands and specific motifs were also identified in these P450s. Moreover, identified motifs were associated to GO terms and transcription factors using bioinformatic tools. Transcriptome data of a spinosad resistant strain provide together with genome data fundamental support for future research to understand evolution of resistance in houseflies. Here, we report for the first time the SNPs, CpG islands and common regulatory motifs in differentially expressed P450s. Taken together our findings will serve as a stepping stone to advance understanding of the mechanism and role of P450s in xenobiotic detoxification.
Pooled Sequencing of 531 Genes in Inflammatory Bowel Disease Identifies an Associated Rare Variant in BTNL2 and Implicates Other Immune Related Genes

PubMed Central

Prescott, Natalie J.; Lehne, Benjamin; Stone, Kristina; Lee, James C.; Taylor, Kirstin; Knight, Jo; Papouli, Efterpi; Mirza, Muddassar M.; Simpson, Michael A.; Spain, Sarah L.; Lu, Grace; Fraternali, Franca; Bumpstead, Suzannah J.; Gray, Emma; Amar, Ariella; Bye, Hannah; Green, Peter; Chung-Faye, Guy; Hayee, Bu’Hussain; Pollok, Richard; Satsangi, Jack; Parkes, Miles; Barrett, Jeffrey C.; Mansfield, John C.; Sanderson, Jeremy; Lewis, Cathryn M.; Weale, Michael E.; Schlitt, Thomas; Mathew, Christopher G.

2015-01-01

The contribution of rare coding sequence variants to genetic susceptibility in complex disorders is an important but unresolved question. Most studies thus far have investigated a limited number of genes from regions which contain common disease associated variants. Here we investigate this in inflammatory bowel disease by sequencing the exons and proximal promoters of 531 genes selected from both genome-wide association studies and pathway analysis in pooled DNA panels from 474 cases of Crohn’s disease and 480 controls. 80 variants with evidence of association in the sequencing experiment or with potential functional significance were selected for follow up genotyping in 6,507 IBD cases and 3,064 population controls. The top 5 disease associated variants were genotyped in an extension panel of 3,662 IBD cases and 3,639 controls, and tested for association in a combined analysis of 10,147 IBD cases and 7,008 controls. A rare coding variant p.G454C in the BTNL2 gene within the major histocompatibility complex was significantly associated with increased risk for IBD (p = 9.65x10−10, OR = 2.3[95% CI = 1.75–3.04]), but was independent of the known common associated CD and UC variants at this locus. Rare (<1%) and low frequency (1–5%) variants in 3 additional genes showed suggestive association (p<0.005) with either an increased risk (ARIH2 c.338-6C>T) or decreased risk (IL12B p.V298F, and NICN p.H191R) of IBD. These results provide additional insights into the involvement of the inhibition of T cell activation in the development of both sub-phenotypes of inflammatory bowel disease. We suggest that although rare coding variants may make a modest overall contribution to complex disease susceptibility, they can inform our understanding of the molecular pathways that contribute to pathogenesis. PMID:25671699
Homology and phylogeny and their automated inference

NASA Astrophysics Data System (ADS)

Fuellen, Georg

2008-06-01

The analysis of the ever-increasing amount of biological and biomedical data can be pushed forward by comparing the data within and among species. For example, an integrative analysis of data from the genome sequencing projects for various species traces the evolution of the genomes and identifies conserved and innovative parts. Here, I review the foundations and advantages of this “historical” approach and evaluate recent attempts at automating such analyses. Biological data is comparable if a common origin exists (homology), as is the case for members of a gene family originating via duplication of an ancestral gene. If the family has relatives in other species, we can assume that the ancestral gene was present in the ancestral species from which all the other species evolved. In particular, describing the relationships among the duplicated biological sequences found in the various species is often possible by a phylogeny, which is more informative than homology statements. Detecting and elaborating on common origins may answer how certain biological sequences developed, and predict what sequences are in a particular species and what their function is. Such knowledge transfer from sequences in one species to the homologous sequences of the other is based on the principle of ‘my closest relative looks and behaves like I do’, often referred to as ‘guilt by association’. To enable knowledge transfer on a large scale, several automated ‘phylogenomics pipelines’ have been developed in recent years, and seven of these will be described and compared. Overall, the examples in this review demonstrate that homology and phylogeny analyses, done on a large (and automated) scale, can give insights into function in biology and biomedicine.
Genetic characterization of infectious hematopoietic necrosis virus of coastal salmonid stocks in Washington State

USGS Publications Warehouse

Emmenegger, E.J.; Kurath, G.

2002-01-01

Infectious hematopoietic necrosis virus (IHNV) is a pathogen that infects many Pacific salmonid stocks from the watersheds of North America. Previous studies have thoroughly characterized the genetic diversity of IHNV isolates from Alaska and the Hagerman Valley in Idaho. To enhance understanding of the evolution and viral transmission patterns of IHNV within the Pacific Northwest geographic range, we analyzed the G gene of IHNV isolates from the coastal watersheds of Washington State by ribonuclease protection assay (RPA) and nucleotide sequencing. The RPA analysis of 23 isolates indicated that the Skagit basin IHNV isolates were relatively homogeneous as a result of the dominance of one G gene haplotype (S). Sequence analysis of 303 bases in the middle of the G gene (midG region) of 61 isolates confirmed the high frequency of a Skagit River basin sequence and identified another sequence commonly found in isolates from the Lake Washington basin. Overall, both the RPA and sequence analysis showed that the Washington coastal IHNV isolates are genetically homogeneous and have little genetic diversity. This is similar to the genetic diversity pattern of IHNV from Alaska and contrasts sharply with the high genetic diversity demonstrated for IHNV isolates from fish farms along the Snake River in Idaho. The high degree of sequence and haplotype similarity between the Washington coastal IHNV isolates and those from Alaska and British Columbia suggests that they have a common viral ancestor. Phylogenetic analyses of the isolates we studied and those from different regions throughout the virus's geographic range confirms a conserved pattern of evolution of the virus in salmonid stocks north of the Columbia River, which forms Washington's southern border.
Dissemination of NDM-1-Producing Enterobacteriaceae Mediated by the IncX3-Type Plasmid

PubMed Central

Fu, Ying; Du, Xiaoxing; Shen, Yuqin; Yu, Yunsong

2015-01-01

The emergence and spread of NDM-1-producing Enterobacteriaceae have resulted in a worldwide public health risk that has affected some provinces of China. China is an exceptionally large country, and there is a crucial need to investigate the epidemic of bla NDM-1-positive Enterobacteriaceae in our province. A total of 186 carbapenem-resistant Enterobacteriaceae isolates (CRE) were collected in a grade-3 hospital in Zhejiang province. Carbapenem-resistant genes, including bla KPC, bla IMP, bla VIM, bla OXA-48 and bla NDM-1 were screened and sequenced. Ninety isolates were identified as harboring the bla KPC-2 genes, and five bla NDM-1-positive isolates were uncovered. XbaI-PFGE revealed that three bla NDM-1-positive K. pneumoniae isolates belonged to two different clones. S1-PFGE and southern blot suggested that the bla NDM-1 genes were located on IncX3-type plasmids with two different sizes ranging from 33.3 to 54.7 kb (n=4) and 104.5 to 138.9 kb (n=1), respectively, all of which could easily transfer to Escherichia coli by conjugation and electrotransformation. The high-throughput sequencing of two plasmids was performed leading to the identification of a smaller 54-kb plasmid, which had high sequence similarity with a previously reported pCFNDM-CN, and a larger plasmid in which only a 7.8-kb sequence of a common gene environment around bla NDM-1 (bla NDM-1-trpF- dsbC-cutA1-groEL-ΔInsE,) was detected. PCR mapping and sequencing demonstrated that four smaller bla NDM-1 plasmids contained a common gene environment around bla NDM-1 (IS5-bla NDM-1-trpF- dsbC-cutA1-groEL). We monitored the CRE epidemic in our hospital and determined that KPC-2 carbapenemase was a major risk to patient health and the IncX3-type plasmid played a vital role in the spread of the bla NDM-1 gene among the CRE. PMID:26047502
Intra and Interspecific Variations of Gene Expression Levels in Yeast Are Largely Neutral: (Nei Lecture, SMBE 2016, Gold Coast).

PubMed

Yang, Jian-Rong; Maclean, Calum J; Park, Chungoo; Zhao, Huabin; Zhang, Jianzhi

2017-09-01

It is commonly, although not universally, accepted that most intra and interspecific genome sequence variations are more or less neutral, whereas a large fraction of organism-level phenotypic variations are adaptive. Gene expression levels are molecular phenotypes that bridge the gap between genotypes and corresponding organism-level phenotypes. Yet, it is unknown whether natural variations in gene expression levels are mostly neutral or adaptive. Here we address this fundamental question by genome-wide profiling and comparison of gene expression levels in nine yeast strains belonging to three closely related Saccharomyces species and originating from five different ecological environments. We find that the transcriptome-based clustering of the nine strains approximates the genome sequence-based phylogeny irrespective of their ecological environments. Remarkably, only ∼0.5% of genes exhibit similar expression levels among strains from a common ecological environment, no greater than that among strains with comparable phylogenetic relationships but different environments. These and other observations strongly suggest that most intra and interspecific variations in yeast gene expression levels result from the accumulation of random mutations rather than environmental adaptations. This finding has profound implications for understanding the driving force of gene expression evolution, genetic basis of phenotypic adaptation, and general role of stochasticity in evolution. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
The nif Gene Operon of the Methanogenic Archaeon Methanococcus maripaludis

PubMed Central

Kessler, Peter S.; Blank, Carrine; Leigh, John A.

1998-01-01

Nitrogen fixation occurs in two domains, Archaea and Bacteria. We have characterized a nif (nitrogen fixation) gene cluster in the methanogenic archaeon Methanococcus maripaludis. Sequence analysis revealed eight genes, six with sequence similarity to known nif genes and two with sequence similarity to glnB. The gene order, nifH, ORF105 (similar to glnB), ORF121 (similar to glnB), nifD, nifK, nifE, nifN, and nifX, was the same as that found in part in other diazotrophic methanogens and except for the presence of the glnB-like genes, also resembled the order found in many members of the Bacteria. Using transposon insertion mutagenesis, we determined that an 8-kb region required for nitrogen fixation corresponded to the nif gene cluster. Northern analysis revealed the presence of either a single 7.6-kb nif mRNA transcript or 10 smaller mRNA species containing portions of the large transcript. Polar effects of transposon insertions demonstrated that all of these mRNAs arose from a single promoter region, where transcription initiated 80 bp 5′ to nifH. Distinctive features of the nif gene cluster include the presence of the six primary nif genes in a single operon, the placement of the two glnB-like genes within the cluster, the apparent physical separation of the cluster from any other nif genes that might be in the genome, the fragmentation pattern of the mRNA, and the regulation of expression by a repression mechanism described previously. Our study and others with methanogenic archaea reporting multiple mRNAs arising from gene clusters with only a single putative promoter sequence suggest that mRNA processing following transcription may be a common occurrence in methanogens. PMID:9515920
The Awesome Power of Yeast Evolutionary Genetics: New Genome Sequences and Strain Resources for the Saccharomyces sensu stricto Genus

PubMed Central

Scannell, Devin R.; Zill, Oliver A.; Rokas, Antonis; Payen, Celia; Dunham, Maitreya J.; Eisen, Michael B.; Rine, Jasper; Johnston, Mark; Hittinger, Chris Todd

2011-01-01

High-quality, well-annotated genome sequences and standardized laboratory strains fuel experimental and evolutionary research. We present improved genome sequences of three species of Saccharomyces sensu stricto yeasts: S. bayanus var. uvarum (CBS 7001), S. kudriavzevii (IFO 1802T and ZP 591), and S. mikatae (IFO 1815T), and describe their comparison to the genomes of S. cerevisiae and S. paradoxus. The new sequences, derived by assembling millions of short DNA sequence reads together with previously published Sanger shotgun reads, have vastly greater long-range continuity and far fewer gaps than the previously available genome sequences. New gene predictions defined a set of 5261 protein-coding orthologs across the five most commonly studied Saccharomyces yeasts, enabling a re-examination of the tempo and mode of yeast gene evolution and improved inferences of species-specific gains and losses. To facilitate experimental investigations, we generated genetically marked, stable haploid strains for all three of these Saccharomyces species. These nearly complete genome sequences and the collection of genetically marked strains provide a valuable toolset for comparative studies of gene function, metabolism, and evolution, and render Saccharomyces sensu stricto the most experimentally tractable model genus. These resources are freely available and accessible through www.SaccharomycesSensuStricto.org. PMID:22384314
Whole genome sequencing of an African American family highlights toll like receptor 6 variants in Kawasaki disease susceptibility.

PubMed

Kim, Jihoon; Shimizu, Chisato; Kingsmore, Stephen F; Veeraraghavan, Narayanan; Levy, Eric; Ribeiro Dos Santos, Andre M; Yang, Hai; Flatley, Jay; Hoang, Long Truong; Hibberd, Martin L; Tremoulet, Adriana H; Harismendy, Olivier; Ohno-Machado, Lucila; Burns, Jane C

2017-01-01

Kawasaki disease (KD) is the most common acquired pediatric heart disease. We analyzed Whole Genome Sequences (WGS) from a 6-member African American family in which KD affected two of four children. We sought rare, potentially causative genotypes by sequentially applying the following WGS filters: sequence quality scores, inheritance model (recessive homozygous and compound heterozygous), predicted deleteriousness, allele frequency, genes in KD-associated pathways or with significant associations in published KD genome-wide association studies (GWAS), and with differential expression in KD blood transcriptomes. Biologically plausible genotypes were identified in twelve variants in six genes in the two affected children. The affected siblings were compound heterozygous for the rare variants p.Leu194Pro and p.Arg247Lys in Toll-like receptor 6 (TLR6), which affect TLR6 signaling. The affected children were also homozygous for three common, linked (r2 = 1) intronic single nucleotide variants (SNVs) in TLR6 (rs56245262, rs56083757 and rs7669329), that have previously shown association with KD in cohorts of European descent. Using transcriptome data from pre-treatment whole blood of KD subjects (n = 146), expression quantitative trait loci (eQTL) analyses were performed. Subjects homozygous for the intronic risk allele (A allele of TLR6 rs56245262) had differential expression of Interleukin-6 (IL-6) as a function of genotype (p = 0.0007) and a higher erythrocyte sedimentation rate at diagnosis. TLR6 plays an important role in pathogen-associated molecular pattern recognition, and sequence variations may affect binding affinities that in turn influence KD susceptibility. This integrative genomic approach illustrates how the analysis of WGS in multiplex families with a complex genetic disease allows examination of both the common disease-common variant and common disease-rare variant hypotheses.
Diverse Antibiotic Resistance Genes in Dairy Cow Manure

PubMed Central

Wichmann, Fabienne; Udikovic-Kolic, Nikolina; Andrew, Sheila; Handelsman, Jo

2014-01-01

ABSTRACT Application of manure from antibiotic-treated animals to crops facilitates the dissemination of antibiotic resistance determinants into the environment. However, our knowledge of the identity, diversity, and patterns of distribution of these antibiotic resistance determinants remains limited. We used a new combination of methods to examine the resistome of dairy cow manure, a common soil amendment. Metagenomic libraries constructed with DNA extracted from manure were screened for resistance to beta-lactams, phenicols, aminoglycosides, and tetracyclines. Functional screening of fosmid and small-insert libraries identified 80 different antibiotic resistance genes whose deduced protein sequences were on average 50 to 60% identical to sequences deposited in GenBank. The resistance genes were frequently found in clusters and originated from a taxonomically diverse set of species, suggesting that some microorganisms in manure harbor multiple resistance genes. Furthermore, amid the great genetic diversity in manure, we discovered a novel clade of chloramphenicol acetyltransferases. Our study combined functional metagenomics with third-generation PacBio sequencing to significantly extend the roster of functional antibiotic resistance genes found in animal gut bacteria, providing a particularly broad resource for understanding the origins and dispersal of antibiotic resistance genes in agriculture and clinical settings. PMID:24757214
Unraveling transcriptional control and cis-regulatory codes using the software suite GeneACT

PubMed Central

Cheung, Tom Hiu; Kwan, Yin Lam; Hamady, Micah; Liu, Xuedong

2006-01-01

Deciphering gene regulatory networks requires the systematic identification of functional cis-acting regulatory elements. We present a suite of web-based bioinformatics tools, called GeneACT , that can rapidly detect evolutionarily conserved transcription factor binding sites or microRNA target sites that are either unique or over-represented in differentially expressed genes from DNA microarray data. GeneACT provides graphic visualization and extraction of common regulatory sequence elements in the promoters and 3'-untranslated regions that are conserved across multiple mammalian species. PMID:17064417
Single sea urchin phagocytes express messages of a single sequence from the diverse Sp185/333 gene family in response to bacterial challenge.

PubMed

Majeske, Audrey J; Oren, Matan; Sacchi, Sandro; Smith, L Courtney

2014-12-01

Immune systems in animals rely on fast and efficient responses to a wide variety of pathogens. The Sp185/333 gene family in the purple sea urchin, Strongylocentrotus purpuratus, consists of an estimated 50 (±10) members per genome that share a basic gene structure but show high sequence diversity, primarily due to the mosaic appearance of short blocks of sequence called elements. The genes show significantly elevated expression in three subpopulations of phagocytes responding to marine bacteria. The encoded Sp185/333 proteins are highly diverse and have central effector functions in the immune system. In this study we report the Sp185/333 gene expression in single sea urchin phagocytes. Sea urchins challenged with heat-killed marine bacteria resulted in a typical increase in coelomocyte concentration within 24 h, which included an increased proportion of phagocytes expressing Sp185/333 proteins. Phagocyte fractions enriched from coelomocytes were used in limiting dilutions to obtain samples of single cells that were evaluated for Sp185/333 gene expression by nested RT-PCR. Amplicon sequences showed identical or nearly identical Sp185/333 amplicon sequences in single phagocytes with matches to six known Sp185/333 element patterns, including both common and rare element patterns. This suggested that single phagocytes show restricted expression from the Sp185/333 gene family and infers a diverse, flexible, and efficient response to pathogens. This type of expression pattern from a family of immune response genes in single cells has not been identified previously in other invertebrates. Copyright © 2014 by The American Association of Immunologists, Inc.

The complete mitochondrial genome of the common sea slater, Ligia oceanica (Crustacea, Isopoda) bears a novel gene order and unusual control region features

PubMed Central

Kilpert, Fabian; Podsiadlowski, Lars

2006-01-01

Background Sequence data and other characters from mitochondrial genomes (gene translocations, secondary structure of RNA molecules) are useful in phylogenetic studies among metazoan animals from population to phylum level. Moreover, the comparison of complete mitochondrial sequences gives valuable information about the evolution of small genomes, e.g. about different mechanisms of gene translocation, gene duplication and gene loss, or concerning nucleotide frequency biases. The Peracarida (gammarids, isopods, etc.) comprise about 21,000 species of crustaceans, living in many environments from deep sea floor to arid terrestrial habitats. Ligia oceanica is a terrestrial isopod living at rocky seashores of the european North Sea and Atlantic coastlines. Results The study reveals the first complete mitochondrial DNA sequence from a peracarid crustacean. The mitochondrial genome of Ligia oceanica is a circular double-stranded DNA molecule, with a size of 15,289 bp. It shows several changes in mitochondrial gene order compared to other crustacean species. An overview about mitochondrial gene order of all crustacean taxa yet sequenced is also presented. The largest non-coding part (the putative mitochondrial control region) of the mitochondrial genome of Ligia oceanica is unexpectedly not AT-rich compared to the remainder of the genome. It bears two repeat regions (4× 10 bp and 3× 64 bp), and a GC-rich hairpin-like secondary structure. Some of the transfer RNAs show secondary structures which derive from the usual cloverleaf pattern. While some tRNA genes are putative targets for RNA editing, trnR could not be localized at all. Conclusion Gene order is not conserved among Peracarida, not even among isopods. The two isopod species Ligia oceanica and Idotea baltica show a similarly derived gene order, compared to the arthropod ground pattern and to the amphipod Parhyale hawaiiensis, suggesting that most of the translocation events were already present the last common ancestor of these isopods. Beyond that, the positions of three tRNA genes differ in the two isopod species. Strand bias in nucleotide frequency is reversed in both isopod species compared to other Malacostraca. This is probably due to a reversal of the replication origin, which is further supported by the fact that the hairpin structure typically found in the control region shows a reversed orientation in the isopod species, compared to other crustaceans. PMID:16987408
Screening strategies for a highly polymorphic gene: DHPLC analysis of the Fanconi anemia group A gene.

PubMed

Rischewski, J; Schneppenheim, R

2001-01-30

Patients with Fanconi anemia (Fanc) are at risk of developing leukemia. Mutations of the group A gene (FancA) are most common. A multitude of polymorphisms and mutations within the 43 exons of the gene are described. To examine the role of heterozygosity as a risk factor for malignancies, a partially automatized screening method to identify aberrations was needed. We report on our experience with DHPLC (WAVE (Transgenomic)). PCR amplification of all 43 exons from one individual was performed on one microtiter plate on a gradient thermocycler. DHPLC analysis conditions were established via melting curves, prediction software, and test runs with aberrant samples. PCR products were analyzed twice: native, and after adding a WT-PCR product. Retention patterns were compared with previously identified polymorphic PCR products or mutants. We have defined the mutation screening conditions for all 43 exons of FancA using DHPLC. So far, 40 different sequence variations have been detected in more than 100 individuals. The native analysis identifies heterozygous individuals, and the second run detects homozygous aberrations. Retention patterns are specific for the underlying sequence aberration, thus reducing sequencing demand and costs. DHPLC is a valuable tool for reproducible recognition of known sequence aberrations and screening for unknown mutations in the highly polymorphic FancA gene.
Nucleotide sequence of the Saccharomyces cerevisiae PUT4 proline-permease-encoding gene: similarities between CAN1, HIP1 and PUT4 permeases.

PubMed

Vandenbol, M; Jauniaux, J C; Grenson, M

1989-11-15

The complete nucleotide (nt) sequence of the PUT4 gene, whose product is required for high-affinity proline active transport in the yeast Saccharomyces cerevisiae, is presented. The sequence contains a single long open reading frame of 1881 nt, encoding a polypeptide with a calculated Mr of 68,795. The predicted protein is strongly hydrophobic and exhibits six potential glycosylation sites. Its hydropathy profile suggests the presence of twelve membrane-spanning regions flanked by hydrophilic N- and C-terminal domains. The N terminus does not resemble signal sequences found in secreted proteins. These features are characteristic of integral membrane proteins catalyzing translocation of ligands across cellular membranes. Protein sequence comparisons indicate strong resemblance to the arginine and histidine permeases of S. cerevisiae, but no marked sequence similarity to the proline permease of Escherichia coli or to other known prokaryotic or eukaryotic transport proteins. The strong similarity between the three yeast amino acid permeases suggests a common ancestor for the three proteins.
Genome-editing Technologies for Gene and Cell Therapy.

PubMed

Maeder, Morgan L; Gersbach, Charles A

2016-03-01

Gene therapy has historically been defined as the addition of new genes to human cells. However, the recent advent of genome-editing technologies has enabled a new paradigm in which the sequence of the human genome can be precisely manipulated to achieve a therapeutic effect. This includes the correction of mutations that cause disease, the addition of therapeutic genes to specific sites in the genome, and the removal of deleterious genes or genome sequences. This review presents the mechanisms of different genome-editing strategies and describes each of the common nuclease-based platforms, including zinc finger nucleases, transcription activator-like effector nucleases (TALENs), meganucleases, and the CRISPR/Cas9 system. We then summarize the progress made in applying genome editing to various areas of gene and cell therapy, including antiviral strategies, immunotherapies, and the treatment of monogenic hereditary disorders. The current challenges and future prospects for genome editing as a transformative technology for gene and cell therapy are also discussed.
Genome-editing Technologies for Gene and Cell Therapy

PubMed Central

Maeder, Morgan L; Gersbach, Charles A

2016-01-01

Gene therapy has historically been defined as the addition of new genes to human cells. However, the recent advent of genome-editing technologies has enabled a new paradigm in which the sequence of the human genome can be precisely manipulated to achieve a therapeutic effect. This includes the correction of mutations that cause disease, the addition of therapeutic genes to specific sites in the genome, and the removal of deleterious genes or genome sequences. This review presents the mechanisms of different genome-editing strategies and describes each of the common nuclease-based platforms, including zinc finger nucleases, transcription activator-like effector nucleases (TALENs), meganucleases, and the CRISPR/Cas9 system. We then summarize the progress made in applying genome editing to various areas of gene and cell therapy, including antiviral strategies, immunotherapies, and the treatment of monogenic hereditary disorders. The current challenges and future prospects for genome editing as a transformative technology for gene and cell therapy are also discussed. PMID:26755333
Comparative Genomics of Bacteriophage of the Genus Seuratvirus

PubMed Central

Sazinas, Pavelas; Redgwell, Tamsin; Rihtman, Branko; Grigonyte, Aurelija; Michniewski, Slawomir; Scanlan, David J; Hobman, Jon

2018-01-01

Abstract Despite being more abundant and having smaller genomes than their bacterial host, relatively few bacteriophages have had their genomes sequenced. Here, we isolated 14 bacteriophages from cattle slurry and performed de novo genome sequencing, assembly, and annotation. The commonly used marker genes polB and terL showed these bacteriophages to be closely related to members of the genus Seuratvirus. We performed a core-gene analysis using the 14 new and four closely related genomes. A total of 58 core genes were identified, the majority of which has no known function. These genes were used to construct a core-gene phylogeny, the results of which confirmed the new isolates to be part of the genus Seuratvirus and expanded the number of species within this genus to four. All bacteriophages within the genus contained the genes queCDE encoding enzymes involved in queuosine biosynthesis. We suggest these genes are carried as a mechanism to modify DNA in order to protect these bacteriophages against host endonucleases. PMID:29272407
Intestinal flora of FAP patients containing APC-like sequences.

PubMed

Hainova, K; Adamcikova, Z; Ciernikova, S; Stevurkova, V; Tyciakova, S; Zajac, V

2014-01-01

Colorectal cancer mortality is one of the most common cause of cancer-related mortality. A multiple risk factors are associated with colorectal cancer, including hereditary, enviromental and inflammatory syndromes affecting the gastrointestinal tract. Familial adenomatous polyposis (FAP) is characterized by the emergence of hundreds to thousands of colorectal adenomatous polyps and FAP syndrome is caused by mutations within the adenomatous polyposis coli (APC) tumor suppressor gene. We analyzed 21 rectal bacterial subclones isolated from FAP patient 41-1 with confirmed 5bp ACAAA deletion within codons 1060-1063 for the presence of APC-like sequences in longest exon 15. The studied section was defined by primers 15Efor-15Erev, what correlates with mutation cluster region (MCR) in which the 75% of all APC germline mutations were detected. More than 90% homology was showed by sequencing and subsequent software comparison. The expression of APC-like sequences was demostrated by Western blot analysis using monoclonal and polyclonal antibodies against APC protein. To study missing link between the DNA analysis (PCR, DNA sequencing) and protein expresion experiments (Western blotting) we analyzed bacterial transcripts containing the 15Efor-15Erev sequence of APC gene by reverse transcription-PCR, what indicated that an APC gene derived fragment may be produced. We observed 97-100 % homology after computer comparison of cDNA PCR products. Our results suggest that presence of APC-like sequences in intestinal/rectal bacteria is enrichment of bacterial genetic information in which horizontal gene transfer between humans and microflora play an important role.
Characterization of a Novel Simian Immunodeficiency Virus (SIVmonNG1) Genome Sequence from a Mona Monkey (Cercopithecus mona)

PubMed Central

Barlow, Katrina L.; Ajao, Adebowale Oluwafemi; Clewley, Jonathan P.

2003-01-01

A novel simian immunodeficiency virus (SIV) sequence has been recovered from RNA extracted from the serum of a mona monkey (Cercopithecus mona) wild born in Nigeria. The sequence was obtained by using novel generic (degenerate) PCR primers and spans from two-thirds into the gag gene to the 3′ poly(A) tail of the SIVmonNG1 RNA genome. Analysis of the open reading frames revealed that the SIVmonNG1 genome codes for a Vpu protein, in addition to Gag, Pol, Vif, Vpr, Tat, Rev, Env, and Nef proteins. Previously, only lentiviruses infecting humans (human immunodeficiency virus type 1 [HIV-1]) and chimpanzees (SIVcpz) were known to have a vpu gene; more recently, this has also been found in SIVgsn from Cercopithecus nictitans. Overall, SIVmonNG1 most closely resembles SIVgsn: the env gene sequence groups with HIV-1/SIVcpz env sequences, whereas the pol gene sequence clusters closely with the pol sequence of SIVsyk from Cercopithecus albogaris. By bootscanning and similarity plotting, the first half of pol resembles SIVsyk, whereas the latter part is closer to SIVcol from Colobus guereza. The similarities between the complex mosaic genomes of SIVmonNG1 and SIVgsn are consistent with a shared or common lineage. These data further highlight the intricate nature of the relationships between the SIVs from different primate species and will be helpful for unraveling these associations. PMID:12768007
Novel insights into the composition, variation, organization, and expression of the low-molecular-weight glutenin subunit gene family in common wheat

PubMed Central

Zhang, Xiaofei; Liu, Dongcheng; Zhang, Jianghua; Jiang, Wei; Luo, Guangbin; Yang, Wenlong; Sun, Jiazhu; Tong, Yiping; Cui, Dangqun; Zhang, Aimin

2013-01-01

Low-molecular-weight glutenin subunits (LMW-GS), encoded by a complex multigene family, play an important role in the processing quality of wheat flour. Although members of this gene family have been identified in several wheat varieties, the allelic variation and composition of LMW-GS genes in common wheat are not well understood. In the present study, using the LMW-GS gene molecular marker system and the full-length gene cloning method, a comprehensive molecular analysis of LMW-GS genes was conducted in a representative population, the micro-core collections (MCC) of Chinese wheat germplasm. Generally, >15 LMW-GS genes were identified from individual MCC accessions, of which 4–6 were located at the Glu-A3 locus, 3–5 at the Glu-B3 locus, and eight at the Glu-D3 locus. LMW-GS genes at the Glu-A3 locus showed the highest allelic diversity, followed by the Glu-B3 genes, while the Glu-D3 genes were extremely conserved among MCC accessions. Expression and sequence analysis showed that 9–13 active LMW-GS genes were present in each accession. Sequence identity analysis showed that all i-type genes present at the Glu-A3 locus formed a single group, the s-type genes located at Glu-B3 and Glu-D3 loci comprised a unique group, while high-diversity m-type genes were classified into four groups and detected in all Glu-3 loci. These results contribute to the functional analysis of LMW-GS genes and facilitate improvement of bread-making quality by wheat molecular breeding programmes. PMID:23536608
Candidate genes for congenital diaphragmatic hernia from animalmodels: sequencing of fog2 and pdgfra reveals rare variants indiaphragmatic hernia patients

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bleyl, S.B.; Moshrefi, A.; Shaw, G.M.

2007-05-11

Congenital diaphragmatic hernia (CDH) is a common, lifethreatening birth defect. Although there is strong evidence implicatinggenetic factors in its pathogenesis, few causative genes have beenidentified, and in isolated CDH, only one de novo, nonsense mutation hasbeen reported in FOG2 in a female with posterior diaphragmaticeventration. We report here that the homozygous null mouse for the Pdgfragene has posterolateral diaphragmatic defects and thus is a model forhuman CDH. We hypothesized that mutations in this gene could cause humanCDH. We sequenced PDGFRa and FOG2 in 96 patients with CDH, of which 53had isolated CDH (55.2 percent), 36 had CDH and additional anomalies(37.5more » percent), and 7 had CDH and known chromosome aberrations (7.3percent). For FOG2, we identified novel sequence alterations predictingp.M703L and p.T843A in two patients with isolated CDH that were absent in526 and 564 control chromosomes respectively. These altered amino acidswere highly conserved. However, due to the lack of available parental DNAsamples we were not able to determine if the sequence alterations were denovo. For PDGFRa, we found a single variant predicting p.L967V in apatient with CDH and multiple anomalies that was absent in 768 controlchromosomes. This patient also had one cell with trisomy 15 on skinfibroblast culture, a finding of uncertain significance. Although ourstudy identified sequence variants in FOG2 and PDGFRa, we have notdefinitively established the variants as mutations and we found noevidence that CDH commonly results from mutations in thesegenes.« less
Molecular Identification of Unusual Pathogenic Yeast Isolates by Large Ribosomal Subunit Gene Sequencing: 2 Years of Experience at the United Kingdom Mycology Reference Laboratory▿

PubMed Central

Linton, Christopher J.; Borman, Andrew M.; Cheung, Grace; Holmes, Ann D.; Szekely, Adrien; Palmer, Michael D.; Bridge, Paul D.; Campbell, Colin K.; Johnson, Elizabeth M.

2007-01-01

Rapid identification of yeast isolates from clinical samples is particularly important given their innately variable antifungal susceptibility profiles. We present here an analysis of the utility of PCR amplification and sequence analysis of the hypervariable D1/D2 region of the 26S rRNA gene for the identification of yeast species submitted to the United Kingdom Mycology Reference Laboratory over a 2-year period. A total of 3,033 clinical isolates were received from 2004 to 2006 encompassing 50 different yeast species. While more than 90% of the isolates, corresponding to the most common Candida species, could be identified by using the AUXACOLOR2 yeast identification kit, 153 isolates (5%), comprised of 47 species, could not be identified by using this system and were subjected to molecular identification via 26S rRNA gene sequencing. These isolates included some common species that exhibited atypical biochemical and phenotypic profiles and also many rarer yeast species that are infrequently encountered in the clinical setting. All 47 species requiring molecular identification were unambiguously identified on the basis of D1/D2 sequences, and the molecular identities correlated well with the observed biochemical profiles of the various organisms. Together, our data underscore the utility of molecular techniques as a reference adjunct to conventional methods of yeast identification. Further, we show that PCR amplification and sequencing of the D1/D2 region reliably identifies more than 45 species of clinically significant yeasts and can also potentially identify new pathogenic yeast species. PMID:17251397
Detection of nonauthorized genetically modified organisms using differential quantitative polymerase chain reaction: application to 35S in maize.

PubMed

Cankar, Katarina; Chauvensy-Ancel, Valérie; Fortabat, Marie-Noelle; Gruden, Kristina; Kobilinsky, André; Zel, Jana; Bertheau, Yves

2008-05-15

Detection of nonauthorized genetically modified organisms (GMOs) has always presented an analytical challenge because the complete sequence data needed to detect them are generally unavailable although sequence similarity to known GMOs can be expected. A new approach, differential quantitative polymerase chain reaction (PCR), for detection of nonauthorized GMOs is presented here. This method is based on the presence of several common elements (e.g., promoter, genes of interest) in different GMOs. A statistical model was developed to study the difference between the number of molecules of such a common sequence and the number of molecules identifying the approved GMO (as determined by border-fragment-based PCR) and the donor organism of the common sequence. When this difference differs statistically from zero, the presence of a nonauthorized GMO can be inferred. The interest and scope of such an approach were tested on a case study of different proportions of genetically modified maize events, with the P35S promoter as the Cauliflower Mosaic Virus common sequence. The presence of a nonauthorized GMO was successfully detected in the mixtures analyzed and in the presence of (donor organism of P35S promoter). This method could be easily transposed to other common GMO sequences and other species and is applicable to other detection areas such as microbiology.
Genetics of Human Cardiovascular Disease

PubMed Central

Kathiresan, Sekar; Srivastava, Deepak

2012-01-01

Cardiovascular disease encompasses a range of conditions extending from myocardial infarction to congenital heart disease most of which are heritable. Enormous effort has been invested in understanding the genes and specific DNA sequence variants responsible for this heritability. Here, we review the lessons learned for monogenic and common, complex forms of cardiovascular disease. We also discuss key challenges that remain for gene discovery and for moving from genomic localization to mechanistic insights with an emphasis on the impact of next generation sequencing and the use of pluripotent human cells to understand the mechanism by which genetic variation contributes to disease. PMID:22424232
Characterization of genetic elements required for site-specific integration of Lactobacillus delbrueckii subsp. bulgaricus bacteriophage mv4 and construction of an integration-proficient vector for Lactobacillus plantarum.

PubMed Central

Dupont, L; Boizet-Bonhoure, B; Coddeville, M; Auvray, F; Ritzenthaler, P

1995-01-01

Temperate phage mv4 integrates its DNA into the chromosome of Lactobacillus delbrueckii subsp. bulgaricus strains via site-specific recombination. Nucleotide sequencing of a 2.2-kb attP-containing phage fragment revealed the presence of four open reading frames. The larger open reading frame, close to the attP site, encoded a 427-amino-acid polypeptide with similarity in its C-terminal domain to site-specific recombinases of the integrase family. Comparison of the sequences of attP, bacterial attachment site attB, and host-phage junctions attL and attR identified a 17-bp common core sequence, where strand exchange occurs during recombination. Analysis of the attB sequence indicated that the core region overlaps the 3' end of a tRNA(Ser) gene. Phage mv4 DNA integration into the tRNA(Ser) gene preserved an intact tRNA(Ser) gene at the attL site. An integration vector based on the mv4 attP site and int gene was constructed. This vector transforms a heterologous host, L. plantarum, through site-specific integration into the tRNA(Ser) gene of the genome and will be useful for development of an efficient integration system for a number of additional bacterial species in which an identical tRNA gene is present. PMID:7836291
Integrating a DNA barcoding project with an ecological survey: a case study on temperate intertidal polychaete communities in Qingdao, China

NASA Astrophysics Data System (ADS)

Zhou, Hong; Zhang, Zhinan; Chen, Haiyan; Sun, Renhua; Wang, Hui; Guo, Lei; Pan, Haijian

2010-07-01

In this study, we integrated a DNA barcoding project with an ecological survey on intertidal polychaete communities and investigated the utility of CO1 gene sequence as a DNA barcode for the classification of the intertidal polychaetes. Using 16S rDNA as a complementary marker and combining morphological and ecological characterization, some of dominant and common polychaete species from Chinese coasts were assessed for their taxonomic status. We obtained 22 haplotype gene sequences of 13 taxa, including 10 CO1 sequences and 12 16S rDNA sequences. Based on intra- and inter-specific distances, we built phylogenetic trees using the neighbor-joining method. Our study suggested that the mitochondrial CO1 gene was a valid DNA barcoding marker for species identification in polychaetes, but other genes, such as 16S rDNA, could be used as a complementary genetic marker. For more accurate species identification and effective testing of species hypothesis, DNA barcoding should be incorporated with morphological, ecological, biogeographical, and phylogenetic information. The application of DNA barcoding and molecular identification in the ecological survey on the intertidal polychaete communities demonstrated the feasibility of integrating DNA taxonomy and ecology.
Exome capture sequencing reveals new insights into hepatitis B virus-induced hepatocellular carcinoma at the early stage of tumorigenesis.

PubMed

Chen, Yong; Wang, Lijuan; Xu, Hexiang; Liu, Xingxiang; Zhao, Yingren

2013-10-01

Hepatocellular carcinoma (HCC), the most common type of liver cancer, is the third primary cause of cancer-related mortality worldwide. The molecular mechanisms underlying the initiation and formation of HCC remain obscure. In the present study, we performed exome sequencing using tumor and normal tissues from 3 hepatitis B virus (HBV)-positive BCLC stage A HCC patients. Bioinformatic analysis was performed to find candidate protein-altering somatic mutations. Eighty damaging mutations were validated and 59 genes were reported to be mutated in HBV-related HCCs for the first time here. Further analysis using whole genome sequencing (WGS) data of 88 HBV-related HCC patients from the European Genome-phenome Archive database showed that mutations in 33 of the 59 genes were also detected in other samples. Variants of two newly found genes, ZNF717 and PARP4, were detected in more than 10% of the WGS samples. Several other genes, such as FLNA and CNTN2, are also noteworthy. Thus, the exome sequencing analysis of three BCLC stage A patients provides new insights into the molecular events governing the early steps of HBV-induced HCC tumorigenesis.
Comparative analysis on the structural features of the 5' flanking region of κ-casein genes from six different species

PubMed Central

Gerencsér, Ákos; Barta, Endre; Boa, Simon; Kastanis, Petros; Bösze, Zsuzsanna; Whitelaw, C Bruce A

2002-01-01

κ-casein plays an essential role in the formation, stabilisation and aggregation of milk micelles. Control of κ-casein expression reflects this essential role, although an understanding of the mechanisms involved lags behind that of the other milk protein genes. We determined the 5'-flanking sequences for the murine, rabbit and human κ-casein genes and compared them to the published ruminant sequences. The most conserved region was not the proximal promoter region but an approximately 400 bp long region centred 800 bp upstream of the TATA box. This region contained two highly conserved MGF/STAT5 sites with common spacing relative to each other. In this region, six conserved short stretches of similarity were also found which did not correspond to known transcription factor consensus sites. On the contrary to ruminant and human 5' regulatory sequences, the rabbit and murine 5'-flanking regions did not harbour any kind of repetitive elements. We generated a phylogenetic tree of the six species based on multiple alignment of the κ-casein sequences. This study identified conserved candidate transcriptional regulatory elements within the κ-casein gene promoter. PMID:11929628
Optimal Scaling of Digital Transcriptomes

PubMed Central

Glusman, Gustavo; Caballero, Juan; Robinson, Max; Kutlu, Burak; Hood, Leroy

2013-01-01

Deep sequencing of transcriptomes has become an indispensable tool for biology, enabling expression levels for thousands of genes to be compared across multiple samples. Since transcript counts scale with sequencing depth, counts from different samples must be normalized to a common scale prior to comparison. We analyzed fifteen existing and novel algorithms for normalizing transcript counts, and evaluated the effectiveness of the resulting normalizations. For this purpose we defined two novel and mutually independent metrics: (1) the number of “uniform” genes (genes whose normalized expression levels have a sufficiently low coefficient of variation), and (2) low Spearman correlation between normalized expression profiles of gene pairs. We also define four novel algorithms, one of which explicitly maximizes the number of uniform genes, and compared the performance of all fifteen algorithms. The two most commonly used methods (scaling to a fixed total value, or equalizing the expression of certain ‘housekeeping’ genes) yielded particularly poor results, surpassed even by normalization based on randomly selected gene sets. Conversely, seven of the algorithms approached what appears to be optimal normalization. Three of these algorithms rely on the identification of “ubiquitous” genes: genes expressed in all the samples studied, but never at very high or very low levels. We demonstrate that these include a “core” of genes expressed in many tissues in a mutually consistent pattern, which is suitable for use as an internal normalization guide. The new methods yield robustly normalized expression values, which is a prerequisite for the identification of differentially expressed and tissue-specific genes as potential biomarkers. PMID:24223126
Molecular analysis of beta-globin gene mutations among Thai beta-thalassemia children: results from a single center study

PubMed Central

Boonyawat, Boonchai; Monsereenusorn, Chalinee; Traivaree, Chanchai

2014-01-01

Background Beta-thalassemia is one of the most common genetic disorders in Thailand. Clinical phenotype ranges from silent carrier to clinically manifested conditions including severe beta-thalassemia major and mild beta-thalassemia intermedia. Objective This study aimed to characterize the spectrum of beta-globin gene mutations in pediatric patients who were followed-up in Phramongkutklao Hospital. Patients and methods Eighty unrelated beta-thalassemia patients were enrolled in this study including 57 with beta-thalassemia/hemoglobin E, eight with homozygous beta-thalassemia, and 15 with heterozygous beta-thalassemia. Mutation analysis was performed by multiplex amplification refractory mutation system (M-ARMS), direct DNA sequencing of beta-globin gene, and gap polymerase chain reaction for 3.4 kb deletion detection, respectively. Results A total of 13 different beta-thalassemia mutations were identified among 88 alleles. The most common mutation was codon 41/42 (-TCTT) (37.5%), followed by codon 17 (A>T) (26.1%), IVS-I-5 (G>C) (8%), IVS-II-654 (C>T) (6.8%), IVS-I-1 (G>T) (4.5%), and codon 71/72 (+A) (2.3%), and all these six common mutations (85.2%) were detected by M-ARMS. Six uncommon mutations (10.2%) were identified by DNA sequencing including 4.5% for codon 35 (C>A) and 1.1% initiation codon mutation (ATG>AGG), codon 15 (G>A), codon 19 (A>G), codon 27/28 (+C), and codon 123/124/125 (-ACCCCACC), respectively. The 3.4 kb deletion was detected at 4.5%. The most common genotype of beta-thalassemia major patients was codon 41/42 (-TCTT)/codon 26 (G>A) or betaE accounting for 40%. Conclusion All of the beta-thalassemia alleles have been characterized by a combination of techniques including M-ARMS, DNA sequencing, and gap polymerase chain reaction for 3.4 kb deletion detection. Thirteen mutations account for 100% of the beta-thalassemia genes among the pediatric patients in our study. PMID:25525381
Recombination in feline immunodeficiency virus from feral and companion domestic cats.

PubMed

Hayward, Jessica J; Rodrigo, Allen G

2008-06-17

Recombination is a relatively common phenomenon in retroviruses. We investigated recombination in Feline Immunodeficiency Virus from naturally-infected New Zealand domestic cats (Felis catus) by sequencing regions of the gag, pol and env genes. The occurrence of intragenic recombination was highest in env, with evidence of recombination in 6.4% (n = 156) of all cats. A further recombinant was identified in each of the gag (n = 48) and pol (n = 91) genes. Comparisons of phylogenetic trees across genes identified cases of incongruence, indicating intergenic recombination. Three (7.7%, n = 39) of these incongruencies were found to be significantly different using the Shimodaira-Hasegawa test.Surprisingly, our phylogenies from the gag and pol genes showed that no New Zealand sequences group with reference subtype C sequences within intrasubtype pairwise distances. Indeed, we find one and two distinct unknown subtype groups in gag and pol, respectively. These observations cause us to speculate that these New Zealand FIV strains have undergone several recombination events between subtype A parent strains and undefined unknown subtype strains, similar to the evolutionary history hypothesised for HIV-1 "subtype E".Endpoint dilution sequencing was used to confirm the consensus sequences of the putative recombinants and unknown subtype groups, providing evidence for the authenticity of these sequences. Endpoint dilution sequencing also resulted in the identification of a dual infection event in the env gene. In addition, an intrahost recombination event between variants of the same subtype in the pol gene was established. This is the first known example of naturally-occurring recombination in a cat with infection of the parent strains. Evidence of intragenic recombination in the gag, pol and env regions, and complex intergenic recombination, of FIV from naturally-infected domestic cats in New Zealand was found. Strains of unknown subtype were identified in all three gene regions. These results have implications for the use of the current FIV vaccine in New Zealand.

Analyses of Expressed Sequence Tags from Apple1

PubMed Central

Newcomb, Richard D.; Crowhurst, Ross N.; Gleave, Andrew P.; Rikkerink, Erik H.A.; Allan, Andrew C.; Beuning, Lesley L.; Bowen, Judith H.; Gera, Emma; Jamieson, Kim R.; Janssen, Bart J.; Laing, William A.; McArtney, Steve; Nain, Bhawana; Ross, Gavin S.; Snowden, Kimberley C.; Souleyre, Edwige J.F.; Walton, Eric F.; Yauk, Yar-Khing

2006-01-01

The domestic apple (Malus domestica; also known as Malus pumila Mill.) has become a model fruit crop in which to study commercial traits such as disease and pest resistance, grafting, and flavor and health compound biosynthesis. To speed the discovery of genes involved in these traits, develop markers to map genes, and breed new cultivars, we have produced a substantial expressed sequence tag collection from various tissues of apple, focusing on fruit tissues of the cultivar Royal Gala. Over 150,000 expressed sequence tags have been collected from 43 different cDNA libraries representing 34 different tissues and treatments. Clustering of these sequences results in a set of 42,938 nonredundant sequences comprising 17,460 tentative contigs and 25,478 singletons, together representing what we predict are approximately one-half the expressed genes from apple. Many potential molecular markers are abundant in the apple transcripts. Dinucleotide repeats are found in 4,018 nonredundant sequences, mainly in the 5′-untranslated region of the gene, with a bias toward one repeat type (containing AG, 88%) and against another (repeats containing CG, 0.1%). Trinucleotide repeats are most common in the predicted coding regions and do not show a similar degree of sequence bias in their representation. Bi-allelic single-nucleotide polymorphisms are highly abundant with one found, on average, every 706 bp of transcribed DNA. Predictions of the numbers of representatives from protein families indicate the presence of many genes involved in disease resistance and the biosynthesis of flavor and health-associated compounds. Comparisons of some of these gene families with Arabidopsis (Arabidopsis thaliana) suggest instances where there have been duplications in the lineages leading to apple of biosynthetic and regulatory genes that are expressed in fruit. This resource paves the way for a concerted functional genomics effort in this important temperate fruit crop. PMID:16531485
Evaluating High-Throughput Ab Initio Gene Finders to Discover Proteins Encoded in Eukaryotic Pathogen Genomes Missed by Laboratory Techniques

PubMed Central

Goodswen, Stephen J.; Kennedy, Paul J.; Ellis, John T.

2012-01-01

Next generation sequencing technology is advancing genome sequencing at an unprecedented level. By unravelling the code within a pathogen’s genome, every possible protein (prior to post-translational modifications) can theoretically be discovered, irrespective of life cycle stages and environmental stimuli. Now more than ever there is a great need for high-throughput ab initio gene finding. Ab initio gene finders use statistical models to predict genes and their exon-intron structures from the genome sequence alone. This paper evaluates whether existing ab initio gene finders can effectively predict genes to deduce proteins that have presently missed capture by laboratory techniques. An aim here is to identify possible patterns of prediction inaccuracies for gene finders as a whole irrespective of the target pathogen. All currently available ab initio gene finders are considered in the evaluation but only four fulfil high-throughput capability: AUGUSTUS, GeneMark_hmm, GlimmerHMM, and SNAP. These gene finders require training data specific to a target pathogen and consequently the evaluation results are inextricably linked to the availability and quality of the data. The pathogen, Toxoplasma gondii, is used to illustrate the evaluation methods. The results support current opinion that predicted exons by ab initio gene finders are inaccurate in the absence of experimental evidence. However, the results reveal some patterns of inaccuracy that are common to all gene finders and these inaccuracies may provide a focus area for future gene finder developers. PMID:23226328
The transcriptome of the novel dinoflagellate Oxyrrhis marina (Alveolata: Dinophyceae): response to salinity examined by 454 sequencing

PubMed Central

2011-01-01

Background The heterotrophic dinoflagellate Oxyrrhis marina is increasingly studied in experimental, ecological and evolutionary contexts. Its basal phylogenetic position within the dinoflagellates make O. marina useful for understanding the origin of numerous unusual features of the dinoflagellate lineage; its broad distribution has lent O. marina to the study of protist biogeography; and nutritive flexibility and eurytopy have made it a common lab rat for the investigation of physiological responses of marine heterotrophic flagellates. Nevertheless, genome-scale resources for O. marina are scarce. Here we present a 454-based transcriptome survey for this organism. In addition, we assess sequence read abundance, as a proxy for gene expression, in response to salinity, an environmental factor potentially important in determining O. marina spatial distributions. Results Sequencing generated ~57 Mbp of data which assembled into 7, 398 contigs. Approximately 24% of contigs were nominally identified by BLAST. A further clustering of contigs (at ≥ 90% identity) revealed 164 transcript variant clusters, the largest of which (Phosphoribosylaminoimidazole-succinocarboxamide synthase) was composed of 28 variants displaying predominately synonymous variation. In a genomic context, a sample of 5 different genes were demonstrated to occur as tandem repeats, separated by short (~200-340 bp) inter-genic regions. For HSP90 several intergenic variants were detected suggesting a potentially complex genomic arrangement. In response to salinity, analysis of 454 read abundance highlighted 9 and 20 genes over or under expressed at 50 PSU, respectively. However, 454 read abundance and subsequent qPCR validation did not correlate well - suggesting that measures of gene expression via ad hoc analysis of sequence read abundance require careful interpretation. Conclusion Here we indicate that tandem gene arrangements and the occurrence of multiple transcribed gene variants are common and indicate potentially complex genomic arrangements in O. marina. Comparison of the reported data set with existing O. marina and other dinoflagellates ESTs indicates little sequence overlap likely as a result of the relatively limited extent of genome scale sequence data currently available for the dinoflagellates. This is one of the first 454-based transcriptome surveys of an ancestral dinoflagellate taxon and will undoubtedly prove useful for future comparative studies aimed at reconstructing the origin of novel features of the dinoflagellates. PMID:22014029
Exome sequencing and arrayCGH detection of gene sequence and copy number variation between ILS and ISS mouse strains.

PubMed

Dumas, Laura; Dickens, C Michael; Anderson, Nathan; Davis, Jonathan; Bennett, Beth; Radcliffe, Richard A; Sikela, James M

2014-06-01

It has been well documented that genetic factors can influence predisposition to develop alcoholism. While the underlying genomic changes may be of several types, two of the most common and disease associated are copy number variations (CNVs) and sequence alterations of protein coding regions. The goal of this study was to identify CNVs and single-nucleotide polymorphisms that occur in gene coding regions that may play a role in influencing the risk of an individual developing alcoholism. Toward this end, two mouse strains were used that have been selectively bred based on their differential sensitivity to alcohol: the Inbred long sleep (ILS) and Inbred short sleep (ISS) mouse strains. Differences in initial response to alcohol have been linked to risk for alcoholism, and the ILS/ISS strains are used to investigate the genetics of initial sensitivity to alcohol. Array comparative genomic hybridization (arrayCGH) and exome sequencing were conducted to identify CNVs and gene coding sequence differences, respectively, between ILS and ISS mice. Mouse arrayCGH was performed using catalog Agilent 1 × 244 k mouse arrays. Subsequently, exome sequencing was carried out using an Illumina HiSeq 2000 instrument. ArrayCGH detected 74 CNVs that were strain-specific (38 ILS/36 ISS), including several ISS-specific deletions that contained genes implicated in brain function and neurotransmitter release. Among several interesting coding variations detected by exome sequencing was the gain of a premature stop codon in the alpha-amylase 2B (AMY2B) gene specifically in the ILS strain. In total, exome sequencing detected 2,597 and 1,768 strain-specific exonic gene variants in the ILS and ISS mice, respectively. This study represents the most comprehensive and detailed genomic comparison of ILS and ISS mouse strains to date. The two complementary genome-wide approaches identified strain-specific CNVs and gene coding sequence variations that should provide strong candidates to contribute to the alcohol-related phenotypic differences associated with these strains.
Comprehensive Evaluation of the Association of APOE Genetic Variation with Plasma Lipoprotein Traits in U.S. Whites and African Blacks

PubMed Central

Radwan, Zaheda H.; Wang, Xingbin; Waqar, Fahad; Pirim, Dilek; Niemsiri, Vipavee; Hokanson, John E.; Hamman, Richard F.; Bunker, Clareann H.; Barmada, M. Michael; Demirci, F. Yesim; Kamboh, M. Ilyas

2014-01-01

Although common APOE genetic variation has a major influence on plasma LDL-cholesterol, its role in affecting HDL-cholesterol and triglycerides is not well established. Recent genome-wide association studies suggest that APOE also affects plasma variation in HDL-cholesterol and triglycerides. It is thus important to resequence the APOE gene to identify both common and uncommon variants that affect plasma lipid profile. Here, we have sequenced the APOE gene in 190 subjects with extreme HDL-cholesterol levels selected from two well-defined epidemiological samples of U.S. non-Hispanic Whites (NHWs) and African Blacks followed by genotyping of identified variants in the entire datasets (623 NHWs, 788 African Blacks) and association analyses with major lipid traits. We identified a total of 40 sequence variants, of which 10 are novel. A total of 32 variants, including common tagSNPs (≥5% frequency) and all uncommon variants (<5% frequency) were successfully genotyped and considered for genotype-phenotype associations. Other than the established associations of APOE*2 and APOE*4 with LDL-cholesterol, we have identified additional independent associations with LDL-cholesterol. We have also identified multiple associations of uncommon and common APOE variants with HDL-cholesterol and triglycerides. Our comprehensive sequencing and genotype-phenotype analyses indicate that APOE genetic variation impacts HDL-cholesterol and triglycerides in addition to affecting LDL-cholesterol. PMID:25502880
Development of a PCR assay to detect cyprinid herpesvirus 1 in koi and common carp.

PubMed

Viadanna, Pedro H O; Miller-Morgan, Tim; Peterson, Trace; Way, Keith; Stone, David M; Marty, Gary D; Pilarski, Fabiana; Hedrick, Ronald P; Waltzek, Thomas B

2017-02-08

Cyprinid herpesvirus 1 (CyHV1) infects all scaled and color varieties of common carp Cyprinus carpio, including koi. While it is most often associated with unsightly growths known as 'carp pox,' the underlying lesion (epidermal hyperplasia) can arise from a variety of disease processes. CyHV1-induced epidermal hyperplasia may occur transiently in response to water temperature, and thus histopathology cannot be used in isolation to assess CyHV1 infection status. To address this problem, here we describe a PCR assay targeted to the putative thymidine kinase gene of CyHV1. The PCR assay generates a 141 bp amplicon and reliably detects down to 10 copies of control plasmid DNA sequence (analytic sensitivity). The PCR does not cross-detect genomic DNA from cyprinid herpesvirus 2 and 3 (analytic specificity). The CyHV1 PCR effectively detected viral DNA in koi and common carp sampled from various locations in the UK, USA, Brazil, and Japan. Viral DNA was detected in both normal appearing and grossly affected epidermal tissues from koi experiencing natural epizootics. The new CyHV1 PCR provides an additional approach to histopathology for the rapid detection of CyHV1. Analysis of the thymidine kinase gene sequences determined for 7 PCR-positive carp originating from disparate geographical regions identified 3 sequence types, with 1 type occurring in both koi and common carp.
Benchmarking of Methods for Genomic Taxonomy

DOE PAGES

Larsen, Mette V.; Cosentino, Salvatore; Lukjancenko, Oksana; ...

2014-02-26

One of the first issues that emerges when a prokaryotic organism of interest is encountered is the question of what it is—that is, which species it is. The 16S rRNA gene formed the basis of the first method for sequence-based taxonomy and has had a tremendous impact on the field of microbiology. Nevertheless, the method has been found to have a number of shortcomings. In this paper, we trained and benchmarked five methods for whole-genome sequence-based prokaryotic species identification on a common data set of complete genomes: (i) SpeciesFinder, which is based on the complete 16S rRNA gene; (ii) Reads2Typemore » that searches for species-specific 50-mers in either the 16S rRNA gene or the gyrB gene (for the Enterobacteraceae family); (iii) the ribosomal multilocus sequence typing (rMLST) method that samples up to 53 ribosomal genes; (iv) TaxonomyFinder, which is based on species-specific functional protein domain profiles; and finally (v) KmerFinder, which examines the number of cooccurring k-mers (substrings of k nucleotides in DNA sequence data). The performances of the methods were subsequently evaluated on three data sets of short sequence reads or draft genomes from public databases. In total, the evaluation sets constituted sequence data from more than 11,000 isolates covering 159 genera and 243 species. Our results indicate that methods that sample only chromosomal, core genes have difficulties in distinguishing closely related species which only recently diverged. Finally, the KmerFinder method had the overall highest accuracy and correctly identified from 93% to 97% of the isolates in the evaluations sets.« less
hSAGEing: an improved SAGE-based software for identification of human tissue-specific or common tumor markers and suppressors.

PubMed

Yang, Cheng-Hong; Chuang, Li-Yeh; Shih, Tsung-Mu; Chang, Hsueh-Wei

2010-12-17

SAGE (serial analysis of gene expression) is a powerful method of analyzing gene expression for the entire transcriptome. There are currently many well-developed SAGE tools. However, the cross-comparison of different tissues is seldom addressed, thus limiting the identification of common- and tissue-specific tumor markers. To improve the SAGE mining methods, we propose a novel function for cross-tissue comparison of SAGE data by combining the mathematical set theory and logic with a unique "multi-pool method" that analyzes multiple pools of pair-wise case controls individually. When all the settings are in "inclusion", the common SAGE tag sequences are mined. When one tissue type is in "inclusion" and the other types of tissues are not in "inclusion", the selected tissue-specific SAGE tag sequences are generated. They are displayed in tags-per-million (TPM) and fold values, as well as visually displayed in four kinds of scales in a color gradient pattern. In the fold visualization display, the top scores of the SAGE tag sequences are provided, along with cluster plots. A user-defined matrix file is designed for cross-tissue comparison by selecting libraries from publically available databases or user-defined libraries. The hSAGEing tool provides a combination of friendly cross-tissue analysis and an interface for comparing SAGE libraries for the first time. Some up- or down-regulated genes with tissue-specific or common tumor markers and suppressors are identified computationally. The tool is useful and convenient for in silico cancer transcriptomic studies and is freely available at http://bio.kuas.edu.tw/hSAGEing.
The polyphenol oxidase gene family in land plants: Lineage-specific duplication and expansion

PubMed Central

2012-01-01

Background Plant polyphenol oxidases (PPOs) are enzymes that typically use molecular oxygen to oxidize ortho-diphenols to ortho-quinones. These commonly cause browning reactions following tissue damage, and may be important in plant defense. Some PPOs function as hydroxylases or in cross-linking reactions, but in most plants their physiological roles are not known. To better understand the importance of PPOs in the plant kingdom, we surveyed PPO gene families in 25 sequenced genomes from chlorophytes, bryophytes, lycophytes, and flowering plants. The PPO genes were then analyzed in silico for gene structure, phylogenetic relationships, and targeting signals. Results Many previously uncharacterized PPO genes were uncovered. The moss, Physcomitrella patens, contained 13 PPO genes and Selaginella moellendorffii (spike moss) and Glycine max (soybean) each had 11 genes. Populus trichocarpa (poplar) contained a highly diversified gene family with 11 PPO genes, but several flowering plants had only a single PPO gene. By contrast, no PPO-like sequences were identified in several chlorophyte (green algae) genomes or Arabidopsis (A. lyrata and A. thaliana). We found that many PPOs contained one or two introns often near the 3’ terminus. Furthermore, N-terminal amino acid sequence analysis using ChloroP and TargetP 1.1 predicted that several putative PPOs are synthesized via the secretory pathway, a unique finding as most PPOs are predicted to be chloroplast proteins. Phylogenetic reconstruction of these sequences revealed that large PPO gene repertoires in some species are mostly a consequence of independent bursts of gene duplication, while the lineage leading to Arabidopsis must have lost all PPO genes. Conclusion Our survey identified PPOs in gene families of varying sizes in all land plants except in the genus Arabidopsis. While we found variation in intron numbers and positions, overall PPO gene structure is congruent with the phylogenetic relationships based on primary sequence data. The dynamic nature of this gene family differentiates PPO from other oxidative enzymes, and is consistent with a protein important for a diversity of functions relating to environmental adaptation. PMID:22897796
Design and Construction of a Single-Tube, LATE-PCR, Multiplex Endpoint Assay with Lights-On/Lights-Off Probes for the Detection of Pathogens Associated with Sepsis

PubMed Central

Carver-Brown, Rachel K.; Reis, Arthur H.; Rice, Lisa M.; Czajka, John W.; Wangh, Lawrence J.

2012-01-01

Aims. The goal of this study was to construct a single tube molecular diagnostic multiplex assay for the detection of microbial pathogens commonly associated with septicemia, using LATE-PCR and Lights-On/Lights-Off probe technology. Methods and Results. The assay described here identified pathogens associated with sepsis by amplification and analysis of the 16S ribosomal DNA gene sequence for bacteria and specific gene sequences for fungi. A sequence from an unidentified gene in Lactococcus lactis subsp. cremoris served as a positive control for assay function. LATE-PCR was used to generate single-stranded amplicons that were then analyzed at endpoint over a wide temperature range in a specific fluorescent color. Each bacterial target was identified by its pattern of hybridization to Lights-On/Lights-Off probes derived from molecular beacons. Complex mixtures of targets were also detected. Conclusions. All microbial targets were identified in samples containing low starting copy numbers of pathogen genomic DNA, both as individual targets and in complex mixtures. Significance and Impact of the Study. This assay uses new technology to achieve an advance in the field of molecular diagnostics: a single-tube multiplex assay for identification of pathogens commonly associated with sepsis. PMID:23326668
cDNA nucleotide sequence coding for stearoyl-CoA desaturase and its expression in the zebrafish (Danio rerio) embryo.

PubMed

Hsieh, S L; Liu, R W; Wu, C H; Cheng, W T; Kuo, Ching-Ming

2003-12-01

A cDNA sequence of stearoyl-CoA desaturase (SCD) was determined from zebrafish (Danio rerio) and compared to the corresponding genes in several teleosts. Zebrafish SCD cDNA has a size of 1,061 bp, encodes a polypeptide of 325 amino acids, and shares 88, 85, 84, and 83% similarities with tilapia (Oreochromis mossambicus), grass carp (Ctenopharyngodon idella), common carp (Cyprinus carpio), and milkfish (Chanos chanos), respectively. This 1,061 bp sequence specifies a protein that, in common with other fatty acid desaturases, contains three histidine boxes, believed to be involved in catalysis. These observations suggested that SCD genes are highly conserved. In addition, an oligonucleotide probe complementary to zebrafish SCD mRNA was hybridized to mRNA of approximately 396 bases with Northern blot analysis. The Northern blot and RT-PCR analyses showed that the SCD mRNA was expressed predominantly in the liver, intestine, gill, and muscle, while a lower level was found in the brain. Furthermore, we utilized whole-mount in situ hybridization and real-time quantitative RT-PCR to identify expression of the zebrafish SCD gene at five different stages of development. This revealed that very high levels of transcripts were found in zebrafish at all stages during embryogenesis and early development. Copyright 2003 Wiley-Liss, Inc.
Next-Generation Sequencing-based genomic profiling of brain metastases of primary ovarian cancer identifies high number of BRCA-mutations.

PubMed

Balendran, S; Liebmann-Reindl, S; Berghoff, A S; Reischer, T; Popitsch, N; Geier, C B; Kenner, L; Birner, P; Streubel, B; Preusser, M

2017-07-01

Ovarian cancer represents the most common gynaecological malignancy and has the highest mortality of all female reproductive cancers. It has a rare predilection to develop brain metastases (BM). In this study, we evaluated the mutational profile of ovarian cancer metastases through Next-Generation Sequencing (NGS) with the aim of identifying potential clinically actionable genetic alterations with options for small molecule targeted therapy. Library preparation was conducted using Illumina TruSight Rapid Capture Kit in combination with a cancer specific enrichment kit covering 94 genes. BRCA-mutations were confirmed by using TruSeq Custom Amplicon Low Input Kit in combination with a custom-designed BRCA gene panel. In our cohort all eight sequenced BM samples exhibited a multitude of variant alterations, each with unique molecular profiles. The 37 identified variants were distributed over 22 cancer-related genes (23.4%). The number of mutated genes per sample ranged from 3 to 7 with a median of 4.5. The most commonly altered genes were BRCA1/2, TP53, and ATM. In total, 7 out of 8 samples revealed either a BRCA1 or a BRCA2 pathogenic mutation. Furthermore, all eight BM samples showed mutations in at least one DNA repair gene. Our NGS study of BM of ovarian carcinoma revealed a significant number of BRCA-mutations beside TP53, ATM and CHEK2 mutations. These findings strongly suggest the implication of BRCA and DNA repair malfunction in ovarian cancer metastasizing to the brain. Based on these findings, pharmacological PARP inhibition could be one potential targeted therapeutic for brain metastatic ovarian cancer patients.
The canine sarcoglycan delta gene: BAC clone contig assembly, chromosome assignment and interrogation as a candidate gene for dilated cardiomyopathy in Dobermann dogs.

PubMed

Stabej, P; Leegwater, P A J; Imholz, S; Versteeg, S A; Zijlstra, C; Stokhof, A A; Domanjko-Petriè, A; van Oost, B A

2005-01-01

Dilated cardiomyopathy (DCM) is a common disease of the myocardium recognized in human, dog and experimental animals. Genetic factors are responsible for a large proportion of cases in humans, and 17 genes with DCM causing mutations have been identified. The genetic origin of DCM in the Dobermann dogs has been suggested, but no disease genes have been identified to date. In this paper, we describe the characterization and evaluation of the canine sarcoglycan delta (SGCD), a gene implicated in DCM in human and hamster. Bacterial artificial chromosomes (BACs) containing the canine SGCD gene were isolated with probes for exon 3 and exons 4-8 and were characterized by Southern blot analysis. BAC end sequences were obtained for four BACs. Three of the BACs overlapped and could be ordered relative to each other and the end sequences of all four BACs could be anchored on the preliminary assembly of the dog genome sequence (www. ensembl.org). One of the BACs of the partial contig was localized by fluorescent in situ hybridization to canine chromosome 4q22, in agreement with the dog genome sequence. Two highly informative polymorphic microsatellite markers in intron 7 of the SGCD gene were identified. In 25 DCM-affected and 13 non DCM-affected dogs seven different haplotypes could be distinguished. However, no association between any of the SGCD variants and the disease locus was apparent.
Design and verification of a pangenome microarray oligonucleotide probe set for Dehalococcoides spp.

PubMed

Hug, Laura A; Salehi, Maryam; Nuin, Paulo; Tillier, Elisabeth R; Edwards, Elizabeth A

2011-08-01

Dehalococcoides spp. are an industrially relevant group of Chloroflexi bacteria capable of reductively dechlorinating contaminants in groundwater environments. Existing Dehalococcoides genomes revealed a high level of sequence identity within this group, including 98 to 100% 16S rRNA sequence identity between strains with diverse substrate specificities. Common molecular techniques for identification of microbial populations are often not applicable for distinguishing Dehalococcoides strains. Here we describe an oligonucleotide microarray probe set designed based on clustered Dehalococcoides genes from five different sources (strain DET195, CBDB1, BAV1, and VS genomes and the KB-1 metagenome). This "pangenome" probe set provides coverage of core Dehalococcoides genes as well as strain-specific genes while optimizing the potential for hybridization to closely related, previously unknown Dehalococcoides strains. The pangenome probe set was compared to probe sets designed independently for each of the five Dehalococcoides strains. The pangenome probe set demonstrated better predictability and higher detection of Dehalococcoides genes than strain-specific probe sets on nontarget strains with <99% average nucleotide identity. An in silico analysis of the expected probe hybridization against the recently released Dehalococcoides strain GT genome and additional KB-1 metagenome sequence data indicated that the pangenome probe set performs more robustly than the combined strain-specific probe sets in the detection of genes not included in the original design. The pangenome probe set represents a highly specific, universal tool for the detection and characterization of Dehalococcoides from contaminated sites. It has the potential to become a common platform for Dehalococcoides-focused research, allowing meaningful comparisons between microarray experiments regardless of the strain examined.
Whole Exome Sequencing of Patients with Steroid-Resistant Nephrotic Syndrome.

PubMed

Warejko, Jillian K; Tan, Weizhen; Daga, Ankana; Schapiro, David; Lawson, Jennifer A; Shril, Shirlee; Lovric, Svjetlana; Ashraf, Shazia; Rao, Jia; Hermle, Tobias; Jobst-Schwan, Tilman; Widmeier, Eugen; Majmundar, Amar J; Schneider, Ronen; Gee, Heon Yung; Schmidt, J Magdalena; Vivante, Asaf; van der Ven, Amelie T; Ityel, Hadas; Chen, Jing; Sadowski, Carolin E; Kohl, Stefan; Pabst, Werner L; Nakayama, Makiko; Somers, Michael J G; Rodig, Nancy M; Daouk, Ghaleb; Baum, Michelle; Stein, Deborah R; Ferguson, Michael A; Traum, Avram Z; Soliman, Neveen A; Kari, Jameela A; El Desoky, Sherif; Fathy, Hanan; Zenker, Martin; Bakkaloglu, Sevcan A; Müller, Dominik; Noyan, Aytul; Ozaltin, Fatih; Cadnapaphornchai, Melissa A; Hashmi, Seema; Hopcian, Jeffrey; Kopp, Jeffrey B; Benador, Nadine; Bockenhauer, Detlef; Bogdanovic, Radovan; Stajić, Nataša; Chernin, Gil; Ettenger, Robert; Fehrenbach, Henry; Kemper, Markus; Munarriz, Reyner Loza; Podracka, Ludmila; Büscher, Rainer; Serdaroglu, Erkin; Tasic, Velibor; Mane, Shrikant; Lifton, Richard P; Braun, Daniela A; Hildebrandt, Friedhelm

2018-01-06

Steroid-resistant nephrotic syndrome overwhelmingly progresses to ESRD. More than 30 monogenic genes have been identified to cause steroid-resistant nephrotic syndrome. We previously detected causative mutations using targeted panel sequencing in 30% of patients with steroid-resistant nephrotic syndrome. Panel sequencing has a number of limitations when compared with whole exome sequencing. We employed whole exome sequencing to detect monogenic causes of steroid-resistant nephrotic syndrome in an international cohort of 300 families. Three hundred thirty-five individuals with steroid-resistant nephrotic syndrome from 300 families were recruited from April of 1998 to June of 2016. Age of onset was restricted to <25 years of age. Exome data were evaluated for 33 known monogenic steroid-resistant nephrotic syndrome genes. In 74 of 300 families (25%), we identified a causative mutation in one of 20 genes known to cause steroid-resistant nephrotic syndrome. In 11 families (3.7%), we detected a mutation in a gene that causes a phenocopy of steroid-resistant nephrotic syndrome. This is consistent with our previously published identification of mutations using a panel approach. We detected a causative mutation in a known steroid-resistant nephrotic syndrome gene in 38% of consanguineous families and in 13% of nonconsanguineous families, and 48% of children with congenital nephrotic syndrome. A total of 68 different mutations were detected in 20 of 33 steroid-resistant nephrotic syndrome genes. Fifteen of these mutations were novel. NPHS1 , PLCE1 , NPHS2 , and SMARCAL1 were the most common genes in which we detected a mutation. In another 28% of families, we detected mutations in one or more candidate genes for steroid-resistant nephrotic syndrome. Whole exome sequencing is a sensitive approach toward diagnosis of monogenic causes of steroid-resistant nephrotic syndrome. A molecular genetic diagnosis of steroid-resistant nephrotic syndrome may have important consequences for the management of treatment and kidney transplantation in steroid-resistant nephrotic syndrome. Copyright © 2018 by the American Society of Nephrology.
Compound heterozygous MYO7A mutations segregating Usher syndrome type 2 in a Han family.

PubMed

Zong, Ling; Chen, Kaitian; Wu, Xuan; Liu, Min; Jiang, Hongyan

2016-11-01

Identification of rare deafness genes for inherited congenital sensorineural hearing impairment remains difficult, because a large variety of genes are implicated. In this study we applied targeted capture and next-generation sequencing to uncover the underlying gene in a three-generation Han family segregating recessive inherited hearing loss and retinitis pigmentosa. After excluding mutations in common deafness genes GJB2, SLC26A4 and the mitochondrial gene, genomic DNA of the proband of a Han family was subjected to targeted next-generation sequencing. The candidate mutations were confirmed by Sanger sequencing and subsequently analyzed with in silico tools. An unreported splice site mutation c.3924+1G > C compound with c.6028G > A in the MYO7A gene were detected to cosegregate with the phenotype in this pedigree. Both mutations, located in the evolutionarily conserved FERM domain in myosin VIIA, were predicted to be pathogenic. In this family, profound sensorineural hearing impairment and retinitis pigmentosa without vestibular disorder, constituted the typical Usher syndrome type 2. Identification of novel mutation in compound heterozygosity in MYO7A gene revealed the genetic origin of Usher syndrome type 2 in this Han family. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Identifying metabolic enzymes with multiple types of association evidence

PubMed Central

Kharchenko, Peter; Chen, Lifeng; Freund, Yoav; Vitkup, Dennis; Church, George M

2006-01-01

Background Existing large-scale metabolic models of sequenced organisms commonly include enzymatic functions which can not be attributed to any gene in that organism. Existing computational strategies for identifying such missing genes rely primarily on sequence homology to known enzyme-encoding genes. Results We present a novel method for identifying genes encoding for a specific metabolic function based on a local structure of metabolic network and multiple types of functional association evidence, including clustering of genes on the chromosome, similarity of phylogenetic profiles, gene expression, protein fusion events and others. Using E. coli and S. cerevisiae metabolic networks, we illustrate predictive ability of each individual type of association evidence and show that significantly better predictions can be obtained based on the combination of all data. In this way our method is able to predict 60% of enzyme-encoding genes of E. coli metabolism within the top 10 (out of 3551) candidates for their enzymatic function, and as a top candidate within 43% of the cases. Conclusion We illustrate that a combination of genome context and other functional association evidence is effective in predicting genes encoding metabolic enzymes. Our approach does not rely on direct sequence homology to known enzyme-encoding genes, and can be used in conjunction with traditional homology-based metabolic reconstruction methods. The method can also be used to target orphan metabolic activities. PMID:16571130
The genetic architecture of type 2 diabetes.

PubMed

Fuchsberger, Christian; Flannick, Jason; Teslovich, Tanya M; Mahajan, Anubha; Agarwala, Vineeta; Gaulton, Kyle J; Ma, Clement; Fontanillas, Pierre; Moutsianas, Loukas; McCarthy, Davis J; Rivas, Manuel A; Perry, John R B; Sim, Xueling; Blackwell, Thomas W; Robertson, Neil R; Rayner, N William; Cingolani, Pablo; Locke, Adam E; Tajes, Juan Fernandez; Highland, Heather M; Dupuis, Josee; Chines, Peter S; Lindgren, Cecilia M; Hartl, Christopher; Jackson, Anne U; Chen, Han; Huyghe, Jeroen R; van de Bunt, Martijn; Pearson, Richard D; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M; Gamazon, Eric R; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A; Below, Jennifer E; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L; Pasko, Dorota; Parker, Stephen C J; Varga, Tibor V; Green, Todd; Beer, Nicola L; Day-Williams, Aaron G; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F; Han, Bok-Ghee; Jenkinson, Christopher P; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C Y; Palmer, Nicholette D; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D; Neale, Benjamin M; Purcell, Shaun; Butterworth, Adam S; Howson, Joanna M M; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K L; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H T; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E; Rybin, Denis; Farook, Vidya S; Fowler, Sharon P; Freedman, Barry I; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; van der Schouw, Yvonne T; Loh, Marie; Musani, Solomon K; Puppala, Sobha; Scott, William R; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C; Mangino, Massimo; Bonnycastle, Lori L; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L; Herder, Christian; Groves, Christopher J; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A; Doney, Alex S F; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeriya; Hollensted, Mette; Jørgensen, Marit E; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H; Stirrups, Kathleen; Wood, Andrew R; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N A; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M; Syvänen, Ann-Christine; Bergman, Richard N; Bharadwaj, Dwaipayan; Bottinger, Erwin P; Cho, Yoon Shin; Chandak, Giriraj R; Chan, Juliana C N; Chia, Kee Seng; Daly, Mark J; Ebrahim, Shah B; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A; Lehman, Donna M; Jia, Weiping; Ma, Ronald C W; Pollin, Toni I; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J F; Small, Kerrin S; Ried, Janina S; DeFronzo, Ralph A; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R; Gloyn, Anna L; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D; Hattersley, Andrew T; Bowden, Donald W; Collins, Francis S; Atzmon, Gil; Chambers, John C; Spector, Timothy D; Laakso, Markku; Strom, Tim M; Bell, Graeme I; Blangero, John; Duggirala, Ravindranath; Tai, E Shyong; McVean, Gilean; Hanis, Craig L; Wilson, James G; Seielstad, Mark; Frayling, Timothy M; Meigs, James B; Cox, Nancy J; Sladek, Rob; Lander, Eric S; Gabriel, Stacey; Burtt, Noël P; Mohlke, Karen L; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Florez, Jose C; Scott, Laura J; Morris, Andrew P; Kang, Hyun Min; Boehnke, Michael; Altshuler, David; McCarthy, Mark I

2016-08-04

The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.
The genetic architecture of type 2 diabetes

PubMed Central

Ma, Clement; Fontanillas, Pierre; Moutsianas, Loukas; McCarthy, Davis J; Rivas, Manuel A; Perry, John R B; Sim, Xueling; Blackwell, Thomas W; Robertson, Neil R; Rayner, N William; Cingolani, Pablo; Locke, Adam E; Tajes, Juan Fernandez; Highland, Heather M; Dupuis, Josee; Chines, Peter S; Lindgren, Cecilia M; Hartl, Christopher; Jackson, Anne U; Chen, Han; Huyghe, Jeroen R; van de Bunt, Martijn; Pearson, Richard D; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M; Gamazon, Eric R; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A; Below, Jennifer E; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L; Pasko, Dorota; Parker, Stephen C J; Varga, Tibor V; Green, Todd; Beer, Nicola L; Day-Williams, Aaron G; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F; Han, Bok-Ghee; Jenkinson, Christopher P; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C Y; Palmer, Nicholette D; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D; Neale, Benjamin M; Purcell, Shaun; Butterworth, Adam S; Howson, Joanna M M; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K L; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H T; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E; Rybin, Denis; Farook, Vidya S; Fowler, Sharon P; Freedman, Barry I; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; van der Schouw, Yvonne T; Loh, Marie; Musani, Solomon K; Puppala, Sobha; Scott, William R; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C; Mangino, Massimo; Bonnycastle, Lori L; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L; Herder, Christian; Groves, Christopher J; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A; Doney, Alex S F; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeriya; Hollensted, Mette; Jørgensen, Marit E; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H; Stirrups, Kathleen; Wood, Andrew R; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N A; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M; Syvänen, Ann-Christine; Bergman, Richard N; Bharadwaj, Dwaipayan; Bottinger, Erwin P; Cho, Yoon Shin; Chandak, Giriraj R; Chan, Juliana C N; Chia, Kee Seng; Daly, Mark J; Ebrahim, Shah B; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A; Lehman, Donna M; Jia, Weiping; Ma, Ronald C W; Pollin, Toni I; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J F; Small, Kerrin S; Ried, Janina S; DeFronzo, Ralph A; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R; Gloyn, Anna L; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D; Hattersley, Andrew T; Bowden, Donald W; Collins, Francis S; Atzmon, Gil; Chambers, John C; Spector, Timothy D; Laakso, Markku; Strom, Tim M; Bell, Graeme I; Blangero, John; Duggirala, Ravindranath; Tai, E Shyong; McVean, Gilean; Hanis, Craig L; Wilson, James G; Seielstad, Mark; Frayling, Timothy M; Meigs, James B; Cox, Nancy J; Sladek, Rob; Lander, Eric S; Gabriel, Stacey; Burtt, Noël P; Mohlke, Karen L; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Florez, Jose C; Scott, Laura J; Morris, Andrew P; Kang, Hyun Min; Boehnke, Michael; Altshuler, David; McCarthy, Mark I

2016-01-01

The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of heritability. To test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole genome sequencing in 2,657 Europeans with and without diabetes, and exome sequencing in a total of 12,940 subjects from five ancestral groups. To increase statistical power, we expanded sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support a major role for lower-frequency variants in predisposition to type 2 diabetes. PMID:27398621
Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

2004-08-06

Background The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. Results We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene,more » and assayed embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Conclusions Measuring conservation of sequence features closely linked to function - such as binding-site clustering - makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less

Genomewide investigation of adaptation to harmful algal blooms in common bottlenose dolphins (Tursiops truncatus).

PubMed

Cammen, Kristina M; Schultz, Thomas F; Rosel, Patricia E; Wells, Randall S; Read, Andrew J

2015-09-01

Harmful algal blooms (HABs), which can be lethal in marine species and cause illness in humans, are increasing worldwide. In the Gulf of Mexico, HABs of Karenia brevis produce neurotoxic brevetoxins that cause large-scale marine mortality events. The long history of such blooms, combined with the potentially severe effects of exposure, may have produced a strong selective pressure for evolved resistance. Advances in next-generation sequencing, in particular genotyping-by-sequencing, greatly enable the genomic study of such adaptation in natural populations. We used restriction site-associated DNA (RAD) sequencing to investigate brevetoxicosis resistance in common bottlenose dolphins (Tursiops truncatus). To improve our understanding of the epidemiology and aetiology of brevetoxicosis and the potential for evolved resistance in an upper trophic level predator, we sequenced pools of genomic DNA from dolphins sampled from both coastal and estuarine populations in Florida and during multiple HAB-associated mortality events. We sequenced 129 594 RAD loci and analysed 7431 single nucleotide polymorphisms (SNPs). The allele frequencies of many of these polymorphic loci differed significantly between live and dead dolphins. Some loci associated with survival showed patterns suggesting a common genetic-based mechanism of resistance to brevetoxins in bottlenose dolphins along the Gulf coast of Florida, but others suggested regionally specific mechanisms of resistance or reflected differences among HABs. We identified candidate genes that may be the evolutionary target for brevetoxin resistance by searching the dolphin genome for genes adjacent to survival-associated SNPs. © 2015 John Wiley & Sons Ltd.
Targeted discovery of single-nucleotide polymorphisms in an unmarked wheat chromosomal region containing the Hessian fly resistance gene H33

USDA-ARS?s Scientific Manuscript database

The highly effective Hessian fly-resistance gene, H33, was introgressed from durum wheat into common wheat and genetically mapped to chromosome 3AS, in previous research. However, H33 located to a region that is well-known to be devoid of molecular markers, with the closest flanking simple sequence ...
Differential Impact of the "FMR1" Gene on Visual Processing in Fragile X Syndrome

ERIC Educational Resources Information Center

Kogan, Cary S.; Boutet, Isabelle; Cornish, Kim; Zangenehpour, Shahin; Mullen, Kathy T.; Holden, Jeanette J. A.; Kaloustian, Vazken M. Der; Andermann, Eva; Chaudhuri, Avi

2004-01-01

Fragile X syndrome (FXS) is the most common form of heritable mental retardation, affecting (~ around) 1 in 4000 males. The syndrome arises from expansion of a trinucleotide repeat in the 5'-untranslated region of the fragile X mental retardation 1 ("FMR1") gene, leading to methylation of the promoter sequence and lack of the fragile X mental…
A single determinant dominates the rate of yeast protein evolution.

PubMed

Drummond, D Allan; Raval, Alpan; Wilke, Claus O

2006-02-01

A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we carry out the first combined analysis of seven predictors (gene expression level, dispensability, protein abundance, codon adaptation index, gene length, number of protein-protein interactions, and the gene's centrality in the interaction network) previously reported to have independent influences on protein evolutionary rates. Strikingly, our analysis reveals a single dominant variable linked to the number of translation events which explains 40-fold more variation in evolutionary rate than any other, suggesting that protein evolutionary rate has a single major determinant among the seven predictors. The dominant variable explains nearly half the variation in the rate of synonymous and protein evolution. We show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to noisy biological data. We overcome these difficulties by employing principal component regression, a multivariate regression of evolutionary rate against the principal components of the predictor variables. Our results support the hypothesis that translational selection governs the rate of synonymous and protein sequence evolution in yeast.
Massive Losses of Taste Receptor Genes in Toothed and Baleen Whales

PubMed Central

Feng, Ping; Zheng, Jinsong; Rossiter, Stephen J.; Wang, Ding; Zhao, Huabin

2014-01-01

Taste receptor genes are functionally important in animals, with a surprising exception in the bottlenose dolphin, which shows extensive losses of sweet, umami, and bitter taste receptor genes. To examine the generality of taste gene loss, we examined seven toothed whales and five baleen whales and sequenced the complete repertoire of three sweet/umami (T1Rs) and ten bitter (T2Rs) taste receptor genes. We found all amplified T1Rs and T2Rs to be pseudogenes in all 12 whales, with a shared premature stop codon in 10 of the 13 genes, which demonstrated massive losses of taste receptor genes in the common ancestor of whales. Furthermore, we analyzed three genome sequences from two toothed whales and one baleen whale and found that the sour taste marker gene Pkd2l1 is a pseudogene, whereas the candidate salty taste receptor genes are intact and putatively functional. Additionally, we examined three genes that are responsible for taste signal transduction and found the relaxation of functional constraints on taste signaling pathways along the ancestral branch leading to whales. Together, our results strongly suggest extensive losses of sweet, umami, bitter, and sour tastes in whales, and the relaxation of taste function most likely arose in the common ancestor of whales between 36 and 53 Ma. Therefore, whales represent the first animal group to lack four of five primary tastes, probably driven by the marine environment with high concentration of sodium, the feeding behavior of swallowing prey whole, and the dietary switch from plants to meat in the whale ancestor. PMID:24803572
RNA-Seq Analysis of Developing Pecan (Carya illinoinensis) Embryos Reveals Parallel Expression Patterns among Allergen and Lipid Metabolism Genes.

PubMed

Mattison, Christopher P; Rai, Ruhi; Settlage, Robert E; Hinchliffe, Doug J; Madison, Crista; Bland, John M; Brashear, Suzanne; Graham, Charles J; Tarver, Matthew R; Florane, Christopher; Bechtel, Peter J

2017-02-22

The pecan nut is a nutrient-rich part of a healthy diet full of beneficial fatty acids and antioxidants, but can also cause allergic reactions in people suffering from food allergy to the nuts. The transcriptome of a developing pecan nut was characterized to identify the gene expression occurring during the process of nut development and to highlight those genes involved in fatty acid metabolism and those that commonly act as food allergens. Pecan samples were collected at several time points during the embryo development process including the water, gel, dough, and mature nut stages. Library preparation and sequencing were performed using Illumina-based mRNA HiSeq with RNA from four time points during the growing season during August and September 2012. Sequence analysis with Trinotate software following the Trinity protocol identified 133,000 unigenes with 52,267 named transcripts and 45,882 annotated genes. A total of 27,312 genes were defined by GO annotation. Gene expression clustering analysis identified 12 different gene expression profiles, each containing a number of genes. Three pecan seed storage proteins that commonly act as allergens, Car i 1, Car i 2, and Car i 4, were significantly up-regulated during the time course. Up-regulated fatty acid metabolism genes that were identified included acyl-[ACP] desaturase and omega-6 desaturase genes involved in oleic and linoleic acid metabolism. Notably, a few of the up-regulated acyl-[ACP] desaturase and omega-6 desaturase genes that were identified have expression patterns similar to the allergen genes based upon gene expression clustering and qPCR analysis. These findings suggest the possibility of coordinated accumulation of lipids and allergens during pecan nut embryogenesis.
Comparison of the Exomes of Common Carp (Cyprinus carpio) and Zebrafish (Danio rerio)

PubMed Central

Henkel, Christiaan V.; Dirks, Ron P.; Jansen, Hans J.; Forlenza, Maria; Wiegertjes, Geert F.; Howe, Kerstin; van den Thillart, Guido E.E.J.M.

2012-01-01

Abstract Research on common carp, Cyprinus carpio, is beneficial for zebrafish research because of resources available owing to its large body size, such as the availability of sufficient organ material for transcriptomics, proteomics, and metabolomics. Here we describe the shot gun sequencing of a clonal double-haploid common carp line. The assembly consists of 511891 scaffolds with an N50 of 17 kb, predicting a total genome size of 1.4–1.5 Gb. A detailed analysis of the ten largest scaffolds indicates that the carp genome has a considerably lower repeat coverage than zebrafish, whilst the average intron size is significantly smaller, making it comparable to the fugu genome. The quality of the scaffolding was confirmed by comparisons with RNA deep sequencing data sets and a manual analysis for synteny with the zebrafish, especially the Hox gene clusters. In the ten largest scaffolds analyzed, the synteny of genes is almost complete. Comparisons of predicted exons of common carp with those of the zebrafish revealed only few genes specific for either zebrafish or carp, most of these being of unknown function. This supports the hypothesis of an additional genome duplication event in the carp evolutionary history, which—due to a higher degree of compactness—did not result in a genome larger than that of zebrafish. PMID:22715948
Detection of inherited mutations for breast and ovarian cancer using genomic capture and massively parallel sequencing

PubMed Central

Walsh, Tom; Lee, Ming K.; Casadei, Silvia; Thornton, Anne M.; Stray, Sunday M.; Pennil, Christopher; Nord, Alex S.; Mandell, Jessica B.; Swisher, Elizabeth M.; King, Mary-Claire

2010-01-01

Inherited loss-of-function mutations in the tumor suppressor genes BRCA1, BRCA2, and multiple other genes predispose to high risks of breast and/or ovarian cancer. Cancer-associated inherited mutations in these genes are collectively quite common, but individually rare or even private. Genetic testing for BRCA1 and BRCA2 mutations has become an integral part of clinical practice, but testing is generally limited to these two genes and to women with severe family histories of breast or ovarian cancer. To determine whether massively parallel, “next-generation” sequencing would enable accurate, thorough, and cost-effective identification of inherited mutations for breast and ovarian cancer, we developed a genomic assay to capture, sequence, and detect all mutations in 21 genes, including BRCA1 and BRCA2, with inherited mutations that predispose to breast or ovarian cancer. Constitutional genomic DNA from subjects with known inherited mutations, ranging in size from 1 to >100,000 bp, was hybridized to custom oligonucleotides and then sequenced using a genome analyzer. Analysis was carried out blind to the mutation in each sample. Average coverage was >1200 reads per base pair. After filtering sequences for quality and number of reads, all single-nucleotide substitutions, small insertion and deletion mutations, and large genomic duplications and deletions were detected. There were zero false-positive calls of nonsense mutations, frameshift mutations, or genomic rearrangements for any gene in any of the test samples. This approach enables widespread genetic testing and personalized risk assessment for breast and ovarian cancer. PMID:20616022
An RNA-Seq based gene expression atlas of the common bean.

PubMed

O'Rourke, Jamie A; Iniguez, Luis P; Fu, Fengli; Bucciarelli, Bruna; Miller, Susan S; Jackson, Scott A; McClean, Philip E; Li, Jun; Dai, Xinbin; Zhao, Patrick X; Hernandez, Georgina; Vance, Carroll P

2014-10-06

Common bean (Phaseolus vulgaris) is grown throughout the world and comprises roughly 50% of the grain legumes consumed worldwide. Despite this, genetic resources for common beans have been lacking. Next generation sequencing, has facilitated our investigation of the gene expression profiles associated with biologically important traits in common bean. An increased understanding of gene expression in common bean will improve our understanding of gene expression patterns in other legume species. Combining recently developed genomic resources for Phaseolus vulgaris, including predicted gene calls, with RNA-Seq technology, we measured the gene expression patterns from 24 samples collected from seven tissues at developmentally important stages and from three nitrogen treatments. Gene expression patterns throughout the plant were analyzed to better understand changes due to nodulation, seed development, and nitrogen utilization. We have identified 11,010 genes differentially expressed with a fold change ≥ 2 and a P-value < 0.05 between different tissues at the same time point, 15,752 genes differentially expressed within a tissue due to changes in development, and 2,315 genes expressed only in a single tissue. These analyses identified 2,970 genes with expression patterns that appear to be directly dependent on the source of available nitrogen. Finally, we have assembled this data in a publicly available database, The Phaseolus vulgaris Gene Expression Atlas (Pv GEA), http://plantgrn.noble.org/PvGEA/ . Using the website, researchers can query gene expression profiles of their gene of interest, search for genes expressed in different tissues, or download the dataset in a tabular form. These data provide the basis for a gene expression atlas, which will facilitate functional genomic studies in common bean. Analysis of this dataset has identified genes important in regulating seed composition and has increased our understanding of nodulation and impact of the nitrogen source on assimilation and distribution throughout the plant.
Transcriptomic analysis reveals differential gene expression in response to aluminium in common bean (Phaseolus vulgaris) genotypes

PubMed Central

Eticha, Dejene; Zahn, Marc; Bremer, Melanie; Yang, Zhongbao; Rangel, Andrés F.; Rao, Idupulapati M.; Horst, Walter J.

2010-01-01

Background and Aims Aluminium (Al) resistance in common bean is known to be due to exudation of citrate from the root after a lag phase, indicating the induction of gene transcription and protein synthesis. The aims of this study were to identify Al-induced differentially expressed genes and to analyse the expression of candidate genes conferring Al resistance in bean. Methods The suppression subtractive hybridization (SSH) method was used to identify differentially expressed genes in an Al-resistant bean genotype (‘Quimbaya’) during the induction period. Using quantitative real-time PCR the expression patterns of selected genes were compared between an Al-resistant and an Al-sensitive genotype (‘VAX 1’) treated with Al for up to 24 h. Key Results Short-term Al treatment resulted in up-regulation of stress-induced genes and down-regulation of genes involved in metabolism. However, the expressions of genes encoding enzymes involved in citrate metabolism were not significantly affected by Al. Al treatment dramatically increased the expression of common bean expressed sequence tags belonging to the citrate transporter gene family MATE (multidrug and toxin extrusion family protein) in both the Al-resistant and -sensitive genotype in close agreement with Al-induced citrate exudation. Conclusions The expression of a citrate transporter MATE gene is crucial for citrate exudation in common bean. However, although the expression of the citrate transporter is a prerequisite for citrate exudation, genotypic Al resistance in common bean particularly depends on the capacity to sustain the synthesis of citrate for maintaining the cytosolic citrate pool that enables exudation. PMID:20237115
Two Δ6-desaturase-like genes in common carp (Cyprinus carpio var. Jian): structure characterization, mRNA expression, temperature and nutritional regulation.

PubMed

Ren, Hong-tao; Zhang, Guang-qin; Li, Jian-lin; Tang, Yong-kai; Li, Hong-xia; Yu, Ju-hua; Xu, Pao

2013-08-01

Δ6-Desaturase is the rate-limiting enzyme involved in highly unsaturated fatty acid (HUFA) biosynthesis. There is very little information on the evolution and functional characterization of Δ6Fad-a and Δ6Fad-b in common carp (Cyprinus carpio var. Jian). In the present study, the genomic sequences and structures of two putative Δ6-desaturase-like genes in common carp genome were obtained. We investigated the mRNA expression patterns of Δ6Fad-a and Δ6Fad-b in tissue, hatching carp embryos, larvae by temperature shock and juveniles under nutritional regulation. Our results showed that the two Δ6Fad genes had identical coding exon structures, being comprised of 12 coding exons, and with introns of distinct size and sequence composition. They were not allelic variants of a single gene. Both Δ6Fad genes were highly expressed in liver, intestine (pyloric caeca) and brain. The Δ6Fad-a and Δ6Fad-b mRNAs showed an increase in expression from newly hatched to 25 days after hatching. The expression levels of Δ6Fad-a were obviously regulated by temperature, whereas Δ6Fad-b was not affected by temperature. The regulation of Δ6Fad-a and Δ6Fad-b in response to dietary fatty acid composition was determined in liver, brain and intestine (pyloric caeca) of common carp fed with diets: diet1with fish oil (FO) rich in n-3 HUFA, diet2 with corn oil (CO, 18:2n-6) and diet3 with linseed oil (LO, 18:3n-3). The differential expression of Δ6Fad-a and Δ6Fad-b genes in liver, brain and intestine in common carps was fed with different oil sources, respectively. Further work is in progress to determine the mechanism of differential expression of the Δ6Fad-a and Δ6Fad-b genes in different tissues and the roles of transcription factors in regulating HUFA synthesis. Copyright © 2013 Elsevier B.V. All rights reserved.
PIMMS (Pragmatic Insertional Mutation Mapping System) Laboratory Methodology a Readily Accessible Tool for Identification of Essential Genes in Streptococcus

PubMed Central

Blanchard, Adam M.; Egan, Sharon A.; Emes, Richard D.; Warry, Andrew; Leigh, James A.

2016-01-01

The Pragmatic Insertional Mutation Mapping (PIMMS) laboratory protocol was developed alongside various bioinformatics packages (Blanchard et al., 2015) to enable detection of essential and conditionally essential genes in Streptococcus and related bacteria. This extended the methodology commonly used to locate insertional mutations in individual mutants to the analysis of mutations in populations of bacteria. In Streptococcus uberis, a pyogenic Streptococcus associated with intramammary infection and mastitis in ruminants, the mutagen pGhost9:ISS1 was shown to integrate across the entire genome. Analysis of >80,000 mutations revealed 196 coding sequences, which were not be mutated and a further 67 where mutation only occurred beyond the 90th percentile of the coding sequence. These sequences showed good concordance with sequences within the database of essential genes and typically matched sequences known to be associated with basic cellular functions. Due to the broad utility of this mutagen and the simplicity of the methodology it is anticipated that PIMMS will be of value to a wide range of laboratories in functional genomic analysis of a wide range of Gram positive bacteria (Streptococcus, Enterococcus, and Lactococcus) of medical, veterinary, and industrial significance. PMID:27826289
Deep sequencing and genome-wide analysis reveals the expansion of MicroRNA genes in the gall midge Mayetiola destructor

PubMed Central

2013-01-01

Background MicroRNAs (miRNAs) are small non-coding RNAs that play critical roles in regulating post transcriptional gene expression. Gall midges encompass a large group of insects that are of economic importance and also possess fascinating biological traits. The gall midge Mayetiola destructor, commonly known as the Hessian fly, is a destructive pest of wheat and model organism for studying gall midge biology and insect – host plant interactions. Results In this study, we systematically analyzed miRNAs from the Hessian fly. Deep-sequencing a Hessian fly larval transcriptome led to the identification of 89 miRNA species that are either identical or very similar to known miRNAs from other insects, and 184 novel miRNAs that have not been reported from other species. A genome-wide search through a draft Hessian fly genome sequence identified a total of 611 putative miRNA-encoding genes based on sequence similarity and the existence of a stem-loop structure for miRNA precursors. Analysis of the 611 putative genes revealed a striking feature: the dramatic expansion of several miRNA gene families. The largest family contained 91 genes that encoded 20 different miRNAs. Microarray analyses revealed the expression of miRNA genes was strictly regulated during Hessian fly larval development and abundance of many miRNA genes were affected by host genotypes. Conclusion The identification of a large number of miRNAs for the first time from a gall midge provides a foundation for further studies of miRNA functions in gall midge biology and behavior. The dramatic expansion of identical or similar miRNAs provides a unique system to study functional relations among miRNA iso-genes as well as changes in sequence specificity due to small changes in miRNAs and in their mRNA targets. These results may also facilitate the identification of miRNA genes for potential pest control through transgenic approaches. PMID:23496979
The detection and phylogenetic analysis of the alkane 1-monooxygenase gene of members of the genus Rhodococcus.

PubMed

Táncsics, András; Benedek, Tibor; Szoboszlay, Sándor; Veres, Péter G; Farkas, Milán; Máthé, István; Márialigeti, Károly; Kukolya, József; Lányi, Szabolcs; Kriszt, Balázs

2015-02-01

Naturally occurring and anthropogenic petroleum hydrocarbons are potential carbon sources for many bacteria. The AlkB-related alkane hydroxylases, which are integral membrane non-heme iron enzymes, play a key role in the microbial degradation of many of these hydrocarbons. Several members of the genus Rhodococcus are well-known alkane degraders and are known to harbor multiple alkB genes encoding for different alkane 1-monooxygenases. In the present study, 48 Rhodococcus strains, representing 35 species of the genus, were investigated to find out whether there was a dominant type of alkB gene widespread among species of the genus that could be used as a phylogenetic marker. Phylogenetic analysis of rhodococcal alkB gene sequences indicated that a certain type of alkB gene was present in almost every member of the genus Rhodococcus. These alkB genes were common in a unique nucleotide sequence stretch absent from other types of rhodococcal alkB genes that encoded a conserved amino acid motif: WLG(I/V/L)D(G/D)GL. The sequence identity of the targeted alkB gene in Rhodococcus ranged from 78.5 to 99.2% and showed higher nucleotide sequence variation at the inter-species level compared to the 16S rRNA gene (93.9-99.8%). The results indicated that the alkB gene type investigated might be applicable for: (i) differentiating closely related Rhodococcus species, (ii) properly assigning environmental isolates to existing Rhodococcus species, and finally (iii) assessing whether a new Rhodococcus isolate represents a novel species of the genus. Copyright © 2014 Elsevier GmbH. All rights reserved.
Mitochondrial DNA of Vitis vinifera and the issue of rampant horizontal gene transfer.

PubMed

Goremykin, Vadim V; Salamini, Francesco; Velasco, Riccardo; Viola, Roberto

2009-01-01

The mitochondrial genome of grape (Vitis vinifera), the largest organelle genome sequenced so far, is presented. The genome is 773,279 nt long and has the highest coding capacity among known angiosperm mitochondrial DNAs (mtDNAs). The proportion of promiscuous DNA of plastid origin in the genome is also the largest ever reported for an angiosperm mtDNA, both in absolute and relative terms. In all, 42.4% of chloroplast genome of Vitis has been incorporated into its mitochondrial genome. In order to test if horizontal gene transfer (HGT) has also contributed to the gene content of the grape mtDNA, we built phylogenetic trees with the coding sequences of mitochondrial genes of grape and their homologs from plant mitochondrial genomes. Many incongruent gene tree topologies were obtained. However, the extent of incongruence between these gene trees is not significantly greater than that observed among optimal trees for chloroplast genes, the common ancestry of which has never been in doubt. In both cases, we attribute this incongruence to artifacts of tree reconstruction, insufficient numbers of characters, and gene paralogy. This finding leads us to question the recent phylogenetic interpretation of Bergthorsson et al. (2003, 2004) and Richardson and Palmer (2007) that rampant HGT into the mtDNA of Amborella best explains phylogenetic incongruence between mitochondrial gene trees for angiosperms. The only evidence for HGT into the Vitis mtDNA found involves fragments of two coding sequences stemming from two closteroviruses that cause the leaf roll disease of this plant. We also report that analysis of sequences shared by both chloroplast and mitochondrial genomes provides evidence for a previously unknown gene transfer route from the mitochondrion to the chloroplast.
Assessment of reference gene stability influenced by extremely divergent disease symptoms in Solanum lycopersicum L.

PubMed

Wieczorek, Przemysław; Wrzesińska, Barbara; Obrępalska-Stęplowska, Aleksandra

2013-12-01

Tomato (Solanum lycopersicum L.) is one of the most important vegetables of great worldwide economic value. The scientific importance of the vegetable results from the fact that the genome of S. lycopersicum has been sequenced. This allows researchers to study fundamental mechanisms playing an essential role during tomato development and response to environmental factors contributing significantly to cell metabolism alterations. Parallel with the development of contemporary genetics and the constant increase in sequencing data, progress has to be aligned with improvement of experimental methods used for studying genes functions and gene expression levels, of which the quantitative polymerase chain reaction (qPCR) is still the most reliable. As well as with other nucleic acid-based methods used for comparison of the abundance of specific RNAs, the RT-qPCR data have to be normalised to the levels of RNAs represented stably in a cell. To achieve the goal, the so-called housekeeping genes (i.e., RNAs encoding, for instance, proteins playing an important role in the cell metabolism or structure maintenance), are used for normalisation of the target gene expression data. However, a number of studies have indicated the transcriptional instability of commonly used reference genes analysed in different situations or conditions; for instance, the origin of cells, tissue types, or environmental or other experimental conditions. The expression of ten common housekeeping genes of S. lycopersicum, namely EF1α, TUB, CAC, EXP, RPL8, GAPDH, TBP, ACT, SAND and 18S rRNA were examined during viral infections of tomato. Changes in the expression levels of the genes were estimated by comparison of the non-inoculated tomato plants with those infected with commonly known tomato viral pathogens, Tomato torrado virus, Cucumber mosaic virus, Tobacco mosaic virus and Pepino mosaic virus, inducing a diverse range of disease symptoms on the common host, ranging from mild leaves chlorosis to very severe stem necrosis. It is emphasised that despite the wide range of diverse disease symptoms it is concluded that ACT, CAC and EF1α could be used as the most suitable reference genes in studies of host-virus interactions in tomato. Copyright © 2013 Elsevier B.V. All rights reserved.
Genome sequencing and analysis of a type A Clostridium perfringens isolate from a case of bovine clostridial abomasitis.

PubMed

Nowell, Victoria J; Kropinski, Andrew M; Songer, J Glenn; MacInnes, Janet I; Parreira, Valeria R; Prescott, John F

2012-01-01

Clostridium perfringens is a common inhabitant of the avian and mammalian gastrointestinal tracts and can behave commensally or pathogenically. Some enteric diseases caused by type A C. perfringens, including bovine clostridial abomasitis, remain poorly understood. To investigate the potential basis of virulence in strains causing this disease, we sequenced the genome of a type A C. perfringens isolate (strain F262) from a case of bovine clostridial abomasitis. The ∼3.34 Mbp chromosome of C. perfringens F262 is predicted to contain 3163 protein-coding genes, 76 tRNA genes, and an integrated plasmid sequence, Cfrag (∼18 kb). In addition, sequences of two complete circular plasmids, pF262C (4.8 kb) and pF262D (9.1 kb), and two incomplete plasmid fragments, pF262A (48.5 kb) and pF262B (50.0 kb), were identified. Comparison of the chromosome sequence of C. perfringens F262 to complete C. perfringens chromosomes, plasmids and phages revealed 261 unique genes. No novel toxin genes related to previously described clostridial toxins were identified: 60% of the 261 unique genes were hypothetical proteins. There was a two base pair deletion in virS, a gene reported to encode the main sensor kinase involved in virulence gene activation. Despite this frameshift mutation, C. perfringens F262 expressed perfringolysin O, alpha-toxin and the beta2-toxin, suggesting that another regulation system might contribute to the pathogenicity of this strain. Two complete plasmids, pF262C (4.8 kb) and pF262D (9.1 kb), unique to this strain of C. perfringens were identified.
Genome Sequencing and Analysis of a Type A Clostridium perfringens Isolate from a Case of Bovine Clostridial Abomasitis

PubMed Central

Nowell, Victoria J.; Kropinski, Andrew M.; Songer, J. Glenn; MacInnes, Janet I.; Parreira, Valeria R.; Prescott, John F.

2012-01-01

Clostridium perfringens is a common inhabitant of the avian and mammalian gastrointestinal tracts and can behave commensally or pathogenically. Some enteric diseases caused by type A C. perfringens, including bovine clostridial abomasitis, remain poorly understood. To investigate the potential basis of virulence in strains causing this disease, we sequenced the genome of a type A C. perfringens isolate (strain F262) from a case of bovine clostridial abomasitis. The ∼3.34 Mbp chromosome of C. perfringens F262 is predicted to contain 3163 protein-coding genes, 76 tRNA genes, and an integrated plasmid sequence, Cfrag (∼18 kb). In addition, sequences of two complete circular plasmids, pF262C (4.8 kb) and pF262D (9.1 kb), and two incomplete plasmid fragments, pF262A (48.5 kb) and pF262B (50.0 kb), were identified. Comparison of the chromosome sequence of C. perfringens F262 to complete C. perfringens chromosomes, plasmids and phages revealed 261 unique genes. No novel toxin genes related to previously described clostridial toxins were identified: 60% of the 261 unique genes were hypothetical proteins. There was a two base pair deletion in virS, a gene reported to encode the main sensor kinase involved in virulence gene activation. Despite this frameshift mutation, C. perfringens F262 expressed perfringolysin O, alpha-toxin and the beta2-toxin, suggesting that another regulation system might contribute to the pathogenicity of this strain. Two complete plasmids, pF262C (4.8 kb) and pF262D (9.1 kb), unique to this strain of C. perfringens were identified. PMID:22412860
Monogenic Autoinflammatory Diseases with Mendelian Inheritance: Genes, Mutations, and Genotype/Phenotype Correlations

PubMed Central

Martorana, Davide; Bonatti, Francesco; Mozzoni, Paola; Vaglio, Augusto; Percesepe, Antonio

2017-01-01

Autoinflammatory diseases (AIDs) are a genetically heterogeneous group of diseases caused by mutations of genes encoding proteins, which play a pivotal role in the regulation of the inflammatory response. In the pathogenesis of AIDs, the role of the genetic background is triggered by environmental factors through the modulation of the innate immune system. Monogenic AIDs are characterized by Mendelian inheritance and are caused by highly penetrant genetic variants in single genes. During the last years, remarkable progress has been made in the identification of disease-associated genes by using new technologies, such as next-generation sequencing, which has allowed the genetic characterization in undiagnosed patients and in sporadic cases by means of targeted resequencing of a gene panel and whole exome sequencing. In this review, we delineate the genetics of the monogenic AIDs, report the role of the most common gene mutations, and describe the evidences of the most sound genotype/phenotype correlations in AID. PMID:28421071
Coexpression networks implicate human midfetal deep cortical projection neurons in the pathogenesis of autism

PubMed Central

Willsey, A. Jeremy; Sanders, Stephan J.; Li, Mingfeng; Dong, Shan; Tebbenkamp, Andrew T.; Muhle, Rebecca A.; Reilly, Steven K.; Lin, Leon; Fertuzinhos, Sofia; Miller, Jeremy A.; Murtha, Michael T.; Bichsel, Candace; Niu, Wei; Cotney, Justin; Ercan-Sencicek, A. Gulhan; Gockley, Jake; Gupta, Abha; Han, Wenqi; He, Xin; Hoffman, Ellen; Klei, Lambertus; Lei, Jing; Liu, Wenzhong; Liu, Li; Lu, Cong; Xu, Xuming; Zhu, Ying; Mane, Shrikant M.; Lein, Edward S.; Wei, Liping; Noonan, James P.; Roeder, Kathryn; Devlin, Bernie; Šestan, Nenad; State, Matthew W.

2013-01-01

SUMMARY Autism spectrum disorder (ASD) is a complex developmental syndrome of unknown etiology. Recent studies employing exome- and genome-wide sequencing have identified nine high-confidence ASD (hcASD) genes. Working from the hypothesis that ASD-associated mutations in these biologically pleiotropic genes will disrupt intersecting developmental processes to contribute to a common phenotype, we have attempted to identify time periods, brain regions, and cell types in which these genes converge. We have constructed coexpression networks based on the hcASD “seed” genes, leveraging a rich expression data set encompassing multiple human brain regions across human development and into adulthood. By assessing enrichment of an independent set of probable ASD (pASD) genes, derived from the same sequencing studies, we demonstrate a key point of convergence in midfetal layer 5/6 cortical projection neurons. This approach informs when, where, and in what cell types mutations in these specific genes may be productively studied to clarify ASD pathophysiology. PMID:24267886

Quality Control Test for Sequence-Phenotype Assignments

PubMed Central

Ortiz, Maria Teresa Lara; Rosario, Pablo Benjamín Leon; Luna-Nevarez, Pablo; Gamez, Alba Savin; Martínez-del Campo, Ana; Del Rio, Gabriel

2015-01-01

Relating a gene mutation to a phenotype is a common task in different disciplines such as protein biochemistry. In this endeavour, it is common to find false relationships arising from mutations introduced by cells that may be depurated using a phenotypic assay; yet, such phenotypic assays may introduce additional false relationships arising from experimental errors. Here we introduce the use of high-throughput DNA sequencers and statistical analysis aimed to identify incorrect DNA sequence-phenotype assignments and observed that 10–20% of these false assignments are expected in large screenings aimed to identify critical residues for protein function. We further show that this level of incorrect DNA sequence-phenotype assignments may significantly alter our understanding about the structure-function relationship of proteins. We have made available an implementation of our method at http://bis.ifc.unam.mx/en/software/chispas. PMID:25700273
Investigating Genetic Alterations in Bladder Cancer | Center for Cancer Research

Cancer.gov

Bladder cancer (BC) is the fifth most common cancer worldwide and the sixth most common cancer in the U.S. Mutations in a number of oncogenes and tumor suppressor genes were previously associated with invasive or noninvasive forms of the disease. More recently, next generation sequencing (NGS) of bladder tumors from over 100 Chinese patients revealed alterations in additional genes, including those encoding chromatin remodeling enzymes, like lysine specific histone demethylase 6A (KDM6A) and ARID1A, and spindle checkpoint proteins. Because the NGS studies analyzed tumors from patients of a single ethnicity, the results may not be representative of alterations in other patients. To expand on these findings, Michael Nickerson, Ph.D., and Michael Dean, Ph.D., of CCR’s Cancer and Inflammation Program and their colleagues, including Dan Theodorescu, M.D., Ph.D., Director of the University of Colorado NCI-Designated Comprehensive Cancer Center, performed exome sequencing of 14 tumors and targeted sequencing of another 40 tumors all from non-Asian patients diagnosed with BC in the U.S.
Metagenomic gene annotation by a homology-independent approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Froula, Jeff; Zhang, Tao; Salmeen, Annette

2011-06-02

Fully understanding the genetic potential of a microbial community requires functional annotation of all the genes it encodes. The recently developed deep metagenome sequencing approach has enabled rapid identification of millions of genes from a complex microbial community without cultivation. Current homology-based gene annotation fails to detect distantly-related or structural homologs. Furthermore, homology searches with millions of genes are very computational intensive. To overcome these limitations, we developed rhModeller, a homology-independent software pipeline to efficiently annotate genes from metagenomic sequencing projects. Using cellulases and carbonic anhydrases as two independent test cases, we demonstrated that rhModeller is much faster than HMMERmore » but with comparable accuracy, at 94.5percent and 99.9percent accuracy, respectively. More importantly, rhModeller has the ability to detect novel proteins that do not share significant homology to any known protein families. As {approx}50percent of the 2 million genes derived from the cow rumen metagenome failed to be annotated based on sequence homology, we tested whether rhModeller could be used to annotate these genes. Preliminary results suggest that rhModeller is robust in the presence of missense and frameshift mutations, two common errors in metagenomic genes. Applying the pipeline to the cow rumen genes identified 4,990 novel cellulases candidates and 8,196 novel carbonic anhydrase candidates.In summary, we expect rhModeller to dramatically increase the speed and quality of metagnomic gene annotation.« less
Penguins reduced olfactory receptor genes common to other waterbirds

PubMed Central

Lu, Qin; Wang, Kai; Lei, Fumin; Yu, Dan; Zhao, Huabin

2016-01-01

The sense of smell, or olfaction, is fundamental in the life of animals. However, penguins (Aves: Sphenisciformes) possess relatively small olfactory bulbs compared with most other waterbirds such as Procellariiformes and Gaviiformes. To test whether penguins have a reduced reliance on olfaction, we analyzed the draft genome sequences of the two penguins, which diverged at the origin of the order Sphenisciformes; we also examined six closely related species with available genomes, and identified 29 one-to-one orthologous olfactory receptor genes (i.e. ORs) that are putatively functionally conserved and important across the eight birds. To survey the 29 one-to-one orthologous ORs in penguins and their relatives, we newly generated 34 sequences that are missing from the draft genomes. Through the analysis of totaling 378 OR sequences, we found that, of these functionally important ORs common to other waterbirds, penguins have a significantly greater percentage of OR pseudogenes than other waterbirds, suggesting a reduction of olfactory capability. The penguin-specific reduction of olfactory capability arose in the common ancestor of penguins between 23 and 60 Ma, which may have resulted from the aquatic specializations for underwater vision. Our study provides genetic evidence for a possible reduction of reliance on olfaction in penguins. PMID:27527385
Automated Gene Ontology annotation for anonymous sequence data.

PubMed

Hennig, Steffen; Groth, Detlef; Lehrach, Hans

2003-07-01

Gene Ontology (GO) is the most widely accepted attempt to construct a unified and structured vocabulary for the description of genes and their products in any organism. Annotation by GO terms is performed in most of the current genome projects, which besides generality has the advantage of being very convenient for computer based classification methods. However, direct use of GO in small sequencing projects is not easy, especially for species not commonly represented in public databases. We present a software package (GOblet), which performs annotation based on GO terms for anonymous cDNA or protein sequences. It uses the species independent GO structure and vocabulary together with a series of protein databases collected from various sites, to perform a detailed GO annotation by sequence similarity searches. The sensitivity and the reference protein sets can be selected by the user. GOblet runs automatically and is available as a public service on our web server. The paper also addresses the reliability of automated GO annotations by using a reference set of more than 6000 human proteins. The GOblet server is accessible at http://goblet.molgen.mpg.de.
Characterization of Pseudomonas syringae pv. syringae, Causal Agent of Citrus Blast of Mandarin in Montenegro.

PubMed

Ivanović, Žarko; Perović, Tatjana; Popović, Tatjana; Blagojević, Jovana; Trkulja, Nenad; Hrnčić, Snježana

2017-02-01

Citrus blast caused by bacterium Pseudomonas syringae is a very important disease of citrus occuring in many areas of the world, but with few data about genetic structure of the pathogen involved. Considering the above fact, this study reports genetic characterization of 43 P. syringae isolates obtained from plant tissue displaying citrus blast symptoms on mandarin ( Citrus reticulata ) in Montenegro, using multilocus sequence analysis of gyrB , rpoD , and gap1 gene sequences. Gene sequences from a collection of 54 reference pathotype strains of P. syringae from the Plant Associated and Environmental Microbes Database (PAMDB) was used to establish a genetic relationship with our isolates obtained from mandarin. Phylogenetic analyses of gyrB , rpoD , and gap1 gene sequences showed that P. syringae pv. syringae causes citrus blast in mandarin in Montenegro, and belongs to genomospecies 1. Genetic homogeneity of isolates suggested that the Montenegrian population might be clonal which indicates a possible common source of infection. These findings may assist in further epidemiological studies of this pathogen and for determining mandarin breeding strategies for P. syringae control.
Characterization of Pseudomonas syringae pv. syringae, Causal Agent of Citrus Blast of Mandarin in Montenegro

PubMed Central

Ivanović, Žarko; Perović, Tatjana; Popović, Tatjana; Blagojević, Jovana; Trkulja, Nenad; Hrnčić, Snježana

2017-01-01

Citrus blast caused by bacterium Pseudomonas syringae is a very important disease of citrus occuring in many areas of the world, but with few data about genetic structure of the pathogen involved. Considering the above fact, this study reports genetic characterization of 43 P. syringae isolates obtained from plant tissue displaying citrus blast symptoms on mandarin (Citrus reticulata) in Montenegro, using multilocus sequence analysis of gyrB, rpoD, and gap1 gene sequences. Gene sequences from a collection of 54 reference pathotype strains of P. syringae from the Plant Associated and Environmental Microbes Database (PAMDB) was used to establish a genetic relationship with our isolates obtained from mandarin. Phylogenetic analyses of gyrB, rpoD, and gap1 gene sequences showed that P. syringae pv. syringae causes citrus blast in mandarin in Montenegro, and belongs to genomospecies 1. Genetic homogeneity of isolates suggested that the Montenegrian population might be clonal which indicates a possible common source of infection. These findings may assist in further epidemiological studies of this pathogen and for determining mandarin breeding strategies for P. syringae control. PMID:28167885
Molecular Analysis of Glucose-6-Phosphate Dehydrogenase Gene Mutations in Bangladeshi Individuals.

PubMed

Sarker, Suprovath Kumar; Islam, Md Tarikul; Eckhoff, Grace; Hossain, Mohammad Amir; Qadri, Syeda Kashfi; Muraduzzaman, A K M; Bhuyan, Golam Sarower; Shahidullah, Mohammod; Mannan, Mohammad Abdul; Tahura, Sarabon; Hussain, Manzoor; Akhter, Shahida; Nahar, Nazmun; Shirin, Tahmina; Qadri, Firdausi; Mannoor, Kaiissar

2016-01-01

Glucose-6-phosphate dehydrogenase (G6PD) deficiency is a common X-linked human enzyme defect of red blood cells (RBCs). Individuals with this gene defect appear normal until exposed to oxidative stress which induces hemolysis. Consumption of certain foods such as fava beans, legumes; infection with bacteria or virus; and use of certain drugs such as primaquine, sulfa drugs etc. may result in lysis of RBCs in G6PD deficient individuals. The genetic defect that causes G6PD deficiency has been identified mostly as single base missense mutations. One hundred and sixty G6PD gene mutations, which lead to amino acid substitutions, have been described worldwide. The purpose of this study was to detect G6PD gene mutations in hospital-based settings in the local population of Dhaka city, Bangladesh. Qualitative fluorescent spot test and quantitative enzyme activity measurement using RANDOX G6PDH kit were performed for analysis of blood specimens and detection of G6PD-deficient participants. For G6PD-deficient samples, PCR was done with six sets of primers specific for G6PD gene. Automated Sanger sequencing of the PCR products was performed to identify the mutations in the gene. Based on fluorescence spot test and quantitative enzyme assay followed by G6PD gene sequencing, 12 specimens (11 males and one female) among 121 clinically suspected patient-specimens were found to be deficient, suggesting a frequency of 9.9% G6PD deficiency. Sequencing of the G6PD-deficient samples revealed c.C131G substitution (exon-3: Ala44Gly) in six samples, c.G487A substitution (exon-6:Gly163Ser) in five samples and c.G949A substitution (exon-9: Glu317Lys) of coding sequence in one sample. These mutations either affect NADP binding or disrupt protein structure. From the study it appears that Ala44Gly and Gly163Ser are the most common G6PD mutations in Dhaka, Bangladesh. This is the first study of G6PD mutations in Bangladesh.
Molecular Analysis of Glucose-6-Phosphate Dehydrogenase Gene Mutations in Bangladeshi Individuals

PubMed Central

Sarker, Suprovath Kumar; Hossain, Mohammad Amir; Qadri, Syeda Kashfi; Muraduzzaman, A. K. M.; Bhuyan, Golam Sarower; Shahidullah, Mohammod; Mannan, Mohammad Abdul; Tahura, Sarabon; Hussain, Manzoor; Akhter, Shahida; Nahar, Nazmun; Shirin, Tahmina; Qadri, Firdausi; Mannoor, Kaiissar

2016-01-01

Glucose-6-phosphate dehydrogenase (G6PD) deficiency is a common X-linked human enzyme defect of red blood cells (RBCs). Individuals with this gene defect appear normal until exposed to oxidative stress which induces hemolysis. Consumption of certain foods such as fava beans, legumes; infection with bacteria or virus; and use of certain drugs such as primaquine, sulfa drugs etc. may result in lysis of RBCs in G6PD deficient individuals. The genetic defect that causes G6PD deficiency has been identified mostly as single base missense mutations. One hundred and sixty G6PD gene mutations, which lead to amino acid substitutions, have been described worldwide. The purpose of this study was to detect G6PD gene mutations in hospital-based settings in the local population of Dhaka city, Bangladesh. Qualitative fluorescent spot test and quantitative enzyme activity measurement using RANDOX G6PDH kit were performed for analysis of blood specimens and detection of G6PD-deficient participants. For G6PD-deficient samples, PCR was done with six sets of primers specific for G6PD gene. Automated Sanger sequencing of the PCR products was performed to identify the mutations in the gene. Based on fluorescence spot test and quantitative enzyme assay followed by G6PD gene sequencing, 12 specimens (11 males and one female) among 121 clinically suspected patient-specimens were found to be deficient, suggesting a frequency of 9.9% G6PD deficiency. Sequencing of the G6PD-deficient samples revealed c.C131G substitution (exon-3: Ala44Gly) in six samples, c.G487A substitution (exon-6:Gly163Ser) in five samples and c.G949A substitution (exon-9: Glu317Lys) of coding sequence in one sample. These mutations either affect NADP binding or disrupt protein structure. From the study it appears that Ala44Gly and Gly163Ser are the most common G6PD mutations in Dhaka, Bangladesh. This is the first study of G6PD mutations in Bangladesh. PMID:27880809
Genomic characterization of biliary tract cancers identifies driver genes and predisposing mutations.

PubMed

Wardell, Christopher P; Fujita, Masashi; Yamada, Toru; Simbolo, Michele; Fassan, Matteo; Karlic, Rosa; Polak, Paz; Kim, Jaegil; Hatanaka, Yutaka; Maejima, Kazuhiro; Lawlor, Rita T; Nakanishi, Yoshitsugu; Mitsuhashi, Tomoko; Fujimoto, Akihiro; Furuta, Mayuko; Ruzzenente, Andrea; Conci, Simone; Oosawa, Ayako; Sasaki-Oku, Aya; Nakano, Kaoru; Tanaka, Hiroko; Yamamoto, Yujiro; Michiaki, Kubo; Kawakami, Yoshiiku; Aikata, Hiroshi; Ueno, Masaki; Hayami, Shinya; Gotoh, Kunihito; Ariizumi, Shun-Ichi; Yamamoto, Masakazu; Yamaue, Hiroki; Chayama, Kazuaki; Miyano, Satoru; Getz, Gad; Scarpa, Aldo; Hirano, Satoshi; Nakamura, Toru; Nakagawa, Hidewaki

2018-05-01

Biliary tract cancers (BTCs) are clinically and pathologically heterogeneous and respond poorly to treatment. Genomic profiling can offer a clearer understanding of their carcinogenesis, classification and treatment strategy. We performed large-scale genome sequencing analyses on BTCs to investigate their somatic and germline driver events and characterize their genomic landscape. We analyzed 412 BTC samples from Japanese and Italian populations, 107 by whole-exome sequencing (WES), 39 by whole-genome sequencing (WGS), and a further 266 samples by targeted sequencing. The subtypes were 136 intrahepatic cholangiocarcinomas (ICCs), 101 distal cholangiocarcinomas (DCCs), 109 peri-hilar type cholangiocarcinomas (PHCs), and 66 gallbladder or cystic duct cancers (GBCs/CDCs). We identified somatic alterations and searched for driver genes in BTCs, finding pathogenic germline variants of cancer-predisposing genes. We predicted cell-of-origin for BTCs by combining somatic mutation patterns and epigenetic features. We identified 32 significantly and commonly mutated genes including TP53, KRAS, SMAD4, NF1, ARID1A, PBRM1, and ATR, some of which negatively affected patient prognosis. A novel deletion of MUC17 at 7q22.1 affected patient prognosis. Cell-of-origin predictions using WGS and epigenetic features suggest hepatocyte-origin of hepatitis-related ICCs. Deleterious germline mutations of cancer-predisposing genes such as BRCA1, BRCA2, RAD51D, MLH1, or MSH2 were detected in 11% (16/146) of BTC patients. BTCs have distinct genetic features including somatic events and germline predisposition. These findings could be useful to establish treatment and diagnostic strategies for BTCs based on genetic information. We here analyzed genomic features of 412 BTC samples from Japanese and Italian populations. A total of 32 significantly and commonly mutated genes were identified, some of which negatively affected patient prognosis, including a novel deletion of MUC17 at 7q22.1. Cell-of-origin predictions using WGS and epigenetic features suggest hepatocyte-origin of hepatitis-related ICCs. Deleterious germline mutations of cancer-predisposing genes were detected in 11% of patients with BTC. BTCs have distinct genetic features including somatic events and germline predisposition. Copyright © 2018 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.
Multiplexed resequencing analysis to identify rare variants in pooled DNA with barcode indexing using next-generation sequencer.

PubMed

Mitsui, Jun; Fukuda, Yoko; Azuma, Kyo; Tozaki, Hirokazu; Ishiura, Hiroyuki; Takahashi, Yuji; Goto, Jun; Tsuji, Shoji

2010-07-01

We have recently found that multiple rare variants of the glucocerebrosidase gene (GBA) confer a robust risk for Parkinson disease, supporting the 'common disease-multiple rare variants' hypothesis. To develop an efficient method of identifying rare variants in a large number of samples, we applied multiplexed resequencing using a next-generation sequencer to identification of rare variants of GBA. Sixteen sets of pooled DNAs from six pooled DNA samples were prepared. Each set of pooled DNAs was subjected to polymerase chain reaction to amplify the target gene (GBA) covering 6.5 kb, pooled into one tube with barcode indexing, and then subjected to extensive sequence analysis using the SOLiD System. Individual samples were also subjected to direct nucleotide sequence analysis. With the optimization of data processing, we were able to extract all the variants from 96 samples with acceptable rates of false-positive single-nucleotide variants.
Draft Genome Sequences of Two Species of "Difficult-to-Identify" Human-Pathogenic Corynebacteria: Implications for Better Identification Tests.

PubMed

Pacheco, Luis G C; Mattos-Guaraldi, Ana L; Santos, Carolina S; Veras, Adonney A O; Guimarães, Luis C; Abreu, Vinícius; Pereira, Felipe L; Soares, Siomar C; Dorella, Fernanda A; Carvalho, Alex F; Leal, Carlos G; Figueiredo, Henrique C P; Ramos, Juliana N; Vieira, Veronica V; Farfour, Eric; Guiso, Nicole; Hirata, Raphael; Azevedo, Vasco; Silva, Artur; Ramos, Rommel T J

2015-01-01

Non-diphtheriae Corynebacterium species have been increasingly recognized as the causative agents of infections in humans. Differential identification of these bacteria in the clinical microbiology laboratory by the most commonly used biochemical tests is challenging, and normally requires additional molecular methods. Herein, we present the annotated draft genome sequences of two isolates of "difficult-to-identify" human-pathogenic corynebacterial species: C. xerosis and C. minutissimum. The genome sequences of ca. 2.7 Mbp, with a mean number of 2,580 protein encoding genes, were also compared with the publicly available genome sequences of strains of C. amycolatum and C. striatum. These results will aid the exploration of novel biochemical reactions to improve existing identification tests as well as the development of more accurate molecular identification methods through detection of species-specific target genes for isolate's identification or drug susceptibility profiling.
Mining and gene ontology based annotation of SSR markers from expressed sequence tags of Humulus lupulus

PubMed Central

Singh, Swati; Gupta, Sanchita; Mani, Ashutosh; Chaturvedi, Anoop

2012-01-01

Humulus lupulus is commonly known as hops, a member of the family moraceae. Currently many projects are underway leading to the accumulation of voluminous genomic and expressed sequence tag sequences in public databases. The genetically characterized domains in these databases are limited due to non-availability of reliable molecular markers. The large data of EST sequences are available in hops. The simple sequence repeat markers extracted from EST data are used as molecular markers for genetic characterization, in the present study. 25,495 EST sequences were examined and assembled to get full-length sequences. Maximum frequency distribution was shown by mononucleotide SSR motifs i.e. 60.44% in contig and 62.16% in singleton where as minimum frequency are observed for hexanucleotide SSR in contig (0.09%) and pentanucleotide SSR in singletons (0.12%). Maximum trinucleotide motifs code for Glutamic acid (GAA) while AT/TA were the most frequent repeat of dinucleotide SSRs. Flanking primer pairs were designed in-silico for the SSR containing sequences. Functional categorization of SSRs containing sequences was done through gene ontology terms like biological process, cellular component and molecular function. PMID:22368382
Evaluating the use of diversity indices to distinguish between microbial communities with different traits.

PubMed

Feranchuk, Sergey; Belkova, Natalia; Potapova, Ulyana; Kuzmin, Dmitry; Belikov, Sergei

2018-05-23

Several measures of biodiversity are commonly used to describe microbial communities, analyzed using 16S gene sequencing. A wide range of available experiments on 16S gene sequencing allows us to present a framework for a comparison of various diversity indices. The criterion for the comparison is the statistical significance of the difference in index values for microbial communities with different traits, within the same experiment. The results of the evaluation indicate that Shannon diversity is the most effective measure among the commonly used diversity indices. The results also indicate that, within the present framework, the Gini coefficient as a diversity index is comparable to Shannon diversity, despite the fact that the Gini coefficient, as a diversity estimator, is far less popular in microbiology than several other measures. Copyright © 2018 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
Genetics of impulsive behaviour

PubMed Central

Bevilacqua, Laura; Goldman, David

2013-01-01

Impulsivity, defined as the tendency to act without foresight, comprises a multitude of constructs and is associated with a variety of psychiatric disorders. Dissecting different aspects of impulsive behaviour and relating these to specific neurobiological circuits would improve our understanding of the etiology of complex behaviours for which impulsivity is key, and advance genetic studies in this behavioural domain. In this review, we will discuss the heritability of some impulsivity constructs and their possible use as endophenotypes (heritable, disease-associated intermediate phenotypes). Several functional genetic variants associated with impulsive behaviour have been identified by the candidate gene approach and re-sequencing, and whole genome strategies can be implemented for discovery of novel rare and common alleles influencing impulsivity. Via deep sequencing an uncommon HTR2B stop codon, common in one population, was discovered, with implications for understanding impulsive behaviour in both humans and rodents and for future gene discovery. PMID:23440466
Introduced T cell receptor variable region gene segments recombine in pre-B cells: evidence that B and T cells use a common recombinase.

PubMed

Yancopoulos, G D; Blackwell, T K; Suh, H; Hood, L; Alt, F W

1986-01-31

We have recently proposed that a common recombinase performs all of the many variable region gene assembly events in B and T cells, and that the specificity of these joining events is mediated by regulating the "accessibility" of the involved gene segments. To test this possibility, we have introduced "accessible" T cell receptor (TCR) variable region gene segments into a pre-B cell line capable of recombining endogenous and transfected immunoglobulin (Ig) variable region gene segments. Although the corresponding "inaccessible" endogenous TCR gene segments do not rearrange in this line or in B cells in general, the introduced TCR gene segments join very frequently and, in fact, closely resemble introduced Ig gene segments in their recombination characteristics. These observations suggest a new role for conventional Ig transcriptional enhancers--recombinational enhancement. Our studies provide insight into additional aspects of the joining mechanism such as N region insertion, aberrant joining, and recombination-recognition sequence requirements for joining.
Complete mitochondrial genome of Yangtze River wild common carp (Cyprinus carpio haematopterus) and Russian scattered scale mirror carp (Cyprinus carpio carpio).

PubMed

Hu, Guang Fu; Liu, Xiang Jiang; Zou, Gui Wei; Li, Zhong; Liang, Hong-Wei; Hu, Shao-Na

2016-01-01

We sequenced the complete mitogenomes of (Cyprinus carpio haematopterus) and Russian scattered scale mirror carp (Cyprinus carpio carpio). Comparison of these two mitogenomes revealed that the mitogenomes of these two common carp strains were remarkably similar in genome length, gene order and content, and AT content. There were only 55 bp variations in 16,581 nucleotides. About 1 bp variation was located in rRNAs, 2 bp in tRNAs, 9 bp in the control region and 43 bp in protein-coding genes. Furthermore, forty-three variable nucleotides in the protein-coding genes of the two strains led to four variable amino acids, which were located in the ND2, ATPase 6, ND5 and ND6 genes, respectively.
Cetaceans evolution: insights from the genome sequences of common minke whales.

PubMed

Park, Jung Youn; An, Yong-Rock; Kanda, Naohisa; An, Chul-Min; An, Hye Suck; Kang, Jung-Ha; Kim, Eun Mi; An, Du-Hae; Jung, Hojin; Joung, Myunghee; Park, Myung Hum; Yoon, Sook Hee; Lee, Bo-Young; Lee, Taeheon; Kim, Kyu-Won; Park, Won Cheoul; Shin, Dong Hyun; Lee, Young Sub; Kim, Jaemin; Kwak, Woori; Kim, Hyeon Jeong; Kwon, Young-Jun; Moon, Sunjin; Kim, Yuseob; Burt, David W; Cho, Seoae; Kim, Heebal

2015-01-22

Whales have captivated the human imagination for millennia. These incredible cetaceans are the only mammals that have adapted to life in the open oceans and have been a source of human food, fuel and tools around the globe. The transition from land to water has led to various aquatic specializations related to hairless skin and ability to regulate their body temperature in cold water. We present four common minke whale (Balaenoptera acutorostrata) genomes with depth of ×13 ~ ×17 coverage and perform resequencing technology without a reference sequence. Our results indicated the time to the most recent common ancestors of common minke whales to be about 2.3574 (95% HPD, 1.1521 - 3.9212) million years ago. Further, we found that genes associated with epilation and tooth-development showed signatures of positive selection, supporting the morphological uniqueness of whales. This whole-genome sequencing offers a chance to better understand the evolutionary journey of one of the largest mammals on earth.
Divergent evolution of multiple virus-resistance genes from a progenitor in Capsicum spp.

PubMed

Kim, Saet-Byul; Kang, Won-Hee; Huy, Hoang Ngoc; Yeom, Seon-In; An, Jeong-Tak; Kim, Seungill; Kang, Min-Young; Kim, Hyun Jung; Jo, Yeong Deuk; Ha, Yeaseong; Choi, Doil; Kang, Byoung-Cheorl

2017-01-01

Plants have evolved hundreds of nucleotide-binding and leucine-rich domain proteins (NLRs) as potential intracellular immune receptors, but the evolutionary mechanism leading to the ability to recognize specific pathogen effectors is elusive. Here, we cloned Pvr4 (a Potyvirus resistance gene in Capsicum annuum) and Tsw (a Tomato spotted wilt virus resistance gene in Capsicum chinense) via a genome-based approach using independent segregating populations. The genes both encode typical NLRs and are located at the same locus on pepper chromosome 10. Despite the fact that these two genes recognize completely different viral effectors, the genomic structures and coding sequences of the two genes are strikingly similar. Phylogenetic studies revealed that these two immune receptors diverged from a progenitor gene of a common ancestor. Our results suggest that sequence variations caused by gene duplication and neofunctionalization may underlie the evolution of the ability to specifically recognize different effectors. These findings thereby provide insight into the divergent evolution of plant immune receptors. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Comparative Genotypes, Staphylococcal Cassette Chromosome mec (SCCmec) Genes and Antimicrobial Resistance amongst Staphylococcus epidermidis and Staphylococcus haemolyticus Isolates from Infections in Humans and Companion Animals.

PubMed

McManus, Brenda A; Coleman, David C; Deasy, Emily C; Brennan, Gráinne I; O' Connell, Brian; Monecke, Stefan; Ehricht, Ralf; Leggett, Bernadette; Leonard, Nola; Shore, Anna C

2015-01-01

This study compares the characteristics of Staphylococcus epidermidis (SE) and Staphylococcus haemolyticus (SH) isolates from epidemiologically unrelated infections in humans (Hu) (28 SE-Hu; 8 SH-Hu) and companion animals (CpA) (12 SE-CpA; 13 SH-CpA). All isolates underwent antimicrobial susceptibility testing, multilocus sequence typing and DNA microarray profiling to detect antimicrobial resistance and SCCmec-associated genes. All methicillin-resistant (MR) isolates (33/40 SE, 20/21 SH) underwent dru and mecA allele typing. Isolates were predominantly assigned to sequence types (STs) within a single clonal complex (CC2, SE, 84.8%; CC1, SH, 95.2%). SCCmec IV predominated among MRSE with ST2-MRSE-IVc common to both Hu (40.9%) and CpA (54.5%). Identical mecA alleles and nontypeable dru types (dts) were identified in one ST2-MRSE-IVc Hu and CpA isolate, however, all mecA alleles and 2/4 dts detected among 18 ST2-MRSE-IVc isolates were closely related, sharing >96.5% DNA sequence homology. Although only one ST-SCCmec type combination (ST1 with a non-typeable [NT] SCCmec NT9 [class C mec and ccrB4]) was common to four MRSH-Hu and one MRSH-CpA, all MRSH isolates were closely related based on similar STs, SCCmec genes (V/VT or components thereof), mecA alleles and dts. Overall, 39.6% of MR isolates harbored NT SCCmec elements, and ACME was more common amongst MRSE and CpA isolates. Multidrug resistance (MDR) was detected among 96.7% of isolates but they differed in the prevalence of specific macrolide, aminoglycoside and trimethoprim resistance genes amongst SE and SH isolates. Ciprofloxacin, rifampicin, chloramphenicol [fexA, cat-pC221], tetracycline [tet(K)], aminoglycosides [aadD, aphA3] and fusidic acid [fusB] resistance was significantly more common amongst CpA isolates. SE and SH isolates causing infections in Hu and CpA hosts belong predominantly to STs within a single lineage, harboring similar but variable SCCmec genes, mecA alleles and dts. Host and staphylococcal species-specific characteristics were identified in relation to antimicrobial resistance genes and phenotypes, SCCmec and ACME.

Comparative Genotypes, Staphylococcal Cassette Chromosome mec (SCCmec) Genes and Antimicrobial Resistance amongst Staphylococcus epidermidis and Staphylococcus haemolyticus Isolates from Infections in Humans and Companion Animals

PubMed Central

McManus, Brenda A.; Coleman, David C.; Deasy, Emily C.; Brennan, Gráinne I.; O’ Connell, Brian; Monecke, Stefan; Ehricht, Ralf; Leggett, Bernadette; Leonard, Nola; Shore, Anna C.

2015-01-01

This study compares the characteristics of Staphylococcus epidermidis (SE) and Staphylococcus haemolyticus (SH) isolates from epidemiologically unrelated infections in humans (Hu) (28 SE-Hu; 8 SH-Hu) and companion animals (CpA) (12 SE-CpA; 13 SH-CpA). All isolates underwent antimicrobial susceptibility testing, multilocus sequence typing and DNA microarray profiling to detect antimicrobial resistance and SCCmec-associated genes. All methicillin-resistant (MR) isolates (33/40 SE, 20/21 SH) underwent dru and mecA allele typing. Isolates were predominantly assigned to sequence types (STs) within a single clonal complex (CC2, SE, 84.8%; CC1, SH, 95.2%). SCCmec IV predominated among MRSE with ST2-MRSE-IVc common to both Hu (40.9%) and CpA (54.5%). Identical mecA alleles and nontypeable dru types (dts) were identified in one ST2-MRSE-IVc Hu and CpA isolate, however, all mecA alleles and 2/4 dts detected among 18 ST2-MRSE-IVc isolates were closely related, sharing >96.5% DNA sequence homology. Although only one ST-SCCmec type combination (ST1 with a non-typeable [NT] SCCmec NT9 [class C mec and ccrB4]) was common to four MRSH-Hu and one MRSH-CpA, all MRSH isolates were closely related based on similar STs, SCCmec genes (V/VT or components thereof), mecA alleles and dts. Overall, 39.6% of MR isolates harbored NT SCCmec elements, and ACME was more common amongst MRSE and CpA isolates. Multidrug resistance (MDR) was detected among 96.7% of isolates but they differed in the prevalence of specific macrolide, aminoglycoside and trimethoprim resistance genes amongst SE and SH isolates. Ciprofloxacin, rifampicin, chloramphenicol [fexA, cat-pC221], tetracycline [tet(K)], aminoglycosides [aadD, aphA3] and fusidic acid [fusB] resistance was significantly more common amongst CpA isolates. SE and SH isolates causing infections in Hu and CpA hosts belong predominantly to STs within a single lineage, harboring similar but variable SCCmec genes, mecA alleles and dts. Host and staphylococcal species-specific characteristics were identified in relation to antimicrobial resistance genes and phenotypes, SCCmec and ACME. PMID:26379051
An asparagine residue at the N-terminus affects the maturation process of low molecular weight glutenin subunits of wheat endosperm

PubMed Central

2014-01-01

Background Wheat glutenin polymers are made up of two main subunit types, the high- (HMW-GS) and low- (LMW-GS) molecular weight subunits. These latter are represented by heterogeneous proteins. The most common, based on the first amino acid of the mature sequence, are known as LMW-m and LMW-s types. The mature sequences differ as a consequence of three extra amino acids (MET-) at the N-terminus of LMW-m types. The nucleotide sequences of their encoding genes are, however, nearly identical, so that the relationship between gene and protein sequences is difficult to ascertain. It has been hypothesized that the presence of an asparagine residue in position 23 of the complete coding sequence for the LMW-s type might account for the observed three-residue shortened sequence, as a consequence of cleavage at the asparagine by an asparaginyl endopeptidase. Results We performed site-directed mutagenesis of a LMW-s gene to replace asparagine at position 23 with threonine and thus convert it to a candidate LMW-m type gene. Similarly, a candidate LMW-m type gene was mutated at position 23 to replace threonine with asparagine. Next, we produced transgenic durum wheat (cultivar Svevo) lines by introducing the mutated versions of the LMW-m and LMW-s genes, along with the wild type counterpart of the LMW-m gene. Proteomic comparisons between the transgenic and null segregant plants enabled identification of transgenic proteins by mass spectrometry analyses and Edman N-terminal sequencing. Conclusions Our results show that the formation of LMW-s type relies on the presence of an asparagine residue close to the N-terminus generated by signal peptide cleavage, and that LMW-GS can be quantitatively processed most likely by vacuolar asparaginyl endoproteases, suggesting that those accumulated in the vacuole are not sequestered into stable aggregates that would hinder the action of proteolytic enzymes. Rather, whatever is the mechanism of glutenin polymer transport to the vacuole, the proteins remain available for proteolytic processing, and can be converted to the mature form by the removal of a short N-terminal sequence. PMID:24629124
Violation of an Evolutionarily Conserved Immunoglobulin Diversity Gene Sequence Preference Promotes Production of dsDNA-Specific IgG Antibodies

PubMed Central

Silva-Sanchez, Aaron; Liu, Cun Ren; Vale, Andre M.; Khass, Mohamed; Kapoor, Pratibha; Elgavish, Ada; Ivanov, Ivaylo I.; Ippolito, Gregory C.; Schelonka, Robert L.; Schoeb, Trenton R.; Burrows, Peter D.; Schroeder, Harry W.

2015-01-01

Variability in the developing antibody repertoire is focused on the third complementarity determining region of the H chain (CDR-H3), which lies at the center of the antigen binding site where it often plays a decisive role in antigen binding. The power of VDJ recombination and N nucleotide addition has led to the common conception that the sequence of CDR-H3 is unrestricted in its variability and random in its composition. Under this view, the immune response is solely controlled by somatic positive and negative clonal selection mechanisms that act on individual B cells to promote production of protective antibodies and prevent the production of self-reactive antibodies. This concept of a repertoire of random antigen binding sites is inconsistent with the observation that diversity (DH) gene segment sequence content by reading frame (RF) is evolutionarily conserved, creating biases in the prevalence and distribution of individual amino acids in CDR-H3. For example, arginine, which is often found in the CDR-H3 of dsDNA binding autoantibodies, is under-represented in the commonly used DH RFs rearranged by deletion, but is a frequent component of rarely used inverted RF1 (iRF1), which is rearranged by inversion. To determine the effect of altering this germline bias in DH gene segment sequence on autoantibody production, we generated mice that by genetic manipulation are forced to utilize an iRF1 sequence encoding two arginines. Over a one year period we collected serial serum samples from these unimmunized, specific pathogen-free mice and found that more than one-fifth of them contained elevated levels of dsDNA-binding IgG, but not IgM; whereas mice with a wild type DH sequence did not. Thus, germline bias against the use of arginine enriched DH sequence helps to reduce the likelihood of producing self-reactive antibodies. PMID:25706374
DNA sequence variation and selection of tag single-nucleotide polymorphisms at candidate genes for drought-stress response in Pinus taeda L.

PubMed

González-Martínez, Santiago C; Ersoz, Elhan; Brown, Garth R; Wheeler, Nicholas C; Neale, David B

2006-03-01

Genetic association studies are rapidly becoming the experimental approach of choice to dissect complex traits, including tolerance to drought stress, which is the most common cause of mortality and yield losses in forest trees. Optimization of association mapping requires knowledge of the patterns of nucleotide diversity and linkage disequilibrium and the selection of suitable polymorphisms for genotyping. Moreover, standard neutrality tests applied to DNA sequence variation data can be used to select candidate genes or amino acid sites that are putatively under selection for association mapping. In this article, we study the pattern of polymorphism of 18 candidate genes for drought-stress response in Pinus taeda L., an important tree crop. Data analyses based on a set of 21 putatively neutral nuclear microsatellites did not show population genetic structure or genomewide departures from neutrality. Candidate genes had moderate average nucleotide diversity at silent sites (pi(sil) = 0.00853), varying 100-fold among single genes. The level of within-gene LD was low, with an average pairwise r2 of 0.30, decaying rapidly from approximately 0.50 to approximately 0.20 at 800 bp. No apparent LD among genes was found. A selective sweep may have occurred at the early-response-to-drought-3 (erd3) gene, although population expansion can also explain our results and evidence for selection was not conclusive. One other gene, ccoaomt-1, a methylating enzyme involved in lignification, showed dimorphism (i.e., two highly divergent haplotype lineages at equal frequency), which is commonly associated with the long-term action of balancing selection. Finally, a set of haplotype-tagging SNPs (htSNPs) was selected. Using htSNPs, a reduction of genotyping effort of approximately 30-40%, while sampling most common allelic variants, can be gained in our ongoing association studies for drought tolerance in pine.
Fastidious Gram-Negatives: Identification by the Vitek 2 Neisseria-Haemophilus Card and by Partial 16S rRNA Gene Sequencing Analysis.

PubMed

Sönksen, Ute Wolff; Christensen, Jens Jørgen; Nielsen, Lisbeth; Hesselbjerg, Annemarie; Hansen, Dennis Schrøder; Bruun, Brita

2010-12-31

Taxonomy and identification of fastidious Gram negatives are evolving and challenging. We compared identifications achieved with the Vitek 2 Neisseria-Haemophilus (NH) card and partial 16S rRNA gene sequence (526 bp stretch) analysis with identifications obtained with extensive phenotypic characterization using 100 fastidious Gram negative bacteria. Seventy-five strains represented 21 of the 26 taxa included in the Vitek 2 NH database and 25 strains represented related species not included in the database. Of the 100 strains, 31 were the type strains of the species. Vitek 2 NH identification results: 48 of 75 database strains were correctly identified, 11 strains gave `low discrimination´, seven strains were unidentified, and nine strains were misidentified. Identification of 25 non-database strains resulted in 14 strains incorrectly identified as belonging to species in the database. Partial 16S rRNA gene sequence analysis results: For 76 strains phenotypic and sequencing identifications were identical, for 23 strains the sequencing identifications were either probable or possible, and for one strain only the genus was confirmed. Thus, the Vitek 2 NH system identifies most of the commonly occurring species included in the database. Some strains of rarely occurring species and strains of non-database species closely related to database species cause problems. Partial 16S rRNA gene sequence analysis performs well, but does not always suffice, additional phenotypical characterization being useful for final identification.
Fastidious Gram-Negatives: Identification by the Vitek 2 Neisseria-Haemophilus Card and by Partial 16S rRNA Gene Sequencing Analysis

PubMed Central

Sönksen, Ute Wolff; Christensen, Jens Jørgen; Nielsen, Lisbeth; Hesselbjerg, Annemarie; Hansen, Dennis Schrøder; Bruun, Brita

2010-01-01

Taxonomy and identification of fastidious Gram negatives are evolving and challenging. We compared identifications achieved with the Vitek 2 Neisseria-Haemophilus (NH) card and partial 16S rRNA gene sequence (526 bp stretch) analysis with identifications obtained with extensive phenotypic characterization using 100 fastidious Gram negative bacteria. Seventy-five strains represented 21 of the 26 taxa included in the Vitek 2 NH database and 25 strains represented related species not included in the database. Of the 100 strains, 31 were the type strains of the species. Vitek 2 NH identification results: 48 of 75 database strains were correctly identified, 11 strains gave `low discrimination´, seven strains were unidentified, and nine strains were misidentified. Identification of 25 non-database strains resulted in 14 strains incorrectly identified as belonging to species in the database. Partial 16S rRNA gene sequence analysis results: For 76 strains phenotypic and sequencing identifications were identical, for 23 strains the sequencing identifications were either probable or possible, and for one strain only the genus was confirmed. Thus, the Vitek 2 NH system identifies most of the commonly occurring species included in the database. Some strains of rarely occurring species and strains of non-database species closely related to database species cause problems. Partial 16S rRNA gene sequence analysis performs well, but does not always suffice, additional phenotypical characterization being useful for final identification. PMID:21347215
Inducible in vivo DNA footprints define sequences necessary for UV light activation of the parsley chalcone synthase gene.

PubMed Central

Schulze-Lefert, P; Dangl, J L; Becker-André, M; Hahlbrock, K; Schulz, W

1989-01-01

We began characterization of the protein--DNA interactions necessary for UV light induced transcriptional activation of the gene encoding chalcone synthase (CHS), a key plant defense enzyme. Three light dependent in vivo footprints appear on a 90 bp stretch of the CHS promoter with a time course correlated with the onset of CHS transcription. We define a minimal light responsive promoter by functional analysis of truncated CHS promoter fusions with a reporter gene in transient expression experiments in parsley protoplasts. Two of the three footprinted sequence 'boxes' reside within the minimal promoter. Replacement of 10 bp within either of these 'boxes' leads to complete loss of light responsiveness. We conclude that these sequences define the necessary cis elements of the minimal CHS promoter's light responsive element. One of the functionally defined 'boxes' is homologous to an element implicated in regulation of genes involved in photosynthesis. These data represent the first example in a plant defense gene of an induced change in protein--DNA contacts necessary for transcriptional activation. Also, our data argue strongly that divergent light induced biosynthetic pathways share common regulatory units. Images PMID:2566481
Enhanced green fluorescent protein (egfp) gene expression in Tetraselmis subcordiformis chloroplast with endogenous regulators.

PubMed

Cui, Yulin; Zhao, Jialin; Hou, Shichang; Qin, Song

2016-05-01

On the basis of fundamental genetic transformation technologies, the goal of this study was to optimize Tetraselmis subcordiformis chloroplast transformation through the use of endogenous regulators. The genes rrn16S, rbcL, psbA, and psbC are commonly highly expressed in chloroplasts, and the regulators of these genes are often used in chloroplast transformation. For lack of a known chloroplast genome sequence, the genome-walking method was used here to obtain full sequences of T. subcordiformis endogenous regulators. The resulting regulators, including three promoters, two terminators, and a ribosome combination sequence, were inserted into the previously constructed plasmid pPSC-R, with the egfp gene included as a reporter gene, and five chloroplast expression vectors prepared. These vectors were successfully transformed into T. subcordiformis by particle bombardment and the efficiency of each vector tested by assessing EGFP fluorescence via microscopy. The results showed that these vectors exhibited higher efficiency than the former vector pPSC-G carrying exogenous regulators, and the vector pRFA with Prrn, psbA-5'RE, and TpsbA showed the highest efficiency. This research provides a set of effective endogenous regulators for T. subcordiformis and will facilitate future fundamental studies of this alga.
A tree of life based on ninety-eight expressed genes conserved across diverse eukaryotic species

PubMed Central

Jayaswal, Pawan Kumar; Dogra, Vivek; Shanker, Asheesh; Sharma, Tilak Raj

2017-01-01

Rapid advances in DNA sequencing technologies have resulted in the accumulation of large data sets in the public domain, facilitating comparative studies to provide novel insights into the evolution of life. Phylogenetic studies across the eukaryotic taxa have been reported but on the basis of a limited number of genes. Here we present a genome-wide analysis across different plant, fungal, protist, and animal species, with reference to the 36,002 expressed genes of the rice genome. Our analysis revealed 9831 genes unique to rice and 98 genes conserved across all 49 eukaryotic species analysed. The 98 genes conserved across diverse eukaryotes mostly exhibited binding and catalytic activities and shared common sequence motifs; and hence appeared to have a common origin. The 98 conserved genes belonged to 22 functional gene families including 26S protease, actin, ADP–ribosylation factor, ATP synthase, casein kinase, DEAD-box protein, DnaK, elongation factor 2, glyceraldehyde 3-phosphate, phosphatase 2A, ras-related protein, Ser/Thr protein phosphatase family protein, tubulin, ubiquitin and others. The consensus Bayesian eukaryotic tree of life developed in this study demonstrated widely separated clades of plants, fungi, and animals. Musa acuminata provided an evolutionary link between monocotyledons and dicotyledons, and Salpingoeca rosetta provided an evolutionary link between fungi and animals, which indicating that protozoan species are close relatives of fungi and animals. The divergence times for 1176 species pairs were estimated accurately by integrating fossil information with synonymous substitution rates in the comprehensive set of 98 genes. The present study provides valuable insight into the evolution of eukaryotes. PMID:28922368
Mutation spectrum of genes associated with steroid-resistant nephrotic syndrome in Chinese children.

PubMed

Wang, Ying; Dang, Xiqiang; He, Qingnan; Zhen, Yan; He, Xiaoxie; Yi, Zhuwen; Zhu, Kuichun

2017-08-20

Approximately 20% of children with idiopathic nephrotic syndrome do not respond to steroid therapy. More than 30 genes have been identified as disease-causing genes for the steroid-resistant nephrotic syndrome (SRNS). Few reports were from the Chinese population. The coding regions of genes commonly associated with SRNS were analyzed to characterize the gene mutation spectrum in children with SRNS in central China. The first phase study involved 38 children with five genes (NPHS1, NPHS2, PLCE1, WT1, and TRPC6) by Sanger sequencing. The second phase study involved 33 children with 17 genes by next generation DNA sequencing (NGS. 22 new patients, and 11 patients from first phase study but without positive findings). Overall deleterious or putatively deleterious gene variants were identified in 19 patients (31.7%), including four NPHS1 variants among five patients and three PLCE1 variants among four other patients. Variants in COL4A3, COL4A4, or COL4A5 were found in six patients. Eight novel variants were identified, including two in NPHS1, two in PLCE1, one in NPHS2, LAMB2, COL4A3, and COL4A4, respectively. 55.6% of the children with variants failed to respond to immunosuppressive agent therapy, while the resistance rate in children without variants was 44.4%. Our results show that screening for deleterious variants in some common genes in children clinically suspected with SRNS might be helpful for disease diagnosis as well as prediction of treatment efficacy and prognosis. Copyright © 2017 Elsevier B.V. All rights reserved.
Profilings of MicroRNAs in the Liver of Common Carp (Cyprinus carpio) Infected with Flavobacterium columnare

PubMed Central

Zhao, Lijuan; Lu, Hong; Meng, Qinglei; Wang, Jinfu; Wang, Weimin; Yang, Ling; Lin, Li

2016-01-01

MicroRNAs (miRNAs) play important roles in regulation of many biological processes in eukaryotes, including pathogen infection and host interactions. Flavobacterium columnare (FC) infection can cause great economic loss of common carp (Cyprinus carpio) which is one of the most important cultured fish in the world. However, miRNAs in response to FC infection in common carp has not been characterized. To identify specific miRNAs involved in common carp infected with FC, we performed microRNA sequencing using livers of common carp infected with and without FC. A total of 698 miRNAs were identified, including 142 which were identified and deposited in the miRbase database (Available online: http://www.mirbase.org/) and 556 had only predicted miRNAs. Among the deposited miRNAs, eight miRNAs were first identified in common carp. Thirty of the 698 miRNAs were differentially expressed miRNAs (DIE-miRNAs) between the FC infected and control samples. From the DIE-miRNAs, seven were selected randomly and their expression profiles were confirmed to be consistent with the microRNA sequencing results using RT-PCR and qRT-PCR. In addition, a total of 27,363 target genes of the 30 DIE-miRNAs were predicted. The target genes were enriched in five Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, including focal adhesion, extracellular matrix (ECM)-receptor interaction, erythroblastic leukemia viral oncogene homolog (ErbB) signaling pathway, regulation of actin cytoskeleton, and adherent junction. The miRNA expression profile of the liver of common carp infected with FC will pave the way for the development of effective strategies to fight against FC infection. PMID:27092486
Probable Diagnosis of a Patient with Niemann-Pick Disease Type C: Managing Pitfalls of Exome Sequencing.

PubMed

Zeiger, William A; Jamal, Nasheed I; Scheuner, Maren T; Pittman, Patricia; Raymond, Kimiyo M; Morra, Massimo; Mishra, Shri K

2018-02-17

Here, we present a case of a 31-year-old man with progressive cognitive decline, ataxia, and dystonia. Extensive laboratory, radiographic, and targeted genetic studies over the course of several years failed to yield a diagnosis. Initial whole exome sequencing through a commercial laboratory identified several variants of uncertain significance; however, follow-up clinical examination and testing ruled each of these out. Eventually, repeat whole exome sequencing identified a known pathogenic intronic variant in the NPC1 gene (NM_000271.4, c.1554-1009G>A) and an additional heterozygous exonic variant of uncertain significance in the NPC1 gene (NM_000271.4, c.2524T>C). Follow-up biochemical testing was consistent with a diagnosis of probable Niemann-Pick disease Type C (NP-C). This case illustrates the potential of whole exome sequencing for diagnosing rare complex neurologic diseases. It also identifies several potential common pitfalls that must be navigated by clinicians when interpreting commercial whole exome sequencing results.
The Complete Genome Sequence of the Plant Growth-Promoting Bacterium Pseudomonas sp. UW4

PubMed Central

Duan, Jin; Jiang, Wei; Cheng, Zhenyu; Heikkila, John J.; Glick, Bernard R.

2013-01-01

The plant growth-promoting bacterium (PGPB) Pseudomonas sp. UW4, previously isolated from the rhizosphere of common reeds growing on the campus of the University of Waterloo, promotes plant growth in the presence of different environmental stresses, such as flooding, high concentrations of salt, cold, heavy metals, drought and phytopathogens. In this work, the genome sequence of UW4 was obtained by pyrosequencing and the gaps between the contigs were closed by directed PCR. The P. sp. UW4 genome contains a single circular chromosome that is 6,183,388 bp with a 60.05% G+C content. The bacterial genome contains 5,423 predicted protein-coding sequences that occupy 87.2% of the genome. Nineteen genomic islands (GIs) were predicted and thirty one complete putative insertion sequences were identified. Genes potentially involved in plant growth promotion such as indole-3-acetic acid (IAA) biosynthesis, trehalose production, siderophore production, acetoin synthesis, and phosphate solubilization were determined. Moreover, genes that contribute to the environmental fitness of UW4 were also observed including genes responsible for heavy metal resistance such as nickel, copper, cadmium, zinc, molybdate, cobalt, arsenate, and chromate. Whole-genome comparison with other completely sequenced Pseudomonas strains and phylogeny of four concatenated “housekeeping” genes (16S rRNA, gyrB, rpoB and rpoD) of 128 Pseudomonas strains revealed that UW4 belongs to the fluorescens group, jessenii subgroup. PMID:23516524
De novo transcriptome sequencing of two cultivated jute species under salinity stress

PubMed Central

Dai, Zhigang; Tang, Qing; Cheng, Chaohua; Xu, Ying

2017-01-01

Soil salinity, a major environmental stress, reduces agricultural productivity by restricting plant development and growth. Jute (Corchorus spp.), a commercially important bast fiber crop, includes two commercially cultivated species, Corchorus capsularis and Corchorus olitorius. We conducted high-throughput transcriptome sequencing of 24 C. capsularis and C. olitorius samples under salt stress and found 127 common differentially expressed genes (DEGs); additionally, 4489 and 492 common DEGs were identified in the root and leaf tissues, respectively, of both Corchorus species. Further, 32, 196, and 11 common differentially expressed transcription factors (DTFs) were detected in the leaf, root, or both tissues, respectively. Several Gene Ontology (GO) terms were enriched in NY and YY. A Kyoto Encyclopedia of Genes and Genomes analysis revealed numerous DEGs in both species. Abscisic acid and cytokinin signal pathways enriched respectively about 20 DEGs in leaves and roots of both NY and YY. The Ca2+, mitogen-activated protein kinase signaling and oxidative phosphorylation pathways were also found to be related to the plant response to salt stress, as evidenced by the DEGs in the roots of both species. These results provide insight into salt stress response mechanisms in plants as well as a basis for future breeding of salt-tolerant cultivars. PMID:29059212
Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure.

PubMed

Ruhlman, Tracey A; Zhang, Jin; Blazier, John C; Sabir, Jamal S M; Jansen, Robert K

2017-04-01

There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements. We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements. Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ∼22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements. We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats. © 2017 Botanical Society of America.
Whole-exome sequencing identified a variant in EFTUD2 gene in establishing a genetic diagnosis.

PubMed

Rengasamy Venugopalan, S; Farrow, E G; Lypka, M

2017-06-01

Craniofacial anomalies are complex and have an overlapping phenotype. Mandibulofacial Dysostosis and Oculo-Auriculo-Vertebral Spectrum are conditions that share common craniofacial phenotype and present a challenge in arriving at a diagnosis. In this report, we present a case of female proband who was given a differential diagnosis of Treacher Collins syndrome or Hemifacial Microsomia without certainty. Prior genetic testing reported negative for 22q deletion and FGFR screenings. The objective of this study was to demonstrate the critical role of whole-exome sequencing in establishing a genetic diagnosis of the proband. The participants were 14½-year-old affected female proband/parent trio. Proband/parent trio were enrolled in the study. Surgical tissue sample from the proband and parental blood samples were collected and prepared for whole-exome sequencing. Illumina HiSeq 2500 instrument was used for sequencing (125 nucleotide reads/84X coverage). Analyses of variants were performed using custom-developed software, RUNES and VIKING. Variant analyses following whole-exome sequencing identified a heterozygous de novo pathogenic variant, c.259C>T (p.Gln87*), in EFTUD2 (NM_004247.3) gene in the proband. Previous studies have reported that the variants in EFTUD2 gene were associated with Mandibulofacial Dysostosis with Microcephaly. Patients with facial asymmetry, micrognathia, choanal atresia and microcephaly should be analyzed for variants in EFTUD2 gene. Next-generation sequencing techniques, such as whole-exome sequencing offer great promise to improve the understanding of etiologies of sporadic genetic diseases. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
De novo sequencing and analysis of the transcriptome during the browning of fresh-cut Luffa cylindrica 'Fusi-3' fruits.

PubMed

Zhu, Haisheng; Liu, Jianting; Wen, Qingfang; Chen, Mindong; Wang, Bin; Zhang, Qianrong; Xue, Zhuzheng

2017-01-01

Fresh-cut luffa (Luffa cylindrica) fruits commonly undergo browning. However, little is known about the molecular mechanisms regulating this process. We used the RNA-seq technique to analyze the transcriptomic changes occurring during the browning of fresh-cut fruits from luffa cultivar 'Fusi-3'. Over 90 million high-quality reads were assembled into 58,073 Unigenes, and 60.86% of these were annotated based on sequences in four public databases. We detected 35,282 Unigenes with significant hits to sequences in the NCBInr database, and 24,427 Unigenes encoded proteins with sequences that were similar to those of known proteins in the Swiss-Prot database. Additionally, 20,546 and 13,021 Unigenes were similar to existing sequences in the Eukaryotic Orthologous Groups of proteins and Kyoto Encyclopedia of Genes and Genomes databases, respectively. Furthermore, 27,301 Unigenes were differentially expressed during the browning of fresh-cut luffa fruits (i.e., after 1-6 h). Moreover, 11 genes from five gene families (i.e., PPO, PAL, POD, CAT, and SOD) identified as potentially associated with enzymatic browning as well as four WRKY transcription factors were observed to be differentially regulated in fresh-cut luffa fruits. With the assistance of rapid amplification of cDNA ends technology, we obtained the full-length sequences of the 15 Unigenes. We also confirmed these Unigenes were expressed by quantitative real-time polymerase chain reaction analysis. This study provides a comprehensive transcriptome sequence resource, and may facilitate further studies aimed at identifying genes affecting luffa fruit browning for the exploitation of the underlying mechanism.
Metabolic Pathway Assignment of Plant Genes based on Phylogenetic Profiling–A Feasibility Study

PubMed Central

Weißenborn, Sandra; Walther, Dirk

2017-01-01

Despite many developed experimental and computational approaches, functional gene annotation remains challenging. With the rapidly growing number of sequenced genomes, the concept of phylogenetic profiling, which predicts functional links between genes that share a common co-occurrence pattern across different genomes, has gained renewed attention as it promises to annotate gene functions based on presence/absence calls alone. We applied phylogenetic profiling to the problem of metabolic pathway assignments of plant genes with a particular focus on secondary metabolism pathways. We determined phylogenetic profiles for 40,960 metabolic pathway enzyme genes with assigned EC numbers from 24 plant species based on sequence and pathway annotation data from KEGG and Ensembl Plants. For gene sequence family assignments, needed to determine the presence or absence of particular gene functions in the given plant species, we included data of all 39 species available at the Ensembl Plants database and established gene families based on pairwise sequence identities and annotation information. Aside from performing profiling comparisons, we used machine learning approaches to predict pathway associations from phylogenetic profiles alone. Selected metabolic pathways were indeed found to be composed of gene families of greater than expected phylogenetic profile similarity. This was particularly evident for primary metabolism pathways, whereas for secondary pathways, both the available annotation in different species as well as the abstraction of functional association via distinct pathways proved limiting. While phylogenetic profile similarity was generally not found to correlate with gene co-expression, direct physical interactions of proteins were reflected by a significantly increased profile similarity suggesting an application of phylogenetic profiling methods as a filtering step in the identification of protein-protein interactions. This feasibility study highlights the potential and challenges associated with phylogenetic profiling methods for the detection of functional relationships between genes as well as the need to enlarge the set of plant genes with proven secondary metabolism involvement as well as the limitations of distinct pathways as abstractions of relationships between genes. PMID:29163570
Bayesian Inference of Allele-Specific Gene Expression Indicates Abundant Cis-Regulatory Variation in Natural Flycatcher Populations

PubMed Central

Wang, Mi

2017-01-01

Abstract Polymorphism in cis-regulatory sequences can lead to different levels of expression for the two alleles of a gene, providing a starting point for the evolution of gene expression. Little is known about the genome-wide abundance of genetic variation in gene regulation in natural populations but analysis of allele-specific expression (ASE) provides a means for investigating such variation. We performed RNA-seq of multiple tissues from population samples of two closely related flycatcher species and developed a Bayesian algorithm that maximizes data usage by borrowing information from the whole data set and combines several SNPs per transcript to detect ASE. Of 2,576 transcripts analyzed in collared flycatcher, ASE was detected in 185 (7.2%) and a similar frequency was seen in the pied flycatcher. Transcripts with statistically significant ASE commonly showed the major allele in >90% of the reads, reflecting that power was highest when expression was heavily biased toward one of the alleles. This would suggest that the observed frequencies of ASE likely are underestimates. The proportion of ASE transcripts varied among tissues, being lowest in testis and highest in muscle. Individuals often showed ASE of particular transcripts in more than one tissue (73.4%), consistent with a genetic basis for regulation of gene expression. The results suggest that genetic variation in regulatory sequences commonly affects gene expression in natural populations and that it provides a seedbed for phenotypic evolution via divergence in gene expression. PMID:28453623
Global sequence diversity of the lactate dehydrogenase gene in Plasmodium falciparum.

PubMed

Simpalipan, Phumin; Pattaradilokrat, Sittiporn; Harnyuttanakorn, Pongchai

2018-01-09

Antigen-detecting rapid diagnostic tests (RDTs) have been recommended by the World Health Organization for use in remote areas to improve malaria case management. Lactate dehydrogenase (LDH) of Plasmodium falciparum is one of the main parasite antigens employed by various commercial RDTs. It has been hypothesized that the poor detection of LDH-based RDTs is attributed in part to the sequence diversity of the gene. To test this, the present study aimed to investigate the genetic diversity of the P. falciparum ldh gene in Thailand and to construct the map of LDH sequence diversity in P. falciparum populations worldwide. The ldh gene was sequenced for 50 P. falciparum isolates in Thailand and compared with hundreds of sequences from P. falciparum populations worldwide. Several indices of molecular variation were calculated, including the proportion of polymorphic sites, the average nucleotide diversity index (π), and the haplotype diversity index (H). Tests of positive selection and neutrality tests were performed to determine signatures of natural selection on the gene. Mean genetic distance within and between species of Plasmodium ldh was analysed to infer evolutionary relationships. Nucleotide sequences of P. falciparum ldh could be classified into 9 alleles, encoding 5 isoforms of LDH. L1a was the most common allelic type and was distributed in P. falciparum populations worldwide. Plasmodium falciparum ldh sequences were highly conserved, with haplotype and nucleotide diversity values of 0.203 and 0.0004, respectively. The extremely low genetic diversity was maintained by purifying selection, likely due to functional constraints. Phylogenetic analysis inferred the close genetic relationship of P. falciparum to malaria parasites of great apes, rather than to other human malaria parasites. This study revealed the global genetic variation of the ldh gene in P. falciparum, providing knowledge for improving detection of LDH-based RDTs and supporting the candidacy of LDH as a therapeutic drug target.

Ancient diversity and geographical sub-structuring in African buffalo Theileria parva populations revealed through metagenetic analysis of antigen-encoding loci.

PubMed

Hemmink, Johanneke D; Sitt, Tatjana; Pelle, Roger; de Klerk-Lorist, Lin-Mari; Shiels, Brian; Toye, Philip G; Morrison, W Ivan; Weir, William

2018-03-01

An infection and treatment protocol involving infection with a mixture of three parasite isolates and simultaneous treatment with oxytetracycline is currently used to vaccinate cattle against Theileria parva. While vaccination results in high levels of protection in some regions, little or no protection is observed in areas where animals are challenged predominantly by parasites of buffalo origin. A previous study involving sequencing of two antigen-encoding genes from a series of parasite isolates indicated that this is associated with greater antigenic diversity in buffalo-derived T. parva. The current study set out to extend these analyses by applying high-throughput sequencing to ex vivo samples from naturally infected buffalo to determine the extent of diversity in a set of antigen-encoding genes. Samples from two populations of buffalo, one in Kenya and the other in South Africa, were examined to investigate the effect of geographical distance on the nature of sequence diversity. The results revealed a number of significant findings. First, there was a variable degree of nucleotide sequence diversity in all gene segments examined, with the percentage of polymorphic nucleotides ranging from 10% to 69%. Second, large numbers of allelic variants of each gene were found in individual animals, indicating multiple infection events. Third, despite the observed diversity in nucleotide sequences, several of the gene products had highly conserved amino acid sequences, and thus represent potential candidates for vaccine development. Fourth, although compelling evidence for population differentiation between the Kenyan and South African T. parva parasites was identified, analysis of molecular variance for each gene revealed that the majority of the underlying nucleotide sequence polymorphism was common to both areas, indicating that much of this aspect of genetic variation in the parasite population arose prior to geographic separation. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Characterization of genomic sequence showing strong association with polyembryony among diverse Citrus species and cultivars, and its synteny with Vitis and Populus.

PubMed

Nakano, Michiharu; Shimada, Takehiko; Endo, Tomoko; Fujii, Hiroshi; Nesumi, Hirohisa; Kita, Masayuki; Ebina, Masumi; Shimizu, Tokurou; Omura, Mitsuo

2012-02-01

Polyembryony, in which multiple somatic nucellar cell-derived embryos develop in addition to the zygotic embryo in a seed, is common in the genus Citrus. Previous genetic studies indicated polyembryony is mainly determined by a single locus, but the underlying molecular mechanism is still unclear. As a step towards identification and characterization of the gene or genes responsible for nucellar embryogenesis in Citrus, haplotype-specific physical maps around the polyembryony locus were constructed. By sequencing three BAC clones aligned on the polyembryony haplotype, a single contiguous draft sequence consisting of 380 kb containing 70 predicted open reading frames (ORFs) was reconstructed. Single nucleotide polymorphism genotypes detected in the sequenced genomic region showed strong association with embryo type in Citrus, indicating a common polyembryony locus is shared among widely diverse Citrus cultivars and species. The arrangement of the predicted ORFs in the characterized genomic region showed high collinearity to the genomic sequence of chromosome 4 of Vitis vinifera and linkage group VI of Populus trichocarpa, suggesting that the syntenic relationship among these species is conserved even though V. vinifera and P. trichocarpa are non-apomictic species. This is the first study to characterize in detail the genomic structure of an apomixis locus determining adventitious embryony. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
The physical map of wheat chromosome 1BS provides insights into its gene space organization and evolution

PubMed Central

2013-01-01

Background The wheat genome sequence is an essential tool for advanced genomic research and improvements. The generation of a high-quality wheat genome sequence is challenging due to its complex 17 Gb polyploid genome. To overcome these difficulties, sequencing through the construction of BAC-based physical maps of individual chromosomes is employed by the wheat genomics community. Here, we present the construction of the first comprehensive physical map of chromosome 1BS, and illustrate its unique gene space organization and evolution. Results Fingerprinted BAC clones were assembled into 57 long scaffolds, anchored and ordered with 2,438 markers, covering 83% of chromosome 1BS. The BAC-based chromosome 1BS physical map and gene order of the orthologous regions of model grass species were consistent, providing strong support for the reliability of the chromosome 1BS assembly. The gene space for chromosome 1BS spans the entire length of the chromosome arm, with 76% of the genes organized in small gene islands, accompanied by a two-fold increase in gene density from the centromere to the telomere. Conclusions This study provides new evidence on common and chromosome-specific features in the organization and evolution of the wheat genome, including a non-uniform distribution of gene density along the centromere-telomere axis, abundance of non-syntenic genes, the degree of colinearity with other grass genomes and a non-uniform size expansion along the centromere-telomere axis compared with other model cereal genomes. The high-quality physical map constructed in this study provides a solid basis for the assembly of a reference sequence of chromosome 1BS and for breeding applications. PMID:24359668
Implications of natural selection in shaping 99.4% nonsynonymous DNA identity between humans and chimpanzees: Enlarging genus Homo

PubMed Central

Wildman, Derek E.; Uddin, Monica; Liu, Guozhen; Grossman, Lawrence I.; Goodman, Morris

2003-01-01

What do functionally important DNA sites, those scrutinized and shaped by natural selection, tell us about the place of humans in evolution? Here we compare ≈90 kb of coding DNA nucleotide sequence from 97 human genes to their sequenced chimpanzee counterparts and to available sequenced gorilla, orangutan, and Old World monkey counterparts, and, on a more limited basis, to mouse. The nonsynonymous changes (functionally important), like synonymous changes (functionally much less important), show chimpanzees and humans to be most closely related, sharing 99.4% identity at nonsynonymous sites and 98.4% at synonymous sites. On a time scale, the coding DNA divergencies separate the human–chimpanzee clade from the gorilla clade at between 6 and 7 million years ago and place the most recent common ancestor of humans and chimpanzees at between 5 and 6 million years ago. The evolutionary rate of coding DNA in the catarrhine clade (Old World monkey and ape, including human) is much slower than in the lineage to mouse. Among the genes examined, 30 show evidence of positive selection during descent of catarrhines. Nonsynonymous substitutions by themselves, in this subset of positively selected genes, group humans and chimpanzees closest to each other and have chimpanzees diverge about as much from the common human–chimpanzee ancestor as humans do. This functional DNA evidence supports two previously offered taxonomic proposals: family Hominidae should include all extant apes; and genus Homo should include three extant species and two subgenera, Homo (Homo) sapiens (humankind), Homo (Pan) troglodytes (common chimpanzee), and Homo (Pan) paniscus (bonobo chimpanzee). PMID:12766228
Implications of natural selection in shaping 99.4% nonsynonymous DNA identity between humans and chimpanzees: enlarging genus Homo.

PubMed

Wildman, Derek E; Uddin, Monica; Liu, Guozhen; Grossman, Lawrence I; Goodman, Morris

2003-06-10

What do functionally important DNA sites, those scrutinized and shaped by natural selection, tell us about the place of humans in evolution? Here we compare approximately 90 kb of coding DNA nucleotide sequence from 97 human genes to their sequenced chimpanzee counterparts and to available sequenced gorilla, orangutan, and Old World monkey counterparts, and, on a more limited basis, to mouse. The nonsynonymous changes (functionally important), like synonymous changes (functionally much less important), show chimpanzees and humans to be most closely related, sharing 99.4% identity at nonsynonymous sites and 98.4% at synonymous sites. On a time scale, the coding DNA divergencies separate the human-chimpanzee clade from the gorilla clade at between 6 and 7 million years ago and place the most recent common ancestor of humans and chimpanzees at between 5 and 6 million years ago. The evolutionary rate of coding DNA in the catarrhine clade (Old World monkey and ape, including human) is much slower than in the lineage to mouse. Among the genes examined, 30 show evidence of positive selection during descent of catarrhines. Nonsynonymous substitutions by themselves, in this subset of positively selected genes, group humans and chimpanzees closest to each other and have chimpanzees diverge about as much from the common human-chimpanzee ancestor as humans do. This functional DNA evidence supports two previously offered taxonomic proposals: family Hominidae should include all extant apes; and genus Homo should include three extant species and two subgenera, Homo (Homo) sapiens (humankind), Homo (Pan) troglodytes (common chimpanzee), and Homo (Pan) paniscus (bonobo chimpanzee).
Phylogenetic relationships and timing of diversification in gonorynchiform fishes inferred using nuclear gene DNA sequences (Teleostei: Ostariophysi).

PubMed

Near, Thomas J; Dornburg, Alex; Friedman, Matt

2014-11-01

The Gonorynchiformes are the sister lineage of the species-rich Otophysi and provide important insights into the diversification of ostariophysan fishes. Phylogenies of gonorynchiforms inferred using morphological characters and mtDNA gene sequences provide differing resolutions with regard to the sister lineage of all other gonorynchiforms (Chanos vs. Gonorynchus) and support for monophyly of the two miniaturized lineages Cromeria and Grasseichthys. In this study the phylogeny and divergence times of gonorynchiforms are investigated with DNA sequences sampled from nine nuclear genes and a published morphological character matrix. Bayesian phylogenetic analyses reveal substantial congruence among individual gene trees with inferences from eight genes placing Gonorynchus as the sister lineage to all other gonorynchiforms. Seven gene trees resolve Cromeria and Grasseichthys as a clade, supporting previous inferences using morphological characters. Phylogenies resulting from either concatenating the nuclear genes, performing a multispecies coalescent species tree analysis, or combining the morphological and nuclear gene DNA sequences resolve Gonorynchus as the living sister lineage of all other gonorynchiforms, strongly support the monophyly of Cromeria and Grasseichthys, and resolve a clade containing Parakneria, Cromeria, and Grasseichthys. The morphological dataset, which includes 13 gonorynchiform fossil taxa that range in age from Early Cretaceous to Eocene, was analyzed in combination with DNA sequences from the nine nuclear genes and a relaxed molecular clock to estimate times of evolutionary divergence. This "tip dating" strategy accommodates uncertainty in the phylogenetic resolution of fossil taxa that provide calibration information in the relaxed molecular clock analysis. The estimated age of the most recent common ancestor (MRCA) of living gonorynchiforms is slightly older than estimates from previous node dating efforts, but the molecular tip dating estimated ages of Kneriinae (Kneria, Parakneria, Cromeria, and Grasseichthys) and the two paedomorphic lineages, Cromeria and Grasseichthys, are considerably younger. Copyright © 2014 Elsevier Inc. All rights reserved.
Horse cDNA clones encoding two MHC class I genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barbis, D.P.; Maher, J.K.; Stanek, J.

1994-12-31

Two full-length clones encoding MHC class I genes were isolated by screening a horse cDNA library, using a probe encoding in human HLA-A2.2Y allele. The library was made in the pcDNA1 vector (Invitrogen, San Diego, CA), using mRNA from peripheral blood lymphocytes obtained from a Thoroughbred stallion (No. 0834) homozygous for a common horse MHC haplotype (ELA-A2, -B2, -D2; Antczak et al. 1984; Donaldson et al. 1988). The clones were sequenced, using SP6 and T7 universal primers and horse-specific oligonucleotides designed to extend previously determined sequences.
Identification and characterization of gene-based SSR markers in date palm (Phoenix dactylifera L.).

PubMed

Zhao, Yongli; Williams, Roxanne; Prakash, C S; He, Guohao

2012-12-15

Date palm (Phoenix dactylifera L.) is an important tree in the Middle East and North Africa due to the nutritional value of its fruit. Molecular Breeding would accelerate genetic improvement of fruit tree through marker assisted selection. However, the lack of molecular markers in date palm restricts the application of molecular breeding. In this study, we analyzed 28,889 EST sequences from the date palm genome database to identify simple-sequence repeats (SSRs) and to develop gene-based markers, i.e. expressed sequence tag-SSRs (EST-SSRs). We identified 4,609 ESTs as containing SSRs, among which, trinucleotide motifs (69.7%) were the most common, followed by tetranucleotide (10.4%) and dinucleotide motifs (9.6%). The motif AG (85.7%) was most abundant in dinucleotides, while motifs AGG (26.8%), AAG (19.3%), and AGC (16.1%) were most common among trinucleotides. A total of 4,967 primer pairs were designed for EST-SSR markers from the computational data. In a follow up laboratory study, we tested a sample of 20 random selected primer pairs for amplification and polymorphism detection using genomic DNA from date palm cultivars. Nearly one-third of these primer pairs detected DNA polymorphism to differentiate the twelve date palm cultivars used. Functional categorization of EST sequences containing SSRs revealed that 3,108 (67.4%) of such ESTs had homology with known proteins. Date palm EST sequences exhibits a good resource for developing gene-based markers. These genic markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in date palm, such as diversity study, QTL mapping, and molecular breeding.
Genes That Escape X-Inactivation in Humans Have High Intraspecific Variability in Expression, Are Associated with Mental Impairment but Are Not Slow Evolving

PubMed Central

Zhang, Yuchao; Castillo-Morales, Atahualpa; Jiang, Min; Zhu, Yufei; Hu, Landian; Urrutia, Araxi O.; Kong, Xiangyin; Hurst, Laurence D.

2013-01-01

In female mammals most X-linked genes are subject to X-inactivation. However, in humans some X-linked genes escape silencing, these escapees being candidates for the phenotypic aberrations seen in polyX karyotypes. These escape genes have been reported to be under stronger purifying selection than other X-linked genes. Although it is known that escape from X-inactivation is much more common in humans than in mice, systematic assays of escape in humans have to date employed only interspecies somatic cell hybrids. Here we provide the first systematic next-generation sequencing analysis of escape in a human cell line. We analyzed RNA and genotype sequencing data obtained from B lymphocyte cell lines derived from Europeans (CEU) and Yorubans (YRI). By replicated detection of heterozygosis in the transcriptome, we identified 114 escaping genes, including 76 not previously known to be escapees. The newly described escape genes cluster on the X chromosome in the same chromosomal regions as the previously known escapees. There is an excess of escaping genes associated with mental retardation, consistent with this being a common phenotype of polyX phenotypes. We find both differences between populations and between individuals in the propensity to escape. Indeed, we provide the first evidence for there being both hyper- and hypo-escapee females in the human population, consistent with the highly variable phenotypic presentation of polyX karyotypes. Considering also prior data, we reclassify genes as being always, never, and sometimes escape genes. We fail to replicate the prior claim that genes that escape X-inactivation are under stronger purifying selection than others. PMID:24023392
Assessing fluctuating evolutionary pressure in yeast and mammal evolutionary rate covariation using bioinformatics of meiotic protein genetic sequences

NASA Astrophysics Data System (ADS)

Dehipawala, Sunil; Nguyen, A.; Tremberger, G.; Cheung, E.; Holden, T.; Lieberman, D.; Cheung, T.

2013-09-01

The evolutionary rate co-variation in meiotic proteins has been reported for yeast and mammal using phylogenic branch lengths which assess retention, duplication and mutation. The bioinformatics of the corresponding DNA sequences could be classified as a diagram of fractal dimension and Shannon entropy. Results from biomedical gene research provide examples on the diagram methodology. The identification of adaptive selection using entropy marker and functional-structural diversity using fractal dimension would support a regression analysis where the coefficient of determination would serve as evolutionary pathway marker for DNA sequences and be an important component in the astrobiology community. Comparisons between biomedical genes such as EEF2 (elongation factor 2 human, mouse, etc), WDR85 in epigenetics, HAR1 in human specificity, clinical trial targeted cancer gene CD47, SIRT6 in spermatogenesis, and HLA-C in mosquito bite immunology demonstrate the diagram classification methodology. Comparisons to the SEPT4-XIAP pair in stem cell apoptosis, testesexpressed taste genes TAS1R3-GNAT3 pair, and amyloid beta APLP1-APLP2 pair with the yeast-mammal DNA sequences for meiotic proteins RAD50-MRE11 pair and NCAPD2-ICK pair have accounted for the observed fluctuating evolutionary pressure systematically. Regression with high R-sq values or a triangular-like cluster pattern for concordant pairs in co-variation among the studied species could serve as evidences for the possible location of common ancestors in the entropy-fractal dimension diagram, consistent with an example of the human-chimp common ancestor study using the FOXP2 regulated genes reported in human fetal brain study. The Deinococcus radiodurans R1 Rad-A could be viewed as an outlier in the RAD50 diagram and also in the free energy versus fractal dimension regression Cook's distance, consistent with a non-Earth source for this radiation resistant bacterium. Convergent and divergent fluctuating evolutionary pressure could be studied with extension to genetic sequences in organisms in possible astrobiology conditions, with the assumption that the continuation of a book of life would require meiotic proteins everywhere in the universe.
Investigating Genetic Alterations in Bladder Cancer | Center for Cancer Research

Cancer.gov

Bladder cancer (BC) is the fifth most common cancer worldwide and the sixth most common cancer in the U.S. Mutations in a number of oncogenes and tumor suppressor genes were previously associated with invasive or noninvasive forms of the disease. More recently, next generation sequencing (NGS) of bladder tumors from over 100 Chinese patients revealed alterations in additional
Integrated DNA/RNA targeted genomic profiling of diffuse large B-cell lymphoma using a clinical assay.

PubMed

Intlekofer, Andrew M; Joffe, Erel; Batlevi, Connie L; Hilden, Patrick; He, Jie; Seshan, Venkatraman E; Zelenetz, Andrew D; Palomba, M Lia; Moskowitz, Craig H; Portlock, Carol; Straus, David J; Noy, Ariela; Horwitz, Steven M; Gerecitano, John F; Moskowitz, Alison; Hamlin, Paul; Matasar, Matthew J; Kumar, Anita; van den Brink, Marcel R; Knapp, Kristina M; Pichardo, Janine D; Nahas, Michelle K; Trabucco, Sally E; Mughal, Tariq; Copeland, Amanda R; Papaemmanuil, Elli; Moarii, Mathai; Levine, Ross L; Dogan, Ahmet; Miller, Vincent A; Younes, Anas

2018-06-12

We sought to define the genomic landscape of diffuse large B-cell lymphoma (DLBCL) by using formalin-fixed paraffin-embedded (FFPE) biopsy specimens. We used targeted sequencing of genes altered in hematologic malignancies, including DNA coding sequence for 405 genes, noncoding sequence for 31 genes, and RNA coding sequence for 265 genes (FoundationOne-Heme). Short variants, rearrangements, and copy number alterations were determined. We studied 198 samples (114 de novo, 58 previously treated, and 26 large-cell transformation from follicular lymphoma). Median number of GAs per case was 6, with 97% of patients harboring at least one alteration. Recurrent GAs were detected in genes with established roles in DLBCL pathogenesis (e.g. MYD88, CREBBP, CD79B, EZH2), as well as notable differences compared to prior studies such as inactivating mutations in TET2 (5%). Less common GAs identified potential targets for approved or investigational therapies, including BRAF, CD274 (PD-L1), IDH2, and JAK1/2. TP53 mutations were more frequently observed in relapsed/refractory DLBCL, and predicted for lack of response to first-line chemotherapy, identifying a subset of patients that could be prioritized for novel therapies. Overall, 90% (n = 169) of the patients harbored a GA which could be explored for therapeutic intervention, with 54% (n = 107) harboring more than one putative target.
Spectrum of CFTR gene mutations in Ecuadorian cystic fibrosis patients: the second report of the p.H609R mutation.

PubMed

Ortiz, Sofía C; Aguirre, Santiago J; Flores, Sofía; Maldonado, Claudio; Mejía, Juan; Salinas, Lilian

2017-11-01

High heterogeneity in the CFTR gene mutations disturbs the molecular diagnosis of cystic fibrosis (CF). In order to improve the diagnosis of CF in our country, the present study aims to define a panel of common CFTR gene mutations by sequencing 27 exons of the gene in Ecuadorian Cystic Fibrosis patients. Forty-eight Ecuadorian individuals with suspected/confirmed CF diagnosis were included. Twenty-seven exons of CFTR gene were sequenced to find sequence variations. Prevalence of pathogenic variations were determined and compared with other countries' data. We found 70 sequence variations. Eight of these are CF-causing mutations: p.F508del, p.G85E, p.G330E, p.A455E, p.G970S, W1098X, R1162X, and N1303K. Also this study is the second report of p.H609R in Ecuadorian population. Mutation prevalence differences between Ecuadorian population and other Latin America countries were found. The panel of mutations suggested as an initial screening for the Ecuadorian population with cystic fibrosis should contain the mutations: p.F508del, p.G85E, p.G330E, p.A455E, p.G970S, W1098X, R1162X, and N1303K. © 2017 NETLAB Laboratorios Especializados. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.
Molecular analysis of two phytohemagglutinin genes and their expression in Phaseolus vulgaris cv. Pinto, a lectin-deficient cultivar of the bean

PubMed Central

Voelker, Toni A.; Staswick, Paul; Chrispeels, Maarten J.

1986-01-01

Phytohemagglutinin (PHA), the seed lectin of the common bean, Phaseolus vulgaris, is encoded by two highly homologous, tandemly linked genes, dlec1 and dlec2, which are coordinately expressed at high levels in developing cotyledons. Their respective transcripts translate into closely related polypeptides, PHA-E and PHA-L, constituents of the tetrameric lectin which accumulates at high levels in developing seeds. In the bean cultivar Pinto UI111, PHA-E is not detectable, and PHA-L accumulates at very reduced levels. To investigate the cause of the Pinto phenotype, we cloned and sequenced the two PHA genes of Pinto, called Pdlec1 and Pdlec2, and determined the abundance of their respective mRNAs in developing cotyledons. Both genes are more than 90% homologous to the normal PHA genes found in other cultivars. Pdlec1 carries a 1-bp frameshift mutation close to the 5' end of its coding sequence. Only very truncated polypeptides could be made from its mRNA. The gene Pdlec2 encodes a polypeptide, which resembles PHA-L and its predicted amino acid sequence agrees with the available Pinto PHA amino acid sequence data. Analysis of the mRNA of developing cotyledons revealed that the Pdlec1 message is reduced 600-fold, and Pdlec2 mRNA is reduced 20-fold with respect to mRNA levels in normal cultivars. A comparison of the sequences which are upstream from the coding sequence shows that Pdlec2 has a 100-bp deletion compared to the other genes (dlec1, dlec2 and Pdlec1). This deletion which contains a large tandem repeat may be responsible for the low level of expression of Pdlec2. The very low expression of Pdlec1 is as yet unexplained. ImagesFig. 5. PMID:16453730
Expressed sequence tag analysis of human RPE/choroid for the NEIBank Project: over 6000 non-redundant transcripts, novel genes and splice variants.

PubMed

Wistow, Graeme; Bernstein, Steven L; Wyatt, M Keith; Fariss, Robert N; Behal, Amita; Touchman, Jeffrey W; Bouffard, Gerald; Smith, Don; Peterson, Katherine

2002-06-15

The retinal pigment epithelium (RPE) and choroid comprise a functional unit of the eye that is essential to normal retinal health and function. Here we describe expressed sequence tag (EST) analysis of human RPE/choroid as part of a project for ocular bioinformatics. A cDNA library (cs) was made from human RPE/choroid and sequenced. Data were analyzed and assembled using the program GRIST (GRouping and Identification of Sequence Tags). Complete sequencing, Northern and Western blots, RH mapping, peptide antibody synthesis and immunofluorescence (IF) have been used to examine expression patterns and genome location for selected transcripts and proteins. Ten thousand individual sequence reads yield over 6300 unique gene clusters of which almost half have no matches with named genes. One of the most abundant transcripts is from a gene (named "alpha") that maps to the BBS1 region of chromosome 11. A number of tissue preferred transcripts are common to both RPE/choroid and iris. These include oculoglycan/opticin, for which an alternative splice form is detected in RPE/choroid, and "oculospanin" (Ocsp), a novel tetraspanin that maps to chromosome 17q. Antiserum to Ocsp detects expression in RPE, iris, ciliary body, and retinal ganglion cells by IF. A newly identified gene for a zinc-finger protein (TIRC) maps to 19q13.4. Variant transcripts of several genes were also detected. Most notably, the predominant form of Bestrophin represented in cs contains a longer open reading frame as a result of splice junction skipping. The unamplified cs library gives a view of the transcriptional repertoire of the adult RPE/choroid. A large number of potentially novel genes and splice forms and candidates for genetic diseases are revealed. Clones from this collection are being included in a large, nonredundant set for cDNA microarray construction.
Cytochrome P450 genes from the aquatic midge Chironomus tentans: Atrazine-induced up-regulation of CtCYP6EX3 contributing to oxidative activation of chlorpyrifos

USDA-ARS?s Scientific Manuscript database

The open reading frames of 19 cytochrome P450 monooxygenase (CYP) genes were sequenced from Chironomus tentans, a commonly used freshwater invertebrate model. Functional analysis of CtCYP6EX3 confirmed its atrazine-induced oxidative activation for chlorpyrifos by using a nanoparticle-based RNA inter...
Refined identification of Vibrio bacterial flora from Acanthasther planci based on biochemical profiling and analysis of housekeeping genes.

PubMed

Rivera-Posada, J A; Pratchett, M; Cano-Gomez, A; Arango-Gomez, J D; Owens, L

2011-09-09

We used a polyphasic approach for precise identification of bacterial flora (Vibrionaceae) isolated from crown-of-thorns starfish (COTS) from Lizard Island (Great Barrier Reef, Australia) and Guam (U.S.A., Western Pacific Ocean). Previous 16S rRNA gene phylogenetic analysis was useful to allocate and identify isolates within the Photobacterium, Splendidus and Harveyi clades but failed in the identification of Vibrio harveyi-like isolates. Species of the V harveyi group have almost indistinguishable phenotypes and genotypes, and thus, identification by standard biochemical tests and 16S rRNA gene analysis is commonly inaccurate. Biochemical profiling and sequence analysis of additional topA and mreB housekeeping genes were carried out for definitive identification of 19 bacterial isolates recovered from sick and wild COTS. For 8 isolates, biochemical profiles and topA and mreB gene sequence alignments with the closest relatives (GenBank) confirmed previous 16S rRNA-based identification: V. fortis and Photobacterium eurosenbergii species (from wild COTS), and V natriegens (from diseased COTS). Further phylogenetic analysis based on topA and mreB concatenated sequences served to identify the remaining 11 V harveyi-like isolates: V. owensii and V. rotiferianus (from wild COTS), and V. owensii, V. rotiferianus, and V. harveyi (from diseased COTS). This study further confirms the reliability of topA-mreB gene sequence analysis for identification of these close species, and it reveals a wider distribution range of the potentially pathogenic V. harveyi group.
Allelic association of sequence variants in the herpes virus entry mediator-B gene (PVRL2) with the severity of multiple sclerosis.

PubMed

Schmidt, S; Pericak-Vance, M A; Sawcer, S; Barcellos, L F; Hart, J; Sims, J; Prokop, A M; van der Walt, J; DeLoa, C; Lincoln, R R; Oksenberg, J R; Compston, A; Hauser, S L; Haines, J L; Gregory, S G

2006-07-01

Discrepant findings have been reported regarding an association of the apolipoprotein E (APOE) gene with the clinical course of multiple sclerosis (MS). To resolve these discrepancies, we examined common sequence variation in six candidate genes residing in a 380-kb genomic region surrounding and including the APOE locus for an association with MS severity. We genotyped at least three polymorphisms in each of six candidate genes in 1,540 Caucasian MS families (729 single-case and multiple-case families from the United States, 811 single-case families from the UK). By applying the quantitative transmission/disequilibrium test to a recently proposed MS severity score, the only statistically significant (P=0.003) association with MS severity was found for an intronic variant in the Herpes Virus Entry Mediator-B Gene PVRL2. Additional genotyping extended the association to a 16.6 kb block spanning intron 1 to intron 2 of the gene. Sequencing of PVRL2 failed to identify variants with an obvious functional role. In conclusion, the analysis of a very large data set suggests that genetic polymorphisms in PVRL2 may influence MS severity and supports the possibility that viral factors may contribute to the clinical course of MS, consistent with previous reports.
High rates of recombination in otitis media isolates of non-typeable Haemophilus influenzae✩

PubMed Central

Cody, Alison J.; Field, Dawn; Feil, Edward J.; Stringer, Suzanna; Deadman, Mary E.; Tsolaki, Anthony G.; Gratz, Brett; Bouchet, Valérie; Goldstein, Richard; Hood, Derek W.; Moxon, E. Richard

2008-01-01

Non-typeable (NT) or capsule-deficient, Haemophilus influenzae (Hi) is a common commensal of the upper respiratory tract of humans and can be pathogenic resulting in diseases such as otitis media, sinusitis and pneumonia. The lipopolysaccharide (LPS) of NTHi is a major virulence factor that displays substantial intra-strain and inter-strain variation of its oligosaccharide structures. To investigate the genetic basis of LPS variation we sequenced internal regions of each of seven genes required for the biosynthesis of either the inner or the outer core oligosaccharide structures. These sequences were obtained from 25 representative NTHi isolates from episodes of otitis media. We found abundant evidence of recombination among LPS genes of NTHi, a finding in marked contrast to previous analyses of biosynthetic genes for capsular polysaccharide, a well-documented virulence factor of Hi. We found mosaic sequences, linkage equilibrium between loci and a lack of congruence between gene trees. These high rates were not confined to LPS genes since evidence for similar amounts of recombination was also found in eight housekeeping genes in a subset of the same 25 isolates. These findings provide a population based foundation for a better understanding of the role of NTHi LPS as a virulence factor and its potential as a candidate vaccine. PMID:12797973
Computing prokaryotic gene ubiquity: rescuing the core from extinction.

PubMed

Charlebois, Robert L; Doolittle, W Ford

2004-12-01

The genomic core concept has found several uses in comparative and evolutionary genomics. Defined as the set of all genes common to (ubiquitous among) all genomes in a phylogenetically coherent group, core size decreases as the number and phylogenetic diversity of the relevant group increases. Here, we focus on methods for defining the size and composition of the core of all genes shared by sequenced genomes of prokaryotes (Bacteria and Archaea). There are few (almost certainly less than 50) genes shared by all of the 147 genomes compared, surely insufficient to conduct all essential functions. Sequencing and annotation errors are responsible for the apparent absence of some genes, while very limited but genuine disappearances (from just one or a few genomes) can account for several others. Core size will continue to decrease as more genome sequences appear, unless the requirement for ubiquity is relaxed. Such relaxation seems consistent with any reasonable biological purpose for seeking a core, but it renders the problem of definition more problematic. We propose an alternative approach (the phylogenetically balanced core), which preserves some of the biological utility of the core concept. Cores, however delimited, preferentially contain informational rather than operational genes; we present a new hypothesis for why this might be so.

Analysis of koi herpesvirus latency in wild common carp and ornamental koi in Oregon, USA.

PubMed

Xu, Jia-Rong; Bently, Jennifer; Beck, Linda; Reed, Aimee; Miller-Morgan, Tim; Heidel, Jerry R; Kent, Michael L; Rockey, Daniel D; Jin, Ling

2013-02-01

Koi herpesvirus (KHV) infection is associated with high mortalities in both common carp (Cyprinus carpio carpio) and koi carp (Cyprinus carpio koi) worldwide. Although acute infection has been reported in both domestic and wild common carp, the status of KHV latent infection is largely unknown in wild common carp. To investigate whether KHV latency is present in wild common carp, the distribution of KHV latent infection was investigated in two geographically distinct populations of wild common carp in Oregon, as well as in koi from an Oregon-based commercial supplier. Latent KHV infection was demonstrated in white blood cells from each of these populations. Although KHV isolated from acute infections has two distinct genetic groups, Asian and European, KHV detected in wild carp has not been genetically characterized. DNA sequences from ORF 25 to 26 that are unique between Asian and European were investigated in this study. KHV from captive koi and some wild common carp were found to have ORF-25-26 sequences similar to KHV-J (Asian), while the majority of KHV DNA detected in wild common carp has similarity to KHV-U/-I (European). In addition, DNA sequences from IL-10, and TNFR were sequenced and compared with no differences found, which suggests immune suppressor genes of KHV are conserved between KHV in wild common carp and koi, and is consistent with KHV-U, -I, -J. Copyright © 2012 Elsevier B.V. All rights reserved.
Identifying resistance gene analogs associated with resistances to different pathogens in common bean.

PubMed

López, Camilo E; Acosta, Iván F; Jara, Carlos; Pedraza, Fabio; Gaitán-Solís, Eliana; Gallego, Gerardo; Beebe, Steve; Tohme, Joe

2003-01-01

ABSTRACT A polymerase chain reaction approach using degenerate primers that targeted the conserved domains of cloned plant disease resistance genes (R genes) was used to isolate a set of 15 resistance gene analogs (RGAs) from common bean (Phaseolus vulgaris). Eight different classes of RGAs were obtained from nucleotide binding site (NBS)-based primers and seven from not previously described Toll/Interleukin-1 receptor-like (TIR)-based primers. Putative amino acid sequences of RGAs were significantly similar to R genes and contained additional conserved motifs. The NBS-type RGAs were classified in two subgroups according to the expected final residue in the kinase-2 motif. Eleven RGAs were mapped at 19 loci on eight linkage groups of the common bean genetic map constructed at Centro Internacional de Agricultura Tropical. Genetic linkage was shown for eight RGAs with partial resistance to anthracnose, angular leaf spot (ALS) and Bean golden yellow mosaic virus (BGYMV). RGA1 and RGA2 were associated with resistance loci to anthracnose and BGYMV and were part of two clusters of R genes previously described. A new major cluster was detected by RGA7 and explained up to 63.9% of resistance to ALS and has a putative contribution to anthracnose resistance. These results show the usefulness of RGAs as candidate genes to detect and eventually isolate numerous R genes in common bean.
RFLP and sequence analysis of the cytochrome b gene of selected animals and man: methodology and forensic application.

PubMed

Zehner, R; Zimmermann, S; Mebs, D

1998-01-01

To identify common animal species by analysis of the cytochrome b gene a method has been developed to obtain PCR products of a large domain of the cytochrome b gene (981 bp out of 1140 bp) in humans, selected mammals and birds using the same specifically designed primers. Species-specific RFLP patterns are generated by co-restriction with the restriction endonucleases ALU I and NCO I. The RFLP patterns obtained are conclusive even in mixtures of two or more species. The results were confirmed by sequence analysis which in addition explained intraspecies variations in the RFLP patterns. The method has been applied to forensic casework studies where the origin of roasted meat, stomach contents and a bone sample has been successfully identified.
Genome-Wide Investigation of WRKY Transcription Factors Involved in Terminal Drought Stress Response in Common Bean

PubMed Central

Wu, Jing; Chen, Jibao; Wang, Lanfen; Wang, Shumin

2017-01-01

WRKY transcription factor plays a key role in drought stress. However, the characteristics of the WRKY gene family in the common bean (Phaseolus vulgaris L.) are unknown. In this study, we identified 88 complete WRKY proteins from the draft genome sequence of the “G19833” common bean. The predicted genes were non-randomly distributed in all chromosomes. Basic information, amino acid motifs, phylogenetic tree and the expression patterns of PvWRKY genes were analyzed, and the proteins were classified into groups 1, 2, and 3. Group 2 was further divided into five subgroups: 2a, 2b, 2c, 2d, and 2e. Finally, we detected 19 WRKY genes that were responsive to drought stress using qRT-PCR; 11 were down-regulated, and 8 were up-regulated under drought stress. This study comprehensively examines WRKY proteins in the common bean, a model food legume, and it provides a foundation for the functional characterization of the WRKY family and opportunities for understanding the mechanisms of drought stress tolerance in this plant. PMID:28386267
Genome-Wide Investigation of WRKY Transcription Factors Involved in Terminal Drought Stress Response in Common Bean.

PubMed

Wu, Jing; Chen, Jibao; Wang, Lanfen; Wang, Shumin

2017-01-01

WRKY transcription factor plays a key role in drought stress. However, the characteristics of the WRKY gene family in the common bean ( Phaseolus vulgaris L.) are unknown. In this study, we identified 88 complete WRKY proteins from the draft genome sequence of the "G19833" common bean. The predicted genes were non-randomly distributed in all chromosomes. Basic information, amino acid motifs, phylogenetic tree and the expression patterns of PvWRKY genes were analyzed, and the proteins were classified into groups 1, 2, and 3. Group 2 was further divided into five subgroups: 2a, 2b, 2c, 2d, and 2e. Finally, we detected 19 WRKY genes that were responsive to drought stress using qRT-PCR; 11 were down-regulated, and 8 were up-regulated under drought stress. This study comprehensively examines WRKY proteins in the common bean, a model food legume, and it provides a foundation for the functional characterization of the WRKY family and opportunities for understanding the mechanisms of drought stress tolerance in this plant.
Genetic architecture of the Delis-Kaplan Executive Function System Trail Making Test: evidence for distinct genetic influences on executive function.

PubMed

Vasilopoulos, Terrie; Franz, Carol E; Panizzon, Matthew S; Xian, Hong; Grant, Michael D; Lyons, Michael J; Toomey, Rosemary; Jacobson, Kristen C; Kremen, William S

2012-03-01

To examine how genes and environments contribute to relationships among Trail Making Test (TMT) conditions and the extent to which these conditions have unique genetic and environmental influences. Participants included 1,237 middle-aged male twins from the Vietnam Era Twin Study of Aging. The Delis-Kaplan Executive Function System TMT included visual searching, number and letter sequencing, and set-shifting components. Phenotypic correlations among TMT conditions ranged from 0.29 to 0.60, and genes accounted for the majority (58-84%) of each correlation. Overall heritability ranged from 0.34 to 0.62 across conditions. Phenotypic factor analysis suggested a single factor. In contrast, genetic models revealed a single common genetic factor but also unique genetic influences separate from the common factor. Genetic variance (i.e., heritability) of number and letter sequencing was completely explained by the common genetic factor while unique genetic influences separate from the common factor accounted for 57% and 21% of the heritabilities of visual search and set shifting, respectively. After accounting for general cognitive ability, unique genetic influences accounted for 64% and 31% of those heritabilities. A common genetic factor, most likely representing a combination of speed and sequencing, accounted for most of the correlation among TMT 1-4. Distinct genetic factors, however, accounted for a portion of variance in visual scanning and set shifting. Thus, although traditional phenotypic shared variance analysis techniques suggest only one general factor underlying different neuropsychological functions in nonpatient populations, examining the genetic underpinnings of cognitive processes with twin analysis can uncover more complex etiological processes.
Single strand conformation polymorphism based SNP and Indel markers for genetic mapping and synteny analysis of common bean (Phaseolus vulgaris L.)

PubMed Central

2009-01-01

Background Expressed sequence tags (ESTs) are an important source of gene-based markers such as those based on insertion-deletions (Indels) or single-nucleotide polymorphisms (SNPs). Several gel based methods have been reported for the detection of sequence variants, however they have not been widely exploited in common bean, an important legume crop of the developing world. The objectives of this project were to develop and map EST based markers using analysis of single strand conformation polymorphisms (SSCPs), to create a transcript map for common bean and to compare synteny of the common bean map with sequenced chromosomes of other legumes. Results A set of 418 EST based amplicons were evaluated for parental polymorphisms using the SSCP technique and 26% of these presented a clear conformational or size polymorphism between Andean and Mesoamerican genotypes. The amplicon based markers were then used for genetic mapping with segregation analysis performed in the DOR364 × G19833 recombinant inbred line (RIL) population. A total of 118 new marker loci were placed into an integrated molecular map for common bean consisting of 288 markers. Of these, 218 were used for synteny analysis and 186 presented homology with segments of the soybean genome with an e-value lower than 7 × 10-12. The synteny analysis with soybean showed a mosaic pattern of syntenic blocks with most segments of any one common bean linkage group associated with two soybean chromosomes. The analysis with Medicago truncatula and Lotus japonicus presented fewer syntenic regions consistent with the more distant phylogenetic relationship between the galegoid and phaseoloid legumes. Conclusion The SSCP technique is a useful and inexpensive alternative to other SNP or Indel detection techniques for saturating the common bean genetic map with functional markers that may be useful in marker assisted selection. In addition, the genetic markers based on ESTs allowed the construction of a transcript map and given their high conservation between species allowed synteny comparisons to be made to sequenced genomes. This synteny analysis may support positional cloning of target genes in common bean through the use of genomic information from these other legumes. PMID:20030833
Molecular cloning and long terminal repeat sequences of human endogenous retrovirus genes related to types A and B retrovirus genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ono, M.

1986-06-01

By using a DNA fragment primarily encoding the reverse transcriptase (pol) region of the Syrian hamster intracisternal A particle (IAP; type A retrovirus) gene as a probe, human endogenous retrovirus genes, tentatively termed HERV-K genes, were cloned from a fetal human liver gene library. Typical HERV-K genes were 9.1 or 9.4 kilobases in length, having long terminal repeats (LTRs) of ca. 970 base pairs. Many structural features commonly observed on the retrovirus LTRs, such as the TATAA box, polyadenylation signal, and terminal inverted repeats, were present on each LTR, and a lysine (K) tRNA having a CUU anticodon was identifiedmore » as a presumed primer tRNA. The HERV-K LTR, however, had little sequence homology to either the IAP LTR or other typical oncovirus LTRs. By filter hybridization, the number of HERV-K genes was estimated to be ca. 50 copies per haploid human genome. The cloned mouse mammary tumor virus (type B) gene was found to hybridize with both the HERV-K and IAP genes to essentially the same extent.« less
Fuzzy measures on the Gene Ontology for gene product similarity.

PubMed

Popescu, Mihail; Keller, James M; Mitchell, Joyce A

2006-01-01

One of the most important objects in bioinformatics is a gene product (protein or RNA). For many gene products, functional information is summarized in a set of Gene Ontology (GO) annotations. For these genes, it is reasonable to include similarity measures based on the terms found in the GO or other taxonomy. In this paper, we introduce several novel measures for computing the similarity of two gene products annotated with GO terms. The fuzzy measure similarity (FMS) has the advantage that it takes into consideration the context of both complete sets of annotation terms when computing the similarity between two gene products. When the two gene products are not annotated by common taxonomy terms, we propose a method that avoids a zero similarity result. To account for the variations in the annotation reliability, we propose a similarity measure based on the Choquet integral. These similarity measures provide extra tools for the biologist in search of functional information for gene products. The initial testing on a group of 194 sequences representing three proteins families shows a higher correlation of the FMS and Choquet similarities to the BLAST sequence similarities than the traditional similarity measures such as pairwise average or pairwise maximum.
Genomewide analysis of Drosophila circular RNAs reveals their structural and sequence properties and age-dependent neural accumulation

PubMed Central

Westholm, Jakub O.; Miura, Pedro; Olson, Sara; Shenker, Sol; Joseph, Brian; Sanfilippo, Piero; Celniker, Susan E.; Graveley, Brenton R.; Lai, Eric C.

2014-01-01

Circularization was recently recognized to broadly expand transcriptome complexity. Here, we exploit massive Drosophila total RNA-sequencing data, >5 billion paired-end reads from >100 libraries covering diverse developmental stages, tissues and cultured cells, to rigorously annotate >2500 fruitfly circular RNAs. These mostly derive from back-splicing of protein-coding genes and lack poly(A) tails, and circularization of hundreds of genes is conserved across multiple Drosophila species. We elucidate structural and sequence properties of Drosophila circular RNAs, which exhibit commonalities and distinctions from mammalian circles. Notably, Drosophila circular RNAs harbor >1000 well-conserved canonical miRNA seed matches, especially within coding regions, and coding conserved miRNA sites reside preferentially within circularized exons. Finally, we analyze the developmental and tissue specificity of circular RNAs, and note their preferred derivation from neural genes and enhanced accumulation in neural tissues. Interestingly, circular isoforms increase dramatically relative to linear isoforms during CNS aging, and constitute a novel aging biomarker. PMID:25544350
Draft genome sequence of Methylibium sp. strain T29, a novel fuel oxygenate-degrading bacterial isolate from Hungary.

PubMed

Szabó, Zsolt; Gyula, Péter; Robotka, Hermina; Bató, Emese; Gálik, Bence; Pach, Péter; Pekker, Péter; Papp, Ildikó; Bihari, Zoltán

2015-01-01

Methylibium sp. strain T29 was isolated from a gasoline-contaminated aquifer and proved to have excellent capabilities in degrading some common fuel oxygenates like methyl tert-butyl ether, tert-amyl methyl ether and tert-butyl alcohol along with other organic compounds. Here, we report the draft genome sequence of M. sp. strain T29 together with the description of the genome properties and its annotation. The draft genome consists of 608 contigs with a total size of 4,449,424 bp and an average coverage of 150×. The genome exhibits an average G + C content of 68.7 %, and contains 4754 protein coding and 52 RNA genes, including 48 tRNA genes. 71 % of the protein coding genes could be assigned to COG (Clusters of Orthologous Groups) categories. A formerly unknown circular plasmid designated as pT29A was isolated and sequenced separately and found to be 86,856 bp long.
Genome-wide Analysis of Drosophila Circular RNAs Reveals Their Structural and Sequence Properties and Age-Dependent Neural Accumulation

DOE PAGES

Westholm, Jakub O.; Miura, Pedro; Olson, Sara; ...

2014-11-26

Circularization was recently recognized to broadly expand transcriptome complexity. Here, we exploit massive Drosophila total RNA-sequencing data, >5 billion paired-end reads from >100 libraries covering diverse developmental stages, tissues, and cultured cells, to rigorously annotate >2,500 fruit fly circular RNAs. These mostly derive from back-splicing of protein-coding genes and lack poly(A) tails, and the circularization of hundreds of genes is conserved across multiple Drosophila species. We elucidate structural and sequence properties of Drosophila circular RNAs, which exhibit commonalities and distinctions from mammalian circles. Notably, Drosophila circular RNAs harbor >1,000 well-conserved canonical miRNA seed matches, especially within coding regions, and codingmore » conserved miRNA sites reside preferentially within circularized exons. Finally, we analyze the developmental and tissue specificity of circular RNAs and note their preferred derivation from neural genes and enhanced accumulation in neural tissues. Interestingly, circular isoforms increase substantially relative to linear isoforms during CNS aging and constitute an aging biomarker.« less
Genome-wide Analysis of Drosophila Circular RNAs Reveals Their Structural and Sequence Properties and Age-Dependent Neural Accumulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Westholm, Jakub O.; Miura, Pedro; Olson, Sara

Circularization was recently recognized to broadly expand transcriptome complexity. Here, we exploit massive Drosophila total RNA-sequencing data, >5 billion paired-end reads from >100 libraries covering diverse developmental stages, tissues, and cultured cells, to rigorously annotate >2,500 fruit fly circular RNAs. These mostly derive from back-splicing of protein-coding genes and lack poly(A) tails, and the circularization of hundreds of genes is conserved across multiple Drosophila species. We elucidate structural and sequence properties of Drosophila circular RNAs, which exhibit commonalities and distinctions from mammalian circles. Notably, Drosophila circular RNAs harbor >1,000 well-conserved canonical miRNA seed matches, especially within coding regions, and codingmore » conserved miRNA sites reside preferentially within circularized exons. Finally, we analyze the developmental and tissue specificity of circular RNAs and note their preferred derivation from neural genes and enhanced accumulation in neural tissues. Interestingly, circular isoforms increase substantially relative to linear isoforms during CNS aging and constitute an aging biomarker.« less
Molecular characterization and detection of variants of Taenia multiceps in sheep in Turkey.

PubMed

Sonmez, Betul; Koroglu, Ergun; Simsek, Sami

2017-02-01

Taenia multiceps is a cestode (family Taeniidae) that in its adult stage lives in the small intestine of dogs and other canids. The metacestode, known as Coenurus cerebralis, is usually found in the central nervous system including brain and spinal card in sheep and other ruminants. The presence of cysts typically leads to neurological symptoms that in the majority of cases result in the death of the animal. Coenurosis could cause high losses in sheep farms because the disease commonly affects young animals. A total of 20 C. cerebralis isolates collected from naturally infected sheep in Mardin province of Turkey were characterized through the polymerase chain reaction and sequencing of a fragment of cytochrome c oxidase subunit 1 (CO1) gene. The results showed that the CO1 gene sequences were highly conserved in C. cerebralis isolates. Phylogenetic analysis based on partial CO1 gene sequences revealed that C. cerebralis isolates were composed of three different variants.
[Mutation analysis for a pedigree affected with keratitis-ichthyosis-deafness syndrome].

PubMed

Li, Lulu; Li, Yuan; Lin, Wei; Zhao, Xiuli

2017-10-10

To identify mutation of GJB2 gene and provide genetic counseling for a family affected with keratitis-ichthyosis-deafness (KID) syndrome. Genomic DNA was extracted from peripheral blood samples with a standard phenol-chloroform method. PCR and Sanger sequencing were used to analyze potential mutation in the proband. Suspected mutation was verified with a PCR-high-resolution melting (PCR-HRM) method. T-clone sequencing was applied to determine the parental origin of the mutation. A heterozygous mutation, c.148G>A (p.Asp50Asn), which is located in the exon 1 of the GJB2 gene, was found in the proband. The results was confirmed by HRM analysis. Cloning sequencing suggested that the mutation was derived from the father's germline. The hot-spot mutation c.148G>A (p.Asp50Asn) in the GJB2 gene probably underlies the KID syndrome in this Chinese family. A PCR-HRM method has been established to rapidly detect common mutations associated with this disease.
Identification of IBV QX vaccine markers : Should vaccine acceptance by authorities require similar identifications for all live IBV vaccines?

PubMed

Listorti, Valeria; Laconi, Andrea; Catelli, Elena; Cecchinato, Mattia; Lupini, Caterina; Naylor, Clive J

2017-10-09

IBV genotype QX causes sufficient disease in Europe for several commercial companies to have started developing live attenuated vaccines. Here, one of those vaccines (L1148) was fully consensus sequenced alongside its progenitor field strain (1148-A) to determine vaccine markers, thereby enabling detection on farms. Twenty-eight single nucleotide substitutions were associated with the 1148-A attenuation, of which any combination can identify vaccine L1148 in the field. Sixteen substitutions resulted in amino acid coding changes of which half were in spike. One change in the 1b gene altered the normally highly conserved final 5 nucleotides of the transcription regulatory sequence of the S gene, common to all IBV QX genes. No mutations can currently be associated with the attenuation process. Field vaccination strategies would greatly benefit by such comparative sequence data being mandatorily submitted to regulators prior to vaccine release following a successful registration process. Copyright © 2017. Published by Elsevier Ltd.
Determinism and randomness in the evolution of introns and sine inserts in mouse and human mitochondrial solute carrier and cytokine receptor genes.

PubMed

Cianciulli, Antonia; Calvello, Rosa; Panaro, Maria A

2015-04-01

In the homologous genes studied, the exons and introns alternated in the same order in mouse and human. We studied, in both species: corresponding short segments of introns, whole corresponding introns and complete homologous genes. We considered the total number of nucleotides and the number and orientation of the SINE inserts. Comparisons of mouse and human data series showed that at the level of individual relatively short segments of intronic sequences the stochastic variability prevails in the local structuring, but at higher levels of organization a deterministic component emerges, conserved in mouse and human during the divergent evolution, despite the ample re-editing of the intronic sequences and the fact that processes such as SINE spread had taken place in an independent way in the two species. Intron conservation is negatively correlated with the SINE occupancy, suggesting that virus inserts interfere with the conservation of the sequences inherited from the common ancestor. Copyright © 2015 Elsevier Ltd. All rights reserved.
Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing

PubMed Central

2011-01-01

Background Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models. PMID:21542930
Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

PubMed

Straub, Shannon C K; Fishbein, Mark; Livshultz, Tatyana; Foster, Zachary; Parks, Matthew; Weitemier, Kevin; Cronn, Richard C; Liston, Aaron

2011-05-04

Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models.
Cloning and sequence analysis of the meso-diaminopimelate decarboxylase gene from Bacillus methanolicus MGA3 and comparison to other decarboxylase genes.

PubMed Central

Mills, D A; Flickinger, M C

1993-01-01

The lysA gene of Bacillus methanolicus MGA3 was cloned by complementation of an auxotrophic Escherichia coli lysA22 mutant with a genomic library of B. methanolicus MGA3 chromosomal DNA. Subcloning localized the B. methanolicus MGA3 lysA gene into a 2.3-kb SmaI-SstI fragment. Sequence analysis of the 2.3-kb fragment indicated an open reading frame encoding a protein of 48,223 Da, which was similar to the meso-diaminopimelate (DAP) decarboxylase amino acid sequences of Bacillus subtilis (62%) and Corynebacterium glutamicum (40%). Amino acid sequence analysis indicated several regions of conservation among bacterial DAP decarboxylases, eukaryotic ornithine decarboxylases, and arginine decarboxylases, suggesting a common structural arrangement for positioning of substrate and the cofactor pyridoxal 5'-phosphate. The B. methanolicus MGA3 DAP decarboxylase was shown to be a dimer (M(r) 86,000) with a subunit molecular mass of approximately 50,000 Da. This decarboxylase is inhibited by lysine (Ki = 0.93 mM) with a Km of 0.8 mM for DAP. The inhibition pattern suggests that the activity of this enzyme in lysine-overproducing strains of B. methanolicus MGA3 may limit lysine synthesis. Images PMID:8215365

Cloning and sequence analysis of the meso-diaminopimelate decarboxylase gene from Bacillus methanolicus MGA3 and comparison to other decarboxylase genes.

PubMed

Mills, D A; Flickinger, M C

1993-09-01

The lysA gene of Bacillus methanolicus MGA3 was cloned by complementation of an auxotrophic Escherichia coli lysA22 mutant with a genomic library of B. methanolicus MGA3 chromosomal DNA. Subcloning localized the B. methanolicus MGA3 lysA gene into a 2.3-kb SmaI-SstI fragment. Sequence analysis of the 2.3-kb fragment indicated an open reading frame encoding a protein of 48,223 Da, which was similar to the meso-diaminopimelate (DAP) decarboxylase amino acid sequences of Bacillus subtilis (62%) and Corynebacterium glutamicum (40%). Amino acid sequence analysis indicated several regions of conservation among bacterial DAP decarboxylases, eukaryotic ornithine decarboxylases, and arginine decarboxylases, suggesting a common structural arrangement for positioning of substrate and the cofactor pyridoxal 5'-phosphate. The B. methanolicus MGA3 DAP decarboxylase was shown to be a dimer (M(r) 86,000) with a subunit molecular mass of approximately 50,000 Da. This decarboxylase is inhibited by lysine (Ki = 0.93 mM) with a Km of 0.8 mM for DAP. The inhibition pattern suggests that the activity of this enzyme in lysine-overproducing strains of B. methanolicus MGA3 may limit lysine synthesis.
The genetic landscape of paediatric de novo acute myeloid leukaemia as defined by single nucleotide polymorphism array and exon sequencing of 100 candidate genes.

PubMed

Olsson, Linda; Zettermark, Sofia; Biloglav, Andrea; Castor, Anders; Behrendtz, Mikael; Forestier, Erik; Paulsson, Kajsa; Johansson, Bertil

2016-07-01

Cytogenetic analyses of a consecutive series of 67 paediatric (median age 8 years; range 0-17) de novo acute myeloid leukaemia (AML) patients revealed aberrations in 55 (82%) cases. The most common subgroups were KMT2A rearrangement (29%), normal karyotype (15%), RUNX1-RUNX1T1 (10%), deletions of 5q, 7q and/or 17p (9%), myeloid leukaemia associated with Down syndrome (7%), PML-RARA (7%) and CBFB-MYH11 (5%). Single nucleotide polymorphism array (SNP-A) analysis and exon sequencing of 100 genes, performed in 52 and 40 cases, respectively (39 overlapping), revealed ≥1 aberration in 89%; when adding cytogenetic data, this frequency increased to 98%. Uniparental isodisomies (UPIDs) were detected in 13% and copy number aberrations (CNAs) in 63% (median 2/case); three UPIDs and 22 CNAs were recurrent. Twenty-two genes were targeted by focal CNAs, including AEBP2 and PHF6 deletions and genes involved in AML-associated gene fusions. Deep sequencing identified mutations in 65% of cases (median 1/case). In total, 60 mutations were found in 30 genes, primarily those encoding signalling proteins (47%), transcription factors (25%), or epigenetic modifiers (13%). Twelve genes (BCOR, CEBPA, FLT3, GATA1, KIT, KRAS, NOTCH1, NPM1, NRAS, PTPN11, SMC3 and TP53) were recurrently mutated. We conclude that SNP-A and deep sequencing analyses complement the cytogenetic diagnosis of paediatric AML. © 2016 John Wiley & Sons Ltd.
Lateral Gene Transfer in a Heavy Metal-Contaminated-Groundwater Microbial Community

PubMed Central

Hemme, Christopher L.; Green, Stefan J.; Rishishwar, Lavanya; Prakash, Om; Pettenato, Angelica; Chakraborty, Romy; Deutschbauer, Adam M.; Van Nostrand, Joy D.; Wu, Liyou; He, Zhili; Jordan, I. King; Arkin, Adam P.; Kostka, Joel E.

2016-01-01

ABSTRACT Unraveling the drivers controlling the response and adaptation of biological communities to environmental change, especially anthropogenic activities, is a central but poorly understood issue in ecology and evolution. Comparative genomics studies suggest that lateral gene transfer (LGT) is a major force driving microbial genome evolution, but its role in the evolution of microbial communities remains elusive. To delineate the importance of LGT in mediating the response of a groundwater microbial community to heavy metal contamination, representative Rhodanobacter reference genomes were sequenced and compared to shotgun metagenome sequences. 16S rRNA gene-based amplicon sequence analysis indicated that Rhodanobacter populations were highly abundant in contaminated wells with low pHs and high levels of nitrate and heavy metals but remained rare in the uncontaminated wells. Sequence comparisons revealed that multiple geochemically important genes, including genes encoding Fe2+/Pb2+ permeases, most denitrification enzymes, and cytochrome c553, were native to Rhodanobacter and not subjected to LGT. In contrast, the Rhodanobacter pangenome contained a recombinational hot spot in which numerous metal resistance genes were subjected to LGT and/or duplication. In particular, Co2+/Zn2+/Cd2+ efflux and mercuric resistance operon genes appeared to be highly mobile within Rhodanobacter populations. Evidence of multiple duplications of a mercuric resistance operon common to most Rhodanobacter strains was also observed. Collectively, our analyses indicated the importance of LGT during the evolution of groundwater microbial communities in response to heavy metal contamination, and a conceptual model was developed to display such adaptive evolutionary processes for explaining the extreme dominance of Rhodanobacter populations in the contaminated groundwater microbiome. PMID:27048805
New enzymes from environmental cassette arrays: Functional attributes of a phosphotransferase and an RNA-methyltransferase

PubMed Central

Nield, Blair S.; Willows, Robert D.; Torda, Andrew E.; Gillings, Michael R.; Holmes, Andrew J.; Nevalainen, K.M. Helena; Stokes, H.W.; Mabbutt, Bridget C.

2004-01-01

By targeting gene cassettes by polymerase chain reaction (PCR) directly from environmentally derived DNA, we are able to amplify entire open reading frames (ORFs) independently of prior sequence knowledge. Approximately 10% of the mobile genes recovered by these means can be attributed to known protein families. Here we describe the characterization of two ORFs which show moderate homology to known proteins: (1) an aminoglycoside phosphotransferase displaying 25% sequence identity with APH(7″) from Streptomyces hygroscopicus, and (2) an RNA methyltransferase sharing 25%–28% identity with a group of recently defined bacterial RNA methyltransferases distinct from the SpoU enzyme family. Our novel genes were expressed as recombinant products and assayed for appropriate enzyme activity. The aminoglycoside phosphotransferase displayed ATPase activity, consistent with the presence of characteristic Mg2+-binding residues. Unlike related APH(4) or APH(7″) enzymes, however, this activity was not enhanced by hygromycin B or kanamycin, suggesting the normal substrate to be a different aminoglycoside. The RNA methyltransferase contains sequence motifs of the RNA methyltransferase superfamily, and our recombinant version showed methyltransferase activity with RNA. Our data confirm that gene cassettes present in the environment encode folded enzymes with novel sequence variation and demonstrable catalytic activity. Our PCR approach (cassette PCR) may be used to identify a diverse range of ORFs from any environmental sample, as well as to directly access the gene pool found in mobile gene cassettes commonly associated with integrons. This gene pool can be accessed from both cultured and uncultured microbial samples as a source of new enzymes and proteins. PMID:15152095
New enzymes from environmental cassette arrays: functional attributes of a phosphotransferase and an RNA-methyltransferase.

PubMed

Nield, Blair S; Willows, Robert D; Torda, Andrew E; Gillings, Michael R; Holmes, Andrew J; Nevalainen, K M Helena; Stokes, H W; Mabbutt, Bridget C

2004-06-01

By targeting gene cassettes by polymerase chain reaction (PCR) directly from environmentally derived DNA, we are able to amplify entire open reading frames (ORFs) independently of prior sequence knowledge. Approximately 10% of the mobile genes recovered by these means can be attributed to known protein families. Here we describe the characterization of two ORFs which show moderate homology to known proteins: (1) an aminoglycoside phosphotransferase displaying 25% sequence identity with APH(7") from Streptomyces hygroscopicus, and (2) an RNA methyltransferase sharing 25%-28% identity with a group of recently defined bacterial RNA methyltransferases distinct from the SpoU enzyme family. Our novel genes were expressed as recombinant products and assayed for appropriate enzyme activity. The aminoglycoside phosphotransferase displayed ATPase activity, consistent with the presence of characteristic Mg(2+)-binding residues. Unlike related APH(4) or APH(7") enzymes, however, this activity was not enhanced by hygromycin B or kanamycin, suggesting the normal substrate to be a different aminoglycoside. The RNA methyltransferase contains sequence motifs of the RNA methyltransferase superfamily, and our recombinant version showed methyltransferase activity with RNA. Our data confirm that gene cassettes present in the environment encode folded enzymes with novel sequence variation and demonstrable catalytic activity. Our PCR approach (cassette PCR) may be used to identify a diverse range of ORFs from any environmental sample, as well as to directly access the gene pool found in mobile gene cassettes commonly associated with integrons. This gene pool can be accessed from both cultured and uncultured microbial samples as a source of new enzymes and proteins.
Selection and Validation of Reference Genes for Quantitative Real-Time PCR in Buckwheat (Fagopyrum esculentum) Based on Transcriptome Sequence Data

PubMed Central

Demidenko, Natalia V.; Logacheva, Maria D.; Penin, Aleksey A.

2011-01-01

Quantitative reverse transcription PCR (qRT-PCR) is one of the most precise and widely used methods of gene expression analysis. A necessary prerequisite of exact and reliable data is the accurate choice of reference genes. We studied the expression stability of potential reference genes in common buckwheat (Fagopyrum esculentum) in order to find the optimal reference for gene expression analysis in this economically important crop. Recently sequenced buckwheat floral transcriptome was used as source of sequence information. Expression stability of eight candidate reference genes was assessed in different plant structures (leaves and inflorescences at two stages of development and fruits). These genes are the orthologs of Arabidopsis genes identified as stable in a genome-wide survey gene of expression stability and a traditionally used housekeeping gene GAPDH. Three software applications – geNorm, NormFinder and BestKeeper - were used to estimate expression stability and provided congruent results. The orthologs of AT4G33380 (expressed protein of unknown function, Expressed1), AT2G28390 (SAND family protein, SAND) and AT5G46630 (clathrin adapter complex subunit family protein, CACS) are revealed as the most stable. We recommend using the combination of Expressed1, SAND and CACS for the normalization of gene expression data in studies on buckwheat using qRT-PCR. These genes are listed among five the most stably expressed in Arabidopsis that emphasizes utility of the studies on model plants as a framework for other species. PMID:21589908
An aureobasidin A resistance gene isolated from Aspergillus is a homolog of yeast AUR1, a gene responsible for inositol phosphorylceramide (IPC) synthase activity.

PubMed

Kuroda, M; Hashida-Okado, T; Yasumoto, R; Gomi, K; Kato, I; Takesako, K

1999-03-01

The AUR1 gene of Saccharomyces cerevisiae, mutations in which confer resistance to the antibiotic aureobasidin A, is necessary for inositol phosphorylceramide (IPC) synthase activity. We report the molecular cloning and characterization of the Aspergillus nidulans aurA gene, which is homologous to AUR1. A single point mutation in the aurA gene of A. nidulans confers a high level of resistance to aureobasidin A. The A. nidulans aurA gene was used to identify its homologs in other Aspergillus species, including A. fumigatus, A. niger, and A. oryzae. The deduced amino acid sequence of an aurA homolog from the pathogenic fungus A. fumigatus showed 87% identity to that of A. nidulans. The AurA proteins of A. nidulans and A. fumigatus shared common characteristics in primary structure, including sequence, hydropathy profile, and N-glycosylation sites, with their S. cerevisiae, Schizosaccharomyces pombe, and Candida albicans counterparts. These results suggest that the aureobasidin resistance gene is conserved evolutionarily in various fungi.
Statistical properties of DNA sequences

NASA Technical Reports Server (NTRS)

Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

1995-01-01

We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.
[Detection of gene mutation in glucose-6-phosphate dehydrogenase deficiency by RT-PCR sequencing].

PubMed

Lyu, Rong-Yu; Chen, Xiao-Wen; Zhang, Min; Chen, Yun-Sheng; Yu, Jie; Wen, Fei-Qiu

2016-07-01

Since glucose-6-phosphate dehydrogenase (G6PD) deficiency is the most common hereditary hemolytic erythrocyte enzyme deficiency, most cases have single nucleotide mutations in the coding region, and current test methods for gene mutation have some missed detections, this study aimed to investigate the feasibility of RT-PCR sequencing in the detection of gene mutation in G6PD deficiency. According to the G6PD/6GPD ratio, 195 children with anemia of unknown cause or who underwent physical examination between August 2013 and July 2014 were classified into G6PD-deficiency group with 130 children (G6PD/6GPD ratio <1.00) and control group with 65 children (G6PD/6GPD ratio≥1.00). The primer design and PCR amplification conditions were optimized, and RT-PCR sequencing was used to analyze the complete coding sequence and verify the genomic DNA sequence in the two groups. In the G6PD-deficiency group, the detection rate of gene mutation was 100% and 13 missense mutations were detected, including one new mutation. In the control group, no missense mutation was detected in 28 boys; 13 heterozygous missense mutations, 1 homozygous same-sense mutation (C1191T) which had not been reported in China and abroad, and 14 single nucleotide polymorphisms of C1311T were detected in 37 girls. The control group showed a high rate of missed detection of G6PD deficiency (carriers) in the specimens from girls (35%, 13/37). RT-PCR sequencing has a high detection rate of G6PD gene mutation and a certain value in clinical diagnosis of G6PD deficiency.
Deinococcus geothermalis: The Pool of Extreme Radiation Resistance Genes Shrinks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Makarova, Kira S.; Omelchenko, Marina V.; Gaidamakova, Elena K.

Bacteria of the genus Deinococcus are extremely resistant to ionizing radiation (IR), ultraviolet light (UV) and desiccation. The mesophile Deinococcus radiodurans was the first member of this group whose genome was completely sequenced. Analysis of the genome sequence of D. radiodurans, however, failed to identify unique DNA repair systems. To further delineate the genes underlying the resistance phenotypes, we report the whole-genome sequence of a second Deinococcus species, the thermophile Deinococcus geothermalis, which at itsoptimal growth temperature is as resistant to IR, UV and desiccation as D. radiodurans, and a comparative analysis of the two Deinococcus genomes. Many D. radioduransmore » genes previously implicated in resistance, but for which no sensitive phenotype was observed upon disruption, are absent in D. geothermalis. In contrast, most D. radiodurans genes whose mutants displayed a radiation-sensitive phenotype in D. radiodurans are conserved in D. geothermalis. Supporting the existence of a Deinococcus radiation response regulon, a common palindromic DNA motif was identified in a conserved set of genes associated with resistance, and a dedicated transcriptional regulator was predicted. We present the case that these two species evolved essentially the same diverse set of gene families, and that the extreme stress-resistance phenotypes of the Deinococcus lineage emerged progressively by amassing cell-cleaning systems from different sources, but not by acquisition of novel DNA repair systems. Our reconstruction of the genomic evolution of the Deinococcus-Thermus phylum indicates that the corresponding set of enzymes proliferated mainly in the common ancestor of Deinococcus. Results of the comparative analysis weaken the arguments for a role of higher-order chromosome alignment structures in resistance; more clearly define and substantially revise downward the number of uncharacterized genes that might participate in DNA repair and contribute to resistance; and strengthen the case for a role in survival of systems involved in manganese and iron homeostasis.« less
SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

PubMed

Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

2012-07-15

In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.
Origin and spread of photosynthesis based upon conserved sequence features in key bacteriochlorophyll biosynthesis proteins.

PubMed

Gupta, Radhey S

2012-11-01

The origin of photosynthesis and how this capability has spread to other bacterial phyla remain important unresolved questions. I describe here a number of conserved signature indels (CSIs) in key proteins involved in bacteriochlorophyll (Bchl) biosynthesis that provide important insights in these regards. The proteins BchL and BchX, which are essential for Bchl biosynthesis, are derived by gene duplication in a common ancestor of all phototrophs. More ancient gene duplication gave rise to the BchX-BchL proteins and the NifH protein of the nitrogenase complex. The sequence alignment of NifH-BchX-BchL proteins contain two CSIs that are uniquely shared by all NifH and BchX homologs, but not by any BchL homologs. These CSIs and phylogenetic analysis of NifH-BchX-BchL protein sequences strongly suggest that the BchX homologs are ancestral to BchL and that the Bchl-based anoxygenic photosynthesis originated prior to the chlorophyll (Chl)-based photosynthesis in cyanobacteria. Another CSI in the BchX-BchL sequence alignment that is uniquely shared by all BchX homologs and the BchL sequences from Heliobacteriaceae, but absent in all other BchL homologs, suggests that the BchL homologs from Heliobacteriaceae are primitive in comparison to all other photosynthetic lineages. Several other identified CSIs in the BchN homologs are commonly shared by all proteobacterial homologs and a clade consisting of the marine unicellular Cyanobacteria (Clade C). These CSIs in conjunction with the results of phylogenetic analyses and pair-wise sequence similarity on the BchL, BchN, and BchB proteins, where the homologs from Clade C Cyanobacteria and Proteobacteria exhibited close relationship, provide strong evidence that these two groups have incurred lateral gene transfers. Additionally, phylogenetic analyses and several CSIs in the BchL-N-B proteins that are uniquely shared by all Chlorobi and Chloroflexi homologs provide evidence that the genes for these proteins have also been laterally transferred between these groups. Other results and observations reported here indicate that the genes for the BchL-N-B proteins in Proteobacteria are derived from the Clade C Cyanobacteria, whereas those in Chlorobi were acquired from Chloroflexus or related bacteria by means of LGTs. Some implications of these observations regarding the origin and spread of photosynthesis are discussed.
Koi herpesvirus epizootic in cultured carp and koi, Cyprinus carpio L., in Taiwan.

PubMed

Cheng, L; Chen, C-Y; Tsai, M-A; Wang, P-C; Hsu, J-P; Chern, R-S; Chen, S-C

2011-07-01

Koi herpesvirus (KHV) poses a significant threat to cultured koi and common carp, both Cyprinus carpio L. Since the first reported case in Israel in 1998, KHV has rapidly spread worldwide. This study investigates the spread of KHV to Taiwan by collecting 49 cases of suspected common carp and koi infections from 2003 to 2005 for analysis. Clinical signs included lethargy, anorexia, increased respiratory movements and uncoordinated swimming. Hyperaemia, haemorrhage on body surface and necrotic gill filaments were recorded. Gill epithelial hyperplasia, necrosis and eosinophilic intranuclear inclusion bodies were observed by histological examination, while virions were detected using transmission electron microscopy. By detecting the presence of the KHV thymidine kinase (TK) gene and the KHV 9/5 gene using polymerase chain reaction (PCR), 37 cases were identified as KHV-positive, and the cumulative mortality of infected fish was 70-100%. Positive cases showed identical sequences for the genes analysed, implying that they were of the same origin. For the KHV 9/5 gene sequence, these cases exhibited 100% identity with the Japanese strain (TUMST1, accession number AP008984) and 99% identity with the Israeli (KHV-I, DQ177346) and US (KHV-U, DQ657948) strains. Additionally, a loop-mediated isothermal amplification (LAMP) assay was performed and found to be more sensitive than PCR tests, suggesting its potential use as a rapid diagnostic method for KHV. This is the first epidemiological study of KHV infection in cultured common carp and koi in Taiwan. © 2011 Blackwell Publishing Ltd.
Transcriptome Profiling of a Multiple Recurrent Muscle-Invasive Urothelial Carcinoma of the Bladder by Deep Sequencing

PubMed Central

Zhang, Shufang; Liu, Yanxuan; Liu, Zhenxiang; Zhang, Chong; Cao, Hui; Ye, Yongqing; Wang, Shunlan; Zhang, Ying'ai; Xiao, Sifang; Yang, Peng; Li, Jindong; Bai, Zhiming

2014-01-01

Urothelial carcinoma of the bladder (UCB) is one of the commonly diagnosed cancers in the world. The UCB has the highest rate of recurrence of any malignancy. A genome-wide screening of transcriptome dysregulation between cancer and normal tissue would provide insight into the molecular basis of UCB recurrence and is a key step to discovering biomarkers for diagnosis and therapeutic targets. Compared with microarray technology, which is commonly used to identify expression level changes, the recently developed RNA-seq technique has the ability to detect other abnormal regulations in the cancer transcriptome, such as alternative splicing. In this study, we performed high-throughput transcriptome sequencing at ∼50× coverage on a recurrent muscle-invasive cisplatin-resistance UCB tissue and the adjacent non-tumor tissue. The results revealed cancer-specific differentially expressed genes between the tumor and non-tumor tissue enriched in the cell adhesion molecules, focal adhesion and ECM-receptor interaction pathway. Five dysregulated genes, including CDH1, VEGFA, PTPRF, CLDN7, and MMP2 were confirmed by Real time qPCR in the sequencing samples and the additional eleven samples. Our data revealed that more than three hundred genes showed differential splicing patterns between tumor tissue and non-tumor tissue. Among these genes, we filtered 24 cancer-associated alternative splicing genes with differential exon usage. The findings from RNA-Seq were validated by Real time qPCR for CD44, PDGFA, NUMB, and LPHN2. This study provides a comprehensive survey of the UCB transcriptome, which provides better insight into the complexity of regulatory changes during recurrence and metastasis. PMID:24622401
Targeted resequencing of candidate genes reveals novel variants associated with severe Behçet's uveitis.

PubMed

Kim, Sang Jin; Lee, Seungbok; Park, Changho; Seo, Jeong-Sun; Kim, Jong-Il; Yu, Hyeong Gon

2013-10-18

Behçet's disease (BD) is a chronic systemic inflammatory disorder characterized by four major manifestations: recurrent uveitis, oral and genital ulcers and skin lesions. To identify some pathogenic variants associated with severe Behçet's uveitis, we used targeted and massively parallel sequencing methods to explore the genetic diversity of target regions. A solution-based target enrichment kit was designed to capture whole-exonic regions of 132 candidate genes. Using a multiplexing strategy, 32 samples from patients with a severe type of Behçet's uveitis were sequenced with a Genome Analyzer IIx. We compared the frequency of each variant with that of 59 normal Korean controls, and selected five rare and eight common single-nucleotide variants as the candidates for a replication study. The selected variants were genotyped in 61 cases and 320 controls and, as a result, two rare and seven common variants showed significant associations with severe Behçet's uveitis (P<0.05). Some of these, including rs199955684 in KIR3DL3, rs1801133 in MTHFR, rs1051790 in MICA and rs1051456 in KIR2DL4, were predicted to be damaging by either the PolyPhen-2 or SIFT prediction program. Variants on FCGR3A (rs396991) and ICAM1 (rs5498) have been previously reported as susceptibility loci of this disease, and those on IFNAR1, MTFHR and MICA also replicated the previous reports at the gene level. The KIR3DL3 and KIR2DL4 genes are novel susceptibility genes that have not been reported in association with BD. In conclusion, this study showed that target enrichment and next-generation sequencing technologies can provide valuable information on the genetic predisposition for Behçet's uveitis.
Genome-wide characterization of Toll-like receptor gene family in common carp (Cyprinus carpio) and their involvement in host immune response to Aeromonas hydrophila infection.

PubMed

Gong, Yiwen; Feng, Shuaisheng; Li, Shangqi; Zhang, Yan; Zhao, Zixia; Hu, Mou; Xu, Peng; Jiang, Yanliang

2017-12-01

The Toll-like receptor (TLR) gene family is a class of conserved pattern recognition receptors, which play an essential role in innate immunity providing efficient defense against invading microbial pathogens. Although TLRs have been extensively characterized in both invertebrates and vertebrates, a comprehensive analysis of TLRs in common carp is lacking. In the present study, we have conducted the first genome-wide systematic analysis of common carp (Cyprinus carpio) TLR genes. A set of 27 common carp TLR genes were identified and characterized. Sequence similarity analysis, functional domain prediction and phylogenetic analysis supported their annotation and orthologies. By examining the gene copy number of TLR genes across several vertebrates, gene duplications and losses were observed. The expression patterns of TLR genes were examined during early developmental stages and in various healthy tissues, and the results showed that TLR genes were ubiquitously expressed, indicating a likely role in maintaining homeostasis. Moreover, the differential expression of TLRs was examined after Aeromons hydrophila infection, and showed that most TLR genes were induced, with diverse patterns. TLR1, TLR4-2, TLR4-3, TLR22-2, TLR22-3 were significantly up-regulated at minimum one timepoint, whereas TLR2-1, TLR4-1, TLR7-1 and TLR7-2 were significantly down-regulated. Our results suggested that TLR genes play critical roles in the common carp immune response. Collectively, our findings provide fundamental genomic resources for future studies on fish disease management and disease-resistance selective breeding strategy development. Copyright © 2017 Elsevier Inc. All rights reserved.
Evolutionary history of the alpha2,8-sialyltransferase (ST8Sia) gene family: Tandem duplications in early deuterostomes explain most of the diversity found in the vertebrate ST8Sia genes

PubMed Central

2008-01-01

Background The animal sialyltransferases, which catalyze the transfer of sialic acid to the glycan moiety of glycoconjugates, are subdivided into four families: ST3Gal, ST6Gal, ST6GalNAc and ST8Sia, based on acceptor sugar specificity and glycosidic linkage formed. Despite low overall sequence identity between each sialyltransferase family, all sialyltransferases share four conserved peptide motifs (L, S, III and VS) that serve as hallmarks for the identification of the sialyltransferases. Currently, twenty subfamilies have been described in mammals and birds. Examples of the four sialyltransferase families have also been found in invertebrates. Focusing on the ST8Sia family, we investigated the origin of the three groups of α2,8-sialyltransferases demonstrated in vertebrates to carry out poly-, oligo- and mono-α2,8-sialylation. Results We identified in the genome of invertebrate deuterostomes, orthologs to the common ancestor for each of the three vertebrate ST8Sia groups and a set of novel genes named ST8Sia EX, not found in vertebrates. All these ST8Sia sequences share a new conserved family-motif, named "C-term" that is involved in protein folding, via an intramolecular disulfide bridge. Interestingly, sequences from Branchiostoma floridae orthologous to the common ancestor of polysialyltransferases possess a polysialyltransferase domain (PSTD) and those orthologous to the common ancestor of oligosialyltransferases possess a new ST8Sia III-specific motif similar to the PSTD. In osteichthyans, we have identified two new subfamilies. In addition, we describe the expression profile of ST8Sia genes in Danio rerio. Conclusion Polysialylation appeared early in the deuterostome lineage. The recent release of several deuterostome genome databases and paralogons combined with synteny analysis allowed us to obtain insight into events at the gene level that led to the diversification of the ST8Sia genes, with their corresponding enzymatic activities, in both invertebrates and vertebrates. The initial expansion and subsequent divergence of the ST8Sia genes resulted as a consequence of a series of ancient duplications and translocations in the invertebrate genome long before the emergence of vertebrates. A second subset of ST8sia genes in the vertebrate genome arose from whole genome duplication (WGD) R1 and R2. Subsequent selective ST8Sia gene loss is responsible for the characteristic ST8Sia gene expression pattern observed today in individual species. PMID:18811928
Evolutionary history of the alpha2,8-sialyltransferase (ST8Sia) gene family: tandem duplications in early deuterostomes explain most of the diversity found in the vertebrate ST8Sia genes.

PubMed

Harduin-Lepers, Anne; Petit, Daniel; Mollicone, Rosella; Delannoy, Philippe; Petit, Jean-Michel; Oriol, Rafael

2008-09-23

The animal sialyltransferases, which catalyze the transfer of sialic acid to the glycan moiety of glycoconjugates, are subdivided into four families: ST3Gal, ST6Gal, ST6GalNAc and ST8Sia, based on acceptor sugar specificity and glycosidic linkage formed. Despite low overall sequence identity between each sialyltransferase family, all sialyltransferases share four conserved peptide motifs (L, S, III and VS) that serve as hallmarks for the identification of the sialyltransferases. Currently, twenty subfamilies have been described in mammals and birds. Examples of the four sialyltransferase families have also been found in invertebrates. Focusing on the ST8Sia family, we investigated the origin of the three groups of alpha2,8-sialyltransferases demonstrated in vertebrates to carry out poly-, oligo- and mono-alpha2,8-sialylation. We identified in the genome of invertebrate deuterostomes, orthologs to the common ancestor for each of the three vertebrate ST8Sia groups and a set of novel genes named ST8Sia EX, not found in vertebrates. All these ST8Sia sequences share a new conserved family-motif, named "C-term" that is involved in protein folding, via an intramolecular disulfide bridge. Interestingly, sequences from Branchiostoma floridae orthologous to the common ancestor of polysialyltransferases possess a polysialyltransferase domain (PSTD) and those orthologous to the common ancestor of oligosialyltransferases possess a new ST8Sia III-specific motif similar to the PSTD. In osteichthyans, we have identified two new subfamilies. In addition, we describe the expression profile of ST8Sia genes in Danio rerio. Polysialylation appeared early in the deuterostome lineage. The recent release of several deuterostome genome databases and paralogons combined with synteny analysis allowed us to obtain insight into events at the gene level that led to the diversification of the ST8Sia genes, with their corresponding enzymatic activities, in both invertebrates and vertebrates. The initial expansion and subsequent divergence of the ST8Sia genes resulted as a consequence of a series of ancient duplications and translocations in the invertebrate genome long before the emergence of vertebrates. A second subset of ST8sia genes in the vertebrate genome arose from whole genome duplication (WGD) R1 and R2. Subsequent selective ST8Sia gene loss is responsible for the characteristic ST8Sia gene expression pattern observed today in individual species.
The floral transcriptomes of four bamboo species (Bambusoideae; Poaceae): support for common ancestry among woody bamboos.

PubMed

Wysocki, William P; Ruiz-Sanchez, Eduardo; Yin, Yanbin; Duvall, Melvin R

2016-05-20

Next-generation sequencing now allows for total RNA extracts to be sequenced in non-model organisms such as bamboos, an economically and ecologically important group of grasses. Bamboos are divided into three lineages, two of which are woody perennials with bisexual flowers, which undergo gregarious monocarpy. The third lineage, which are herbaceous perennials, possesses unisexual flowers that undergo annual flowering events. Transcriptomes were assembled using both reference-based and de novo methods. These two methods were tested by characterizing transcriptome content using sequence alignment to previously characterized reference proteomes and by identifying Pfam domains. Because of the striking differences in floral morphology and phenology between the herbaceous and woody bamboo lineages, MADS-box genes, transcription factors that control floral development and timing, were characterized and analyzed in this study. Transcripts were identified using phylogenetic methods and categorized as A, B, C, D or E-class genes, which control floral development, or SOC or SVP-like genes, which control the timing of flowering events. Putative nuclear orthologues were also identified in bamboos to use as phylogenetic markers. Instances of gene copies exhibiting topological patterns that correspond to shared phenotypes were observed in several gene families including floral development and timing genes. Alignments and phylogenetic trees were generated for 3,878 genes and for all genes in a concatenated analysis. Both the concatenated analysis and those of 2,412 separate gene trees supported monophyly among the woody bamboos, which is incongruent with previous phylogenetic studies using plastid markers.
Massive losses of taste receptor genes in toothed and baleen whales.

PubMed

Feng, Ping; Zheng, Jinsong; Rossiter, Stephen J; Wang, Ding; Zhao, Huabin

2014-05-06

Taste receptor genes are functionally important in animals, with a surprising exception in the bottlenose dolphin, which shows extensive losses of sweet, umami, and bitter taste receptor genes. To examine the generality of taste gene loss, we examined seven toothed whales and five baleen whales and sequenced the complete repertoire of three sweet/umami (T1Rs) and ten bitter (T2Rs) taste receptor genes. We found all amplified T1Rs and T2Rs to be pseudogenes in all 12 whales, with a shared premature stop codon in 10 of the 13 genes, which demonstrated massive losses of taste receptor genes in the common ancestor of whales. Furthermore, we analyzed three genome sequences from two toothed whales and one baleen whale and found that the sour taste marker gene Pkd2l1 is a pseudogene, whereas the candidate salty taste receptor genes are intact and putatively functional. Additionally, we examined three genes that are responsible for taste signal transduction and found the relaxation of functional constraints on taste signaling pathways along the ancestral branch leading to whales. Together, our results strongly suggest extensive losses of sweet, umami, bitter, and sour tastes in whales, and the relaxation of taste function most likely arose in the common ancestor of whales between 36 and 53 Ma. Therefore, whales represent the first animal group to lack four of five primary tastes, probably driven by the marine environment with high concentration of sodium, the feeding behavior of swallowing prey whole, and the dietary switch from plants to meat in the whale ancestor. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.